Vipina K. Keloth, PhD
Associate Research Scientist in Biomedical Informatics and Data ScienceCards
About
Research
Publications
2025
Medical foundation large language models for comprehensive text analysis and beyond
Xie Q, Chen Q, Chen A, Peng C, Hu Y, Lin F, Peng X, Huang J, Zhang J, Keloth V, Zhou X, Qian L, He H, Shung D, Ohno-Machado L, Wu Y, Xu H, Bian J. Medical foundation large language models for comprehensive text analysis and beyond. Npj Digital Medicine 2025, 8: 141. PMID: 40044845, PMCID: PMC11882967, DOI: 10.1038/s41746-025-01533-1.Peer-Reviewed Original ResearchText analysis tasksAnalysis tasksLanguage modelDomain-specific knowledgeZero-ShotHuman evaluationSupervised settingTask-specific instructionsClinical data sourcesSpecialized medical knowledgeChatGPTText analysisPretrainingTaskData sourcesMedical applicationsMedical knowledgeEnhanced performanceTextPerformance
2024
Using clinical entity recognition for curating an interface terminology to aid fast skimming of EHRs
Kollapally N, Dehkordi M, Perl Y, Geller J, Deek F, Liu H, Keloth V, Elhanan G, Einstein A, Zhou S. Using clinical entity recognition for curating an interface terminology to aid fast skimming of EHRs. 2024, 00: 6427-6434. DOI: 10.1109/bibm62325.2024.10822845.Peer-Reviewed Original ResearchElectronic health recordsEntity recognitionVolume of electronic health recordsEHR interoperabilityClinical entity recognitionClinical terminologyNeural network modelClinical NERTransfer learningSNOMED CT conceptsInterface terminologyNetwork modelSNOMED CTHealth recordsHigher granularityCT conceptsOverworked physiciansHealthcare providersDense volumeRecognitionInteroperabilityCurationGranularityCardiology patientsNERDetection of Gastrointestinal Bleeding With Large Language Models to Aid Quality Improvement and Appropriate Reimbursement
Zheng N, Keloth V, You K, Kats D, Li D, Deshpande O, Sachar H, Xu H, Laine L, Shung D. Detection of Gastrointestinal Bleeding With Large Language Models to Aid Quality Improvement and Appropriate Reimbursement. Gastroenterology 2024, 168: 111-120.e4. PMID: 39304088, DOI: 10.1053/j.gastro.2024.09.014.Peer-Reviewed Original ResearchElectronic health recordsOvert gastrointestinal bleedingGastrointestinal bleedingRecurrent bleedingMachine learning modelsHealth recordsClinically relevant applicationsNursing notesLanguage modelAcute gastrointestinal bleedingQuality improvementLearning modelsDetection of gastrointestinal bleedingReimbursementIdentification of clinical conditionsSeparate hospitalsQuality measuresHospitalBleedingClinical conditionsPatient managementEarly identificationPatientsReimbursement codesCoding algorithmA Study of Biomedical Relation Extraction Using GPT Models.
Zhang J, Wibert M, Zhou H, Peng X, Chen Q, Keloth V, Hu Y, Zhang R, Xu H, Raja K. A Study of Biomedical Relation Extraction Using GPT Models. AMIA Joint Summits On Translational Science Proceedings 2024, 2024: 391-400. PMID: 38827097, PMCID: PMC11141827.Peer-Reviewed Original Research543 IDENTIFYING OVERT SIGNS OF ACUTE GASTROINTESTINAL BLEEDING IN THE ELECTRONIC HEALTH RECORD WITH LARGE LANGUAGE MODELS
Zheng N, Keloth V, You K, Li D, Xu H, Laine L, Shung D. 543 IDENTIFYING OVERT SIGNS OF ACUTE GASTROINTESTINAL BLEEDING IN THE ELECTRONIC HEALTH RECORD WITH LARGE LANGUAGE MODELS. Gastroenterology 2024, 166: s-124-s-125. DOI: 10.1016/s0016-5085(24)00776-5.Peer-Reviewed Original Research1244 AUTOMATED IDENTIFICATION OF RECURRENT GASTROINTESTINAL BLEEDING USING ELECTRONIC HEALTH RECORDS AND LARGE LANGUAGE MODELS
Zheng N, Keloth V, You K, Li D, Xu H, Laine L, Shung D. 1244 AUTOMATED IDENTIFICATION OF RECURRENT GASTROINTESTINAL BLEEDING USING ELECTRONIC HEALTH RECORDS AND LARGE LANGUAGE MODELS. Gastroenterology 2024, 166: s-292. DOI: 10.1016/s0016-5085(24)01152-1.Peer-Reviewed Original ResearchEnsemble pretrained language models to extract biomedical knowledge from literature
Li Z, Wei Q, Huang L, Li J, Hu Y, Chuang Y, He J, Das A, Keloth V, Yang Y, Diala C, Roberts K, Tao C, Jiang X, Zheng W, Xu H. Ensemble pretrained language models to extract biomedical knowledge from literature. Journal Of The American Medical Informatics Association 2024, 31: 1904-1911. PMID: 38520725, PMCID: PMC11339500, DOI: 10.1093/jamia/ocae061.Peer-Reviewed Original ResearchNatural language processingNatural language processing systemsLanguage modelExpansion of biomedical literatureZero-shot settingManually annotated corpusKnowledge graph developmentTask-specific modelsDomain-specific modelsZero-ShotEntity recognitionBillion parametersEnsemble learningLocation informationKnowledge basesBiomedical entitiesLanguage processingFree textGraph developmentBiomedical conceptsAutomated techniqueBiomedical literatureDetection methodPredictive performanceBiomedical knowledgeAdvancing entity recognition in biomedicine via instruction tuning of large language models
Keloth V, Hu Y, Xie Q, Peng X, Wang Y, Zheng A, Selek M, Raja K, Wei C, Jin Q, Lu Z, Chen Q, Xu H. Advancing entity recognition in biomedicine via instruction tuning of large language models. Bioinformatics 2024, 40: btae163. PMID: 38514400, PMCID: PMC11001490, DOI: 10.1093/bioinformatics/btae163.Peer-Reviewed Original ResearchNamed Entity RecognitionSequence labeling taskNatural language processingBiomedical NER datasetsLanguage modelNER datasetsEntity recognitionLabeling taskText generationField of natural language processingBiomedical NERFew-shot learning capabilityReasoning tasksMulti-domain scenariosDomain-specific modelsEnd-to-endMinimal fine-tuningSOTA performanceF1 scoreHealthcare applicationsBiomedical entitiesBiomedical domainLanguage processingMulti-taskingPubMedBERT modelFedFSA: Hybrid and federated framework for functional status ascertainment across institutions
Fu S, Jia H, Vassilaki M, Keloth V, Dang Y, Zhou Y, Garg M, Petersen R, St Sauver J, Moon S, Wang L, Wen A, Li F, Xu H, Tao C, Fan J, Liu H, Sohn S. FedFSA: Hybrid and federated framework for functional status ascertainment across institutions. Journal Of Biomedical Informatics 2024, 152: 104623. PMID: 38458578, PMCID: PMC11005095, DOI: 10.1016/j.jbi.2024.104623.Peer-Reviewed Original ResearchNatural language processingElectronic health recordsStatus informationInformation extractionFunctional status informationRule-based information extractionFederated learning frameworkPrivate local dataNatural language processing frameworkHealthcare sitesPatient's functional statusMultiple healthcare institutionsFederated learningPyTorch libraryConcept normalizationBERT modelLearning frameworkCollaborative development effortsCorpus annotationLanguage processingHealthcare institutionsFunctional statusPredictor of health outcomesActivities of daily livingNatural language processing performanceImproving large language models for clinical named entity recognition via prompt engineering
Hu Y, Chen Q, Du J, Peng X, Keloth V, Zuo X, Zhou Y, Li Z, Jiang X, Lu Z, Roberts K, Xu H. Improving large language models for clinical named entity recognition via prompt engineering. Journal Of The American Medical Informatics Association 2024, 31: 1812-1820. PMID: 38281112, PMCID: PMC11339492, DOI: 10.1093/jamia/ocad259.Peer-Reviewed Original ResearchClinical NER tasksNER taskTask-specific promptsEntity recognitionLanguage modelTraining samplesState-of-the-art modelsFew-shot learningState-of-the-artMinimal training dataTask-specific knowledgeF1-socreAnnotated samplesConcept extractionModel performanceAnnotated datasetsTraining dataF1 scoreTask descriptionFormat specificationsComplex clinical dataOptimal performanceTaskEvaluation schemaGPT model
Academic Achievements & Community Involvement
News
News
Get In Touch
Contacts
Locations
100 College Street
Academic Office
New Haven, CT 06510
Events
Apr 202529Tuesday
- RestrictedMulti-session EventYujia Zhou - Vincent Zhang - Lingfei Qian - Vipina K. Keloth, PhD - Al Pacelli - Nathaniel Price