Vipina K. Keloth, PhD
Associate Research Scientist in Biomedical Informatics and Data ScienceCards
Appointments
Contact Info
About
Copy Link
Titles
Associate Research Scientist in Biomedical Informatics and Data Science
Biography
Dr. Vipina Keloth is an Associate Research Scientist at the Department of Biomedical Informatics and Data Science at Yale School of Medicine. Previously, she was a Postdoctoral Associate at Yale BIDS and prior to that a Postdoctoral Research Fellow at the School of Biomedical Informatics at the University of Texas Health Science Center at Houston. Vipina graduated with a doctoral degree in Computer Science from New Jersey Institute of Technology (NJIT) in 2021. She has also worked as an assistant lecturer in the Department of Mathematical and Computational Sciences at the National Institute of Technology Karnataka, India. Her research interests lie broadly in the domain of biomedical ontologies/terminologies and clinical and biomedical natural language processing.
Appointments
Biomedical Informatics & Data Science
Associate Research ScientistPrimary
Other Departments & Organizations
- Biomedical Informatics & Data Science
- Clinical NLP Lab
Education & Training
- Postdoctoral Associate
- Yale University (2024)
- Postdoctoral Research Fellow
- University of Texas Health Science Center at Houston (2023)
- PhD
- New Jersey Institute of Technology, Computer Science (2021)
- MS
- National Institute of Technology Karnataka, Systems Analysis and Computer Applications (2014)
- MSc
- Mahatma Gandhi University, Computer Applications (2010)
- BS
- Kannur University, Physics (2007)
Research
Copy Link
Overview
Medical Research Interests
ORCID
0000-0001-6919-1122
Research at a Glance
Yale Co-Authors
Publications Timeline
Research Interests
Hua Xu, PhD
Qingyu Chen, PhD
Kalpana Raja, PhD, MRSB, CSci
Aline Pedroso, PhD
Cynthia Brandt, MD, MPH
Hamita Sachar, MD
Biological Ontologies
Natural Language Processing
Social Determinants of Health
Publications
2025
Scientific Writing in the Era of Large Language Models: A Computational Analysis of AI- Versus Human-Created Content
Khera R, Pedroso A, Keloth V, Xu H, Silva G, Schwamm L. Scientific Writing in the Era of Large Language Models: A Computational Analysis of AI- Versus Human-Created Content. Stroke 2025, 56: 3078-3083. PMID: 40814778, DOI: 10.1161/strokeaha.125.051913.Peer-Reviewed Original ResearchAltmetricMeSH Keywords and ConceptsConceptsLanguage modelArtificial intelligenceAI-generatedLinguistic featuresDetection toolsAI-generated contentHuman-written textLanguage perplexityHuman expertsPerformance of expertsLinguistic differencesScientific textsGrade levelWord countEssayLanguageScientific communicationScientific writingComputer synthesisHigher grade levelsTextScientific contentReadability scoresPerplexityFlesch-KincaidImproving Large Language Models’ Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation
Dehkordi M, Perl Y, Deek F, He Z, Keloth V, Liu H, Elhanan G, Einstein A. Improving Large Language Models’ Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation. JMIR Medical Informatics 2025, 13: e66476. PMID: 40705416, PMCID: PMC12332456, DOI: 10.2196/66476.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsElectronic health recordsSummarization accuracyContents of Electronic Health RecordsElectronic health record notesText summarizationLanguage modelEvaluation metricsMachine learningInterface terminologyMIMIC-III databaseDischarge notesSimplification stepsHealth recordsErroneous informationComparative evaluationAccuracyInformationSummarizationHeaderLanguageAmerican Medical AssociationOntology enrichment using a large language model: Applying lexical, semantic, and knowledge network-based similarity for concept placement
Kollapally N, Geller J, Keloth V, He Z, Xu J. Ontology enrichment using a large language model: Applying lexical, semantic, and knowledge network-based similarity for concept placement. Journal Of Biomedical Informatics 2025, 168: 104865. PMID: 40543734, PMCID: PMC12371725, DOI: 10.1016/j.jbi.2025.104865.Peer-Reviewed Original ResearchConceptsSemantic triplesSeed ontologyHuman expertsLogical axiomsPubMed abstractsSimilarity search techniquesState-of-the-artReal-world conceptsNetwork-based filterLanguage modelSemantic correctnessText corpusNetwork-based searchSource of textDomain viewSearch techniqueNetwork-based similaritySemMedDBOntology toolsOntologyIdentified conceptsSource of conceptsPipelineDomains of social determinants of healthAxiomsDeveloping and sustaining inclusive language in biomedical informatics communications: an AMIA Board of Directors endorsed paper on the Inclusive Language and Context Style Guidelines
Bear Don't Walk O, Haldar S, Wei D, Huang H, Rivera R, Fan J, Keloth V, Leung T, Desai P, Korngiebel D, Grossman Liu L, Pichon A, Subbian V, Solomonides A, Wiley L, Ogunyemi O, Jackson G, Dankwa-Mullan I, Dirks L, Everhart A, Parker A, Iott B, Kronk C, Foraker R, Martin K, Anand T, Volpe S, Yung N, Rizvi R, Lucero R, Bright T. Developing and sustaining inclusive language in biomedical informatics communications: an AMIA Board of Directors endorsed paper on the Inclusive Language and Context Style Guidelines. Journal Of The American Medical Informatics Association 2025, 32: 1380-1387. PMID: 40523007, PMCID: PMC12277697, DOI: 10.1093/jamia/ocaf096.Peer-Reviewed Original ResearchCitationsAltmetricSocial determinants of health extraction from clinical notes across institutions using large language models
Keloth V, Selek S, Chen Q, Gilman C, Fu S, Dang Y, Chen X, Hu X, Zhou Y, He H, Fan J, Wang K, Brandt C, Tao C, Liu H, Xu H. Social determinants of health extraction from clinical notes across institutions using large language models. Npj Digital Medicine 2025, 8: 287. PMID: 40379919, PMCID: PMC12084648, DOI: 10.1038/s41746-025-01645-8.Peer-Reviewed Original ResearchCitationsAltmetricBenchmarking large language models for biomedical natural language processing applications and recommendations
Chen Q, Hu Y, Peng X, Xie Q, Jin Q, Gilson A, Singer M, Ai X, Lai P, Wang Z, Keloth V, Raja K, Huang J, He H, Lin F, Du J, Zhang R, Zheng W, Adelman R, Lu Z, Xu H. Benchmarking large language models for biomedical natural language processing applications and recommendations. Nature Communications 2025, 16: 3280. PMID: 40188094, PMCID: PMC11972378, DOI: 10.1038/s41467-025-56989-2.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsLanguage modelNatural language processing applicationsBiomedical natural language processingMedical question answeringLanguage processing applicationsNatural language processingGrowth of biomedical literatureMissing informationFew-shotQuestion AnsweringZero-ShotKnowledge curationLanguage processingProcessing applicationsBioNLPBART modelPerformance gapBiomedical literatureGeneral domainTaskBenchmarksBERTInformationPerformanceLLMThe Development Landscape of Large Language Models for Biomedical Applications
Cao Z, Keloth V, Xie Q, Qian L, Liu Y, Wang Y, Shi R, Zhou W, Yang G, Zhang J, Peng X, Zhen E, Weng R, Chen Q, Xu H. The Development Landscape of Large Language Models for Biomedical Applications. Annual Review Of Biomedical Data Science 2025, 8: 251-274. PMID: 40169010, PMCID: PMC12372014, DOI: 10.1146/annurev-biodatasci-102224-074736.Peer-Reviewed Original ResearchCitationsConceptsLanguage modelTask-specific fine-tuningPrivacy concernsImprove data sharingComputational resourcesSpecialized medical applicationsBiomedical dataData sharingFine-tuningBiomedical literatureTransform healthcareModel accessDevelopment processMedical applicationsMultimodal integrationChatGPTPrivacyApplicationsModel characteristicsTrainingArchitectureLLMMedical researchMedical foundation large language models for comprehensive text analysis and beyond
Xie Q, Chen Q, Chen A, Peng C, Hu Y, Lin F, Peng X, Huang J, Zhang J, Keloth V, Zhou X, Qian L, He H, Shung D, Ohno-Machado L, Wu Y, Xu H, Bian J. Medical foundation large language models for comprehensive text analysis and beyond. Npj Digital Medicine 2025, 8: 141. PMID: 40044845, PMCID: PMC11882967, DOI: 10.1038/s41746-025-01533-1.Peer-Reviewed Original ResearchCitationsAltmetricConceptsText analysis tasksAnalysis tasksLanguage modelDomain-specific knowledgeZero-ShotHuman evaluationSupervised settingTask-specific instructionsClinical data sourcesSpecialized medical knowledgeChatGPTText analysisPretrainingTaskData sourcesMedical applicationsMedical knowledgeEnhanced performanceTextPerformance
2024
Using clinical entity recognition for curating an interface terminology to aid fast skimming of EHRs
Kollapally N, Dehkordi M, Perl Y, Geller J, Deek F, Liu H, Keloth V, Elhanan G, Einstein A, Zhou S. Using clinical entity recognition for curating an interface terminology to aid fast skimming of EHRs. 2024, 00: 6427-6434. DOI: 10.1109/bibm62325.2024.10822845.Peer-Reviewed Original ResearchCitationsConceptsElectronic health recordsEntity recognitionVolume of electronic health recordsEHR interoperabilityClinical entity recognitionClinical terminologyNeural network modelClinical NERTransfer learningSNOMED CT conceptsInterface terminologyNetwork modelSNOMED CTHealth recordsHigher granularityCT conceptsOverworked physiciansHealthcare providersDense volumeRecognitionInteroperabilityCurationGranularityCardiology patientsNERDetection of Gastrointestinal Bleeding With Large Language Models to Aid Quality Improvement and Appropriate Reimbursement
Zheng N, Keloth V, You K, Kats D, Li D, Deshpande O, Sachar H, Xu H, Laine L, Shung D. Detection of Gastrointestinal Bleeding With Large Language Models to Aid Quality Improvement and Appropriate Reimbursement. Gastroenterology 2024, 168: 111-120.e4. PMID: 39304088, DOI: 10.1053/j.gastro.2024.09.014.Peer-Reviewed Original ResearchCitationsAltmetricConceptsElectronic health recordsOvert gastrointestinal bleedingGastrointestinal bleedingRecurrent bleedingMachine learning modelsHealth recordsClinically relevant applicationsNursing notesLanguage modelAcute gastrointestinal bleedingQuality improvementLearning modelsDetection of gastrointestinal bleedingReimbursementIdentification of clinical conditionsSeparate hospitalsQuality measuresHospitalBleedingClinical conditionsPatient managementEarly identificationPatientsReimbursement codesCoding algorithm
Academic Achievements & Community Involvement
Copy Link
Activities
activity JAMIA Open
04/03/2023 - PresentJournal ServiceRevieweractivity PLOS ONE
03/14/2023 - PresentJournal ServiceRevieweractivity American Medical Informatics Association (AMIA)
05/15/2018 - PresentProfessional OrganizationsMemberactivity Journal of Biomedical Informatics
06/30/2021 - PresentJournal ServiceRevieweractivity BMC Supplements
06/01/2021 - PresentJournal ServiceReviewer
News
Copy Link
News
Get In Touch
Copy Link
Contacts
Locations
100 College Street
Academic Office
New Haven, CT 06510