Kalpana Raja, PhD, MRSB, CSci
Instructor of Biomedical Informatics and Data ScienceCards
Contact Info
Biomedical Informatics & Data Science
100 College St
New Haven, CT 06510
United States
About
Titles
Instructor of Biomedical Informatics and Data Science
Biography
Kalpana Raja, PhD joined the Section of Biomedical Informatics & Data Science (BIDS) at Yale School of Medicine in February of 2023. Before moving to New Haven, CT, Kalpana worked as an assistant professor at the School of Biomedical Informatics, University of Texas Health Science Center (UTHealth) at Houston, TX. She also worked as a scientist at Sema4, a patient centered healthcare company located in Stamford, CT. Kalpana completed her bachelor’s degree in pharmacy from Tamil Nadu Dr. M.G.R. Medical University at Chennai, India. She is a registered pharmacist with the Indian Pharmacy Council. With a vision to develop software for biological applications, she completed her master’s degree in computing with a focus in software technology from The Robert Gordon University, Aberdeen, UK. She developed ProfileSKiM, an intelligent document retrieval tool, and submitted the findings in her MSc thesis. ProfileSKiM received a reward from the Robert Gordon University in 2005 and the Technology Award from the British Computer Society, London, UK in 2006. Kalpana completed her second master’s degree in bioinformatics and her PhD in computing: software technology – bioinformatics (inter-disciplinary) from Bharathiar University, Coimbatore, India. She presented her findings from the PhD research work at the 2012 Asia Pacific Bioinformatics Conference (APBC) held in Melbourne, Australia, and BioCreative Conference V held at Washington DC.
Kalpana’s research interests include natural language processing (NLP) and machine learning. She developed methodologies and software for information retrieval, information extraction, knowledge summarization, literature-based discovery, and automated hypothesis generation. She applied her approaches on various biological domains such as protein-protein interaction, protein phosphorylation, drug-drug interactions, adverse drug events, drug repurposing, and disease comorbidity. She also provided the NLP support for various genomics and transcriptomics projects. Kalpana has published more than 70 articles in peer reviewed journals, books, and conference proceedings. She has reviewed several research articles submitted to prestigious journals such as Briefings in Bioinformatics and serves as an associate editor in the Journal of Embryology & Stem Cell Research.
Kalpana was elected as a “Member of Royal Society of Biology” (MRSB) in 2019 by the Royal Society of Biology, London, UK. Recently, she was honored as the “Chartered Scientist” (CSci) by the Royal Society of Biology, London, UK. Kalpana also received the “2019 Women Scientist Award” from the Society for Bioinformatics and Biological Sciences, a non-profit professional society based in India.
Areas of Expertise
- Natural Language Processing
- Artificial Intelligence (AI)
- Large Language Models (LLMs)
- Deep Learning
- Machine Learning
- Biomedical informatics
Google scholar
Appointments
Biomedical Informatics & Data Science
InstructorPrimary
Other Departments & Organizations
- Biomedical Informatics & Data Science
- Clinical NLP Lab
Education & Training
- PhD
- Bharathiar University
- MSc
- Bharathiar University
- MSc
- The Robert Gordon University
- BPharm
- Tamilnadu Dr M G R Medical University
Research
Overview
Medical Research Interests
ORCID
0000-0002-3156-4197
Research at a Glance
Yale Co-Authors
Publications Timeline
Research Interests
Hua Xu, PhD
Vipina K. Keloth, PhD
Qingyu Chen, PhD
Jeffrey Zhang
Machine Learning
Publications
2024
A Study of Biomedical Relation Extraction Using GPT Models.
Zhang J, Wibert M, Zhou H, Peng X, Chen Q, Keloth V, Hu Y, Zhang R, Xu H, Raja K. A Study of Biomedical Relation Extraction Using GPT Models. AMIA Joint Summits On Translational Science Proceedings 2024, 2024: 391-400. PMID: 38827097, PMCID: PMC11141827.Peer-Reviewed Original ResearchCitationsAdvancing entity recognition in biomedicine via instruction tuning of large language models
Keloth V, Hu Y, Xie Q, Peng X, Wang Y, Zheng A, Selek M, Raja K, Wei C, Jin Q, Lu Z, Chen Q, Xu H. Advancing entity recognition in biomedicine via instruction tuning of large language models. Bioinformatics 2024, 40: btae163. PMID: 38514400, PMCID: PMC11001490, DOI: 10.1093/bioinformatics/btae163.Peer-Reviewed Original ResearchCitationsAltmetricConceptsNamed Entity RecognitionSequence labeling taskNatural language processingBiomedical NER datasetsLanguage modelNER datasetsEntity recognitionLabeling taskText generationField of natural language processingBiomedical NERFew-shot learning capabilityReasoning tasksMulti-domain scenariosDomain-specific modelsEnd-to-endMinimal fine-tuningSOTA performanceF1 scoreHealthcare applicationsBiomedical entitiesBiomedical domainLanguage processingMulti-taskingPubMedBERT model
2023
Serial KinderMiner (SKiM) discovers and annotates biomedical knowledge using co-occurrence and transformer models
Millikin R, Raja K, Steill J, Lock C, Tu X, Ross I, Tsoi L, Kuusisto F, Ni Z, Livny M, Bockelman B, Thomson J, Stewart R. Serial KinderMiner (SKiM) discovers and annotates biomedical knowledge using co-occurrence and transformer models. BMC Bioinformatics 2023, 24: 412. PMID: 37915001, PMCID: PMC10619245, DOI: 10.1186/s12859-023-05539-y.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsLiterature-based discoveryKnowledge graphWeb interfaceLiterature-based discovery toolsUser-defined conceptsFunctional web-interfaceOpen-source toolBiomedical domainMachine-learning modelsOpen-source web interfaceBiomedical conceptsType labelsKnowledge domainsLiterature domainTransformation modelAlgorithmBiomedical knowledgeGraphQuerying thousandsPubMed archiveTowards precise PICO extraction from abstracts of randomized controlled trials using a section-specific learning approach
Hu Y, Keloth V, Raja K, Chen Y, Xu H. Towards precise PICO extraction from abstracts of randomized controlled trials using a section-specific learning approach. Bioinformatics 2023, 39: btad542. PMID: 37669123, PMCID: PMC10500081, DOI: 10.1093/bioinformatics/btad542.Peer-Reviewed Original ResearchCitationsAltmetricConceptsNatural language processingMicro-F1 scoreCOVID-19 datasetNLP pipelineF1 scoreEntity recognition modelAD datasetPICO elementsSentence classificationNER modelRecognition modelLanguage processingLearning approachLearning modelEnd evaluationSupplementary dataDatasetPipelineExtractionInformationRCT abstractsAnnotationSentencesBioinformaticsComplexity
2022
A haplotype-resolved genome assembly of the Nile rat facilitates exploration of the genetic basis of diabetes
Toh H, Yang C, Formenti G, Raja K, Yan L, Tracey A, Chow W, Howe K, Bergeron L, Zhang G, Haase B, Mountcastle J, Fedrigo O, Fogg J, Kirilenko B, Munegowda C, Hiller M, Jain A, Kihara D, Rhie A, Phillippy A, Swanson S, Jiang P, Clegg D, Jarvis E, Thomson J, Stewart R, Chaisson M, Bukhman Y. A haplotype-resolved genome assembly of the Nile rat facilitates exploration of the genetic basis of diabetes. BMC Biology 2022, 20: 245. PMID: 36344967, PMCID: PMC9641963, DOI: 10.1186/s12915-022-01427-8.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsVertebrate Genomes ProjectGenome assemblyChromosome-level reference genome assemblyLevels of genomic resolutionGenes associated with type 2 diabetesReference genome assemblyNile ratContig N50Scaffold N50Diet-induced diabetesGenomic resolutionDuplicated GenesSegmental duplicationsGenomic featuresGenome ProjectModel organismsRobust diurnal rhythmParental haplotypesGenetic basisHouse miceMus musculusCone-rich retinaGenesN50Genetic modificationIntegrated Approaches to Identify miRNA Biomarkers Associated with Cognitive Dysfunction in Multiple Sclerosis Using Text Mining, Gene Expression, Pathways, and GWAS
Prabahar A, Raja K. Integrated Approaches to Identify miRNA Biomarkers Associated with Cognitive Dysfunction in Multiple Sclerosis Using Text Mining, Gene Expression, Pathways, and GWAS. Diagnostics 2022, 12: 1914. PMID: 36010264, PMCID: PMC9406323, DOI: 10.3390/diagnostics12081914.Peer-Reviewed Original ResearchCitationsConceptsGenome-wide association studiesGenome-wide association study signalsGenome-wide association study catalogECM-receptor signaling pathwayPathway analysisSignaling pathwaySusceptibility genetic lociSusceptibility genetic variantsReceptor Signaling PathwayAssociation signalsGenetic lociGenomic etiologyAssociation studiesGene regulationTranscriptomic studiesGenetic variantsPI3K/Akt Signaling PathwayGene expressionComprehensive repositoryGenetic riskPathwayHsa-miR-148b-3pGenesExperimental associationsAssociation networksBiomedical Literature Mining and Its Components
Raja K. Biomedical Literature Mining and Its Components. Methods In Molecular Biology 2022, 2496: 1-16. PMID: 35713856, DOI: 10.1007/978-1-0716-2305-3_1.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsMining protocolInformation retrieval approachBiomedical literaturePublished biomedical articlesUser queriesMining tasksInformation retrievalInformation extractionKnowledge discoveryBiomedical textBiomedical entitiesBiomedical articlesRetrieval approachAutomatic extractionRetrieving informationMining approachRelevant documentsPubMed titlesManual extractionMiningSource of knowledgeDrug mentionsInformationExponential ratePopulation informationA Text Mining Protocol for Extracting Drug–Drug Interaction and Adverse Drug Reactions Specific to Patient Population, Pharmacokinetics, Pharmacodynamics, and Disease
Shukkoor M, Baharuldin M, Raja K. A Text Mining Protocol for Extracting Drug–Drug Interaction and Adverse Drug Reactions Specific to Patient Population, Pharmacokinetics, Pharmacodynamics, and Disease. Methods In Molecular Biology 2022, 2496: 259-282. PMID: 35713869, DOI: 10.1007/978-1-0716-2305-3_14.Peer-Reviewed Original ResearchCitationsMeSH Keywords and Concepts
2021
Basics of Fungal Siderophores: Classification, Iron Transport and Storage, Chemistry and Biosynthesis, Application, and More
Arputhanantham S, Raja K, Shanmugam L, Raman V. Basics of Fungal Siderophores: Classification, Iron Transport and Storage, Chemistry and Biosynthesis, Application, and More. Fungal Biology 2021, 1-12. DOI: 10.1007/978-3-030-53077-8_1.Peer-Reviewed Original ResearchCitationsConceptsFungal siderophoresEssential proteinsSecrete siderophoresCellular homeostasisCellular functionsSiderophorePublished biomedical literatureOptimal growthIron uptakeIron transportBiosynthesisFungiBacteriaMicroorganismsAccumulation of ironBiomedical literatureIron-chelating agentProteinIntrinsic mechanismHomeostasisHost
2020
Caenorhabditis elegans as a possible model to screen anti-Alzheimer's therapeutics
Paul D, Chipurupalli S, Justin A, Raja K, Mohankumar S. Caenorhabditis elegans as a possible model to screen anti-Alzheimer's therapeutics. Journal Of Pharmacological And Toxicological Methods 2020, 106: 106932. PMID: 33091537, DOI: 10.1016/j.vascn.2020.106932.Peer-Reviewed Original ResearchCitationsAltmetricMeSH Keywords and ConceptsConceptsC. elegansCaenorhabditis elegansAD therapeuticsAlzheimer's diseaseDevelopment of AD therapeuticsPathology of ADCombat ADAD pathogenesisEndoplasmic reticulumMitochondria dysfunctionHuman ADMolecular mechanismsMolecular cascadesCaenorhabditisAnti-AlzheimerIn vivoIn vitroTherapeuticsPathological processesMitochondriaPharmacological approachesGenesReticulumConcurrent advancesExperimental pharmacological approaches
Academic Achievements & Community Involvement
Honors
honor Full Member
12/04/2023National AwardSigma Xi, NC, USAhonor Outstanding Reviewer
11/30/2023International AwardMultidisciplinary Digital Publishing Institute (MDPI), Basel, SwitzerlandDetailsSwitzerlandhonor Chartered Scientist (CSci)
04/01/2022International AwardThe Royal Society, London, UKDetailsUnited Kingdomhonor 2019 Women Scientist Award
12/18/2020International AwardThe Society for Bioinformatics and Biological Sciences, Indiahonor Chair Person
12/14/2020International AwardInternational Conference on Agriculture and Biological Sciences at Kathmandu (Lalitpur), Nepal
News
News
- September 16, 2025Source: NIH
Yale Team Recognized in NIH $1 Million Data Sharing Challenge
- September 27, 2024
Biomedical Informatics and Data Science (BIDS) Secures a $7.88 Million NIH Grant to Advance Mental Health Research Using AI Technology
- June 17, 2024
Hot off the Press: Natural Language Processing in Biomedicine
Get In Touch
Contacts
Biomedical Informatics & Data Science
100 College St
New Haven, CT 06510
United States