2025
RouterRetriever: Routing over a Mixture of Expert Embedding Models
Lee H, Soldaini L, Cohan A, Seo M, Lo K. RouterRetriever: Routing over a Mixture of Expert Embedding Models. Proceedings Of The AAAI Conference On Artificial Intelligence 2025, 39: 11995-12003. DOI: 10.1609/aaai.v39i11.33306.Peer-Reviewed Original ResearchEmbedding modelRouting mechanismGeneral domain datasetsMulti-task trainingDomain-specific dataInformation retrieval methodsMulti-task modelDomain-specific expertsExpert retrievalInformation retrievalLanguage modelRouting techniquesRetrieval modelUnderperforming modelsRetrieval methodRetrievalSpecialized domainsDatasetGeneration researchExpertsQueryInformationLanguageTrainingEmbeddingmFollowIR: A Multilingual Benchmark for Instruction Following in Retrieval
Weller O, Chang B, Yang E, Yarmohammadi M, Barham S, MacAvaney S, Cohan A, Soldaini L, Van Durme B, Lawrie D. mFollowIR: A Multilingual Benchmark for Instruction Following in Retrieval. Lecture Notes In Computer Science 2025, 15573: 295-310. DOI: 10.1007/978-3-031-88711-6_19.Peer-Reviewed Original ResearchAutomated transformation of unstructured cardiovascular diagnostic reports into structured datasets using sequentially deployed large language models
Vasisht Shankar S, Dhingra L, Aminorroaya A, Adejumo P, Nadkarni G, Xu H, Brandt C, Oikonomou E, Pedroso A, Khera R. Automated transformation of unstructured cardiovascular diagnostic reports into structured datasets using sequentially deployed large language models. European Heart Journal - Digital Health 2025, ztaf030. DOI: 10.1093/ehjdh/ztaf030.Peer-Reviewed Original ResearchIgniting Language Intelligence: The Hitchhiker's Guide from Chain-of-Thought Reasoning to Language Agents
Zhang Z, Yao Y, Zhang A, Tang X, Ma X, He Z, Wang Y, Gerstein M, Wang R, Liu G, Zhao H. Igniting Language Intelligence: The Hitchhiker's Guide from Chain-of-Thought Reasoning to Language Agents. ACM Computing Surveys 2025, 57: 1-39. DOI: 10.1145/3719341.Peer-Reviewed Original ResearchLanguage agentsComplex reasoning tasksReasoning capabilitiesReasoning methodologyLanguage modelCOTS techniquesReasoning approachReasoning tasksTheoretical proofLanguage instructionLinguistic contextEmpirical performanceLanguage intelligenceReasoning performanceHitchhiker’s GuideLanguageSurvey articleEnhance interpretationAdvanced cognitive abilitiesCOTSResearch dimensionsExecutive actionProspective research avenuesCognitive abilitiesCOTS approachMedical foundation large language models for comprehensive text analysis and beyond
Xie Q, Chen Q, Chen A, Peng C, Hu Y, Lin F, Peng X, Huang J, Zhang J, Keloth V, Zhou X, Qian L, He H, Shung D, Ohno-Machado L, Wu Y, Xu H, Bian J. Medical foundation large language models for comprehensive text analysis and beyond. Npj Digital Medicine 2025, 8: 141. PMID: 40044845, PMCID: PMC11882967, DOI: 10.1038/s41746-025-01533-1.Peer-Reviewed Original ResearchText analysis tasksAnalysis tasksLanguage modelDomain-specific knowledgeZero-ShotHuman evaluationSupervised settingTask-specific instructionsClinical data sourcesSpecialized medical knowledgeChatGPTText analysisPretrainingTaskData sourcesMedical applicationsMedical knowledgeEnhanced performanceTextPerformanceAssessing and alleviating state anxiety in large language models
Ben-Zion Z, Witte K, Jagadish A, Duek O, Harpaz-Rotem I, Khorsandian M, Burrer A, Seifritz E, Homan P, Schulz E, Spiller T. Assessing and alleviating state anxiety in large language models. Npj Digital Medicine 2025, 8: 132. PMID: 40033130, PMCID: PMC11876565, DOI: 10.1038/s41746-025-01512-6.Peer-Reviewed Original ResearchEditorial Comment: Large Language Models Have Potential to Improve Follow-Up But May Have Unforeseen Impact on Radiology Reporting.
Mezrich J. Editorial Comment: Large Language Models Have Potential to Improve Follow-Up But May Have Unforeseen Impact on Radiology Reporting. American Journal Of Roentgenology 2025 PMID: 39936860, DOI: 10.2214/ajr.25.32734.Peer-Reviewed Original ResearchLanguage modelLanguageImproving entity recognition using ensembles of deep learning and fine-tuned large language models: A case study on adverse event extraction from VAERS and social media
Li Y, Viswaroopan D, He W, Li J, Zuo X, Xu H, Tao C. Improving entity recognition using ensembles of deep learning and fine-tuned large language models: A case study on adverse event extraction from VAERS and social media. Journal Of Biomedical Informatics 2025, 163: 104789. PMID: 39923968, DOI: 10.1016/j.jbi.2025.104789.Peer-Reviewed Original ResearchConceptsTraditional deep learning modelsDeep learning modelsRecurrent neural networkLearning modelsEntity recognitionLanguage modelF1 scoreEnsemble of deep learningAdvances of natural language processingEffectiveness of ensemble methodsMicro-averaged F1Bidirectional Encoder RepresentationsExtensive labeled dataNatural language processingFine-tuned modelsBiomedical text miningFeature representationEncoder RepresentationsEvent extractionEntity typesText dataDeep learningSequential dataGPT-2Neural networkCollaborative large language models for automated data extraction in living systematic reviews
Khan M, Ayub U, Naqvi S, Khakwani K, Sipra Z, Raina A, Zhou S, He H, Saeidi A, Hasan B, Rumble R, Bitterman D, Warner J, Zou J, Tevaarwerk A, Leventakos K, Kehl K, Palmer J, Murad M, Baral C, Riaz I. Collaborative large language models for automated data extraction in living systematic reviews. Journal Of The American Medical Informatics Association 2025, ocae325. PMID: 39836495, DOI: 10.1093/jamia/ocae325.Peer-Reviewed Original ResearchUsing natural language processing to identify emergency department patients with incidental lung nodules requiring follow‐up
Moore C, Socrates V, Hesami M, Denkewicz R, Cavallo J, Venkatesh A, Taylor R. Using natural language processing to identify emergency department patients with incidental lung nodules requiring follow‐up. Academic Emergency Medicine 2025, 32: 274-283. PMID: 39821298, DOI: 10.1111/acem.15080.Peer-Reviewed Original ResearchNatural language processingIncidental lung nodulesFollow-upChest CTsCT reportsF1 scoreLung nodulesEmergency departmentLanguage processingFollow-up of incidental findingsIncidental findingNatural language processing developersAbsence of malignancyMetrics of precisionNatural language processing pipelineNatural language processing metricsChest CT reportsRecommended follow-upEmergency department patientsFollow-up rateLanguage modelLung cancerReduce errorsMalignancyDepartment patientsBiomedRAG: A retrieval augmented large language model for biomedicine
Li M, Kilicoglu H, Xu H, Zhang R. BiomedRAG: A retrieval augmented large language model for biomedicine. Journal Of Biomedical Informatics 2025, 162: 104769. PMID: 39814274, PMCID: PMC11837810, DOI: 10.1016/j.jbi.2024.104769.Peer-Reviewed Original Research
2024
A predictive language model for SARS-CoV-2 evolution
Ma E, Guo X, Hu M, Wang P, Wang X, Wei C, Cheng G. A predictive language model for SARS-CoV-2 evolution. Signal Transduction And Targeted Therapy 2024, 9: 353. PMID: 39710752, PMCID: PMC11663983, DOI: 10.1038/s41392-024-02066-x.Peer-Reviewed Original ResearchConceptsSARS-CoV-2 evolutionHot mutation spotsFrequency of mutationsSequence dataS1 sequencesVital mutationsPredicted mutationsSARS-CoV-2 variantsMutationsViral evolutionSARS-CoV-2Viral pathogensMutation profilesVariantsImmune evasionSequenceMutation spotsPredictive language modelViral mutationsLanguage modelStrainViral infectionSemantic representationChatGPT and frequently asked patient questions for upper eyelid blepharoplasty surgery
Watane A, Perzia B, Weiss M, Tooley A, Li E, Habib L, Tenzel P, Maeng M. ChatGPT and frequently asked patient questions for upper eyelid blepharoplasty surgery. Orbit 2024, ahead-of-print: 1-4. PMID: 39671176, DOI: 10.1080/01676830.2024.2435930.Peer-Reviewed Original ResearchOculofacial plastic surgeonsAmerican Society of Ophthalmic Plastic and Reconstructive SurgeryBlepharoplasty procedureMean Likert scale scoreOphthalmic Plastic and Reconstructive SurgeryUpper eyelid blepharoplasty surgeryCross-sectional surveyOnline health information seekersHealth information seekersLikert scale scoresNon-inferior accuracyPlastic and Reconstructive SurgeryPatient educationPatient questionsClinical reasoningBlepharoplasty surgeryReconstructive surgeryLikert scalePlastic surgeonsScale scorePatientsSDLanguage modelChatGPTSurgeryIdentifying Deprescribing Opportunities with Large Language Models in Older Adults: Retrospective Cohort Study (Preprint)
Socrates V, Wright D, Huang T, Fereydooni S, Dien C, Chi L, Albano J, Patterson B, Kanaparthy N, Wright C, Loza A, Chartash D, Iscoe M, Taylor R. Identifying Deprescribing Opportunities with Large Language Models in Older Adults: Retrospective Cohort Study (Preprint). JMIR Aging 2024 DOI: 10.2196/69504.Peer-Reviewed Original ResearchLanguage modelOlder adultsLarge language models surpass human experts in predicting neuroscience results
Luo X, Rechardt A, Sun G, Nejad K, Yáñez F, Yilmaz B, Lee K, Cohen A, Borghesani V, Pashkov A, Marinazzo D, Nicholas J, Salatiello A, Sucholutsky I, Minervini P, Razavi S, Rocca R, Yusifov E, Okalova T, Gu N, Ferianc M, Khona M, Patil K, Lee P, Mata R, Myers N, Bizley J, Musslick S, Bilgin I, Niso G, Ales J, Gaebler M, Ratan Murty N, Loued-Khenissi L, Behler A, Hall C, Dafflon J, Bao S, Love B. Large language models surpass human experts in predicting neuroscience results. Nature Human Behaviour 2024, 9: 305-315. PMID: 39604572, PMCID: PMC11860209, DOI: 10.1038/s41562-024-02046-9.Peer-Reviewed Original ResearchConceptsHuman expertsLanguage modelHuman information processing capacityNeuroscience resultsKnowledge-intensive endeavourInformation processing capacityProcessing capacityNeuroscience literatureExperimental outcomesExpertsHigh confidenceScientific discoveryDecades of researchLanguageNeuroscienceLLMTaskOptimizing Ingredient Substitution Using Large Language Models to Enhance Phytochemical Content in Recipes
Rita L, Southern J, Laponogov I, Higgins K, Veselkov K. Optimizing Ingredient Substitution Using Large Language Models to Enhance Phytochemical Content in Recipes. Machine Learning And Knowledge Extraction 2024, 6: 2738-2752. DOI: 10.3390/make6040131.Peer-Reviewed Original ResearchArtificial Intelligence in Diagnosing and Managing Vascular Surgery Patients: An Experimental Study Using the GPT-4 Model
Alexiou V, Sumpio B, Vassiliou A, Kakkos S, Geroulakos G. Artificial Intelligence in Diagnosing and Managing Vascular Surgery Patients: An Experimental Study Using the GPT-4 Model. Annals Of Vascular Surgery 2024, 111: 260-267. PMID: 39586530, DOI: 10.1016/j.avsg.2024.11.014.Peer-Reviewed Original ResearchNatural language processingAI modelsArtificial intelligenceMachine learning algorithmsLanguage modelLearning algorithmsVascular surgery patientsRelevant answersLanguage processingAI chatbotsIntroduction of artificial intelligenceStandalone solutionMedical classification systemsTest scenariosSurgery patientsMedical informationClinical scenariosComplex problemsIntelligenceScientific fieldsComplex clinical scenariosScenariosStatistically significant differenceClinically relevant answersPerformance variationAccuracy of Spanish and English-generated ChatGPT responses to commonly asked patient questions about labor epidurals: a survey-based study among bilingual obstetric anesthesia experts
Gonzalez Fiol A, Mootz A, He Z, Delgado C, Ortiz V, Reale S. Accuracy of Spanish and English-generated ChatGPT responses to commonly asked patient questions about labor epidurals: a survey-based study among bilingual obstetric anesthesia experts. International Journal Of Obstetric Anesthesia 2024, 61: 104290. PMID: 39579604, DOI: 10.1016/j.ijoa.2024.104290.Peer-Reviewed Original ResearchLabor epiduralsNon-English-speaking patientsMode of deliveryHealth inequalitiesLabor courseCesarean deliveryObstetric anesthesiologistPatient questionsSurvey-based studyMedian scoreMedical adviceAnesthesia expertsEpiduralsLikert scaleEnglishLanguage modelEnglish answersPerpetuate misinformationPatientsLanguageSpanishScoresAccuracy scoresDeliveryQuestionsThe “ David Vs Goliath ” Study: Application of Large Language Models (LLM) for Automatic Medical Information Retrieval from Multiple Data Sources to Accelerate Clinical and Translational Research in Hematology
Delleani M, D'Amico S, Sauta E, Asti G, Zazzetti E, Campagna A, Lanino L, Maggioni G, Grondelli M, Forcina Barrero A, Morandini P, Ubezio M, Todisco G, Russo A, Tentori C, Buizza A, Bonometti A, Lancellotti C, Di Tommaso L, Rahal D, Bicchieri M, Savevski V, Santoro A, Santini V, Sole F, Platzbecker U, Fenaux P, Diez-Campelo M, Komrokji R, Garcia-Manero G, Haferlach T, Kordasti S, Zeidan A, Castellani G, Della Porta M. The “ David Vs Goliath ” Study: Application of Large Language Models (LLM) for Automatic Medical Information Retrieval from Multiple Data Sources to Accelerate Clinical and Translational Research in Hematology. Blood 2024, 144: 3597-3597. DOI: 10.1182/blood-2024-205621.Peer-Reviewed Original ResearchGenerative Pretrained TransformerInformation retrievalHealthcare dataLanguage modelArtificial intelligenceNatural language processing tasksSemi-supervised training processMedical information retrievalAutomatic information retrievalOriginal datasetLanguage processing tasksValidation frameworkData collection tasksRetrieval information systemsReducing human effortPotential of artificial intelligenceLearning statistical relationshipsStatistical fidelitySelf-supervisionPretrained TransformerStandard datasetsLanguage generationPrivacy limitationsData model formatCollection tasksGeneration of Multimodal Longitudinal Synthetic Data By Artificial Intelligence to Improve Personalized Medicine in Hematology
D'Amico S, Delleani M, Sauta E, Asti G, Zazzetti E, Campagna A, Lanino L, Maggioni G, Ubezio M, Todisco G, Russo A, Tentori C, Buizza A, Bicchieri M, Zampini M, Brindisi M, Ficara F, Riva E, Ventura D, Crisafulli L, Pinocchio N, Jacobs F, Zambelli A, Savevski V, Santoro A, Sanavia T, Rollo C, Sartori F, Fariselli P, Sanz G, Santini V, Sole F, Platzbecker U, Fenaux P, Diez-Campelo M, Kordasti S, Komrokji R, Garcia-Manero G, Haferlach T, Zeidan A, Castellani G, Della Porta M. Generation of Multimodal Longitudinal Synthetic Data By Artificial Intelligence to Improve Personalized Medicine in Hematology. Blood 2024, 144: 4981-4981. DOI: 10.1182/blood-2024-209541.Peer-Reviewed Original ResearchDeep learning-based frameworkLearning-based frameworkPrivacy preservationSynthetic dataSynthetic patientsPerformance of classificationXGBoost classification modelDisease classificationPrivacy-compliantConditional GANPrivacy protectionPrivacy risksLanguage modelGeneration pipelineMultimodal featuresFeature distributionGenerative AIMultimodal dataModel trainingMosaic frameworkClassification modelData integrationArtificial intelligencePrivacyClipping module
This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply