2025
RouterRetriever: Routing over a Mixture of Expert Embedding Models
Lee H, Soldaini L, Cohan A, Seo M, Lo K. RouterRetriever: Routing over a Mixture of Expert Embedding Models. Proceedings Of The AAAI Conference On Artificial Intelligence 2025, 39: 11995-12003. DOI: 10.1609/aaai.v39i11.33306.Peer-Reviewed Original ResearchEmbedding modelRouting mechanismGeneral domain datasetsMulti-task trainingDomain-specific dataInformation retrieval methodsMulti-task modelDomain-specific expertsExpert retrievalInformation retrievalLanguage modelRouting techniquesRetrieval modelUnderperforming modelsRetrieval methodRetrievalSpecialized domainsDatasetGeneration researchExpertsQueryInformationLanguageTrainingEmbeddingmFollowIR: A Multilingual Benchmark for Instruction Following in Retrieval
Weller O, Chang B, Yang E, Yarmohammadi M, Barham S, MacAvaney S, Cohan A, Soldaini L, Van Durme B, Lawrie D. mFollowIR: A Multilingual Benchmark for Instruction Following in Retrieval. Lecture Notes In Computer Science 2025, 15573: 295-310. DOI: 10.1007/978-3-031-88711-6_19.Peer-Reviewed Original ResearchIgniting Language Intelligence: The Hitchhiker's Guide from Chain-of-Thought Reasoning to Language Agents
Zhang Z, Yao Y, Zhang A, Tang X, Ma X, He Z, Wang Y, Gerstein M, Wang R, Liu G, Zhao H. Igniting Language Intelligence: The Hitchhiker's Guide from Chain-of-Thought Reasoning to Language Agents. ACM Computing Surveys 2025, 57: 1-39. DOI: 10.1145/3719341.Peer-Reviewed Original ResearchLanguage agentsComplex reasoning tasksReasoning capabilitiesReasoning methodologyLanguage modelCOTS techniquesReasoning approachReasoning tasksTheoretical proofLanguage instructionLinguistic contextEmpirical performanceLanguage intelligenceReasoning performanceHitchhiker’s GuideLanguageSurvey articleEnhance interpretationAdvanced cognitive abilitiesCOTSResearch dimensionsExecutive actionProspective research avenuesCognitive abilitiesCOTS approachAssessing and alleviating state anxiety in large language models
Ben-Zion Z, Witte K, Jagadish A, Duek O, Harpaz-Rotem I, Khorsandian M, Burrer A, Seifritz E, Homan P, Schulz E, Spiller T. Assessing and alleviating state anxiety in large language models. Npj Digital Medicine 2025, 8: 132. PMID: 40033130, PMCID: PMC11876565, DOI: 10.1038/s41746-025-01512-6.Peer-Reviewed Original ResearchEditorial Comment: Large Language Models Have Potential to Improve Follow-Up But May Have Unforeseen Impact on Radiology Reporting.
Mezrich J. Editorial Comment: Large Language Models Have Potential to Improve Follow-Up But May Have Unforeseen Impact on Radiology Reporting. American Journal Of Roentgenology 2025 PMID: 39936860, DOI: 10.2214/ajr.25.32734.Peer-Reviewed Original ResearchLanguage modelLanguageImproving the Measures of Phonological Ability in the Russian Language: IRT and CART Modeling Application
Markov I, Kharitonova K, Grigorenko E. Improving the Measures of Phonological Ability in the Russian Language: IRT and CART Modeling Application. Reading Research Quarterly 2025, 60 DOI: 10.1002/rrq.604.Peer-Reviewed Original ResearchLevels of phonological abilityPhonological abilitiesMeasures of phonological abilityRepetitive testingPhonological working memoryDevelopment of literacyItem analysisLanguage-universalLanguage acquisitionRussian languageItem response theory modeling frameworkTransparent languagePseudoword repetitionPhonological awarenessLinguistic metricsLanguageSample of childrenTest development strategiesPseudowordsWorking memoryEffect of gradeItem subsetsTest adaptationItem selectionItemsBiomedRAG: A retrieval augmented large language model for biomedicine
Li M, Kilicoglu H, Xu H, Zhang R. BiomedRAG: A retrieval augmented large language model for biomedicine. Journal Of Biomedical Informatics 2025, 162: 104769. PMID: 39814274, PMCID: PMC11837810, DOI: 10.1016/j.jbi.2024.104769.Peer-Reviewed Original ResearchLanguage of administration and academic test performance in Ghanaian children
Garcia J, Kulesz P, Grigorenko E. Language of administration and academic test performance in Ghanaian children. International Journal Of Educational Research 2025, 130: 102549. DOI: 10.1016/j.ijer.2025.102549.Peer-Reviewed Original ResearchAcademic test performanceLanguage of test administrationItem characteristicsTest performancePerformance of school-aged childrenFunctioning of studentsLanguage of administrationSchool-aged childrenPhonological itemsTest administrationItem responsesAcademic performanceItemsChildrenAssessment approachPhonologyOrthographyMathematicsStudentsAdministrationLanguageGhana
2024
Large language models surpass human experts in predicting neuroscience results
Luo X, Rechardt A, Sun G, Nejad K, Yáñez F, Yilmaz B, Lee K, Cohen A, Borghesani V, Pashkov A, Marinazzo D, Nicholas J, Salatiello A, Sucholutsky I, Minervini P, Razavi S, Rocca R, Yusifov E, Okalova T, Gu N, Ferianc M, Khona M, Patil K, Lee P, Mata R, Myers N, Bizley J, Musslick S, Bilgin I, Niso G, Ales J, Gaebler M, Ratan Murty N, Loued-Khenissi L, Behler A, Hall C, Dafflon J, Bao S, Love B. Large language models surpass human experts in predicting neuroscience results. Nature Human Behaviour 2024, 9: 305-315. PMID: 39604572, PMCID: PMC11860209, DOI: 10.1038/s41562-024-02046-9.Peer-Reviewed Original ResearchConceptsHuman expertsLanguage modelHuman information processing capacityNeuroscience resultsKnowledge-intensive endeavourInformation processing capacityProcessing capacityNeuroscience literatureExperimental outcomesExpertsHigh confidenceScientific discoveryDecades of researchLanguageNeuroscienceLLMTaskCross-Cultural Validation of the Sexual Desire Inventory (SDI-2) in 42 Countries and 26 Languages
Castro-Calvo J, Beltrán-Martínez P, Ballester-Arnal R, Nagy L, Koós M, Kraus S, Demetrovics Z, Potenza M, Batthyány D, Bergeron S, Billieux J, Briken P, Burkauskas J, Cárdenas-López G, Carvalho J, Chen L, Ciocca G, Corazza O, Csakó R, Fernandez D, Fernandez E, Fujiwara H, Fuss J, Gabrhelík R, Gewirtz-Meydan A, Gjoneska B, Gola M, Grubbs J, Hashim H, Hsieh Y, Islam S, Ismail M, Jiménez-Martínez M, Jurin T, Kalina O, Klein V, Költő A, Lee S, Lewczuk K, Lin C, Lochner C, Lopez-Alvarado S, Lukavská K, Mayta-Tristán P, Miller D, Orosova O, Orosz G, Team S, Ponce F, Quintana G, Garzola G, Ramos-Diaz J, Rigaud K, Rousseau A, De Tubino Scanavino M, Schulmeyer M, Sharan P, Shibata M, Shoib, Sigre-Leirós V, Sniewski L, Spasovski O, Steibliene V, Stein D, Štulhofer A, Ünsal B, Vaillancourt-Morel M, Van Hout M, Bőthe B. Cross-Cultural Validation of the Sexual Desire Inventory (SDI-2) in 42 Countries and 26 Languages. The Journal Of Sex Research 2024, ahead-of-print: 1-14. PMID: 39560207, DOI: 10.1080/00224499.2024.2417023.Peer-Reviewed Original ResearchSexual Desire InventorySexual desireDesire InventorySDI-2Non-clinical sampleMeasurement invariance testingPsychometrically robust measureSexuality-related variablesConfirmatory factor analysisGroup-based differencesCross-cultural validitySexual orientationCross-cultural studiesPsychometric propertiesInvariance testingEffect sizeFactor analysisWell-beingExpression of sexual desireWeak-to-moderateSexual functionInventorySex surveysLanguageDesireAccuracy of Spanish and English-generated ChatGPT responses to commonly asked patient questions about labor epidurals: a survey-based study among bilingual obstetric anesthesia experts
Gonzalez Fiol A, Mootz A, He Z, Delgado C, Ortiz V, Reale S. Accuracy of Spanish and English-generated ChatGPT responses to commonly asked patient questions about labor epidurals: a survey-based study among bilingual obstetric anesthesia experts. International Journal Of Obstetric Anesthesia 2024, 61: 104290. PMID: 39579604, DOI: 10.1016/j.ijoa.2024.104290.Peer-Reviewed Original ResearchLabor epiduralsNon-English-speaking patientsMode of deliveryHealth inequalitiesLabor courseCesarean deliveryObstetric anesthesiologistPatient questionsSurvey-based studyMedian scoreMedical adviceAnesthesia expertsEpiduralsLikert scaleEnglishLanguage modelEnglish answersPerpetuate misinformationPatientsLanguageSpanishScoresAccuracy scoresDeliveryQuestionsA professional musician with progressive visuospatial concerns: a case study and review of musical alexia
Ficek-Tani B, Tun S, Frolov A, Sharp E, Fredericks C. A professional musician with progressive visuospatial concerns: a case study and review of musical alexia. Neurocase 2024, 30: 214-225. PMID: 39655794, DOI: 10.1080/13554794.2024.2438413.Peer-Reviewed Original ResearchCross-cultural Validation of the Arizona Sexual Experience Scale (ASEX) in 42 Countries and 26 Languages
Ballester-Arnal R, Elipe-Miravet M, Castro-Calvo J, Beltrán-Martínez P, Nagy L, Koós M, Kraus S, Demetrovics Z, Potenza M, Batthyány D, Bergeron S, Billieux J, Briken P, Burkauskas J, Cárdenas-López G, Carvalho J, Chen J, Chen L, Ciocca G, Corazza O, Csako R, Fernandez D, Fernandez E, Fujiwara H, Fuss J, Gabrhelík R, Gewirtz-Meydan A, Gjoneska B, Gola M, Grubbs J, Hashim H, Islam M, Ismail M, Jiménez-Martínez M, Jurin T, Kalina O, Klein V, Költő A, Lee S, Lewczuk K, Lin C, Lochner C, López-Alvarado S, Lukavská K, Mayta-Tristán P, Miller D, Orosová O, Orosz G, Ponce F, Quintana G, Garzola G, Ramos-Diaz J, Rigaud K, Rousseau A, De Tubino Scanavino M, Schulmeyer M, Sharan P, Shibata M, Shoib S, Sigre-Leirós V, Sniewski L, Spasovski O, Steibliene V, Stein D, Ünsal B, Vaillancourt-Morel M, Van Hout M, Bőthe B. Cross-cultural Validation of the Arizona Sexual Experience Scale (ASEX) in 42 Countries and 26 Languages. Sexuality Research And Social Policy 2024, 1-23. DOI: 10.1007/s13178-024-01040-0.Peer-Reviewed Original ResearchArizona Sexual Experience ScaleSexual Experience ScaleSexual function problemsExperiences ScaleOne-factor solutionSexual functionCross-cultural validitySexual orientationCross-cultural differencesMulti-national sampleAsexual participantsResidual invariancePsychometric examinationConvergent validityPsychometric propertiesSexual function issuesSex driveInternal consistencyPolicy ImplicationsThe findingsImplicationsThe findingsFunctional problemsWell-beingTailored interventionsOrgasmLanguageL2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models
Ni A, Yin P, Zhao Y, Riddell M, Feng T, Shen R, Yin S, Liu Y, Yavuz S, Xiong C, Joty S, Zhou Y, Radev D, Cohan A, Cohan A. L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models. Transactions Of The Association For Computational Linguistics 2024, 12: 1311-1329. DOI: 10.1162/tacl_a_00705.Peer-Reviewed Original ResearchLanguage modelNatural language inputSemantic parsingHuman evaluationPretraining dataModel architectureModel sizeGeneration capabilityConfidence calibrationLearning paradigmPython programProject websiteTaskCapabilityLanguage inputParsingComprehensive evaluationLanguagePythonArchitectureCodeEvaluationLLMFramework1LearningPain disparities attributed to linguistic minoritization in health care settings
Lim P, Fortier M, Kain Z. Pain disparities attributed to linguistic minoritization in health care settings. Journal Of Pain 2024, 104688. PMID: 39357614, DOI: 10.1016/j.jpain.2024.104688.Peer-Reviewed Original ResearchSpoken languageEnglish-speaking childrenPain disparitiesEnglish-speaking familiesPain assessmentPain outcomesEnglishLanguagePain conceptsCommunication processMinoritized childrenHealth care settingsPediatric pain assessmentNarrative reviewInterpreter servicesHypothesized factorsRandomized controlled trialsPain communicationCare settingsHospital settingClinician biasDesign interventionsControlled trialsEmpirical researchSystemic factorsChanges in the structure of spontaneous speech predict the disruption of hierarchical brain organization in first‐episode psychosis
He R, Alonso‐Sánchez M, Sepulcre J, Palaniyappan L, Hinzen W. Changes in the structure of spontaneous speech predict the disruption of hierarchical brain organization in first‐episode psychosis. Human Brain Mapping 2024, 45: e70030. PMID: 39301700, PMCID: PMC11413563, DOI: 10.1002/hbm.70030.Peer-Reviewed Original ResearchConceptsFirst-episode psychosisSpontaneous speechCortical hierarchyHierarchical brain organizationHigher-order association corticesPicture descriptionMode networkBrain organizationAssociation cortexPsychosisCognitive functionMental dysfunctionPrimary sensorimotorSpeech patternsSyntactic associationsCortical organizationSemantic networkSensorimotorSituational languageSpeechLanguageHierarchical organizationHierarchical distanceNeurocognitionFMRIWhat R Mandarin Chinese /ɹ/s? – acoustic and articulatory features of Mandarin Chinese rhotics
Chen S, Whalen D, Mok P. What R Mandarin Chinese /ɹ/s? – acoustic and articulatory features of Mandarin Chinese rhotics. Phonetica 2024, 81: 509-552. PMID: 39279469, PMCID: PMC11449382, DOI: 10.1515/phon-2023-0023.Peer-Reviewed Original ResearchTongue shapeRhotic soundsSyllable positionEffect of syllable positionPhonetic variationPhonetic featuresVowel contextsHigher F2Speech productionArticulatory featuresFricative noiseRhoticsAcoustic differencesMandarinAcoustic featuresPrevocalicLanguageTonguePostvocalicRetroflexVowelsArticulatorySpeakersSoundSpeechA Secret Shopper Study of Language Accessibility of Community-Based Behavioral Health Services for Children in Families Who Speak Spanish and English
Lomax S, Klusaritz H, Jimenez M, Frausto B, Cahen V, Njoroge W, Yun K. A Secret Shopper Study of Language Accessibility of Community-Based Behavioral Health Services for Children in Families Who Speak Spanish and English. The Journal Of Pediatrics 2024, 276: 114275. PMID: 39218205, PMCID: PMC11645237, DOI: 10.1016/j.jpeds.2024.114275.Peer-Reviewed Original ResearchBehavioral health facilitiesMedicaid-insured childrenHealth facilitiesPreferred languageBehavioral health careSubstance abuse servicesSpanish-speaking familiesModifiable barriersHealth schedulesHealth careAbuse servicesOutpatient facilitiesStandardized scriptPrimary outcomeTelephone numbersAppointmentChildrenEnglishCallersLanguageScripted callsAccess trainingFacilitiesSpanishFamilyCan large language models estimate public opinion about global warming? An empirical assessment of algorithmic fidelity and bias
Lee S, Peng T, Goldberg M, Rosenthal S, Kotcher J, Maibach E, Leiserowitz A. Can large language models estimate public opinion about global warming? An empirical assessment of algorithmic fidelity and bias. PLOS Climate 2024, 3: e0000429. DOI: 10.1371/journal.pclm.0000429.Peer-Reviewed Original ResearchPublic opinionEstimate public opinionMeasures of public opinionLanguage modelQuestion formatSocial science researchLanguageVoting behaviorBlack AmericansGlobal warmingOpinionEmpirical assessmentHuman attitudesAlgorithmic biasAttitudesScience researchLLMPredictive beliefsSurvey responsesModel selectionClinical Documentation of Patient Identities in the Electronic Health Record: Ethical Principles to Consider
Decker S, Farook M, Meshberg-Cohen S, Matsuura T, Manning M, Abel E, Blakley L, Prelli F. Clinical Documentation of Patient Identities in the Electronic Health Record: Ethical Principles to Consider. Psychological Services 2024, 21: 589-600. PMID: 37917474, DOI: 10.1037/ser0000816.Peer-Reviewed Original ResearchMulticultural guidelinesAmerican Psychological Association’s multicultural guidelinesFluid identitiesLanguageLived experienceIdentitySet of questionsIdentity variablesPurpose of documentationEthical dilemmasDocumentation approachAssociation CodePsychologistsPatient identityEthical principlesOrganizational mandatesDocumentationGeneral principlesArticleDilemmaQuestionsExperienceLittle guidance
This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply