2025
Medical foundation large language models for comprehensive text analysis and beyond
Xie Q, Chen Q, Chen A, Peng C, Hu Y, Lin F, Peng X, Huang J, Zhang J, Keloth V, Zhou X, Qian L, He H, Shung D, Ohno-Machado L, Wu Y, Xu H, Bian J. Medical foundation large language models for comprehensive text analysis and beyond. Npj Digital Medicine 2025, 8: 141. PMID: 40044845, PMCID: PMC11882967, DOI: 10.1038/s41746-025-01533-1.Peer-Reviewed Original ResearchText analysis tasksAnalysis tasksLanguage modelDomain-specific knowledgeZero-ShotHuman evaluationSupervised settingTask-specific instructionsClinical data sourcesSpecialized medical knowledgeChatGPTText analysisPretrainingTaskData sourcesMedical applicationsMedical knowledgeEnhanced performanceTextPerformance
2023
Blockchain-enabled immutable, distributed, and highly available clinical research activity logging system for federated COVID-19 data analysis from multiple institutions
Kuo T, Pham A, Edelson M, Kim J, Chan J, Gupta Y, Ohno-Machado L, Anderson D, Balacha C, Bath T, Baxter S, Becker-Pennrich A, Bell D, Bernstam E, Ngan C, Day M, Doctor J, DuVall S, El-Kareh R, Florian R, Follett R, Geisler B, Ghigi A, Gottlieb A, Hinske L, Hu Z, Ir D, Jiang X, Kim K, Kim J, Knight T, Koola J, Kuo T, Lee N, Mansmann U, Matheny M, Meeker D, Mou Z, Neumann L, Nguyen N, Nick A, Ohno-Machado L, Park E, Paul P, Pletcher M, Post K, Rieder C, Scherer C, Schilling L, Soares A, SooHoo S, Soysal E, Steven C, Tep B, Toy B, Wang B, Wu Z, Xu H, Yong C, Zheng K, Zhou Y, Zucker R. Blockchain-enabled immutable, distributed, and highly available clinical research activity logging system for federated COVID-19 data analysis from multiple institutions. Journal Of The American Medical Informatics Association 2023, 30: 1167-1178. PMID: 36916740, PMCID: PMC10198529, DOI: 10.1093/jamia/ocad049.Peer-Reviewed Original ResearchConceptsFederated data analysisUser activity logsSmart contract deploymentRun-time efficiencyData analysis systemData analysis activitiesActivity logsData discoveryQuerying timeBlockchain systemBlockchain technologyNetwork transactionsCOVID-19 data analysisMultiple institutionsLow deploymentBlockchainGitHub repositoryMultiple nodesLarge networksQueriesAnalysis activitiesHigh availabilityLanguage codeBaseline solutionData analysisA hierarchical strategy to minimize privacy risk when linking “De-identified” data in biomedical research consortia
Ohno-Machado L, Jiang X, Kuo T, Tao S, Chen L, Ram P, Zhang G, Xu H. A hierarchical strategy to minimize privacy risk when linking “De-identified” data in biomedical research consortia. Journal Of Biomedical Informatics 2023, 139: 104322. PMID: 36806328, PMCID: PMC10975485, DOI: 10.1016/j.jbi.2023.104322.Peer-Reviewed Original ResearchConceptsPrivacy of individualsAppropriate privacy protectionData-driven modelsPrivacy protectionPrivacy risksData Coordination CenterData hubData repositoryHierarchical strategyPrivacyBiomedical discoveryData setsRecord linkageData Coordinating CenterRepositoryComplex strategiesCoordination centerTechnologyTechniqueDataPartiesSetHierarchy
2022
The All of Us Research Program: Data quality, utility, and diversity
Ramirez A, Sulieman L, Schlueter D, Halvorson A, Qian J, Ratsimbazafy F, Loperena R, Mayo K, Basford M, Deflaux N, Muthuraman K, Natarajan K, Kho A, Xu H, Wilkins C, Anton-Culver H, Boerwinkle E, Cicek M, Clark C, Cohn E, Ohno-Machado L, Schully S, Ahmedani B, Argos M, Cronin R, O’Donnell C, Fouad M, Goldstein D, Greenland P, Hebbring S, Karlson E, Khatri P, Korf B, Smoller J, Sodeke S, Wilbanks J, Hentges J, Mockrin S, Lunt C, Devaney S, Gebo K, Denny J, Carroll R, Glazer D, Harris P, Hripcsak G, Philippakis A, Roden D, Program T, Ahmedani B, Johnson C, Ahsan H, Antoine-LaVigne D, Singleton G, Anton-Culver H, Topol E, Baca-Motes K, Steinhubl S, Wade J, Begale M, Jain P, Sutherland S, Lewis B, Korf B, Behringer M, Gharavi A, Goldstein D, Hripcsak G, Bier L, Boerwinkle E, Brilliant M, Murali N, Hebbring S, Farrar-Edwards D, Burnside E, Drezner M, Taylor A, Channamsetty V, Montalvo W, Sharma Y, Chinea C, Jenks N, Cicek M, Thibodeau S, Holmes B, Schlueter E, Collier E, Winkler J, Corcoran J, D’Addezio N, Daviglus M, Winn R, Wilkins C, Roden D, Denny J, Doheny K, Nickerson D, Eichler E, Jarvik G, Funk G, Philippakis A, Rehm H, Lennon N, Kathiresan S, Gabriel S, Gibbs R, Rico E, Glazer D, Grand J, Greenland P, Harris P, Shenkman E, Hogan W, Igho-Pemu P, Pollan C, Jorge M, Okun S, Karlson E, Smoller J, Murphy S, Ross M, Kaushal R, Winford E, Wallace F, Khatri P, Kheterpal V, Ojo A, Moreno F, Kron I, Peterson R, Menon U, Lattimore P, Leviner N, Obedin-Maliver J, Lunn M, Malik-Gagnon L, Mangravite L, Marallo A, Marroquin O, Visweswaran S, Reis S, Marshall G, McGovern P, Mignucci D, Moore J, Munoz F, Talavera G, O'Connor G, O'Donnell C, Ohno-Machado L, Orr G, Randal F, Theodorou A, Reiman E, Roxas-Murray M, Stark L, Tepp R, Zhou A, Topper S, Trousdale R, Tsao P, Weidman L, Weiss S, Wellis D, Whittle J, Wilson A, Zuchner S, Zwick M. The All of Us Research Program: Data quality, utility, and diversity. Patterns 2022, 3: 100570. PMID: 36033590, PMCID: PMC9403360, DOI: 10.1016/j.patter.2022.100570.Peer-Reviewed Original ResearchIMI-CDE: an interactive interface for collaborative mapping of study variables to common data elements
Tao S, Chou W, Li J, Du J, Ram P, Abeysinghe R, Xu H, Jiang X, Rose P, Ohno-Machado L, Zhang G. IMI-CDE: an interactive interface for collaborative mapping of study variables to common data elements. 2022, 00: 465-468. DOI: 10.1109/ichi54592.2022.00070.Peer-Reviewed Original Research
2021
Privacy-protecting, reliable response data discovery using COVID-19 patient observations
Kim J, Neumann L, Paul P, Day M, Aratow M, Bell D, Doctor J, Hinske L, Jiang X, Kim K, Matheny M, Meeker D, Pletcher M, Schilling L, SooHoo S, Xu H, Zheng K, Ohno-Machado L, Anderson D, Anderson N, Balacha C, Bath T, Baxter S, Becker-Pennrich A, Bernstam E, Carter W, Chau N, Choi Y, Covington S, DuVall S, El-Kareh R, Florian R, Follett R, Geisler B, Ghigi A, Gottlieb A, Hu Z, Ir D, Knight T, Koola J, Kuo T, Lee N, Mansmann U, Mou Z, Murphy R, Neumann L, Nguyen N, Niedermayer S, Park E, Perkins A, Post K, Rieder C, Scherer C, Soares A, Soysal E, Tep B, Toy B, Wang B, Wu Z, Zhou Y, Zucker R. Privacy-protecting, reliable response data discovery using COVID-19 patient observations. Journal Of The American Medical Informatics Association 2021, 28: 1765-1776. PMID: 34051088, PMCID: PMC8194878, DOI: 10.1093/jamia/ocab054.Peer-Reviewed Original Research
2020
How do we share data in COVID-19 research? A systematic review of COVID-19 datasets in PubMed Central Articles
Zuo X, Chen Y, Ohno-Machado L, Xu H. How do we share data in COVID-19 research? A systematic review of COVID-19 datasets in PubMed Central Articles. Briefings In Bioinformatics 2020, 22: 800-811. PMID: 33757278, PMCID: PMC7799277, DOI: 10.1093/bib/bbaa331.Peer-Reviewed Reviews, Practice Guidelines, Standards, and Consensus StatementsCOVID-19 TestNorm: A tool to normalize COVID-19 testing names to LOINC codes
Dong X, Li J, Soysal E, Bian J, DuVall S, Hanchrow E, Liu H, Lynch K, Matheny M, Natarajan K, Ohno-Machado L, Pakhomov S, Reeves R, Sitapati A, Abhyankar S, Cullen T, Deckard J, Jiang X, Murphy R, Xu H. COVID-19 TestNorm: A tool to normalize COVID-19 testing names to LOINC codes. Journal Of The American Medical Informatics Association 2020, 27: 1437-1442. PMID: 32569358, PMCID: PMC7337837, DOI: 10.1093/jamia/ocaa145.Peer-Reviewed Original ResearchConceptsElectronic health recordsLOINC codesSecondary useRule-based toolOnline web applicationOpen-source packageCritical data elementsWeb applicationData networksEnd usersData elementsIndependent test setHealth recordsTest setKey challengesData normalizationCritical resourcesTest namesRoutine clinical practice dataCodeClinical practice dataCoronavirus disease 2019COVID-19 diagnostic testsToolDevelopersCoronavirus: indexed data speed up solutions
Ohno-Machado L, Xu H. Coronavirus: indexed data speed up solutions. Nature 2020, 584: 192-192. PMID: 32782375, DOI: 10.1038/d41586-020-02331-3.Commentaries, Editorials and Letters
2018
DataMed – an open source discovery index for finding biomedical datasets
Chen X, Gururaj A, Ozyurt B, Liu R, Soysal E, Cohen T, Tiryaki F, Li Y, Zong N, Jiang M, Rogith D, Salimi M, Kim H, Rocca-Serra P, Gonzalez-Beltran A, Farcas C, Johnson T, Margolis R, Alter G, Sansone S, Fore I, Ohno-Machado L, Grethe J, Xu H. DataMed – an open source discovery index for finding biomedical datasets. Journal Of The American Medical Informatics Association 2018, 25: 300-308. PMID: 29346583, PMCID: PMC7378878, DOI: 10.1093/jamia/ocx121.Peer-Reviewed Original ResearchIngestion pipelineBiomedical datasetsSearch enginesBiomedical domainAdvanced natural language processingRelevant datasetsUser-entered queryData discovery systemUnified metadata modelData ingestion pipelinesNatural language processingOpen-source packageRetrieval engineTerminology servicesMetadata modelMetadata informationDiscovery systemData reuseDataMedBenchmark datasetsBiomedical dataData indexAverage precisionLanguage processingSource package
2017
User needs analysis and usability assessment of DataMed – a biomedical data discovery index
Dixit R, Rogith D, Narayana V, Salimi M, Gururaj A, Ohno-Machado L, Xu H, Johnson T. User needs analysis and usability assessment of DataMed – a biomedical data discovery index. Journal Of The American Medical Informatics Association 2017, 25: 337-344. PMID: 29202203, PMCID: PMC7378884, DOI: 10.1093/jamia/ocx134.Peer-Reviewed Original ResearchData discoveryUsability evaluationInformation needsUser interfaceBiomedical dataIterative usability evaluationsInformation retrieval toolsUser interface needsHigh-quality metadataResearchers informationCommon search enginesDiscovery systemRetrieval toolsDataMedUser studyRelevance judgmentsSearch enginesUser needsDataset explorationUsability assessmentRetrieval techniquesNew retrieval techniqueIncomplete metadataMetadataUsersDATS, the data tag suite to enable discoverability of datasets
Sansone S, Gonzalez-Beltran A, Rocca-Serra P, Alter G, Grethe J, Xu H, Fore I, Lyle J, Gururaj A, Chen X, Kim H, Zong N, Li Y, Liu R, Ozyurt I, Ohno-Machado L. DATS, the data tag suite to enable discoverability of datasets. Scientific Data 2017, 4: 170059. PMID: 28585923, PMCID: PMC5460592, DOI: 10.1038/sdata.2017.59.Peer-Reviewed Original ResearchFinding useful data across multiple biomedical data repositories using DataMed
Ohno-Machado L, Sansone S, Alter G, Fore I, Grethe J, Xu H, Gonzalez-Beltran A, Rocca-Serra P, Gururaj A, Bell E, Soysal E, Zong N, Kim H. Finding useful data across multiple biomedical data repositories using DataMed. Nature Genetics 2017, 49: 816-819. PMID: 28546571, PMCID: PMC6460922, DOI: 10.1038/ng.3864.Peer-Reviewed Original ResearchConceptsBiomedical data repositoriesHealth big dataData setsKnowledge discoveryBig dataMultiple repositoriesSearch enginesData indexFAIR principlesDataMedData repositoryService providersKnowledge initiativesKnowledge expertsBiomedical research communityResearch communityRepositoryScience landscapeUseful dataInteroperabilityMetadataFindabilitySetEngineDataA publicly available benchmark for biomedical dataset retrieval: the reference standard for the 2016 bioCADDIE dataset retrieval challenge
Cohen T, Roberts K, Gururaj A, Chen X, Pournejati S, Alter G, Hersh W, Demner-Fushman D, Ohno-Machado L, Xu H. A publicly available benchmark for biomedical dataset retrieval: the reference standard for the 2016 bioCADDIE dataset retrieval challenge. Database 2017, 2017: bax061. PMID: 29220453, PMCID: PMC5737202, DOI: 10.1093/database/bax061.Peer-Reviewed Original ResearchInformation retrieval for biomedical datasets: the 2016 bioCADDIE dataset retrieval challenge
Roberts K, Gururaj A, Chen X, Pournejati S, Hersh W, Demner-Fushman D, Ohno-Machado L, Cohen T, Xu H. Information retrieval for biomedical datasets: the 2016 bioCADDIE dataset retrieval challenge. Database 2017, 2017: bax068. DOI: 10.1093/database/bax068.Peer-Reviewed Original ResearchBiomedical datasetsRetrieval challengesInformation retrieval techniquesAdvanced query processingBiomedical data repositoriesAdvanced retrieval methodsQuery processingInformation retrievalTest queriesRetrieval systemRank frameworkRetrieval approachRetrieval techniquesData repositoryRetrieval methodTop precisionDatasetQueriesRepositoryChallengesRetrievalTaskLearningSystemCorpus
2014
PhenDisco: phenotype discovery system for the database of genotypes and phenotypes
Doan S, Lin K, Conway M, Ohno-Machado L, Hsieh A, Feupe S, Garland A, Ross M, Jiang X, Farzaneh S, Walker R, Alipanah N, Zhang J, Xu H, Kim H. PhenDisco: phenotype discovery system for the database of genotypes and phenotypes. Journal Of The American Medical Informatics Association 2014, 21: 31-36. PMID: 23989082, PMCID: PMC3912702, DOI: 10.1136/amiajnl-2013-001882.Peer-Reviewed Original ResearchConceptsNew information retrieval systemInformation retrieval systemsInformation retrieval toolsDatabase of GenotypesText processing toolsRetrieval systemSearch scenariosDiscovery systemRetrieval toolsAuthorized usersNon-standardized wayCross-study validationSearch comparisonProcessing toolsPromising performanceUsersPhenotype informationDatabaseInformationBiotechnology InformationQueriesMetadataEntrezResourcesSystem
This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply