2024
Leveraging error-prone algorithm-derived phenotypes: Enhancing association studies for risk factors in EHR data
Lu Y, Tong J, Chubak J, Lumley T, Hubbard R, Xu H, Chen Y. Leveraging error-prone algorithm-derived phenotypes: Enhancing association studies for risk factors in EHR data. Journal Of Biomedical Informatics 2024, 157: 104690. PMID: 39004110, DOI: 10.1016/j.jbi.2024.104690.Peer-Reviewed Original ResearchElectronic health recordsElectronic health record dataKaiser Permanente WashingtonEHR-derived phenotypesAssociation studiesHealth recordsColon cancer recurrencePhenotyping errorsComputable phenotypeRisk factorsCancer recurrenceMultiple phenotypesReduce biasImprove estimation accuracySimulation studyBias reductionKaiserReduction of biasBiasEstimation accuracyAssociationStudyOutcomesRiskEstimation efficiencyDevelop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data
He X, Wei R, Huang Y, Chen Z, Lyu T, Bost S, Tong J, Li L, Zhou Y, Li Z, Guo J, Tang H, Wang F, DeKosky S, Xu H, Chen Y, Zhang R, Xu J, Guo Y, Wu Y, Bian J. Develop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data. Alzheimer's & Dementia Diagnosis Assessment & Disease Monitoring 2024, 16: e12613. PMID: 38966622, PMCID: PMC11220631, DOI: 10.1002/dad2.12613.Peer-Reviewed Original ResearchElectronic health record dataElectronic health recordsComputable phenotypeHealth record dataManual chart reviewHealth recordsAlzheimer's diseaseDiagnosis codesRecord dataChart reviewUTHealthAlzheimer's disease patientsUniversity of MinnesotaAD diagnosisAD identificationDisease patientsPatientsAlzheimerAD patientsDemographicsDiagnosisDiseaseCodeDataUniversityKamino: A Scalable Architecture to Support Medical AI Research Using Large Real World Data
Lin F, Young P, He H, Huang J, Gagne R, Rice D, Price N, Byron W, Hu Y, Felker D, Button W, Meeker D, Hsiao A, Xu H, Torre C, Schulz W. Kamino: A Scalable Architecture to Support Medical AI Research Using Large Real World Data. 2024, 00: 500-504. DOI: 10.1109/ichi61247.2024.00072.Peer-Reviewed Original ResearchElectronic health recordsAI researchNatural language processing tasksElectronic health record dataLanguage processing tasksComputing resource managementLarge-scale data retrievalMedical AI researchLeveraging electronic health recordsStandard data modelKubernetes orchestratorScalable architectureProcessing tasksResource allocation systemsSecurity considerationsAccess managementData retrievalData modelArchitectural solutionsOMOP CDMReal World DataWorld DataHealth recordsOMOPDataDeveloping deep learning-based strategies to predict the risk of hepatocellular carcinoma among patients with nonalcoholic fatty liver disease from electronic health records
Li Z, Lan L, Zhou Y, Li R, Chavin K, Xu H, Li L, Shih D, Zheng W. Developing deep learning-based strategies to predict the risk of hepatocellular carcinoma among patients with nonalcoholic fatty liver disease from electronic health records. Journal Of Biomedical Informatics 2024, 152: 104626. PMID: 38521180, DOI: 10.1016/j.jbi.2024.104626.Peer-Reviewed Original ResearchDeep learning modelsElectronic health recordsHCC risk predictionHealth recordsTime-varying covariatesLearning modelsElectronic health record dataRisk predictionHealth record dataAccuracy of deep learning modelsDeep learning-based strategyCovariate imbalanceDisease prediction tasksLearning-based strategyDeep learning performanceDisease risk predictionEHR databaseClassification problemLength of follow-upTransfer learningFatty liver diseasePrediction taskCarcinoma riskModel trainingRecord data
2023
Prediction of Brain Metastases Development in Patients With Lung Cancer by Explainable Artificial Intelligence From Electronic Health Records.
Li Z, Li R, Zhou Y, Rasmy L, Zhi D, Zhu P, Dono A, Jiang X, Xu H, Esquenazi Y, Zheng W. Prediction of Brain Metastases Development in Patients With Lung Cancer by Explainable Artificial Intelligence From Electronic Health Records. JCO Clinical Cancer Informatics 2023, 7: e2200141. PMID: 37018650, PMCID: PMC10281421, DOI: 10.1200/cci.22.00141.Peer-Reviewed Original ResearchConceptsBrain metastasesExplainable artificial intelligenceFeature attribution methodsLung cancerEHR dataArtificial intelligenceCerner Health Facts databaseBM developmentExplainable artificial intelligence approachBrain metastasis developmentHealth Facts databaseElectronic health record dataRecurrent neural network modelArtificial intelligence approachHealth record dataModel decision processStructured EHR dataNeural network modelDecision processAttribution methodsHigh-quality cohortElectronic health recordsPrompt treatmentMetastasis developmentIntelligence approach
2022
The All of Us Research Program: Data quality, utility, and diversity
Ramirez A, Sulieman L, Schlueter D, Halvorson A, Qian J, Ratsimbazafy F, Loperena R, Mayo K, Basford M, Deflaux N, Muthuraman K, Natarajan K, Kho A, Xu H, Wilkins C, Anton-Culver H, Boerwinkle E, Cicek M, Clark C, Cohn E, Ohno-Machado L, Schully S, Ahmedani B, Argos M, Cronin R, O’Donnell C, Fouad M, Goldstein D, Greenland P, Hebbring S, Karlson E, Khatri P, Korf B, Smoller J, Sodeke S, Wilbanks J, Hentges J, Mockrin S, Lunt C, Devaney S, Gebo K, Denny J, Carroll R, Glazer D, Harris P, Hripcsak G, Philippakis A, Roden D, Program T, Ahmedani B, Johnson C, Ahsan H, Antoine-LaVigne D, Singleton G, Anton-Culver H, Topol E, Baca-Motes K, Steinhubl S, Wade J, Begale M, Jain P, Sutherland S, Lewis B, Korf B, Behringer M, Gharavi A, Goldstein D, Hripcsak G, Bier L, Boerwinkle E, Brilliant M, Murali N, Hebbring S, Farrar-Edwards D, Burnside E, Drezner M, Taylor A, Channamsetty V, Montalvo W, Sharma Y, Chinea C, Jenks N, Cicek M, Thibodeau S, Holmes B, Schlueter E, Collier E, Winkler J, Corcoran J, D’Addezio N, Daviglus M, Winn R, Wilkins C, Roden D, Denny J, Doheny K, Nickerson D, Eichler E, Jarvik G, Funk G, Philippakis A, Rehm H, Lennon N, Kathiresan S, Gabriel S, Gibbs R, Rico E, Glazer D, Grand J, Greenland P, Harris P, Shenkman E, Hogan W, Igho-Pemu P, Pollan C, Jorge M, Okun S, Karlson E, Smoller J, Murphy S, Ross M, Kaushal R, Winford E, Wallace F, Khatri P, Kheterpal V, Ojo A, Moreno F, Kron I, Peterson R, Menon U, Lattimore P, Leviner N, Obedin-Maliver J, Lunn M, Malik-Gagnon L, Mangravite L, Marallo A, Marroquin O, Visweswaran S, Reis S, Marshall G, McGovern P, Mignucci D, Moore J, Munoz F, Talavera G, O'Connor G, O'Donnell C, Ohno-Machado L, Orr G, Randal F, Theodorou A, Reiman E, Roxas-Murray M, Stark L, Tepp R, Zhou A, Topper S, Trousdale R, Tsao P, Weidman L, Weiss S, Wellis D, Whittle J, Wilson A, Zuchner S, Zwick M. The All of Us Research Program: Data quality, utility, and diversity. Patterns 2022, 3: 100570. PMID: 36033590, PMCID: PMC9403360, DOI: 10.1016/j.patter.2022.100570.Peer-Reviewed Original ResearchRecurrent neural network models (CovRNN) for predicting outcomes of patients with COVID-19 on admission to hospital: model development and validation using electronic health record data
Rasmy L, Nigo M, Kannadath B, Xie Z, Mao B, Patel K, Zhou Y, Zhang W, Ross A, Xu H, Zhi D. Recurrent neural network models (CovRNN) for predicting outcomes of patients with COVID-19 on admission to hospital: model development and validation using electronic health record data. The Lancet Digital Health 2022, 4: e415-e425. PMID: 35466079, PMCID: PMC9023005, DOI: 10.1016/s2589-7500(22)00049-8.Peer-Reviewed Original ResearchConceptsLight Gradient Boost MachineFeature engineeringGradient-boosting machineMultiple machine learning modelsElectronic health record dataNeural network-based modelReal-world datasetsRecurrent neural network modelComplex feature engineeringMachine learning modelsBinary classification taskSpecific feature selectionLogistic regression algorithmNeural network modelHealth record dataRecurrent neural network-based modelBinary classification modelNetwork-based modelTraditional machineExtensive data preprocessingHigh prediction accuracyMultiple external datasetsClassification taskData preprocessingFeature selection
2020
Representation of EHR data for predictive modeling: a comparison between UMLS and other terminologies
Rasmy L, Tiryaki F, Zhou Y, Xiang Y, Tao C, Xu H, Zhi D. Representation of EHR data for predictive modeling: a comparison between UMLS and other terminologies. Journal Of The American Medical Informatics Association 2020, 27: 1593-1599. PMID: 32930711, PMCID: PMC7647355, DOI: 10.1093/jamia/ocaa180.Peer-Reviewed Original ResearchConceptsUnified Medical Language SystemRecurrent neural networkNeural networkPrediction performanceLogistic regressionPredictive modelingDeep learningData aggregationElectronic health record dataMachine learningRisk predictionBetter prediction performanceDengue hemorrhagic feverHealth record dataEHR dataCancer predictionLarge vocabularyDifferent tasksPredictive modelHeart failureDiabetes patientsPancreatic cancerClinical dataHemorrhagic feverICD-9Time event ontology (TEO): to support semantic representation and reasoning of complex temporal relations of clinical events
Li F, Du J, He Y, Song H, Madkour M, Rao G, Xiang Y, Luo Y, Chen H, Liu S, Wang L, Liu H, Xu H, Tao C. Time event ontology (TEO): to support semantic representation and reasoning of complex temporal relations of clinical events. Journal Of The American Medical Informatics Association 2020, 27: 1046-1056. PMID: 32626903, PMCID: PMC7647306, DOI: 10.1093/jamia/ocaa058.Peer-Reviewed Original ResearchConceptsTime Event OntologyComplex temporal relationsEvent ontologyNatural language processing fieldTemporal relationsTime-related queriesInformation annotationProcessing fieldTemporal informationData propertiesRelation representationClinical narrativesSemantic representationElectronic health record dataRich setHealth record dataOntologyStrong capabilityReasoningSetQueriesOrder relationRecord dataRepresentationPrimitives
2019
Discovery of Noncancer Drug Effects on Survival in Electronic Health Records of Patients With Cancer: A New Paradigm for Drug Repurposing
Wu Y, Warner J, Wang L, Jiang M, Xu J, Chen Q, Nian H, Dai Q, Du X, Yang P, Denny J, Liu H, Xu H. Discovery of Noncancer Drug Effects on Survival in Electronic Health Records of Patients With Cancer: A New Paradigm for Drug Repurposing. JCO Clinical Cancer Informatics 2019, 3: cci.19.00001. PMID: 31141421, PMCID: PMC6693869, DOI: 10.1200/cci.19.00001.Peer-Reviewed Original ResearchConceptsVanderbilt University Medical CenterCancer survivalMayo ClinicDrug repurposingNoncancer drugsElectronic health record dataCancer registry dataEHR dataClinical trial evaluationOverall cancer survivalUniversity Medical CenterHealth record dataElectronic health recordsTreatment of cancerClinical trialsDrug classesRegistry dataMedical CenterDrug effectsSignificant associationLongitudinal EHRNew indicationsPatientsCancerHealth records
2018
A study of generalizability of recurrent neural network-based predictive models for heart failure onset risk using a large and heterogeneous EHR data set
Rasmy L, Wu Y, Wang N, Geng X, Zheng W, Wang F, Wu H, Xu H, Zhi D. A study of generalizability of recurrent neural network-based predictive models for heart failure onset risk using a large and heterogeneous EHR data set. Journal Of Biomedical Informatics 2018, 84: 11-16. PMID: 29908902, PMCID: PMC6076336, DOI: 10.1016/j.jbi.2018.06.011.Peer-Reviewed Original ResearchConceptsRecurrent neural networkOnset riskCapability of RNNCerner Health FactsHeterogeneous EHR dataHeart failure patientsData setsElectronic health record dataDeep learning modelsDifferent patient populationsNeural network-based predictive modelDifferent patient groupsHealth record dataEHR data setsPredictive modelingSmall data setsFailure patientsPatient groupPatient populationReduction of AUCNeural networkRNN modelRETAIN modelHealth FactsHospital
2013
Applying active learning to high-throughput phenotyping algorithms for electronic health records data
Chen Y, Carroll R, Hinz E, Shah A, Eyler A, Denny J, Xu H. Applying active learning to high-throughput phenotyping algorithms for electronic health records data. Journal Of The American Medical Informatics Association 2013, 20: e253-e259. PMID: 23851443, PMCID: PMC3861916, DOI: 10.1136/amiajnl-2013-001945.Peer-Reviewed Original ResearchConceptsActive learningUnrefined featuresSupervised Machine Learning AlgorithmsRefined featuresPhenotyping algorithmElectronic health record dataMachine Learning AlgorithmsHealth record dataVenous thromboembolismRheumatoid arthritisFeature engineeringDomain expertsDomain knowledgePhenotyping tasksLearning algorithmFeature setsLearning approachColorectal cancerAL approachCurve scorePassive learning approachHigh-throughput phenotyping methodsAlgorithmSmall setRecord data