2024
Leveraging error-prone algorithm-derived phenotypes: Enhancing association studies for risk factors in EHR data
Lu Y, Tong J, Chubak J, Lumley T, Hubbard R, Xu H, Chen Y. Leveraging error-prone algorithm-derived phenotypes: Enhancing association studies for risk factors in EHR data. Journal Of Biomedical Informatics 2024, 157: 104690. PMID: 39004110, DOI: 10.1016/j.jbi.2024.104690.Peer-Reviewed Original ResearchElectronic health recordsElectronic health record dataKaiser Permanente WashingtonEHR-derived phenotypesAssociation studiesHealth recordsColon cancer recurrencePhenotyping errorsComputable phenotypeRisk factorsCancer recurrenceMultiple phenotypesReduce biasImprove estimation accuracySimulation studyBias reductionKaiserReduction of biasBiasEstimation accuracyAssociationStudyOutcomesRiskEstimation efficiencyDevelop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data
He X, Wei R, Huang Y, Chen Z, Lyu T, Bost S, Tong J, Li L, Zhou Y, Li Z, Guo J, Tang H, Wang F, DeKosky S, Xu H, Chen Y, Zhang R, Xu J, Guo Y, Wu Y, Bian J. Develop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data. Alzheimer's & Dementia Diagnosis Assessment & Disease Monitoring 2024, 16: e12613. PMID: 38966622, PMCID: PMC11220631, DOI: 10.1002/dad2.12613.Peer-Reviewed Original ResearchElectronic health record dataElectronic health recordsComputable phenotypeHealth record dataManual chart reviewHealth recordsAlzheimer's diseaseDiagnosis codesRecord dataChart reviewUTHealthAlzheimer's disease patientsUniversity of MinnesotaAD diagnosisAD identificationDisease patientsPatientsAlzheimerAD patientsDemographicsDiagnosisDiseaseCodeDataUniversityDeveloping deep learning-based strategies to predict the risk of hepatocellular carcinoma among patients with nonalcoholic fatty liver disease from electronic health records
Li Z, Lan L, Zhou Y, Li R, Chavin K, Xu H, Li L, Shih D, Zheng W. Developing deep learning-based strategies to predict the risk of hepatocellular carcinoma among patients with nonalcoholic fatty liver disease from electronic health records. Journal Of Biomedical Informatics 2024, 152: 104626. PMID: 38521180, DOI: 10.1016/j.jbi.2024.104626.Peer-Reviewed Original ResearchDeep learning modelsElectronic health recordsHCC risk predictionHealth recordsTime-varying covariatesLearning modelsElectronic health record dataRisk predictionHealth record dataAccuracy of deep learning modelsDeep learning-based strategyCovariate imbalanceDisease prediction tasksLearning-based strategyDeep learning performanceDisease risk predictionEHR databaseClassification problemLength of follow-upTransfer learningFatty liver diseasePrediction taskCarcinoma riskModel trainingRecord dataFedFSA: Hybrid and federated framework for functional status ascertainment across institutions
Fu S, Jia H, Vassilaki M, Keloth V, Dang Y, Zhou Y, Garg M, Petersen R, St Sauver J, Moon S, Wang L, Wen A, Li F, Xu H, Tao C, Fan J, Liu H, Sohn S. FedFSA: Hybrid and federated framework for functional status ascertainment across institutions. Journal Of Biomedical Informatics 2024, 152: 104623. PMID: 38458578, PMCID: PMC11005095, DOI: 10.1016/j.jbi.2024.104623.Peer-Reviewed Original ResearchNatural language processingElectronic health recordsStatus informationInformation extractionFunctional status informationRule-based information extractionFederated learning frameworkPrivate local dataNatural language processing frameworkHealthcare sitesPatient's functional statusMultiple healthcare institutionsFederated learningPyTorch libraryConcept normalizationBERT modelLearning frameworkCollaborative development effortsCorpus annotationLanguage processingHealthcare institutionsFunctional statusPredictor of health outcomesActivities of daily livingNatural language processing performanceMapping Clinical Documents to the Logical Observation Identifiers, Names and Codes (LOINC) Document Ontology using Electronic Health Record Systems Structured Metadata.
Khan H, Mosa A, Paka V, Rana M, Mandhadi V, Islam S, Xu H, McClay J, Sarker S, Rao P, Waitman L. Mapping Clinical Documents to the Logical Observation Identifiers, Names and Codes (LOINC) Document Ontology using Electronic Health Record Systems Structured Metadata. AMIA Annual Symposium Proceedings 2024, 2023: 1017-1026. PMID: 38222329, PMCID: PMC10785913.Peer-Reviewed Original ResearchConceptsDocument ontologyElectronic health recordsBag-of-words approachNatural language processing techniquesFree-text documentsLanguage processing techniquesClinical documentationLogical Observation IdentifiersText documentsStructured metadataWords approachComputational scalabilityMetadataHealth recordsEHR documentationElectronic health record fieldsProcessing techniquesOntologyDocumentsAutomated pipelineNLPScalabilityClinical careFrameworkLOINCStandardizing Multi-site Clinical Note Titles to LOINC Document Ontology: A Transformer-based Approach.
Zuo X, Zhou Y, Duke J, Hripcsak G, Shah N, Banda J, Reeves R, Miller T, Waitman L, Natarajan K, Xu H. Standardizing Multi-site Clinical Note Titles to LOINC Document Ontology: A Transformer-based Approach. AMIA Annual Symposium Proceedings 2024, 2023: 834-843. PMID: 38222429, PMCID: PMC10785935.Peer-Reviewed Original Research
2023
Suicide Tendency Prediction from Psychiatric Notes Using Transformer Models
Li Z, Ameer I, Hu Y, Abdelhameed A, Tao C, Selek S, Xu H. Suicide Tendency Prediction from Psychiatric Notes Using Transformer Models. 2023, 00: 481-483. DOI: 10.1109/ichi57859.2023.00074.Peer-Reviewed Original ResearchWeighted F1 scoreF1 scoreMachine learning modelsElectronic health recordsLearning modelsState-of-the-art modelsState-of-the-artBinary classification taskHealth recordsBinary classification modelStandard diagnosis codesClassification taskMulticlass classificationHealth informaticsClassification modelMental health informaticsTransformation modelPrediction algorithmPsychiatric notesInitial psychiatric evaluationSuicidal tendenciesMachineRandom forest modelSuicidal ideationPerformanceRepresenting and utilizing clinical textual data for real world studies: An OHDSI approach
Keloth V, Banda J, Gurley M, Heider P, Kennedy G, Liu H, Liu F, Miller T, Natarajan K, V Patterson O, Peng Y, Raja K, Reeves R, Rouhizadeh M, Shi J, Wang X, Wang Y, Wei W, Williams A, Zhang R, Belenkaya R, Reich C, Blacketer C, Ryan P, Hripcsak G, Elhadad N, Xu H. Representing and utilizing clinical textual data for real world studies: An OHDSI approach. Journal Of Biomedical Informatics 2023, 142: 104343. PMID: 36935011, PMCID: PMC10428170, DOI: 10.1016/j.jbi.2023.104343.Peer-Reviewed Original ResearchConceptsNatural language processingCommon data modelTextual dataNLP solutionObservational Health Data SciencesOMOP Common Data ModelSpecific use casesObservational Medical Outcomes Partnership Common Data ModelHealth Data SciencesRepresentation of informationUse casesElectronic health recordsReal-world evidence generationData scienceClinical textData modelClinical notesLanguage processingHealth recordsLoad dataClinical documentationCurrent applicationsInformationWorkflowEvidence generation
2022
Assess the documentation of cognitive tests and biomarkers in electronic health records via natural language processing for Alzheimer’s disease and related dementias
Chen Z, Zhang H, Yang X, Wu S, He X, Xu J, Guo J, Prosperi M, Wang F, Xu H, Chen Y, Hu H, DeKosky S, Farrer M, Guo Y, Wu Y, Bian J. Assess the documentation of cognitive tests and biomarkers in electronic health records via natural language processing for Alzheimer’s disease and related dementias. International Journal Of Medical Informatics 2022, 170: 104973. PMID: 36577203, PMCID: PMC11325083, DOI: 10.1016/j.ijmedinf.2022.104973.Peer-Reviewed Original ResearchConceptsElectronic health recordsPatients' electronic health recordsCognitive testsCognitive test scoresFlorida health systemSeverity categoriesHealth recordsAD-related dementiaAD/ADRD researchAD/ADRDPatient levelAlzheimer's diseaseClinical narrativesHealth systemBiomarkersDifferent severityDiseaseSeverityPatientsADRD researchStandardized approachDementiaTest scoresPopulation characteristicsScoresClinicalLayoutLM: A Pre-trained Multi-modal Model for Understanding Scanned Document in Electronic Health Records
Wei Q, Zuo X, Anjum O, Hu Y, Denlinger R, Bernstam E, Citardi M, Xu H. ClinicalLayoutLM: A Pre-trained Multi-modal Model for Understanding Scanned Document in Electronic Health Records. 2022, 00: 2821-2827. DOI: 10.1109/bigdata55660.2022.10020569.Peer-Reviewed Original ResearchOptical character recognitionMulti-modal modelElectronic health recordsClinical documentsNatural language processing tasksInformation extraction technologyPre-trained modelsHealth recordsLanguage processing tasksInformation extractionImage informationF1 scoreCharacter recognitionLayout analysisProcessing tasksMulti-modal approachClinical corpusBaseline modelDocumentsOpen domainTaskExtraction technologyClinical operationsDifferent categoriesText
2020
Efficient and Accurate Extracting of Unstructured EHRs on Cancer Therapy Responses for the Development of RECIST Natural Language Processing Tools: Part I, the Corpus
Li Y, Luo Y, Wampfler J, Rubinstein S, Tiryaki F, Ashok K, Warner J, Xu H, Yang P. Efficient and Accurate Extracting of Unstructured EHRs on Cancer Therapy Responses for the Development of RECIST Natural Language Processing Tools: Part I, the Corpus. JCO Clinical Cancer Informatics 2020, 4: cci.19.00147. PMID: 32364754, PMCID: PMC7265793, DOI: 10.1200/cci.19.00147.Peer-Reviewed Original ResearchConceptsNatural language processing toolsElectronic health recordsLanguage processing toolsGold standard dataUnstructured electronic health recordsProcessing toolsAmount of dataClinical notesStandard dataMayo Clinic electronic health recordsClinic's electronic health recordEnvironment toolsAccurate annotationHealth recordsInformatics toolsEffective analysisData setsTextual sourcesCorpusToolInformationData extractionSetExtractingAnnotation