2024
Extracting Systemic Anticancer Therapy and Response Information From Clinical Notes Following the RECIST Definition
Zuo X, Kumar A, Shen S, Li J, Cong G, Jin E, Chen Q, Warner J, Yang P, Xu H. Extracting Systemic Anticancer Therapy and Response Information From Clinical Notes Following the RECIST Definition. JCO Clinical Cancer Informatics 2024, 8: e2300166. PMID: 38885475, DOI: 10.1200/cci.23.00166.Peer-Reviewed Original ResearchMeSH KeywordsAlgorithmsData MiningDeep LearningElectronic Health RecordsHumansMachine LearningNatural Language ProcessingNeoplasmsResponse Evaluation Criteria in Solid TumorsConceptsNatural language processingDomain-specific language modelsNatural language processing systemsInformation extraction systemRule-based moduleNarrative clinical textsNLP tasksEntity recognitionText normalizationAssertion classificationLanguage modelInformation extractionClinical textElectronic health recordsLearning-basedClinical notesLanguage processingTest setSystem performanceHealth recordsResponse extractionTime-consumingAnticancer therapyInformationAssessment informationMapping Clinical Documents to the Logical Observation Identifiers, Names and Codes (LOINC) Document Ontology using Electronic Health Record Systems Structured Metadata.
Khan H, Mosa A, Paka V, Rana M, Mandhadi V, Islam S, Xu H, McClay J, Sarker S, Rao P, Waitman L. Mapping Clinical Documents to the Logical Observation Identifiers, Names and Codes (LOINC) Document Ontology using Electronic Health Record Systems Structured Metadata. AMIA Annual Symposium Proceedings 2024, 2023: 1017-1026. PMID: 38222329, PMCID: PMC10785913.Peer-Reviewed Original ResearchMeSH KeywordsDocumentationElectronic Health RecordsHumansLogical Observation Identifiers Names and CodesMetadataConceptsDocument ontologyElectronic health recordsBag-of-words approachNatural language processing techniquesFree-text documentsLanguage processing techniquesClinical documentationLogical Observation IdentifiersText documentsStructured metadataWords approachComputational scalabilityMetadataHealth recordsEHR documentationElectronic health record fieldsProcessing techniquesOntologyDocumentsAutomated pipelineNLPScalabilityClinical careFrameworkLOINCStandardizing Multi-site Clinical Note Titles to LOINC Document Ontology: A Transformer-based Approach.
Zuo X, Zhou Y, Duke J, Hripcsak G, Shah N, Banda J, Reeves R, Miller T, Waitman L, Natarajan K, Xu H. Standardizing Multi-site Clinical Note Titles to LOINC Document Ontology: A Transformer-based Approach. AMIA Annual Symposium Proceedings 2024, 2023: 834-843. PMID: 38222429, PMCID: PMC10785935.Peer-Reviewed Original ResearchMeSH KeywordsElectronic Health RecordsHumansInformation Storage and RetrievalLogical Observation Identifiers Names and CodesSemantics
2023
Prediction of Brain Metastases Development in Patients With Lung Cancer by Explainable Artificial Intelligence From Electronic Health Records
Li Z, Li R, Zhou Y, Rasmy L, Zhi D, Zhu P, Dono A, Jiang X, Xu H, Esquenazi Y, Zheng W. Prediction of Brain Metastases Development in Patients With Lung Cancer by Explainable Artificial Intelligence From Electronic Health Records. JCO Clinical Cancer Informatics 2023, 7: e2200141. PMID: 37018650, PMCID: PMC10281421, DOI: 10.1200/cci.22.00141.Peer-Reviewed Original ResearchMeSH KeywordsArtificial IntelligenceBrain NeoplasmsEarly Detection of CancerElectronic Health RecordsHumansLung NeoplasmsConceptsBrain metastasesExplainable artificial intelligenceFeature attribution methodsLung cancerEHR dataArtificial intelligenceCerner Health Facts databaseBM developmentExplainable artificial intelligence approachBrain metastasis developmentHealth Facts databaseElectronic health record dataRecurrent neural network modelArtificial intelligence approachHealth record dataModel decision processStructured EHR dataNeural network modelDecision processAttribution methodsHigh-quality cohortElectronic health recordsPrompt treatmentMetastasis developmentIntelligence approachRepresenting and utilizing clinical textual data for real world studies: An OHDSI approach
Keloth V, Banda J, Gurley M, Heider P, Kennedy G, Liu H, Liu F, Miller T, Natarajan K, V Patterson O, Peng Y, Raja K, Reeves R, Rouhizadeh M, Shi J, Wang X, Wang Y, Wei W, Williams A, Zhang R, Belenkaya R, Reich C, Blacketer C, Ryan P, Hripcsak G, Elhadad N, Xu H. Representing and utilizing clinical textual data for real world studies: An OHDSI approach. Journal Of Biomedical Informatics 2023, 142: 104343. PMID: 36935011, PMCID: PMC10428170, DOI: 10.1016/j.jbi.2023.104343.Peer-Reviewed Original ResearchMeSH KeywordsData ScienceElectronic Health RecordsHumansMedical InformaticsNarrationNatural Language ProcessingConceptsNatural language processingCommon data modelTextual dataNLP solutionObservational Health Data SciencesOMOP Common Data ModelSpecific use casesObservational Medical Outcomes Partnership Common Data ModelHealth Data SciencesRepresentation of informationUse casesElectronic health recordsReal-world evidence generationData scienceClinical textData modelClinical notesLanguage processingHealth recordsLoad dataClinical documentationCurrent applicationsInformationWorkflowEvidence generation
2022
Assess the documentation of cognitive tests and biomarkers in electronic health records via natural language processing for Alzheimer’s disease and related dementias
Chen Z, Zhang H, Yang X, Wu S, He X, Xu J, Guo J, Prosperi M, Wang F, Xu H, Chen Y, Hu H, DeKosky S, Farrer M, Guo Y, Wu Y, Bian J. Assess the documentation of cognitive tests and biomarkers in electronic health records via natural language processing for Alzheimer’s disease and related dementias. International Journal Of Medical Informatics 2022, 170: 104973. PMID: 36577203, PMCID: PMC11325083, DOI: 10.1016/j.ijmedinf.2022.104973.Peer-Reviewed Original ResearchMeSH KeywordsAlzheimer DiseaseBiomarkersDocumentationElectronic Health RecordsHumansNatural Language ProcessingConceptsElectronic health recordsPatients' electronic health recordsCognitive testsCognitive test scoresFlorida health systemSeverity categoriesHealth recordsAD-related dementiaAD/ADRD researchAD/ADRDPatient levelAlzheimer's diseaseClinical narrativesHealth systemBiomarkersDifferent severityDiseaseSeverityPatientsADRD researchStandardized approachDementiaTest scoresPopulation characteristicsScoresAssessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing
Wang L, Fu S, Wen A, Ruan X, He H, Liu S, Moon S, Mai M, Riaz I, Wang N, Yang P, Xu H, Warner J, Liu H. Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing. JCO Clinical Cancer Informatics 2022, 6: e2200006. PMID: 35917480, PMCID: PMC9470142, DOI: 10.1200/cci.22.00006.Peer-Reviewed Original ResearchRecurrent neural network models (CovRNN) for predicting outcomes of patients with COVID-19 on admission to hospital: model development and validation using electronic health record data
Rasmy L, Nigo M, Kannadath B, Xie Z, Mao B, Patel K, Zhou Y, Zhang W, Ross A, Xu H, Zhi D. Recurrent neural network models (CovRNN) for predicting outcomes of patients with COVID-19 on admission to hospital: model development and validation using electronic health record data. The Lancet Digital Health 2022, 4: e415-e425. PMID: 35466079, PMCID: PMC9023005, DOI: 10.1016/s2589-7500(22)00049-8.Peer-Reviewed Original ResearchMeSH KeywordsCOVID-19Electronic Health RecordsHospitalsHumansNeural Networks, ComputerRetrospective StudiesConceptsLight Gradient Boost MachineFeature engineeringGradient-boosting machineMultiple machine learning modelsElectronic health record dataNeural network-based modelReal-world datasetsRecurrent neural network modelComplex feature engineeringMachine learning modelsBinary classification taskSpecific feature selectionLogistic regression algorithmNeural network modelHealth record dataRecurrent neural network-based modelBinary classification modelNetwork-based modelTraditional machineExtensive data preprocessingHigh prediction accuracyMultiple external datasetsClassification taskData preprocessingFeature selection
2021
Comprehensive Characterization of COVID-19 Patients with Repeatedly Positive SARS-CoV-2 Tests Using a Large U.S. Electronic Health Record Database
Dong X, Zhou Y, Shu X, Bernstam E, Stern R, Aronoff D, Xu H, Lipworth L. Comprehensive Characterization of COVID-19 Patients with Repeatedly Positive SARS-CoV-2 Tests Using a Large U.S. Electronic Health Record Database. Microbiology Spectrum 2021, 9: 10.1128/spectrum.00327-21. PMID: 34406805, PMCID: PMC8552669, DOI: 10.1128/spectrum.00327-21.Peer-Reviewed Original ResearchConceptsPositive SARS-CoV-2 testSARS-CoV-2 testSecond positive testElectronic health record databaseCases of reinfectionHealth record databasePositive testPositive SARS-CoV-2 PCR test resultsSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) testingSARS-CoV-2 PCR test resultsRecord databaseSevere acute respiratory syndrome coronavirus 2Intensive care unit admissionAcute respiratory syndrome coronavirus 2SARS-CoV-2 infectionRespiratory syndrome coronavirus 2Long-term health consequencesLarge electronic health record databasePotential long-term health consequencesCare unit admissionOverweight/obeseChronic medical conditionsPositive molecular testCOVID-19 patientsSyndrome coronavirus 2Privacy-protecting, reliable response data discovery using COVID-19 patient observations
Kim J, Neumann L, Paul P, Day M, Aratow M, Bell D, Doctor J, Hinske L, Jiang X, Kim K, Matheny M, Meeker D, Pletcher M, Schilling L, SooHoo S, Xu H, Zheng K, Ohno-Machado L, Anderson D, Anderson N, Balacha C, Bath T, Baxter S, Becker-Pennrich A, Bernstam E, Carter W, Chau N, Choi Y, Covington S, DuVall S, El-Kareh R, Florian R, Follett R, Geisler B, Ghigi A, Gottlieb A, Hu Z, Ir D, Knight T, Koola J, Kuo T, Lee N, Mansmann U, Mou Z, Murphy R, Neumann L, Nguyen N, Niedermayer S, Park E, Perkins A, Post K, Rieder C, Scherer C, Soares A, Soysal E, Tep B, Toy B, Wang B, Wu Z, Zhou Y, Zucker R. Privacy-protecting, reliable response data discovery using COVID-19 patient observations. Journal Of The American Medical Informatics Association 2021, 28: 1765-1776. PMID: 34051088, PMCID: PMC8194878, DOI: 10.1093/jamia/ocab054.Peer-Reviewed Original ResearchCOVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model
Wang J, Abu-El-Rub N, Gray J, Pham H, Zhou Y, Manion F, Liu M, Song X, Xu H, Rouhizadeh M, Zhang Y. COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model. Journal Of The American Medical Informatics Association 2021, 28: 1275-1283. PMID: 33674830, PMCID: PMC7989301, DOI: 10.1093/jamia/ocab015.Peer-Reviewed Original ResearchMeSH KeywordsCOVID-19Deep LearningElectronic Health RecordsHumansInformation Storage and RetrievalNatural Language ProcessingSymptom AssessmentConceptsNatural language processing toolsCommon data modelLanguage processing toolsElectronic health recordsClinical natural language processing toolsData modelDeep learning-based modelProcessing toolsOMOP Common Data ModelPattern-based rulesObservational Medical Outcomes Partnership Common Data ModelLearning-based modelsSpecific information needsUse casesNLP toolsClinical textFree textExtensive evaluationDownloadable packageInformation needsHybrid approachResearch communityHealth recordsData sourcesHigh performance
2020
Representation of EHR data for predictive modeling: a comparison between UMLS and other terminologies
Rasmy L, Tiryaki F, Zhou Y, Xiang Y, Tao C, Xu H, Zhi D. Representation of EHR data for predictive modeling: a comparison between UMLS and other terminologies. Journal Of The American Medical Informatics Association 2020, 27: 1593-1599. PMID: 32930711, PMCID: PMC7647355, DOI: 10.1093/jamia/ocaa180.Peer-Reviewed Original ResearchMeSH KeywordsAgedDatabases, FactualElectronic Health RecordsFemaleHumansMaleMiddle AgedROC CurveUnified Medical Language SystemVocabulary, ControlledConceptsUnified Medical Language SystemRecurrent neural networkNeural networkPrediction performanceLogistic regressionPredictive modelingDeep learningData aggregationElectronic health record dataMachine learningRisk predictionBetter prediction performanceDengue hemorrhagic feverHealth record dataEHR dataCancer predictionLarge vocabularyDifferent tasksPredictive modelHeart failureDiabetes patientsPancreatic cancerClinical dataHemorrhagic feverICD-9COVID-19 TestNorm: A tool to normalize COVID-19 testing names to LOINC codes
Dong X, Li J, Soysal E, Bian J, DuVall S, Hanchrow E, Liu H, Lynch K, Matheny M, Natarajan K, Ohno-Machado L, Pakhomov S, Reeves R, Sitapati A, Abhyankar S, Cullen T, Deckard J, Jiang X, Murphy R, Xu H. COVID-19 TestNorm: A tool to normalize COVID-19 testing names to LOINC codes. Journal Of The American Medical Informatics Association 2020, 27: 1437-1442. PMID: 32569358, PMCID: PMC7337837, DOI: 10.1093/jamia/ocaa145.Peer-Reviewed Original ResearchConceptsElectronic health recordsLOINC codesSecondary useRule-based toolOnline web applicationOpen-source packageCritical data elementsWeb applicationData networksEnd usersData elementsIndependent test setHealth recordsTest setKey challengesData normalizationCritical resourcesTest namesRoutine clinical practice dataCodeClinical practice dataCoronavirus disease 2019COVID-19 diagnostic testsToolDevelopersLearning from local to global: An efficient distributed algorithm for modeling time-to-event data
Duan R, Luo C, Schuemie M, Tong J, Liang C, Chang H, Boland M, Bian J, Xu H, Holmes J, Forrest C, Morton S, Berlin J, Moore J, Mahoney K, Chen Y. Learning from local to global: An efficient distributed algorithm for modeling time-to-event data. Journal Of The American Medical Informatics Association 2020, 27: 1028-1036. PMID: 32626900, PMCID: PMC7647322, DOI: 10.1093/jamia/ocaa044.Peer-Reviewed Original ResearchTime event ontology (TEO): to support semantic representation and reasoning of complex temporal relations of clinical events
Li F, Du J, He Y, Song H, Madkour M, Rao G, Xiang Y, Luo Y, Chen H, Liu S, Wang L, Liu H, Xu H, Tao C. Time event ontology (TEO): to support semantic representation and reasoning of complex temporal relations of clinical events. Journal Of The American Medical Informatics Association 2020, 27: 1046-1056. PMID: 32626903, PMCID: PMC7647306, DOI: 10.1093/jamia/ocaa058.Peer-Reviewed Original ResearchMeSH KeywordsBiological OntologiesDecision Support Systems, ClinicalElectronic Health RecordsHumansNatural Language ProcessingSemantic WebTimeConceptsTime Event OntologyComplex temporal relationsEvent ontologyNatural language processing fieldTemporal relationsTime-related queriesInformation annotationProcessing fieldTemporal informationData propertiesRelation representationClinical narrativesSemantic representationElectronic health record dataRich setHealth record dataOntologyStrong capabilityReasoningSetQueriesOrder relationRecord dataRepresentationPrimitivesEfficient and Accurate Extracting of Unstructured EHRs on Cancer Therapy Responses for the Development of RECIST Natural Language Processing Tools: Part I, the Corpus
Li Y, Luo Y, Wampfler J, Rubinstein S, Tiryaki F, Ashok K, Warner J, Xu H, Yang P. Efficient and Accurate Extracting of Unstructured EHRs on Cancer Therapy Responses for the Development of RECIST Natural Language Processing Tools: Part I, the Corpus. JCO Clinical Cancer Informatics 2020, 4: cci.19.00147. PMID: 32364754, PMCID: PMC7265793, DOI: 10.1200/cci.19.00147.Peer-Reviewed Original ResearchMeSH KeywordsElectronic Health RecordsHumansNatural Language ProcessingNeoplasmsResponse Evaluation Criteria in Solid TumorsConceptsNatural language processing toolsElectronic health recordsLanguage processing toolsGold standard dataUnstructured electronic health recordsProcessing toolsAmount of dataClinical notesStandard dataMayo Clinic electronic health recordsClinic's electronic health recordEnvironment toolsAccurate annotationHealth recordsInformatics toolsEffective analysisData setsTextual sourcesCorpusToolInformationData extractionSetExtractingAnnotationAchievability to Extract Specific Date Information for Cancer Research.
Wang L, Wampfler J, Dispenzieri A, Xu H, Yang P, Liu H. Achievability to Extract Specific Date Information for Cancer Research. AMIA Annual Symposium Proceedings 2020, 2019: 893-902. PMID: 32308886, PMCID: PMC7153063.Peer-Reviewed Original ResearchElectronic Health Records for Drug Repurposing: Current Status, Challenges, and Future Directions
Xu H, Li J, Jiang X, Chen Q. Electronic Health Records for Drug Repurposing: Current Status, Challenges, and Future Directions. Clinical Pharmacology & Therapeutics 2020, 107: 712-714. PMID: 32012237, PMCID: PMC10815929, DOI: 10.1002/cpt.1769.Peer-Reviewed Original Research
2019
Learning from electronic health records across multiple sites: A communication-efficient and privacy-preserving distributed algorithm
Duan R, Boland M, Liu Z, Liu Y, Chang H, Xu H, Chu H, Schmid C, Forrest C, Holmes J, Schuemie M, Berlin J, Moore J, Chen Y. Learning from electronic health records across multiple sites: A communication-efficient and privacy-preserving distributed algorithm. Journal Of The American Medical Informatics Association 2019, 27: 376-385. PMID: 31816040, PMCID: PMC7025371, DOI: 10.1093/jamia/ocz199.Peer-Reviewed Original ResearchEditorial: The second international workshop on health natural language processing (HealthNLP 2019)
Wang Y, Xu H, Uzuner O. Editorial: The second international workshop on health natural language processing (HealthNLP 2019). BMC Medical Informatics And Decision Making 2019, 19: 233. PMID: 31801516, PMCID: PMC6894102, DOI: 10.1186/s12911-019-0930-9.Peer-Reviewed Original Research