2023
Representing and utilizing clinical textual data for real world studies: An OHDSI approach
Keloth V, Banda J, Gurley M, Heider P, Kennedy G, Liu H, Liu F, Miller T, Natarajan K, V Patterson O, Peng Y, Raja K, Reeves R, Rouhizadeh M, Shi J, Wang X, Wang Y, Wei W, Williams A, Zhang R, Belenkaya R, Reich C, Blacketer C, Ryan P, Hripcsak G, Elhadad N, Xu H. Representing and utilizing clinical textual data for real world studies: An OHDSI approach. Journal Of Biomedical Informatics 2023, 142: 104343. PMID: 36935011, PMCID: PMC10428170, DOI: 10.1016/j.jbi.2023.104343.Peer-Reviewed Original ResearchConceptsNatural language processingCommon data modelTextual dataNLP solutionObservational Health Data SciencesOMOP Common Data ModelSpecific use casesObservational Medical Outcomes Partnership Common Data ModelHealth Data SciencesRepresentation of informationUse casesElectronic health recordsReal-world evidence generationData scienceClinical textData modelClinical notesLanguage processingHealth recordsLoad dataClinical documentationCurrent applicationsInformationWorkflowEvidence generation
2021
COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model
Wang J, Abu-El-Rub N, Gray J, Pham H, Zhou Y, Manion F, Liu M, Song X, Xu H, Rouhizadeh M, Zhang Y. COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model. Journal Of The American Medical Informatics Association 2021, 28: 1275-1283. PMID: 33674830, PMCID: PMC7989301, DOI: 10.1093/jamia/ocab015.Peer-Reviewed Original ResearchConceptsNatural language processing toolsCommon data modelLanguage processing toolsElectronic health recordsClinical natural language processing toolsData modelDeep learning-based modelProcessing toolsOMOP Common Data ModelPattern-based rulesObservational Medical Outcomes Partnership Common Data ModelLearning-based modelsSpecific information needsUse casesNLP toolsClinical textFree textExtensive evaluationDownloadable packageInformation needsHybrid approachResearch communityHealth recordsData sourcesHigh performance
2019
Cost-sensitive Active Learning for Phenotyping of Electronic Health Records.
Ji Z, Wei Q, Franklin A, Cohen T, Xu H. Cost-sensitive Active Learning for Phenotyping of Electronic Health Records. AMIA Joint Summits On Translational Science Proceedings 2019, 2019: 829-838. PMID: 31259040, PMCID: PMC6568101.Peer-Reviewed Original ResearchAnnotation timeElectronic health recordsActive learningMachine learning-based methodsCost-sensitive active learningLarge annotated datasetLearning-based methodsHealth recordsUse casesAnnotated datasetUser 1AL algorithmUser 2Phenotyping algorithmAL approachSecondary useAlgorithmBetter performanceActual timeLearningExperimental resultsBreast cancer patientsDatasetModel performancePassive learning
2017
CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines
Soysal E, Wang J, Jiang M, Wu Y, Pakhomov S, Liu H, Xu H. CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines. Journal Of The American Medical Informatics Association 2017, 25: 331-336. PMID: 29186491, PMCID: PMC7378877, DOI: 10.1093/jamia/ocx132.Peer-Reviewed Original ResearchGraphic user interfaceUser interfaceUser-friendly graphic user interfaceNatural language processing systemsClinical natural language processing (NLP) systemsNatural language processing pipelineKnowledge Extraction SystemLanguage processing pipelineClinical Text AnalysisLanguage processing systemNLP componentsNLP toolkitInformation extractionNLP pipelineUse casesEntity recognitionClinical textEnd usersNLP communityProcessing pipelineProcessing systemIndividual tasksIndividual applicationsText analysisBetter performanceSearch Datasets in Literature: A Case Study of GWAS.
Dong X, Zhang Y, Xu H. Search Datasets in Literature: A Case Study of GWAS. AMIA Joint Summits On Translational Science Proceedings 2017, 2017: 40-49. PMID: 28815103, PMCID: PMC5543360.Peer-Reviewed Original ResearchRecognition systemMEDLINE abstractsDataset search enginePattern-based rulesText mining methodsData setsUnderlying data setSearch datasetsData discoverabilityUse casesSearch enginesDataset attributesMining methodsF-measureDomain dictionaryScalable approachHybrid approachDatasetFinderRetrieving literatureDiscoverabilityUltimate goalCase studySetScientific publicationsKnowledge-Based Approach for Named Entity Recognition in Biomedical Literature: A Use Case in Biomedical Software Identification
Amith M, Zhang Y, Xu H, Tao C. Knowledge-Based Approach for Named Entity Recognition in Biomedical Literature: A Use Case in Biomedical Software Identification. Lecture Notes In Computer Science 2017, 10351: 386-395. DOI: 10.1007/978-3-319-60045-1_40.Peer-Reviewed Original ResearchEntity recognitionNatural language processingContextual semantic informationNamed Entity RecognitionEntity recognition methodFeatures of ontologyMachine learning approachesKnowledge-based approachSoftware entitiesSoftware namesInformation extractionUse casesBiomedical softwareSemantic informationSoftware identificationLanguage processingRecognition methodLearning approachBiomedical literatureRecognitionOntologyEntitiesSoftwareResearch abstractsTaskCATTLE (CAncer treatment treasury with linked evidence): An integrated knowledge base for personalized oncology research and practice
Soysal E, Lee H, Zhang Y, Huang L, Chen X, Wei Q, Zheng W, Chang J, Cohen T, Sun J, Xu H. CATTLE (CAncer treatment treasury with linked evidence): An integrated knowledge base for personalized oncology research and practice. CPT Pharmacometrics & Systems Pharmacology 2017, 6: 188-196. PMID: 28296354, PMCID: PMC5351410, DOI: 10.1002/psp4.12174.Peer-Reviewed Original Research