2024
Extracting Systemic Anticancer Therapy and Response Information From Clinical Notes Following the RECIST Definition
Zuo X, Kumar A, Shen S, Li J, Cong G, Jin E, Chen Q, Warner J, Yang P, Xu H. Extracting Systemic Anticancer Therapy and Response Information From Clinical Notes Following the RECIST Definition. JCO Clinical Cancer Informatics 2024, 8: e2300166. PMID: 38885475, DOI: 10.1200/cci.23.00166.Peer-Reviewed Original ResearchConceptsNatural language processingDomain-specific language modelsNatural language processing systemsInformation extraction systemRule-based moduleNarrative clinical textsNLP tasksEntity recognitionText normalizationAssertion classificationLanguage modelInformation extractionClinical textElectronic health recordsLearning-basedClinical notesLanguage processingTest setSystem performanceHealth recordsResponse extractionTime-consumingAnticancer therapyInformationAssessment informationDevelopment of Clinical NLP Systems
Xu H, Demner Fushman D. Development of Clinical NLP Systems. Cognitive Informatics In Biomedicine And Healthcare 2024, 301-324. DOI: 10.1007/978-3-031-55865-8_11.Peer-Reviewed Original ResearchEnsemble pretrained language models to extract biomedical knowledge from literature
Li Z, Wei Q, Huang L, Li J, Hu Y, Chuang Y, He J, Das A, Keloth V, Yang Y, Diala C, Roberts K, Tao C, Jiang X, Zheng W, Xu H. Ensemble pretrained language models to extract biomedical knowledge from literature. Journal Of The American Medical Informatics Association 2024, 31: 1904-1911. PMID: 38520725, PMCID: PMC11339500, DOI: 10.1093/jamia/ocae061.Peer-Reviewed Original ResearchNatural language processingNatural language processing systemsLanguage modelExpansion of biomedical literatureZero-shot settingManually annotated corpusKnowledge graph developmentTask-specific modelsDomain-specific modelsZero-ShotEntity recognitionBillion parametersEnsemble learningLocation informationKnowledge basesBiomedical entitiesLanguage processingFree textGraph developmentBiomedical conceptsAutomated techniqueBiomedical literatureDetection methodPredictive performanceBiomedical knowledge
2017
CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines
Soysal E, Wang J, Jiang M, Wu Y, Pakhomov S, Liu H, Xu H. CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines. Journal Of The American Medical Informatics Association 2017, 25: 331-336. PMID: 29186491, PMCID: PMC7378877, DOI: 10.1093/jamia/ocx132.Peer-Reviewed Original ResearchGraphic user interfaceUser interfaceUser-friendly graphic user interfaceNatural language processing systemsClinical natural language processing (NLP) systemsNatural language processing pipelineKnowledge Extraction SystemLanguage processing pipelineClinical Text AnalysisLanguage processing systemNLP componentsNLP toolkitInformation extractionNLP pipelineUse casesEntity recognitionClinical textEnd usersNLP communityProcessing pipelineProcessing systemIndividual tasksIndividual applicationsText analysisBetter performanceA hybrid approach to automatic de-identification of psychiatric notes
Lee H, Wu Y, Zhang Y, Xu J, Xu H, Roberts K. A hybrid approach to automatic de-identification of psychiatric notes. Journal Of Biomedical Informatics 2017, 75: s19-s27. PMID: 28602904, PMCID: PMC5705430, DOI: 10.1016/j.jbi.2017.06.006.Peer-Reviewed Original ResearchConceptsPsychiatric notesCEGS N-GRIDNatural language processing systemsRule-based componentTask Track 1Language processing systemRule-based approachDe-identificationDomain adaptationRich featuresProcessing systemHybrid approachN gridTrack 1Clinical dataTest setSystem performanceMachineHealth informationHybrid systemSystemClinical applicationTaskInformationDataInterweaving Domain Knowledge and Unsupervised Learning for Psychiatric Stressor Extraction from Clinical Notes
Zhang O, Zhang Y, Xu J, Roberts K, Zhang X, Xu H. Interweaving Domain Knowledge and Unsupervised Learning for Psychiatric Stressor Extraction from Clinical Notes. Lecture Notes In Computer Science 2017, 10351: 396-406. DOI: 10.1007/978-3-319-60045-1_41.Peer-Reviewed Original ResearchNatural language processing systemsWord representation featuresPsychiatric stressorsLanguage processing systemDeep learningDomain knowledgeElectronic health recordsUnsupervised learningInexact matchingClinical notesF-measureRepresentation featuresProcessing systemHealth recordsPsychiatric notesImportant problemMultiple sourcesExperimental resultsLearningAlgorithmChallengesMatchingNarrative textStressor dataRecall
2015
A study of active learning methods for named entity recognition in clinical text
Chen Y, Lasko T, Mei Q, Denny J, Xu H. A study of active learning methods for named entity recognition in clinical text. Journal Of Biomedical Informatics 2015, 58: 11-18. PMID: 26385377, PMCID: PMC4934373, DOI: 10.1016/j.jbi.2015.09.010.Peer-Reviewed Original ResearchConceptsClinical NER tasksMachine learningAnnotation costF-measureEntity recognitionNER taskActive learningLearning methodsI2b2/VA NLP challengeNatural language processing systemsPerformance of MLClinical natural language processing (NLP) systemsSequential labeling tasksSupervised machine learningAL methodsLanguage processing systemDiversity-based methodReal-time settingActive learning methodsNew AL methodsNER corpusDomain expertsUncertainty samplingAnnotation effortClinical textA Preliminary Study of Clinical Abbreviation Disambiguation in Real Time
Wu Y, Denny J, Rosenbloom S, Miller R, Giuse D, Song M, Xu H. A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time. Applied Clinical Informatics 2015, 06: 364-374. PMID: 26171081, PMCID: PMC4493336, DOI: 10.4338/aci-2014-10-ra-0088.Peer-Reviewed Original ResearchConceptsElectronic health record systemsUser studyClinical documentation systemNatural language processing systemsClinical NLP systemsPreliminary user studyAbbreviation recognitionExtra time costLanguage processing systemWSD methodHealth record systemsDocumentation systemPrototype applicationWord sense disambiguation methodNLP systemsCorrect sensesNote generationPrototype systemClinical sentencesCost of timeClinical documentsDocument entryDisambiguation moduleSense disambiguation methodHealthcare records
2013
A prototype application for real-time recognition and disambiguation of clinical abbreviations
Wu Y, Denny J, Rosenbloom S, Miller R, Giuse D, Song M, Xu H. A prototype application for real-time recognition and disambiguation of clinical abbreviations. 2013, 7-8. DOI: 10.1145/2512089.2512096.Peer-Reviewed Original ResearchElectronic health record systemsPrototype applicationClinical documentation systemNatural language processing systemsClinical abbreviationsClinical NLP systemsReal-time recognitionLanguage processing systemAverage response timeHealth record systemsDocumentation systemResponse timeWord sense disambiguation methodNLP systemsNote generationPrototype systemClinical documentsSense disambiguation methodHealthcare recordsProcessing systemAbbreviation disambiguationCard systemDisambiguation methodAbbreviation recognitionSystem design
2012
A study of transportability of an existing smoking status detection module across institutions.
Liu M, Shah A, Jiang M, Peterson N, Dai Q, Aldrich M, Chen Q, Bowton E, Liu H, Denny J, Xu H. A study of transportability of an existing smoking status detection module across institutions. AMIA Annual Symposium Proceedings 2012, 2012: 577-86. PMID: 23304330, PMCID: PMC3540509.Peer-Reviewed Original ResearchConceptsDetection moduleNatural language processing systemsKnowledge Extraction SystemEMR dataRule-based classifierClinical Text AnalysisHighest F-measureLanguage processing systemElectronic medical recordsF-measureLevels of classificationProcessing systemSpecific tasksText analysisClassifierDesirable performanceModuleModest effortExtraction systemCTAKESSmoking moduleMachineSystemTaskClassificationPortability of an algorithm to identify rheumatoid arthritis in electronic health records
Carroll R, Thompson W, Eyler A, Mandelin A, Cai T, Zink R, Pacheco J, Boomershine C, Lasko T, Xu H, Karlson E, Perez R, Gainer V, Murphy S, Ruderman E, Pope R, Plenge R, Kho A, Liao K, Denny J. Portability of an algorithm to identify rheumatoid arthritis in electronic health records. Journal Of The American Medical Informatics Association 2012, 19: e162-e169. PMID: 22374935, PMCID: PMC3392871, DOI: 10.1136/amiajnl-2011-000583.Peer-Reviewed Original Research
2010
MedEx: a medication information extraction system for clinical narratives
Xu H, Stenner S, Doan S, Johnson K, Waitman L, Denny J. MedEx: a medication information extraction system for clinical narratives. Journal Of The American Medical Informatics Association 2010, 17: 19-24. PMID: 20064797, PMCID: PMC2995636, DOI: 10.1197/jamia.m3378.Peer-Reviewed Original ResearchConceptsClinic visit notesVisit notesMedication informationClinical notesDischarge summariesElectronic medical record dataMedical record dataElectronic medical recordsMedication dataMedical recordsClinical dataClinical researchRecord dataHealthcare safetyDrug namesMedexF-measureClinical narrativesNatural language processing systemsInformation extraction system
2009
Development of a natural language processing system to identify timing and status of colonoscopy testing in electronic medical records.
Denny J, Peterson J, Choma N, Xu H, Miller R, Bastarache L, Peterson N. Development of a natural language processing system to identify timing and status of colonoscopy testing in electronic medical records. AMIA Annual Symposium Proceedings 2009, 2009: 141. PMID: 20351837, PMCID: PMC2815478.Peer-Reviewed Original ResearchConceptsNatural language processingNatural language processing systemsElectronic medical recordsLanguage processing systemNLP systemsIdentifier systemLanguage processingMedical recordsProcessing systemElectronic textsColorectal cancer screening ratesCancer screening ratesPrimary care populationColonoscopy testingScreening ratesCare populationBilling codesQueriesColonoscopySystemStatus indicatorsAlgorithmCodeProcessingStatus
2007
A study of abbreviations in clinical notes.
Xu H, Stetson P, Friedman C. A study of abbreviations in clinical notes. AMIA Annual Symposium Proceedings 2007, 2007: 821-5. PMID: 18693951, PMCID: PMC2655910.Peer-Reviewed Original ResearchConceptsUnified Medical Language SystemNatural language processing systemsLanguage processing systemNarrative clinical notesDetection methodClinical notesDifferent knowledge sourcesSense inventoryDomain expertsNLP systemsCorrect sensesDecision supportText corporaKnowledge sourcesError detectionProcessing systemBiomedical literatureStudy of abbreviationsLanguage systemPatient informationAmbiguity rateBetter detection methodsDatabaseAnnotationAbbreviations
2006
Natural language processing and visualization in the molecular imaging domain
Tulipano P, Tao Y, Millar W, Zanzonico P, Kolbert K, Xu H, Yu H, Chen L, Lussier Y, Friedman C. Natural language processing and visualization in the molecular imaging domain. Journal Of Biomedical Informatics 2006, 40: 270-281. PMID: 17084109, DOI: 10.1016/j.jbi.2006.08.002.Peer-Reviewed Original ResearchMeSH KeywordsAnimalsCell LineComputational BiologyDatabases, BibliographicDatabases, GeneticDiagnostic ImagingGenomicsHumansInformation Storage and RetrievalNatural Language ProcessingPhenotypeProgramming LanguagesSoftwareSystems IntegrationTerminology as TopicUser-Computer InterfaceVocabulary, ControlledConceptsImaging domainNatural language processing systemsNatural language processingLanguage processing systemJava viewerNLP systemsFormal evaluation studiesLanguage processingInformation resourcesProcessing systemMedical imagingIndex imagesSystem performanceBiological informationInformationImagesVisualizationBioMedLEEPerformanceNLPEvaluation studyDomainGenomics literatureSystemSimultaneous visualization