2020
Opioid2FHIR: A system for extracting FHIR-compatible opioid prescriptions from clinical text
Wang J, Mathews W, Pham H, Xu H, Zhang Y. Opioid2FHIR: A system for extracting FHIR-compatible opioid prescriptions from clinical text. 2020, 00: 1748-1751. DOI: 10.1109/bibm49941.2020.9313258.Peer-Reviewed Original ResearchFast Healthcare Interoperability ResourcesInformation extractionNatural language processing techniquesLanguage processing techniquesMedical concept normalizationOpioid informationPost-processing rulesClinical decision supportManual effortConcept normalizationClinical textF-measureNLP applicationsPrescription recordsClinical data standardsData standardsDecision supportFree textProcessing toolsPrescription drug monitoring programsNational public health emergencyProcessing techniquesPrescription opioid overdoseDrug monitoring programsDrug overdose deaths
2019
Recognizing software names in biomedical literature using machine learning
Wei Q, Zhang Y, Amith M, Lin R, Lapeyrolerie J, Tao C, Xu H. Recognizing software names in biomedical literature using machine learning. Health Informatics Journal 2019, 26: 21-33. PMID: 31566474, PMCID: PMC7334865, DOI: 10.1177/1460458219869490.Peer-Reviewed Original ResearchConceptsSoftware namesF-measureNatural language processing methodsBiomedical literatureWord representation featuresLanguage processing methodsEntity recognition systemSoftware catalogSoftware repositoriesFeature engineeringBiomedical softwareRecognition systemSoftware toolsBiomedical domainRepresentation featuresMEDLINE abstractsWord embeddingsKnowledge featuresManual curationSoftwareMachineProcessing methodsBest systemRepositorySystemDeveloping Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP
Soysal E, Warner J, Wang J, Jiang M, Harvey K, Jain S, Dong X, Song H, Siddhanamatha H, Wang L, Dai Q, Chen Q, Du X, Tao C, Yang P, Denny J, Liu H, Xu H. Developing Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP. 2019, 264: 1041-1045. PMID: 31438083, PMCID: PMC7359882, DOI: 10.3233/shti190383.Peer-Reviewed Original ResearchConceptsElectronic health recordsNLP solutionNatural language processing technologyInformation extraction moduleLanguage processing technologyInformation extraction tasksUser-friendly interfaceBest F-measureInformation extractionExtraction moduleExtraction taskCustomizable modulesNLP systemsF-measureAcademic useHealth recordsComparable performanceProcessing technologyVanderbilt University Medical CenterModuleDiverse typesInformationNLPSubstantial effortSystem
2017
Accurate Identification of Fatty Liver Disease in Data Warehouse Utilizing Natural Language Processing
Redman J, Natarajan Y, Hou J, Wang J, Hanif M, Feng H, Kramer J, Desiderio R, Xu H, El-Serag H, Kanwal F. Accurate Identification of Fatty Liver Disease in Data Warehouse Utilizing Natural Language Processing. Digestive Diseases And Sciences 2017, 62: 2713-2718. PMID: 28861720, DOI: 10.1007/s10620-017-4721-9.Peer-Reviewed Original ResearchConceptsData warehouseFatty liver diseaseLanguage processingNatural language processingLiver diseaseF-measureAlgorithm developmentVeterans Affairs Corporate Data WarehouseMagnetic resonance imaging reportsOutcomes of patientsAlgorithmExpert radiologistsValidation methodElectronic medical recordsCorporate Data WarehouseWarehouseAbdominal ultrasoundManual reviewHepatic steatosisMedical recordsRandom national sampleClinical studiesLarge cohortComputerized tomographyImaging reportsSearch Datasets in Literature: A Case Study of GWAS.
Dong X, Zhang Y, Xu H. Search Datasets in Literature: A Case Study of GWAS. AMIA Joint Summits On Translational Science Proceedings 2017, 2017: 40-49. PMID: 28815103, PMCID: PMC5543360.Peer-Reviewed Original ResearchRecognition systemMEDLINE abstractsDataset search enginePattern-based rulesText mining methodsData setsUnderlying data setSearch datasetsData discoverabilityUse casesSearch enginesDataset attributesMining methodsF-measureDomain dictionaryScalable approachHybrid approachDatasetFinderRetrieving literatureDiscoverabilityUltimate goalCase studySetScientific publicationsInterweaving Domain Knowledge and Unsupervised Learning for Psychiatric Stressor Extraction from Clinical Notes
Zhang O, Zhang Y, Xu J, Roberts K, Zhang X, Xu H. Interweaving Domain Knowledge and Unsupervised Learning for Psychiatric Stressor Extraction from Clinical Notes. Lecture Notes In Computer Science 2017, 10351: 396-406. DOI: 10.1007/978-3-319-60045-1_41.Peer-Reviewed Original ResearchNatural language processing systemsWord representation featuresPsychiatric stressorsLanguage processing systemDeep learningDomain knowledgeElectronic health recordsUnsupervised learningInexact matchingClinical notesF-measureRepresentation featuresProcessing systemHealth recordsPsychiatric notesImportant problemMultiple sourcesExperimental resultsLearningAlgorithmChallengesMatchingNarrative textStressor dataRecall
2015
Recognizing Disjoint Clinical Concepts in Clinical Text Using Machine Learning-based Methods.
Tang B, Chen Q, Wang X, Wu Y, Zhang Y, Jiang M, Wang J, Xu H. Recognizing Disjoint Clinical Concepts in Clinical Text Using Machine Learning-based Methods. AMIA Annual Symposium Proceedings 2015, 2015: 1184-93. PMID: 26958258, PMCID: PMC4765674.Peer-Reviewed Original ResearchClassification of Cancer Primary Sites Using Machine Learning and Somatic Mutations
Chen Y, Sun J, Huang L, Xu H, Zhao Z. Classification of Cancer Primary Sites Using Machine Learning and Somatic Mutations. BioMed Research International 2015, 2015: 491502. PMID: 26539502, PMCID: PMC4619847, DOI: 10.1155/2015/491502.Peer-Reviewed Original ResearchConceptsMachine learningF-measureAvailable big dataSupport vector machineBig dataVector machineClassification experimentsAccurate classificationCancer classificationGene function informationMachineSomatic mutation informationClassificationMutation informationFunction informationLearningGene symbolsInformationGene featuresGreat opportunityPerformanceSomatic mutation dataMutation dataAccuracyPredictionA study of active learning methods for named entity recognition in clinical text
Chen Y, Lasko T, Mei Q, Denny J, Xu H. A study of active learning methods for named entity recognition in clinical text. Journal Of Biomedical Informatics 2015, 58: 11-18. PMID: 26385377, PMCID: PMC4934373, DOI: 10.1016/j.jbi.2015.09.010.Peer-Reviewed Original ResearchConceptsClinical NER tasksMachine learningAnnotation costF-measureEntity recognitionNER taskActive learningLearning methodsI2b2/VA NLP challengeNatural language processing systemsPerformance of MLClinical natural language processing (NLP) systemsSequential labeling tasksSupervised machine learningAL methodsLanguage processing systemDiversity-based methodReal-time settingActive learning methodsNew AL methodsNER corpusDomain expertsUncertainty samplingAnnotation effortClinical textA comparison of conditional random fields and structured support vector machines for chemical entity recognition in biomedical literature
Tang B, Feng Y, Wang X, Wu Y, Zhang Y, Jiang M, Wang J, Xu H. A comparison of conditional random fields and structured support vector machines for chemical entity recognition in biomedical literature. Journal Of Cheminformatics 2015, 7: s8. PMID: 25810779, PMCID: PMC4331698, DOI: 10.1186/1758-2946-7-s1-s8.Peer-Reviewed Original ResearchMachine learning-based systemsConditional Random FieldsLearning-based systemEntity recognition systemSupport vector machineEntity recognitionRecognition systemF-measureChallenge organizersDrug Named Entity RecognitionVector machineStructured support vector machineMicro F-measureInformation extraction tasksWord representation featuresNamed Entity RecognitionTest setRandom fieldsPrimary evaluation measureBrown clusteringDocument indexingIndividual subtasksExtraction taskRandom IndexingBiomedical domain
2014
Extracting and standardizing medication information in clinical text - the MedEx-UIMA system.
Jiang M, Wu Y, Shah A, Priyanka P, Denny J, Xu H. Extracting and standardizing medication information in clinical text - the MedEx-UIMA system. AMIA Joint Summits On Translational Science Proceedings 2014, 2014: 37-42. PMID: 25954575, PMCID: PMC4419757.Peer-Reviewed Original ResearchEvaluating Word Representation Features in Biomedical Named Entity Recognition Tasks
Tang B, Cao H, Wang X, Chen Q, Xu H. Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks. BioMed Research International 2014, 2014: 240403. PMID: 24729964, PMCID: PMC3963372, DOI: 10.1155/2014/240403.Peer-Reviewed Original ResearchConceptsBiomedical Named Entity RecognitionWord representationsNamed Entity Recognition (NER) taskMachine learning-based approachWord representation featuresNatural language processingLearning-based approachEntity recognition taskNamed Entity RecognitionCluster-based representationJNLPBA corpusEntity recognitionBiomedical domainF-measureLanguage processingRepresentation featuresWord embeddingsRecognition taskWR algorithmDistributional representationsTaskBetter performanceAlgorithmRepresentationDifferent types
2012
A study of transportability of an existing smoking status detection module across institutions.
Liu M, Shah A, Jiang M, Peterson N, Dai Q, Aldrich M, Chen Q, Bowton E, Liu H, Denny J, Xu H. A study of transportability of an existing smoking status detection module across institutions. AMIA Annual Symposium Proceedings 2012, 2012: 577-86. PMID: 23304330, PMCID: PMC3540509.Peer-Reviewed Original ResearchConceptsDetection moduleNatural language processing systemsKnowledge Extraction SystemEMR dataRule-based classifierClinical Text AnalysisHighest F-measureLanguage processing systemElectronic medical recordsF-measureLevels of classificationProcessing systemSpecific tasksText analysisClassifierDesirable performanceModuleModest effortExtraction systemCTAKESSmoking moduleMachineSystemTaskClassificationClinical entity recognition using structural support vector machines with rich features
Tang B, Cao H, Wu Y, Jiang M, Xu H. Clinical entity recognition using structural support vector machines with rich features. 2012, 13-20. DOI: 10.1145/2390068.2390073.Peer-Reviewed Original ResearchStructural support vector machineClinical entity recognitionSupport vector machineConditional Random FieldsNatural language processingEntity recognitionVector machineRich featuresNLP challengeSequential labeling algorithmLarge margin theoryUnsupervised word representationsClinical text processingConcept extraction taskLess training timeHighest F-measureTest setI2b2 NLP challengeExtraction taskTypical machineNER taskClinical textTraining timeF-measureLanguage processingExtracting epidemiologic exposure and outcome terms from literature using machine learning approaches.
Lu Y, Xu H, Peterson N, Dai Q, Jiang M, Denny J, Liu M. Extracting epidemiologic exposure and outcome terms from literature using machine learning approaches. International Journal Of Data Mining And Bioinformatics 2012, 6: 447-59. PMID: 23155773, DOI: 10.1504/ijdmb.2012.049284.Peer-Reviewed Original Research
2011
Detecting abbreviations in discharge summaries using machine learning methods.
Wu Y, Rosenbloom S, Denny J, Miller R, Mani S, Giuse D, Xu H. Detecting abbreviations in discharge summaries using machine learning methods. AMIA Annual Symposium Proceedings 2011, 2011: 1541-9. PMID: 22195219, PMCID: PMC3243185.Peer-Reviewed Original ResearchConceptsNatural language processingMachine learning methodsHighest F-measureF-measureClinical natural language processingLexical resourcesClinical abbreviationsTraining setPre-defined featuresRandom forest classifierDomain expertsML algorithmsML classifiersLanguage processingVoting schemeLearning methodsDischarge summariesForest classifierTest setClassifierCorpus-based methodSetResourcesAlgorithmAbbreviationsExtracting and integrating data from entire electronic health records for detecting colorectal cancer cases.
Xu H, Fu Z, Shah A, Chen Y, Peterson N, Chen Q, Mani S, Levy M, Dai Q, Denny J. Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases. AMIA Annual Symposium Proceedings 2011, 2011: 1564-72. PMID: 22195222, PMCID: PMC3243156.Peer-Reviewed Original ResearchConceptsEntire electronic health recordElectronic health recordsNatural language processingHealth recordsStructured EHR dataMachine learningText dataNarrative text dataF-measureLanguage processingClinical narrativesEHR dataSuch tasksColorectal cancerDetection methodConcept identificationCohort of patientsColorectal cancer casesVanderbilt University HospitalCase detection methodsClinical notesCRC patientsCRC casesUniversity HospitalCancer casesA study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries
Jiang M, Chen Y, Liu M, Rosenbloom S, Mani S, Denny J, Xu H. A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. Journal Of The American Medical Informatics Association 2011, 18: 601-606. PMID: 21508414, PMCID: PMC3168315, DOI: 10.1136/amiajnl-2011-000163.Peer-Reviewed Original ResearchConceptsEntity extraction systemCenter of InformaticsConcept extractionIntegrating BiologyEntity recognition moduleEntity recognition systemConditional Random FieldsOverall F-scoreSupport vector machineRule-based moduleAssertion classificationClassification taskRecognition moduleRecognition systemML algorithmsSemantic informationTraining dataClinical textNatural languageF-measureChallenge organizersF-scoreVector machineEvaluation scriptsTraining corpus
2010
MedEx: a medication information extraction system for clinical narratives
Xu H, Stenner S, Doan S, Johnson K, Waitman L, Denny J. MedEx: a medication information extraction system for clinical narratives. Journal Of The American Medical Informatics Association 2010, 17: 19-24. PMID: 20064797, PMCID: PMC2995636, DOI: 10.1197/jamia.m3378.Peer-Reviewed Original ResearchConceptsClinic visit notesVisit notesMedication informationClinical notesDischarge summariesElectronic medical record dataMedical record dataElectronic medical recordsMedication dataMedical recordsClinical dataClinical researchRecord dataHealthcare safetyDrug namesMedexF-measureClinical narrativesNatural language processing systemsInformation extraction system