2024
Development of Clinical NLP Systems
Xu H, Demner Fushman D. Development of Clinical NLP Systems. Cognitive Informatics In Biomedicine And Healthcare 2024, 301-324. DOI: 10.1007/978-3-031-55865-8_11.Peer-Reviewed Original Research
2019
Recognizing software names in biomedical literature using machine learning
Wei Q, Zhang Y, Amith M, Lin R, Lapeyrolerie J, Tao C, Xu H. Recognizing software names in biomedical literature using machine learning. Health Informatics Journal 2019, 26: 21-33. PMID: 31566474, PMCID: PMC7334865, DOI: 10.1177/1460458219869490.Peer-Reviewed Original ResearchConceptsSoftware namesF-measureNatural language processing methodsBiomedical literatureWord representation featuresLanguage processing methodsEntity recognition systemSoftware catalogSoftware repositoriesFeature engineeringBiomedical softwareRecognition systemSoftware toolsBiomedical domainRepresentation featuresMEDLINE abstractsWord embeddingsKnowledge featuresManual curationSoftwareMachineProcessing methodsBest systemRepositorySystemDeveloping Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP
Soysal E, Warner J, Wang J, Jiang M, Harvey K, Jain S, Dong X, Song H, Siddhanamatha H, Wang L, Dai Q, Chen Q, Du X, Tao C, Yang P, Denny J, Liu H, Xu H. Developing Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP. 2019, 264: 1041-1045. PMID: 31438083, PMCID: PMC7359882, DOI: 10.3233/shti190383.Peer-Reviewed Original ResearchConceptsElectronic health recordsNLP solutionNatural language processing technologyInformation extraction moduleLanguage processing technologyInformation extraction tasksUser-friendly interfaceBest F-measureInformation extractionExtraction moduleExtraction taskCustomizable modulesNLP systemsF-measureAcademic useHealth recordsComparable performanceProcessing technologyVanderbilt University Medical CenterModuleDiverse typesInformationNLPSubstantial effortSystem
2017
CNN-based ranking for biomedical entity normalization
Li H, Chen Q, Tang B, Wang X, Xu H, Wang B, Huang D. CNN-based ranking for biomedical entity normalization. BMC Bioinformatics 2017, 18: 385. PMID: 28984180, PMCID: PMC5629610, DOI: 10.1186/s12859-017-1805-7.Peer-Reviewed Original ResearchConceptsBiomedical entity normalizationEntity normalizationSemantic informationCNN architectureNovel convolutional neural network architectureConvolutional neural network architectureTraditional rule-based methodsNeural network architectureRule-based systemRanking methodRule-based methodNetwork architectureBiomedical entitiesBenchmark datasetsArt performanceEntity mentionsRanking problemCNNNormalization systemArchitectureMorphological informationComparison resultsInformationDatasetSystemAn active learning-enabled annotation system for clinical named entity recognition
Chen Y, Lask T, Mei Q, Chen Q, Moon S, Wang J, Nguyen K, Dawodu T, Cohen T, Denny J, Xu H. An active learning-enabled annotation system for clinical named entity recognition. BMC Medical Informatics And Decision Making 2017, 17: 82. PMID: 28699546, PMCID: PMC5506567, DOI: 10.1186/s12911-017-0466-9.Peer-Reviewed Original ResearchConceptsNovel AL algorithmAL algorithmAnnotation timeUser studyEntity recognitionAnnotation systemNatural language processing modelsLanguage processing modelsAnnotation costMedical domainAnnotation processDifferent usersNER modelProcessing modelAlgorithmAL methodsResultsThe simulation resultsUsersSimulation resultsInformation contentFuture workRecognitionLarge numberSystemReal-life settingA hybrid approach to automatic de-identification of psychiatric notes
Lee H, Wu Y, Zhang Y, Xu J, Xu H, Roberts K. A hybrid approach to automatic de-identification of psychiatric notes. Journal Of Biomedical Informatics 2017, 75: s19-s27. PMID: 28602904, PMCID: PMC5705430, DOI: 10.1016/j.jbi.2017.06.006.Peer-Reviewed Original ResearchConceptsPsychiatric notesCEGS N-GRIDNatural language processing systemsRule-based componentTask Track 1Language processing systemRule-based approachDe-identificationDomain adaptationRich featuresProcessing systemHybrid approachN gridTrack 1Clinical dataTest setSystem performanceMachineHealth informationHybrid systemSystemClinical applicationTaskInformationDataInformation retrieval for biomedical datasets: the 2016 bioCADDIE dataset retrieval challenge
Roberts K, Gururaj A, Chen X, Pournejati S, Hersh W, Demner-Fushman D, Ohno-Machado L, Cohen T, Xu H. Information retrieval for biomedical datasets: the 2016 bioCADDIE dataset retrieval challenge. Database 2017, 2017: bax068. DOI: 10.1093/database/bax068.Peer-Reviewed Original ResearchBiomedical datasetsRetrieval challengesInformation retrieval techniquesAdvanced query processingBiomedical data repositoriesAdvanced retrieval methodsQuery processingInformation retrievalTest queriesRetrieval systemRank frameworkRetrieval approachRetrieval techniquesData repositoryRetrieval methodTop precisionDatasetQueriesRepositoryChallengesRetrievalTaskLearningSystemCorpus
2015
Recognizing Disjoint Clinical Concepts in Clinical Text Using Machine Learning-based Methods.
Tang B, Chen Q, Wang X, Wu Y, Zhang Y, Jiang M, Wang J, Xu H. Recognizing Disjoint Clinical Concepts in Clinical Text Using Machine Learning-based Methods. AMIA Annual Symposium Proceedings 2015, 2015: 1184-93. PMID: 26958258, PMCID: PMC4765674.Peer-Reviewed Original Research
2014
Open Source Clinical NLP - More than Any Single System.
Masanz J, Pakhomov S, Xu H, Wu S, Chute C, Liu H. Open Source Clinical NLP - More than Any Single System. AMIA Joint Summits On Translational Science Proceedings 2014, 2014: 76-82. PMID: 25954581, PMCID: PMC4419764.Peer-Reviewed Original ResearchClinical NLPNLP systemsNatural language processing toolsOpen source softwareLanguage processing toolsInformation technology practicesPluggable componentsSource softwareNLP softwareProcessing capabilitiesProcessing toolsResearch communityNLPCollaborative communitySingle systemSoftwareTechnology practicesNLP activityUIMAAnnotatorsSystemApacheOngoing activityFrameworkCapabilityPhenDisco: phenotype discovery system for the database of genotypes and phenotypes
Doan S, Lin K, Conway M, Ohno-Machado L, Hsieh A, Feupe S, Garland A, Ross M, Jiang X, Farzaneh S, Walker R, Alipanah N, Zhang J, Xu H, Kim H. PhenDisco: phenotype discovery system for the database of genotypes and phenotypes. Journal Of The American Medical Informatics Association 2014, 21: 31-36. PMID: 23989082, PMCID: PMC3912702, DOI: 10.1136/amiajnl-2013-001882.Peer-Reviewed Original ResearchConceptsNew information retrieval systemInformation retrieval systemsInformation retrieval toolsDatabase of GenotypesText processing toolsRetrieval systemSearch scenariosDiscovery systemRetrieval toolsAuthorized usersNon-standardized wayCross-study validationSearch comparisonProcessing toolsPromising performanceUsersPhenotype informationDatabaseInformationBiotechnology InformationQueriesMetadataEntrezResourcesSystem
2012
A study of transportability of an existing smoking status detection module across institutions.
Liu M, Shah A, Jiang M, Peterson N, Dai Q, Aldrich M, Chen Q, Bowton E, Liu H, Denny J, Xu H. A study of transportability of an existing smoking status detection module across institutions. AMIA Annual Symposium Proceedings 2012, 2012: 577-86. PMID: 23304330, PMCID: PMC3540509.Peer-Reviewed Original ResearchConceptsDetection moduleNatural language processing systemsKnowledge Extraction SystemEMR dataRule-based classifierClinical Text AnalysisHighest F-measureLanguage processing systemElectronic medical recordsF-measureLevels of classificationProcessing systemSpecific tasksText analysisClassifierDesirable performanceModuleModest effortExtraction systemCTAKESSmoking moduleMachineSystemTaskClassificationExtracting epidemiologic exposure and outcome terms from literature using machine learning approaches.
Lu Y, Xu H, Peterson N, Dai Q, Jiang M, Denny J, Liu M. Extracting epidemiologic exposure and outcome terms from literature using machine learning approaches. International Journal Of Data Mining And Bioinformatics 2012, 6: 447-59. PMID: 23155773, DOI: 10.1504/ijdmb.2012.049284.Peer-Reviewed Original Research
2011
Data from clinical notes: a perspective on the tension between structure and flexible documentation
Rosenbloom S, Denny J, Xu H, Lorenzi N, Stead W, Johnson K. Data from clinical notes: a perspective on the tension between structure and flexible documentation. Journal Of The American Medical Informatics Association 2011, 18: 181-186. PMID: 21233086, PMCID: PMC3116264, DOI: 10.1136/jamia.2010.007237.Peer-Reviewed Original ResearchConceptsReusable dataElectronic health record system adoptionStructured documentationComputer-based documentation systemsClinical notesClinical documentationStructured dataText processingSystem adoptionRecord systemSuch systemsDocumentation systemWorkflowContent needsProvidersUsabilityDocumentationExpressivitySystemHealthcare providersPatient careDataProcessingMajor goalAdoption
2009
Development of a natural language processing system to identify timing and status of colonoscopy testing in electronic medical records.
Denny J, Peterson J, Choma N, Xu H, Miller R, Bastarache L, Peterson N. Development of a natural language processing system to identify timing and status of colonoscopy testing in electronic medical records. AMIA Annual Symposium Proceedings 2009, 2009: 141. PMID: 20351837, PMCID: PMC2815478.Peer-Reviewed Original ResearchConceptsNatural language processingNatural language processing systemsElectronic medical recordsLanguage processing systemNLP systemsIdentifier systemLanguage processingMedical recordsProcessing systemElectronic textsColorectal cancer screening ratesCancer screening ratesPrimary care populationColonoscopy testingScreening ratesCare populationBilling codesQueriesColonoscopySystemStatus indicatorsAlgorithmCodeProcessingStatus
2006
Natural language processing and visualization in the molecular imaging domain
Tulipano P, Tao Y, Millar W, Zanzonico P, Kolbert K, Xu H, Yu H, Chen L, Lussier Y, Friedman C. Natural language processing and visualization in the molecular imaging domain. Journal Of Biomedical Informatics 2006, 40: 270-281. PMID: 17084109, DOI: 10.1016/j.jbi.2006.08.002.Peer-Reviewed Original ResearchMeSH KeywordsAnimalsCell LineComputational BiologyDatabases, BibliographicDatabases, GeneticDiagnostic ImagingGenomicsHumansInformation Storage and RetrievalNatural Language ProcessingPhenotypeProgramming LanguagesSoftwareSystems IntegrationTerminology as TopicUser-Computer InterfaceVocabulary, ControlledConceptsImaging domainNatural language processing systemsNatural language processingLanguage processing systemJava viewerNLP systemsFormal evaluation studiesLanguage processingInformation resourcesProcessing systemMedical imagingIndex imagesSystem performanceBiological informationInformationImagesVisualizationBioMedLEEPerformanceNLPEvaluation studyDomainGenomics literatureSystemSimultaneous visualization