2021
The application of artificial intelligence and data integration in COVID-19 studies: a scoping review
Guo Y, Zhang Y, Lyu T, Prosperi M, Wang F, Xu H, Bian J. The application of artificial intelligence and data integration in COVID-19 studies: a scoping review. Journal Of The American Medical Informatics Association 2021, 28: 2050-2067. PMID: 34151987, PMCID: PMC8344463, DOI: 10.1093/jamia/ocab098.Peer-Reviewed Original ResearchConceptsAI applicationsArtificial intelligenceData integrationHeterogeneous dataSocial media data analysisMost AI applicationsHeterogeneous data sourcesMedia data analysisProteomics data analysisAI algorithmsAI frameworkElectronic health recordsHeterogenous dataBiased algorithmsHealth recordsCOVID-19 researchData analysisSingle-source approachResearch topicData sourcesResearch areaIntelligenceSurveillance systemDifferent sourcesAlgorithm
2020
A study of entity-linking methods for normalizing Chinese diagnosis and procedure terms to ICD codes
Wang Q, Ji Z, Wang J, Wu S, Lin W, Li W, Ke L, Xiao G, Jiang Q, Xu H, Zhou Y. A study of entity-linking methods for normalizing Chinese diagnosis and procedure terms to ICD codes. Journal Of Biomedical Informatics 2020, 105: 103418. PMID: 32298846, DOI: 10.1016/j.jbi.2020.103418.Peer-Reviewed Original ResearchConceptsBM25 algorithmConcept rankingConcept generationConvolutional neural network approachNeural network approachRanking-based methodRanking methodSupport vector machineProcedure termsBetter performanceVector machineDifferent algorithmsMedical codingNetwork approachAlgorithmICD codesBERTExtended versionGood accuracyKnowledgebaseDisease termsClinical termsMatch criteriaCodeChinese diagnosis
2019
Cost-aware active learning for named entity recognition in clinical text
Wei Q, Chen Y, Salimi M, Denny J, Mei Q, Lasko T, Chen Q, Wu S, Franklin A, Cohen T, Xu H. Cost-aware active learning for named entity recognition in clinical text. Journal Of The American Medical Informatics Association 2019, 26: 1314-1322. PMID: 31294792, PMCID: PMC6798575, DOI: 10.1093/jamia/ocz102.Peer-Reviewed Original ResearchConceptsAnnotation costUser studyActive learningAL methodsAL algorithmCost-CAUSEReal-world environmentsAnnotation taskAnnotation timeAnnotation accuracyEntity recognitionClinical textAnnotation dataPassive learningInformative examplesCurve scoreMost approachesSimulation areaUsersSyntactic featuresLearningCost measuresAlgorithmCostAnnotationCost-sensitive Active Learning for Phenotyping of Electronic Health Records.
Ji Z, Wei Q, Franklin A, Cohen T, Xu H. Cost-sensitive Active Learning for Phenotyping of Electronic Health Records. AMIA Joint Summits On Translational Science Proceedings 2019, 2019: 829-838. PMID: 31259040, PMCID: PMC6568101.Peer-Reviewed Original ResearchAnnotation timeElectronic health recordsActive learningMachine learning-based methodsCost-sensitive active learningLarge annotated datasetLearning-based methodsHealth recordsUse casesAnnotated datasetUser 1AL algorithmUser 2Phenotyping algorithmAL approachSecondary useAlgorithmBetter performanceActual timeLearningExperimental resultsBreast cancer patientsDatasetModel performancePassive learning
2017
Accurate Identification of Fatty Liver Disease in Data Warehouse Utilizing Natural Language Processing
Redman J, Natarajan Y, Hou J, Wang J, Hanif M, Feng H, Kramer J, Desiderio R, Xu H, El-Serag H, Kanwal F. Accurate Identification of Fatty Liver Disease in Data Warehouse Utilizing Natural Language Processing. Digestive Diseases And Sciences 2017, 62: 2713-2718. PMID: 28861720, DOI: 10.1007/s10620-017-4721-9.Peer-Reviewed Original ResearchConceptsData warehouseFatty liver diseaseLanguage processingNatural language processingLiver diseaseF-measureAlgorithm developmentVeterans Affairs Corporate Data WarehouseMagnetic resonance imaging reportsOutcomes of patientsAlgorithmExpert radiologistsValidation methodElectronic medical recordsCorporate Data WarehouseWarehouseAbdominal ultrasoundManual reviewHepatic steatosisMedical recordsRandom national sampleClinical studiesLarge cohortComputerized tomographyImaging reportsAn active learning-enabled annotation system for clinical named entity recognition
Chen Y, Lask T, Mei Q, Chen Q, Moon S, Wang J, Nguyen K, Dawodu T, Cohen T, Denny J, Xu H. An active learning-enabled annotation system for clinical named entity recognition. BMC Medical Informatics And Decision Making 2017, 17: 82. PMID: 28699546, PMCID: PMC5506567, DOI: 10.1186/s12911-017-0466-9.Peer-Reviewed Original ResearchConceptsNovel AL algorithmAL algorithmAnnotation timeUser studyEntity recognitionAnnotation systemNatural language processing modelsLanguage processing modelsAnnotation costMedical domainAnnotation processDifferent usersNER modelProcessing modelAlgorithmAL methodsResultsThe simulation resultsUsersSimulation resultsInformation contentFuture workRecognitionLarge numberSystemReal-life settingInterweaving Domain Knowledge and Unsupervised Learning for Psychiatric Stressor Extraction from Clinical Notes
Zhang O, Zhang Y, Xu J, Roberts K, Zhang X, Xu H. Interweaving Domain Knowledge and Unsupervised Learning for Psychiatric Stressor Extraction from Clinical Notes. Lecture Notes In Computer Science 2017, 10351: 396-406. DOI: 10.1007/978-3-319-60045-1_41.Peer-Reviewed Original ResearchNatural language processing systemsWord representation featuresPsychiatric stressorsLanguage processing systemDeep learningDomain knowledgeElectronic health recordsUnsupervised learningInexact matchingClinical notesF-measureRepresentation featuresProcessing systemHealth recordsPsychiatric notesImportant problemMultiple sourcesExperimental resultsLearningAlgorithmChallengesMatchingNarrative textStressor dataRecall
2014
Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks
Tang B, Cao H, Wang X, Chen Q, Xu H. Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks. BioMed Research International 2014, 2014: 240403. PMID: 24729964, PMCID: PMC3963372, DOI: 10.1155/2014/240403.Peer-Reviewed Original ResearchConceptsBiomedical Named Entity RecognitionWord representationsNamed Entity Recognition (NER) taskMachine learning-based approachWord representation featuresNatural language processingLearning-based approachEntity recognition taskNamed Entity RecognitionCluster-based representationJNLPBA corpusEntity recognitionBiomedical domainF-measureLanguage processingRepresentation featuresWord embeddingsRecognition taskWR algorithmDistributional representationsTaskBetter performanceAlgorithmRepresentationDifferent typesChapter 12 Linking Genomic and Clinical Data for Discovery and Personalized Care
Denny J, Xu H. Chapter 12 Linking Genomic and Clinical Data for Discovery and Personalized Care. 2014, 395-424. DOI: 10.1016/b978-0-12-401678-1.00012-9.Peer-Reviewed Original ResearchElectronic health recordsEHR dataNatural language processingSuch algorithmsLanguage processingDecision supportPhenotype algorithmsIdeal repositoryHealth recordsNumber of challengesRepositoryAlgorithmClinical notesClinical careClinical documentationGenomic dataResult dataAccurate caseDNA biobanksEarly demonstration projectsHealth care qualityClinical recordsMedication recordsClinical dataTool
2013
Applying active learning to high-throughput phenotyping algorithms for electronic health records data
Chen Y, Carroll R, Hinz E, Shah A, Eyler A, Denny J, Xu H. Applying active learning to high-throughput phenotyping algorithms for electronic health records data. Journal Of The American Medical Informatics Association 2013, 20: e253-e259. PMID: 23851443, PMCID: PMC3861916, DOI: 10.1136/amiajnl-2013-001945.Peer-Reviewed Original ResearchConceptsActive learningUnrefined featuresSupervised Machine Learning AlgorithmsRefined featuresPhenotyping algorithmElectronic health record dataMachine Learning AlgorithmsHealth record dataVenous thromboembolismRheumatoid arthritisFeature engineeringDomain expertsDomain knowledgePhenotyping tasksLearning algorithmFeature setsLearning approachColorectal cancerAL approachCurve scorePassive learning approachHigh-throughput phenotyping methodsAlgorithmSmall setRecord dataWord Sense Disambiguation of clinical abbreviations with hyperdimensional computing.
Moon S, Berster B, Xu H, Cohen T. Word Sense Disambiguation of clinical abbreviations with hyperdimensional computing. AMIA Annual Symposium Proceedings 2013, 2013: 1007-16. PMID: 24551390, PMCID: PMC3900125.Peer-Reviewed Original ResearchConceptsWord sense disambiguationAverage accuracySense disambiguationWord sense disambiguation algorithmSupport vector machineHyperdimensional ComputingNaïve BayesCommon machineClinical documentsVector machineDisambiguation algorithmClinical abbreviationsMedical informationAccurate extractionAlgorithmDisambiguationMachineSuch approachesClinical notesPresent new approachVector transformationNew approachAmbiguous termsComputingAccuracy
2012
Advances in systems biology: computational algorithms and applications
Huang Y, Zhao Z, Xu H, Shyr Y, Zhang B. Advances in systems biology: computational algorithms and applications. BMC Systems Biology 2012, 6: s1. PMID: 23281622, PMCID: PMC3524016, DOI: 10.1186/1752-0509-6-s3-s1.Peer-Reviewed Original ResearchPortability of an algorithm to identify rheumatoid arthritis in electronic health records
Carroll R, Thompson W, Eyler A, Mandelin A, Cai T, Zink R, Pacheco J, Boomershine C, Lasko T, Xu H, Karlson E, Perez R, Gainer V, Murphy S, Ruderman E, Pope R, Plenge R, Kho A, Liao K, Denny J. Portability of an algorithm to identify rheumatoid arthritis in electronic health records. Journal Of The American Medical Informatics Association 2012, 19: e162-e169. PMID: 22374935, PMCID: PMC3392871, DOI: 10.1136/amiajnl-2011-000583.Peer-Reviewed Original ResearchExtracting epidemiologic exposure and outcome terms from literature using machine learning approaches.
Lu Y, Xu H, Peterson N, Dai Q, Jiang M, Denny J, Liu M. Extracting epidemiologic exposure and outcome terms from literature using machine learning approaches. International Journal Of Data Mining And Bioinformatics 2012, 6: 447-59. PMID: 23155773, DOI: 10.1504/ijdmb.2012.049284.Peer-Reviewed Original Research
2011
Detecting abbreviations in discharge summaries using machine learning methods.
Wu Y, Rosenbloom S, Denny J, Miller R, Mani S, Giuse D, Xu H. Detecting abbreviations in discharge summaries using machine learning methods. AMIA Annual Symposium Proceedings 2011, 2011: 1541-9. PMID: 22195219, PMCID: PMC3243185.Peer-Reviewed Original ResearchConceptsNatural language processingMachine learning methodsHighest F-measureF-measureClinical natural language processingLexical resourcesClinical abbreviationsTraining setPre-defined featuresRandom forest classifierDomain expertsML algorithmsML classifiersLanguage processingVoting schemeLearning methodsDischarge summariesForest classifierTest setClassifierCorpus-based methodSetResourcesAlgorithmAbbreviations
2009
Development of a natural language processing system to identify timing and status of colonoscopy testing in electronic medical records.
Denny J, Peterson J, Choma N, Xu H, Miller R, Bastarache L, Peterson N. Development of a natural language processing system to identify timing and status of colonoscopy testing in electronic medical records. AMIA Annual Symposium Proceedings 2009, 2009: 141. PMID: 20351837, PMCID: PMC2815478.Peer-Reviewed Original ResearchConceptsNatural language processingNatural language processing systemsElectronic medical recordsLanguage processing systemNLP systemsIdentifier systemLanguage processingMedical recordsProcessing systemElectronic textsColorectal cancer screening ratesCancer screening ratesPrimary care populationColonoscopy testingScreening ratesCare populationBilling codesQueriesColonoscopySystemStatus indicatorsAlgorithmCodeProcessingStatus