2020
COVID-19 TestNorm: A tool to normalize COVID-19 testing names to LOINC codes
Dong X, Li J, Soysal E, Bian J, DuVall S, Hanchrow E, Liu H, Lynch K, Matheny M, Natarajan K, Ohno-Machado L, Pakhomov S, Reeves R, Sitapati A, Abhyankar S, Cullen T, Deckard J, Jiang X, Murphy R, Xu H. COVID-19 TestNorm: A tool to normalize COVID-19 testing names to LOINC codes. Journal Of The American Medical Informatics Association 2020, 27: 1437-1442. PMID: 32569358, PMCID: PMC7337837, DOI: 10.1093/jamia/ocaa145.Peer-Reviewed Original ResearchConceptsElectronic health recordsLOINC codesSecondary useRule-based toolOnline web applicationOpen-source packageCritical data elementsWeb applicationData networksEnd usersData elementsIndependent test setHealth recordsTest setKey challengesData normalizationCritical resourcesTest namesRoutine clinical practice dataCodeClinical practice dataCoronavirus disease 2019COVID-19 diagnostic testsToolDevelopers
2015
A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text.
Wu Y, Xu J, Jiang M, Zhang Y, Xu H. A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text. AMIA Annual Symposium Proceedings 2015, 2015: 1326-33. PMID: 26958273, PMCID: PMC4765694.Peer-Reviewed Original ResearchMeSH KeywordsAlgorithmsData CurationHumansNatural Language ProcessingPattern Recognition, AutomatedSemanticsTerminology as TopicConceptsNamed Entity RecognitionClinical NER systemNeural word embeddingsClinical Named Entity RecognitionWord embeddingsNER systemWord representationsI2b2 dataEntity recognitionEmbedding featuresClinical textNatural language processing researchConditional Random FieldsLanguage processing researchWord embedding featuresLarge unlabeled corpusBrown clustersNeural wordImportant patient informationFeature representationF1 scoreIntelligent monitoringCritical taskUnlabeled corpusSemantic relationsNamed Entity Recognition in Chinese Clinical Text Using Deep Neural Network.
Wu Y, Jiang M, Lei J, Xu H. Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network. 2015, 216: 624-8. PMID: 26262126, PMCID: PMC4624324.Peer-Reviewed Original ResearchConceptsDeep neural networksLarge unlabeled corpusNamed Entity RecognitionWord embeddingsUnlabeled corpusUnsupervised learningEntity recognitionNeural networkNatural language processing technologyNovel deep learning methodLanguage processing technologyDeep learning methodsUnsupervised feature learningFeature engineering approachImportant healthcare informationChinese clinical textTypes of entitiesFeature learningNER taskClinical textLearning methodsClinical documentsCRF modelHealthcare informationFree text
2014
Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks
Tang B, Cao H, Wang X, Chen Q, Xu H. Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks. BioMed Research International 2014, 2014: 240403. PMID: 24729964, PMCID: PMC3963372, DOI: 10.1155/2014/240403.Peer-Reviewed Original ResearchConceptsBiomedical Named Entity RecognitionWord representationsNamed Entity Recognition (NER) taskMachine learning-based approachWord representation featuresNatural language processingLearning-based approachEntity recognition taskNamed Entity RecognitionCluster-based representationJNLPBA corpusEntity recognitionBiomedical domainF-measureLanguage processingRepresentation featuresWord embeddingsRecognition taskWR algorithmDistributional representationsTaskBetter performanceAlgorithmRepresentationDifferent types
2012
Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method.
Jiang M, Denny J, Tang B, Cao H, Xu H. Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method. AMIA Annual Symposium Proceedings 2012, 2012: 409-16. PMID: 23304311, PMCID: PMC3540581.Peer-Reviewed Original Research
2011
Applying semantic-based probabilistic context-free grammar to medical language processing – A preliminary study on parsing medication sentences
Xu H, AbdelRahman S, Lu Y, Denny J, Doan S. Applying semantic-based probabilistic context-free grammar to medical language processing – A preliminary study on parsing medication sentences. Journal Of Biomedical Informatics 2011, 44: 1068-1075. PMID: 21856440, PMCID: PMC3226929, DOI: 10.1016/j.jbi.2011.08.009.Peer-Reviewed Original Research
2007
Using contextual and lexical features to restructure and validate the classification of biomedical concepts
Fan J, Xu H, Friedman C. Using contextual and lexical features to restructure and validate the classification of biomedical concepts. BMC Bioinformatics 2007, 8: 264. PMID: 17650333, PMCID: PMC2014782, DOI: 10.1186/1471-2105-8-264.Peer-Reviewed Original ResearchMeSH KeywordsBiomedical ResearchMedical InformaticsSemanticsSoftwareTerminology as TopicUnified Medical Language SystemConceptsUnified Medical Language SystemString-based approachesMean reciprocal rankReciprocal rankNatural language processingError rateContextual featuresLexical featuresIntegration of dataLow error rateReasoning systemAutomatic approachComplementary classifiersLanguage processingClassification approachBiomedical terminologiesClassification errorOntological conceptsBiomedical conceptsOntological termsSyntactic approachLanguage systemClassifierSyntactic featuresOntologyGene symbol disambiguation using knowledge-based profiles
Xu H, Fan J, Hripcsak G, Mendonça E, Markatou M, Friedman C. Gene symbol disambiguation using knowledge-based profiles. Bioinformatics 2007, 23: 1015-1022. PMID: 17314123, DOI: 10.1093/bioinformatics/btm056.Peer-Reviewed Original ResearchConceptsKnowledge sourcesSimilarity scoresInformation retrieval methodsGene symbol disambiguationText mining systemKnowledge-based profilesTesting data setsBiomedical entitiesBiomedical domainMEDLINE abstractsHigh similarity scoresRetrieval methodAmbiguous genesEntrez GeneGene symbolsDisambiguation taskTesting set
2006
Natural language processing and visualization in the molecular imaging domain
Tulipano P, Tao Y, Millar W, Zanzonico P, Kolbert K, Xu H, Yu H, Chen L, Lussier Y, Friedman C. Natural language processing and visualization in the molecular imaging domain. Journal Of Biomedical Informatics 2006, 40: 270-281. PMID: 17084109, DOI: 10.1016/j.jbi.2006.08.002.Peer-Reviewed Original ResearchMeSH KeywordsAnimalsCell LineComputational BiologyDatabases, BibliographicDatabases, GeneticDiagnostic ImagingGenomicsHumansInformation Storage and RetrievalNatural Language ProcessingPhenotypeProgramming LanguagesSoftwareSystems IntegrationTerminology as TopicUser-Computer InterfaceVocabulary, ControlledConceptsImaging domainNatural language processing systemsNatural language processingLanguage processing systemJava viewerNLP systemsFormal evaluation studiesLanguage processingInformation resourcesProcessing systemMedical imagingIndex imagesSystem performanceBiological informationInformationImagesVisualizationBioMedLEEPerformanceNLPEvaluation studyDomainGenomics literatureSystemSimultaneous visualizationMachine learning and word sense disambiguation in the biomedical domain: design and evaluation issues
Xu H, Markatou M, Dimova R, Liu H, Friedman C. Machine learning and word sense disambiguation in the biomedical domain: design and evaluation issues. BMC Bioinformatics 2006, 7: 334. PMID: 16822321, PMCID: PMC1550263, DOI: 10.1186/1471-2105-7-334.Peer-Reviewed Original ResearchConceptsNatural language processingBiomedical domainInformation retrieval systemsML methodsWSD classifierSense disambiguationMachine learning methodsVector machine classifierError rateWord sense disambiguationRetrieval systemMachine learningML techniquesText miningBiomedical abbreviationsLanguage processingLearning methodsCross-validation methodWSD problemMachine classifierAccurate accessSense distributionClassifierBiomolecular entitiesWSD task