2023
Automated Identification of Missing IS-A Relations in the Human Phenotype Ontology.
Mohtashamian M, Hu R, Abeysinghe R, Hao X, Xu H, Cui L. Automated Identification of Missing IS-A Relations in the Human Phenotype Ontology. AMIA Annual Symposium Proceedings 2023, 2022: 785-794. PMID: 37128366, PMCID: PMC10148310.Peer-Reviewed Original Research
2022
Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature
Schutte D, Vasilakes J, Bompelli A, Zhou Y, Fiszman M, Xu H, Kilicoglu H, Bishop J, Adam T, Zhang R. Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature. Journal Of Biomedical Informatics 2022, 131: 104120. PMID: 35709900, PMCID: PMC9335448, DOI: 10.1016/j.jbi.2022.104120.Peer-Reviewed Original ResearchMeSH KeywordsDietary SupplementsNatural Language ProcessingPubMedSemanticsUnified Medical Language SystemConceptsUnified Medical Language SystemComprehensive knowledge graphDomain terminologyKnowledge graphSemantic relationsNatural language processing technologyLanguage processing technologyNLP toolsDownstream tasksF1 scoreSemantic relationshipsDiscovery patternsPubMed abstractsLimited coverageBiomedical literatureProcessing technologyLanguage systemSemRepDietary supplement informationManual reviewNovel methodologyGraphNodesDomainTask
2020
The UMLS knowledge sources at 30: indispensable to current research and applications in biomedical informatics
Humphreys B, Del Fiol G, Xu H. The UMLS knowledge sources at 30: indispensable to current research and applications in biomedical informatics. Journal Of The American Medical Informatics Association 2020, 27: 1499-1501. PMID: 33059366, PMCID: PMC7647371, DOI: 10.1093/jamia/ocaa208.Peer-Reviewed Original ResearchRepresentation of EHR data for predictive modeling: a comparison between UMLS and other terminologies
Rasmy L, Tiryaki F, Zhou Y, Xiang Y, Tao C, Xu H, Zhi D. Representation of EHR data for predictive modeling: a comparison between UMLS and other terminologies. Journal Of The American Medical Informatics Association 2020, 27: 1593-1599. PMID: 32930711, PMCID: PMC7647355, DOI: 10.1093/jamia/ocaa180.Peer-Reviewed Original ResearchMeSH KeywordsAgedDatabases, FactualElectronic Health RecordsFemaleHumansMaleMiddle AgedROC CurveUnified Medical Language SystemVocabulary, ControlledConceptsUnified Medical Language SystemRecurrent neural networkNeural networkPrediction performanceLogistic regressionPredictive modelingDeep learningData aggregationElectronic health record dataMachine learningRisk predictionBetter prediction performanceDengue hemorrhagic feverHealth record dataEHR dataCancer predictionLarge vocabularyDifferent tasksPredictive modelHeart failureDiabetes patientsPancreatic cancerClinical dataHemorrhagic feverICD-9
2019
Ontological representation–oriented term normalization and standardization of the Research Domain Criteria
Li F, Rao G, Du J, Xiang Y, Zhang Y, Selek S, Hamilton J, Xu H, Tao C. Ontological representation–oriented term normalization and standardization of the Research Domain Criteria. Health Informatics Journal 2019, 26: 726-737. PMID: 30843449, PMCID: PMC7863676, DOI: 10.1177/1460458219832059.Peer-Reviewed Original Research
2018
Combine Factual Medical Knowledge and Distributed Word Representation to Improve Clinical Named Entity Recognition.
Wu Y, Yang X, Bian J, Guo Y, Xu H, Hogan W. Combine Factual Medical Knowledge and Distributed Word Representation to Improve Clinical Named Entity Recognition. AMIA Annual Symposium Proceedings 2018, 2018: 1110-1117. PMID: 30815153, PMCID: PMC6371322.Peer-Reviewed Original ResearchMeSH KeywordsDeep LearningHumansNatural Language ProcessingNeural Networks, ComputerUnified Medical Language SystemConceptsRecurrent neural networkWord embeddingsOne-hot vectorsWord representationsLow-frequency wordsOnly word embeddingsClinical Named Entity RecognitionClinical NER tasksWord embedding methodsConditional Random FieldsStatistical language modelNamed Entity RecognitionUnlabeled corpusLanguage modelLanguage systemNER taskDecent representationFactual medical knowledgeImportant wordsDeep learning modelsEntity recognitionClinical corpusNamed Entity Recognition SystemArt performanceFeature representation
2016
Automated identification of molecular effects of drugs (AIMED)
Fathiamini S, Johnson A, Zeng J, Araya A, Holla V, Bailey A, Litzenburger B, Sanchez N, Khotskaya Y, Xu H, Meric-Bernstam F, Bernstam E, Cohen T. Automated identification of molecular effects of drugs (AIMED). Journal Of The American Medical Informatics Association 2016, 23: 758-765. PMID: 27107438, PMCID: PMC4926748, DOI: 10.1093/jamia/ocw030.Peer-Reviewed Original Research
2012
A new clustering method for detecting rare senses of abbreviations in clinical notes
Xu H, Wu Y, Elhadad N, Stetson P, Friedman C. A new clustering method for detecting rare senses of abbreviations in clinical notes. Journal Of Biomedical Informatics 2012, 45: 1075-1083. PMID: 22742938, PMCID: PMC3729222, DOI: 10.1016/j.jbi.2012.06.003.Peer-Reviewed Original ResearchAbbreviations as TopicAlgorithmsCluster AnalysisMedical RecordsNatural Language ProcessingUnified Medical Language System
2010
Mining Biomedical Literature for Terms related to Epidemiologic Exposures.
Xu H, Lu Y, Jiang M, Liu M, Denny J, Dai Q, Peterson N. Mining Biomedical Literature for Terms related to Epidemiologic Exposures. AMIA Annual Symposium Proceedings 2010, 2010: 897-901. PMID: 21347108, PMCID: PMC3041399.Peer-Reviewed Original Research
2007
Automated Acquisition of Disease–Drug Knowledge from Biomedical and Clinical Documents: An Initial Study
Chen E, Hripcsak G, Xu H, Markatou M, Friedman C. Automated Acquisition of Disease–Drug Knowledge from Biomedical and Clinical Documents: An Initial Study. Journal Of The American Medical Informatics Association 2007, 15: 87-98. PMID: 17947625, PMCID: PMC2274872, DOI: 10.1197/jamia.m2401.Peer-Reviewed Original ResearchA study of abbreviations in clinical notes.
Xu H, Stetson P, Friedman C. A study of abbreviations in clinical notes. AMIA Annual Symposium Proceedings 2007, 2007: 821-5. PMID: 18693951, PMCID: PMC2655910.Peer-Reviewed Original ResearchMeSH KeywordsAbbreviations as TopicDecision TreesHumansMEDLINENatural Language ProcessingUnified Medical Language SystemConceptsUnified Medical Language SystemNatural language processing systemsLanguage processing systemNarrative clinical notesDetection methodClinical notesDifferent knowledge sourcesSense inventoryDomain expertsNLP systemsCorrect sensesDecision supportText corporaKnowledge sourcesError detectionProcessing systemBiomedical literatureStudy of abbreviationsLanguage systemPatient informationAmbiguity rateBetter detection methodsDatabaseAnnotationAbbreviationsUsing contextual and lexical features to restructure and validate the classification of biomedical concepts
Fan J, Xu H, Friedman C. Using contextual and lexical features to restructure and validate the classification of biomedical concepts. BMC Bioinformatics 2007, 8: 264. PMID: 17650333, PMCID: PMC2014782, DOI: 10.1186/1471-2105-8-264.Peer-Reviewed Original ResearchMeSH KeywordsBiomedical ResearchMedical InformaticsSemanticsSoftwareTerminology as TopicUnified Medical Language SystemConceptsUnified Medical Language SystemString-based approachesMean reciprocal rankReciprocal rankNatural language processingError rateContextual featuresLexical featuresIntegration of dataLow error rateReasoning systemAutomatic approachComplementary classifiersLanguage processingClassification approachBiomedical terminologiesClassification errorOntological conceptsBiomedical conceptsOntological termsSyntactic approachLanguage systemClassifierSyntactic featuresOntologyUsing distributional analysis to semantically classify UMLS concepts.
Fan J, Xu H, Friedman C. Using distributional analysis to semantically classify UMLS concepts. 2007, 129: 519-23. PMID: 17911771.Peer-Reviewed Original Research
2006
Machine learning and word sense disambiguation in the biomedical domain: design and evaluation issues
Xu H, Markatou M, Dimova R, Liu H, Friedman C. Machine learning and word sense disambiguation in the biomedical domain: design and evaluation issues. BMC Bioinformatics 2006, 7: 334. PMID: 16822321, PMCID: PMC1550263, DOI: 10.1186/1471-2105-7-334.Peer-Reviewed Original ResearchConceptsNatural language processingBiomedical domainInformation retrieval systemsML methodsWSD classifierSense disambiguationMachine learning methodsVector machine classifierError rateWord sense disambiguationRetrieval systemMachine learningML techniquesText miningBiomedical abbreviationsLanguage processingLearning methodsCross-validation methodWSD problemMachine classifierAccurate accessSense distributionClassifierBiomolecular entitiesWSD task