2022
ClinicalLayoutLM: A Pre-trained Multi-modal Model for Understanding Scanned Document in Electronic Health Records
Wei Q, Zuo X, Anjum O, Hu Y, Denlinger R, Bernstam E, Citardi M, Xu H. ClinicalLayoutLM: A Pre-trained Multi-modal Model for Understanding Scanned Document in Electronic Health Records. 2022, 00: 2821-2827. DOI: 10.1109/bigdata55660.2022.10020569.Peer-Reviewed Original ResearchOptical character recognitionMulti-modal modelElectronic health recordsClinical documentsNatural language processing tasksInformation extraction technologyPre-trained modelsHealth recordsLanguage processing tasksInformation extractionImage informationF1 scoreCharacter recognitionLayout analysisProcessing tasksMulti-modal approachClinical corpusBaseline modelDocumentsOpen domainTaskExtraction technologyClinical operationsDifferent categoriesText
2019
Enhancing clinical concept extraction with contextual embeddings
Si Y, Wang J, Xu H, Roberts K. Enhancing clinical concept extraction with contextual embeddings. Journal Of The American Medical Informatics Association 2019, 26: 1297-1304. PMID: 31265066, PMCID: PMC6798561, DOI: 10.1093/jamia/ocz096.Peer-Reviewed Original ResearchConceptsClinical concept extractionContextual embeddingsNatural language processing tasksTraditional word embeddingsTraditional word representationsClinical NLP tasksLanguage processing tasksSemantic informationWord embedding methodsLarge language modelsArt performanceConcept extraction taskSemEval 2014Word representationsNLP tasksLanguage modelWord embeddingsProcessing tasksNeural network-based representationI2b2 2010Concept extractionTaskLarge clinical corpusClinical corpusNetwork-based representationA study of deep learning approaches for medication and adverse drug event extraction from clinical text
Wei Q, Ji Z, Li Z, Du J, Wang J, Xu J, Xiang Y, Tiryaki F, Wu S, Zhang Y, Tao C, Xu H. A study of deep learning approaches for medication and adverse drug event extraction from clinical text. Journal Of The American Medical Informatics Association 2019, 27: 13-21. PMID: 31135882, PMCID: PMC6913210, DOI: 10.1093/jamia/ocz063.Peer-Reviewed Original ResearchConceptsDeep learning-based approachDeep learning approachLearning-based approachTraditional machineLearning approachNational NLP Clinical ChallengesAdverse drug event extractionOutperform traditional machineDifferent ensemble approachesConditional Random FieldsSequence labeling approachMIMIC-III databaseEvent extractionMedical domainEntity recognitionClassification componentF1 scoreClinical textRelation extractionClinical documentsVector machineEnd evaluationEnsemble approachClinical corpusMachine
2018
Combine Factual Medical Knowledge and Distributed Word Representation to Improve Clinical Named Entity Recognition.
Wu Y, Yang X, Bian J, Guo Y, Xu H, Hogan W. Combine Factual Medical Knowledge and Distributed Word Representation to Improve Clinical Named Entity Recognition. AMIA Annual Symposium Proceedings 2018, 2018: 1110-1117. PMID: 30815153, PMCID: PMC6371322.Peer-Reviewed Original ResearchConceptsRecurrent neural networkWord embeddingsOne-hot vectorsWord representationsLow-frequency wordsOnly word embeddingsClinical Named Entity RecognitionClinical NER tasksWord embedding methodsConditional Random FieldsStatistical language modelNamed Entity RecognitionUnlabeled corpusLanguage modelLanguage systemNER taskDecent representationFactual medical knowledgeImportant wordsDeep learning modelsEntity recognitionClinical corpusNamed Entity Recognition SystemArt performanceFeature representationClinical text annotation - what factors are associated with the cost of time?
Wei Q, Franklin A, Cohen T, Xu H. Clinical text annotation - what factors are associated with the cost of time? AMIA Annual Symposium Proceedings 2018, 2018: 1552-1560. PMID: 30815201, PMCID: PMC6371268.Peer-Reviewed Original ResearchConceptsAnnotation timeClinical textNatural language processing modelsClinical corpusIndividual user behaviorEntity recognition taskLanguage processing modelsPractice of annotationCharacteristics of sentencesClinical Text AnnotationText annotationsUser behaviorIndividual usersCost of timeActive learning researchRecognition taskLearning researchProcessing modelCost modelAnnotationUsersLimited workCorpusTextTask
2016
A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD)
Wu Y, Denny J, Rosenbloom S, Miller R, Giuse D, Wang L, Blanquicett C, Soysal E, Xu J, Xu H. A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD). Journal Of The American Medical Informatics Association 2016, 24: e79-e86. PMID: 27539197, PMCID: PMC7651947, DOI: 10.1093/jamia/ocw109.Peer-Reviewed Original ResearchConceptsClinical NLP systemsOpen-source frameworkNLP systemsClinical corpusClinical abbreviationsClinic visit notesSense inventoryKnowledge Extraction SystemAbbreviation recognitionWord sense disambiguation methodDischarge summariesF1 scoreExternal corpusClinical narrativesSense disambiguation methodSystem capabilitiesVanderbilt University Medical CenterWrapperFrequent abbreviationsDisambiguation methodMetaMapAbbreviation identificationCardsVisit notesDisambiguation
2015
Domain adaptation for semantic role labeling of clinical text
Zhang Y, Tang B, Jiang M, Wang J, Xu H. Domain adaptation for semantic role labeling of clinical text. Journal Of The American Medical Informatics Association 2015, 22: 967-979. PMID: 26063745, PMCID: PMC4986662, DOI: 10.1093/jamia/ocu048.Peer-Reviewed Original Research