2022
A comparative study of pre-trained language models for named entity recognition in clinical trial eligibility criteria from multiple corpora
Li J, Wei Q, Ghiasvand O, Chen M, Lobanov V, Weng C, Xu H. A comparative study of pre-trained language models for named entity recognition in clinical trial eligibility criteria from multiple corpora. BMC Medical Informatics And Decision Making 2022, 22: 235. PMID: 36068551, PMCID: PMC9450226, DOI: 10.1186/s12911-022-01967-7.Peer-Reviewed Original ResearchConceptsPre-trained language modelsNER taskUnstructured textEntity recognitionLanguage modelNatural language processing techniquesClinical trial eligibility criteriaLanguage processing techniquesData augmentation resultsData augmentation approachDomain-specific corpusBetter performanceTransformer modelCross-validation showMultiple data sourcesEligibility criteria textBiomedical domainEmbedding modelsNER performanceAugmentation approachContextual embeddingsMeaningful informationEvaluation resultsSuch documentsProcessing techniques
2020
Named Entity Recognition from Table Headers in Randomized Controlled Trial Articles
Wei Q, Zhou Y, Zhao B, Hu X, Mei Q, Tao C, Xu H. Named Entity Recognition from Table Headers in Randomized Controlled Trial Articles. 2020, 00: 1-2. DOI: 10.1109/ichi48887.2020.9374323.Peer-Reviewed Original ResearchTable headersEntity recognitionDeep learning-based approachBiomedical text miningLearning-based approachNamed Entity RecognitionInformation extractionBiomedical entitiesF1 scoreText miningUnstructured natureBiomedical articlesContextual informationComputational applicationsHeaderSemantic complexityBetter performanceCorpusRecognitionInformationMiningApplicationsImportant informationComplexityBiomedical researchA study of entity-linking methods for normalizing Chinese diagnosis and procedure terms to ICD codes
Wang Q, Ji Z, Wang J, Wu S, Lin W, Li W, Ke L, Xiao G, Jiang Q, Xu H, Zhou Y. A study of entity-linking methods for normalizing Chinese diagnosis and procedure terms to ICD codes. Journal Of Biomedical Informatics 2020, 105: 103418. PMID: 32298846, DOI: 10.1016/j.jbi.2020.103418.Peer-Reviewed Original ResearchConceptsBM25 algorithmConcept rankingConcept generationConvolutional neural network approachNeural network approachRanking-based methodRanking methodSupport vector machineProcedure termsBetter performanceVector machineDifferent algorithmsMedical codingNetwork approachAlgorithmICD codesBERTExtended versionGood accuracyKnowledgebaseDisease termsClinical termsMatch criteriaCodeChinese diagnosis
2019
Cost-sensitive Active Learning for Phenotyping of Electronic Health Records.
Ji Z, Wei Q, Franklin A, Cohen T, Xu H. Cost-sensitive Active Learning for Phenotyping of Electronic Health Records. AMIA Joint Summits On Translational Science Proceedings 2019, 2019: 829-838. PMID: 31259040, PMCID: PMC6568101.Peer-Reviewed Original ResearchAnnotation timeElectronic health recordsActive learningMachine learning-based methodsCost-sensitive active learningLarge annotated datasetLearning-based methodsHealth recordsUse casesAnnotated datasetUser 1AL algorithmUser 2Phenotyping algorithmAL approachSecondary useAlgorithmBetter performanceActual timeLearningExperimental resultsBreast cancer patientsDatasetModel performancePassive learning
2017
CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines
Soysal E, Wang J, Jiang M, Wu Y, Pakhomov S, Liu H, Xu H. CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines. Journal Of The American Medical Informatics Association 2017, 25: 331-336. PMID: 29186491, PMCID: PMC7378877, DOI: 10.1093/jamia/ocx132.Peer-Reviewed Original ResearchGraphic user interfaceUser interfaceUser-friendly graphic user interfaceNatural language processing systemsClinical natural language processing (NLP) systemsNatural language processing pipelineKnowledge Extraction SystemLanguage processing pipelineClinical Text AnalysisLanguage processing systemNLP componentsNLP toolkitInformation extractionNLP pipelineUse casesEntity recognitionClinical textEnd usersNLP communityProcessing pipelineProcessing systemIndividual tasksIndividual applicationsText analysisBetter performance
2014
Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks
Tang B, Cao H, Wang X, Chen Q, Xu H. Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks. BioMed Research International 2014, 2014: 240403. PMID: 24729964, PMCID: PMC3963372, DOI: 10.1155/2014/240403.Peer-Reviewed Original ResearchConceptsBiomedical Named Entity RecognitionWord representationsNamed Entity Recognition (NER) taskMachine learning-based approachWord representation featuresNatural language processingLearning-based approachEntity recognition taskNamed Entity RecognitionCluster-based representationJNLPBA corpusEntity recognitionBiomedical domainF-measureLanguage processingRepresentation featuresWord embeddingsRecognition taskWR algorithmDistributional representationsTaskBetter performanceAlgorithmRepresentationDifferent types
2013
Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features
Tang B, Cao H, Wu Y, Jiang M, Xu H. Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features. BMC Medical Informatics And Decision Making 2013, 13: s1. PMID: 23566040, PMCID: PMC3618243, DOI: 10.1186/1472-6947-13-s1-s1.Peer-Reviewed Original ResearchConceptsStructural support vector machineWord representation featuresClinical NER tasksConditional Random FieldsSupport vector machinePerformance of MLClinical NER systemMachine learningRepresentation featuresNER systemNER taskVector machineEntity recognitionNatural language processing researchSequential labeling algorithmClinical entity recognitionLarge margin theoryClinical text processingLanguage processing researchPerformance of CRFsHighest F-measureClinical NLP researchI2b2 NLP challengeSame feature setsBetter performance
2010
Recognizing Medication related Entities in Hospital Discharge Summaries using Support Vector Machine.
Doan S, Xu H. Recognizing Medication related Entities in Hospital Discharge Summaries using Support Vector Machine. Proceedings - International Conference On Computational Linguistics 2010, 2010: 259-266. PMID: 26848286, PMCID: PMC4736747.Peer-Reviewed Original ResearchSupport vector machineHospital discharge summariesConditional Random FieldsDischarge summariesMedication namesRelated entitiesClinical textVector machineType of medicationNamed Entity Recognition (NER) taskEntity recognition taskRule-based systemBest F-scoreI2b2 NLP challengeTypes of featuresF-scoreI2b2 challengeNLP challengeNER systemSemantic featuresRecognition taskMachineData setsRandom fieldsBetter performance