2024
Ascle—A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study
Yang R, Zeng Q, You K, Qiao Y, Huang L, Hsieh C, Rosand B, Goldwasser J, Dave A, Keenan T, Ke Y, Hong C, Liu N, Chew E, Radev D, Lu Z, Xu H, Chen Q, Li I. Ascle—A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study. Journal Of Medical Internet Research 2024, 26: e60601. PMID: 39361955, PMCID: PMC11487205, DOI: 10.2196/60601.Peer-Reviewed Original ResearchConceptsNatural language processingNatural language processing toolkitQuestion-answering taskLanguage modelText generationText processingDomain-specific language modelsNatural language processing functionsMinimal programming expertiseText generation tasksMedical knowledge graphMachine translation tasksROUGE-L scoreDomain-specific challengesAll-in-one solutionROUGE-LText summarizationBLEU scoreKnowledge graphMachine translationUnstructured textQuestion-answeringHugging FaceProcessing toolkitLanguage processingAdvancing entity recognition in biomedicine via instruction tuning of large language models
Keloth V, Hu Y, Xie Q, Peng X, Wang Y, Zheng A, Selek M, Raja K, Wei C, Jin Q, Lu Z, Chen Q, Xu H. Advancing entity recognition in biomedicine via instruction tuning of large language models. Bioinformatics 2024, 40: btae163. PMID: 38514400, PMCID: PMC11001490, DOI: 10.1093/bioinformatics/btae163.Peer-Reviewed Original ResearchNamed Entity RecognitionSequence labeling taskNatural language processingBiomedical NER datasetsLanguage modelNER datasetsEntity recognitionLabeling taskText generationField of natural language processingBiomedical NERFew-shot learning capabilityReasoning tasksMulti-domain scenariosDomain-specific modelsEnd-to-endMinimal fine-tuningSOTA performanceF1 scoreHealthcare applicationsBiomedical entitiesBiomedical domainLanguage processingMulti-taskingPubMedBERT model
2021
Benchmarking Effectiveness and Efficiency of Deep Learning Models for Semantic Textual Similarity in the Clinical Domain: Validation Study
Chen Q, Rankine A, Peng Y, Aghaarabi E, Lu Z. Benchmarking Effectiveness and Efficiency of Deep Learning Models for Semantic Textual Similarity in the Clinical Domain: Validation Study. JMIR Medical Informatics 2021, 9: e27386. PMID: 34967748, PMCID: PMC8759018, DOI: 10.2196/27386.Peer-Reviewed Original ResearchSemantic textual similarityConvolutional neural networkDeep learning modelsReal-time applicationsDL modelsSentence pairsNeural networkTextual similarityBERT modelNational Natural Language Processing Clinical ChallengesLearning modelNatural language processingAverage Pearson correlationData setsDifferent similarity levelsInference timeGeneralization capabilityManual annotationLanguage processingPearson correlationEnsemble modelWord orderTime efficiencyNegation termsTraining setArtificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing
Chen Q, Leaman R, Allot A, Luo L, Wei C, Yan S, Lu Z. Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing. Annual Review Of Biomedical Data Science 2021, 4: 1-27. PMID: 34465169, DOI: 10.1146/annurev-biodatasci-021821-061045.Peer-Reviewed Original ResearchConceptsNatural language processingArtificial intelligenceLanguage processingInformation needsLiterature-based discoveryInformation retrievalEntity recognitionMisinformation detectionInformation overloadNLP studiesNLP tasksEmotion analysisTopic modelingCOVID-19 pandemicIntelligenceAdditional tasksHuman languagePublic health measuresTaskHealth measuresProcessingSerious health effectsHealth effectsRetrievalDatasetLitSuggest: a web-based system for literature recommendation and curation using machine learning
Allot A, Lee K, Chen Q, Luo L, Lu Z. LitSuggest: a web-based system for literature recommendation and curation using machine learning. Nucleic Acids Research 2021, 49: w352-w358. PMID: 33950204, PMCID: PMC8262723, DOI: 10.1093/nar/gkab326.Peer-Reviewed Original ResearchConceptsNatural language processingWeb-based systemQuery methodSearch systemSearch queriesMachine learningWeb serverCuration servicesAdvanced machineUser projectsLanguage processingClassification resultsTraining corpusSingle interfaceUsersBiomedical researchersCollaborative analysisHigh accuracyLiterature recommendationsPubMed articlesMachineCurationComputational methodsSpecialized knowledgeKeywords
2019
BioWordVec, improving biomedical word embeddings with subword information and MeSH
Zhang Y, Chen Q, Yang Z, Lin H, Lu Z. BioWordVec, improving biomedical word embeddings with subword information and MeSH. Scientific Data 2019, 6: 52. PMID: 31076572, PMCID: PMC6510737, DOI: 10.1038/s41597-019-0055-0.Peer-Reviewed Original ResearchConceptsWord embeddingsSubword informationWord representationsBiomedical natural language processingNatural language processingMultiple NLP tasksBiomedical word embeddingsInformation retrievalUnlabeled textBiomedical textText miningBiomedical domainLanguage processingNLP tasksStructured resourcesChallenging taskPrevious stateBenchmarking resultsLarge corpusEmbeddingWord levelBioWordVecSuch informationTaskInformation
2018
Sentence Similarity Measures Revisited
Chen Q, Kim S, Wilbur W, Lu Z. Sentence Similarity Measures Revisited. 2018, 531-532. DOI: 10.1145/3233547.3233640.Peer-Reviewed Original ResearchSentence similaritySimilarity measureNatural language processingMultiple similarity measuresSentence similarity measureNDCG scoresText summarizationBiomedical domainLanguage processingLarge-scale benchmark setPubMed abstractsComputational biologySemantic measuresBenchmark setExperimental resultsSummarizationSentencesDatasetCrucial componentDocumentsProcessingSimilaritySet