2024
Ascle—A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study
Yang R, Zeng Q, You K, Qiao Y, Huang L, Hsieh C, Rosand B, Goldwasser J, Dave A, Keenan T, Ke Y, Hong C, Liu N, Chew E, Radev D, Lu Z, Xu H, Chen Q, Li I. Ascle—A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study. Journal Of Medical Internet Research 2024, 26: e60601. PMID: 39361955, PMCID: PMC11487205, DOI: 10.2196/60601.Peer-Reviewed Original ResearchConceptsNatural language processingNatural language processing toolkitQuestion-answering taskLanguage modelText generationText processingDomain-specific language modelsNatural language processing functionsMinimal programming expertiseText generation tasksMedical knowledge graphMachine translation tasksROUGE-L scoreDomain-specific challengesAll-in-one solutionROUGE-LText summarizationBLEU scoreKnowledge graphMachine translationUnstructured textQuestion-answeringHugging FaceProcessing toolkitLanguage processing
2023
AIONER: all-in-one scheme-based biomedical named entity recognition using deep learning
Luo L, Wei C, Lai P, Leaman R, Chen Q, Lu Z. AIONER: all-in-one scheme-based biomedical named entity recognition using deep learning. Bioinformatics 2023, 39: btad310. PMID: 37171899, PMCID: PMC10212279, DOI: 10.1093/bioinformatics/btad310.Peer-Reviewed Original ResearchConceptsDeep learningEntity recognitionTraining dataEntity typesLabeling training dataNatural language textText mining tasksSignificant domain expertiseMulti-task learningMining tasksInformation extractionBioNER taskDomain expertiseBiomedical entitiesIndependent tasksSource codeBenchmark tasksLanguage textBiomedical textArt approachesAccurate annotationExternal dataData scarcityTaskLearning
2022
Assigning species information to corresponding genes by a sequence labeling framework
Luo L, Wei C, Lai P, Chen Q, Islamaj R, Lu Z. Assigning species information to corresponding genes by a sequence labeling framework. Database 2022, 2022: baac090. PMID: 36227127, PMCID: PMC9558450, DOI: 10.1093/database/baac090.Peer-Reviewed Original ResearchConceptsNovel deep learning-based frameworkDeep learning-based frameworkLearning-based frameworkText mining algorithmsSequence labeling taskGene normalization taskSequence labeling frameworkBinary classification frameworkSource codeBaseline methodsNormalization taskClassification frameworkLabeling taskLabeling frameworkAutomatic assignmentHigh-performance methodHeuristic rulesGene mentionsBenchmarking resultsDatabase URLDatabase recordsAssignment task
2021
Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing
Chen Q, Leaman R, Allot A, Luo L, Wei C, Yan S, Lu Z. Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing. Annual Review Of Biomedical Data Science 2021, 4: 1-27. PMID: 34465169, DOI: 10.1146/annurev-biodatasci-021821-061045.Peer-Reviewed Original ResearchConceptsNatural language processingArtificial intelligenceLanguage processingInformation needsLiterature-based discoveryInformation retrievalEntity recognitionMisinformation detectionInformation overloadNLP studiesNLP tasksEmotion analysisTopic modelingCOVID-19 pandemicIntelligenceAdditional tasksHuman languagePublic health measuresTaskHealth measuresProcessingSerious health effectsHealth effectsRetrievalDatasetLitSuggest: a web-based system for literature recommendation and curation using machine learning
Allot A, Lee K, Chen Q, Luo L, Lu Z. LitSuggest: a web-based system for literature recommendation and curation using machine learning. Nucleic Acids Research 2021, 49: w352-w358. PMID: 33950204, PMCID: PMC8262723, DOI: 10.1093/nar/gkab326.Peer-Reviewed Original ResearchConceptsNatural language processingWeb-based systemQuery methodSearch systemSearch queriesMachine learningWeb serverCuration servicesAdvanced machineUser projectsLanguage processingClassification resultsTraining corpusSingle interfaceUsersBiomedical researchersCollaborative analysisHigh accuracyLiterature recommendationsPubMed articlesMachineCurationComputational methodsSpecialized knowledgeKeywords
2019
LitSense: making sense of biomedical literature at sentence level
Allot A, Chen Q, Kim S, Alvarez R, Comeau D, Wilbur W, Lu Z. LitSense: making sense of biomedical literature at sentence level. Nucleic Acids Research 2019, 47: w594-w599. PMID: 31020319, PMCID: PMC6602490, DOI: 10.1093/nar/gkz289.Peer-Reviewed Original ResearchConceptsFirst web-based systemFilter search resultsNeural embedding approachBiomedical literatureUser-friendly interfaceWeb-based systemTerm-weighting approachUser queriesQuery formulationUnified accessKeyword matchesBiomedical entitiesSentence retrievalResults visualizationSearch resultsEmbedding approachCurrent toolsQueriesRetrievalSentence levelRare termsRelevant resultsSignificant effortsPrevious knowledgePubTatorOverview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine
Doğan R, Kim S, Chatr-aryamontri A, Wei C, Comeau D, Antunes R, Matos S, Chen Q, Elangovan A, Panyam N, Verspoor K, Liu H, Wang Y, Liu Z, Altınel B, Hüsünbeyi Z, Özgür A, Fergadis A, Wang C, Dai H, Tran T, Kavuluru R, Luo L, Steppi A, Zhang J, Qu J, Lu Z. Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine. Database 2019, 2019: bay147. PMID: 30689846, PMCID: PMC6348314, DOI: 10.1093/database/bay147.Peer-Reviewed Original ResearchConceptsRelation extraction taskDocument triage taskBest F-scoreExtraction taskTriage taskKnowledge basesF-scorePubMed documentsArt deep learning methodsText-mining research communityLarge knowledge basesDeep learning methodsText mining systemText mining modelText mining toolsBest average precisionData setsLarge-scale corpusHuman annotationsElectronic health recordsSystem developersBetter recallText miningAverage precisionLearning methods