2024
Ascle—A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study
Yang R, Zeng Q, You K, Qiao Y, Huang L, Hsieh C, Rosand B, Goldwasser J, Dave A, Keenan T, Ke Y, Hong C, Liu N, Chew E, Radev D, Lu Z, Xu H, Chen Q, Li I. Ascle—A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study. Journal Of Medical Internet Research 2024, 26: e60601. PMID: 39361955, PMCID: PMC11487205, DOI: 10.2196/60601.Peer-Reviewed Original ResearchConceptsNatural language processingNatural language processing toolkitQuestion-answering taskLanguage modelText generationText processingDomain-specific language modelsNatural language processing functionsMinimal programming expertiseText generation tasksMedical knowledge graphMachine translation tasksROUGE-L scoreDomain-specific challengesAll-in-one solutionROUGE-LText summarizationBLEU scoreKnowledge graphMachine translationUnstructured textQuestion-answeringHugging FaceProcessing toolkitLanguage processing
2022
Assigning species information to corresponding genes by a sequence labeling framework
Luo L, Wei C, Lai P, Chen Q, Islamaj R, Lu Z. Assigning species information to corresponding genes by a sequence labeling framework. Database 2022, 2022: baac090. PMID: 36227127, PMCID: PMC9558450, DOI: 10.1093/database/baac090.Peer-Reviewed Original ResearchConceptsNovel deep learning-based frameworkDeep learning-based frameworkLearning-based frameworkText mining algorithmsSequence labeling taskGene normalization taskSequence labeling frameworkBinary classification frameworkSource codeBaseline methodsNormalization taskClassification frameworkLabeling taskLabeling frameworkAutomatic assignmentHigh-performance methodHeuristic rulesGene mentionsBenchmarking resultsDatabase URLDatabase recordsAssignment taskDetecting visually significant cataract using retinal photograph-based deep learning
Tham Y, Goh J, Anees A, Lei X, Rim T, Chee M, Wang Y, Jonas J, Thakur S, Teo Z, Cheung N, Hamzah H, Tan G, Husain R, Sabanayagam C, Wang J, Chen Q, Lu Z, Keenan T, Chew E, Tan A, Mitchell P, Goh R, Xu X, Liu Y, Wong T, Cheng C. Detecting visually significant cataract using retinal photograph-based deep learning. Nature Aging 2022, 2: 264-271. PMID: 37118370, PMCID: PMC10154193, DOI: 10.1038/s43587-022-00171-6.Peer-Reviewed Original Research
2020
Better synonyms for enriching biomedical search
Yeganova L, Kim S, Chen Q, Balasanov G, Wilbur W, Lu Z. Better synonyms for enriching biomedical search. Journal Of The American Medical Informatics Association 2020, 27: 1894-1902. PMID: 33083825, PMCID: PMC7727334, DOI: 10.1093/jamia/ocaa151.Peer-Reviewed Original ResearchBioConceptVec: Creating and evaluating literature-based biomedical concept embeddings on a large scale
Chen Q, Lee K, Yan S, Kim S, Wei C, Lu Z. BioConceptVec: Creating and evaluating literature-based biomedical concept embeddings on a large scale. PLOS Computational Biology 2020, 16: e1007617. PMID: 32324731, PMCID: PMC7237030, DOI: 10.1371/journal.pcbi.1007617.Peer-Reviewed Original ResearchConceptsConcept embeddingsNER toolsLearning modelBiomedical text mining applicationsAdvanced deep learning modelsDifferent machine learning modelsEvaluation resultsText mining applicationsDeep learning modelsSemantics of conceptsMachine learning modelsLiterature-based discoveryConcept recognitionDifferent machineProtein-protein interaction predictionPubMed abstractsRecognition toolsMassive numberVector representationBiomedical conceptsLarge marginExtrinsic evaluationBiomedical literatureIntrinsic evaluationSemantic relatedness
2019
BioWordVec, improving biomedical word embeddings with subword information and MeSH
Zhang Y, Chen Q, Yang Z, Lin H, Lu Z. BioWordVec, improving biomedical word embeddings with subword information and MeSH. Scientific Data 2019, 6: 52. PMID: 31076572, PMCID: PMC6510737, DOI: 10.1038/s41597-019-0055-0.Peer-Reviewed Original ResearchConceptsWord embeddingsSubword informationWord representationsBiomedical natural language processingNatural language processingMultiple NLP tasksBiomedical word embeddingsInformation retrievalUnlabeled textBiomedical textText miningBiomedical domainLanguage processingNLP tasksStructured resourcesChallenging taskPrevious stateBenchmarking resultsLarge corpusEmbeddingWord levelBioWordVecSuch informationTaskInformation