2024
PubMed Computed Authors in 2024: an open resource of disambiguated author names in biomedical literature
Tian S, Chen Q, Comeau D, Wilbur W, Lu Z. PubMed Computed Authors in 2024: an open resource of disambiguated author names in biomedical literature. Bioinformatics 2024, 40: btae672. PMID: 39520405, PMCID: PMC11588201, DOI: 10.1093/bioinformatics/btae672.Peer-Reviewed Original ResearchAuthor name disambiguationAuthor namesBiomedical literature searchWeb APIsEnhancement algorithmAuthority datasetsQueryBiomedical literatureDatasetAuthors' algorithmImproved accuracyIndividual researchersAlgorithmPubMed articlesSupplementary dataDisambiguationORCIDLiterature retrievalDownloadRetrievalWebAPIComprehensive datasetBioinformaticsAugmenting biomedical named entity recognition with general-domain resources
Yin Y, Kim H, Xiao X, Wei C, Kang J, Lu Z, Xu H, Fang M, Chen Q. Augmenting biomedical named entity recognition with general-domain resources. Journal Of Biomedical Informatics 2024, 159: 104731. PMID: 39368529, DOI: 10.1016/j.jbi.2024.104731.Peer-Reviewed Original ResearchBioNER datasetsMulti-task learningNER datasetsEntity typesBiomedical datasetsBaseline modelGeneral domain datasetsBiomedical language modelNeural network-basedYield performance improvementsBioNER modelsEntity recognitionBiomedical corporaHuman annotatorsLabel ambiguityLanguage modelTransfer learningF1 scoreBioNERHuman effortNetwork-basedBiomedical resourcesPerformance improvementDatasetSuperior performanceGeneGPT: augmenting large language models with domain tools for improved access to biomedical information
Jin Q, Yang Y, Chen Q, Lu Z. GeneGPT: augmenting large language models with domain tools for improved access to biomedical information. Bioinformatics 2024, 40: btae075. PMID: 38341654, PMCID: PMC10904143, DOI: 10.1093/bioinformatics/btae075.Peer-Reviewed Original ResearchAPI callsWeb APIsLanguage modelState-of-the-art performanceMulti-hop questionsState-of-the-artDomain-specific toolsDecoding algorithmNational Center for Biotechnology InformationGPT-3Biomedical informationDatabase utilizationExperimental resultsAPITaskDomain toolsLearningChatGPTSpecialized knowledgeInformationLanguageGenomic questionsAlgorithmDatasetBiotechnology Information
2023
BioREx: Improving biomedical relation extraction by leveraging heterogeneous datasets
Lai P, Wei C, Luo L, Chen Q, Lu Z. BioREx: Improving biomedical relation extraction by leveraging heterogeneous datasets. Journal Of Biomedical Informatics 2023, 146: 104487. PMID: 37673376, DOI: 10.1016/j.jbi.2023.104487.Peer-Reviewed Original ResearchBiomedical relation extractionRelation extractionRE tasksNatural language processing researchData-centric approachKnowledge graph constructionMulti-task learningLanguage processing researchIndividual datasetsLiterature-based discoveryChemical-induced disease relationsDataset annotationDomain knowledgeTransfer learningTraining dataHeterogeneous datasetsArt methodsNovel frameworkGraph constructionFree textData heterogeneityLarge datasetsBiomedical conceptsProcessing researchDataset
2021
Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing
Chen Q, Leaman R, Allot A, Luo L, Wei C, Yan S, Lu Z. Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing. Annual Review Of Biomedical Data Science 2021, 4: 1-27. PMID: 34465169, DOI: 10.1146/annurev-biodatasci-021821-061045.Peer-Reviewed Original ResearchConceptsNatural language processingArtificial intelligenceLanguage processingInformation needsLiterature-based discoveryInformation retrievalEntity recognitionMisinformation detectionInformation overloadNLP studiesNLP tasksEmotion analysisTopic modelingCOVID-19 pandemicIntelligenceAdditional tasksHuman languagePublic health measuresTaskHealth measuresProcessingSerious health effectsHealth effectsRetrievalDataset
2019
Evaluation of Five Sentence Similarity Models on Electronic Medical Records
Chen Q, Du J, Kim S, Wilbur W, Lu Z. Evaluation of Five Sentence Similarity Models on Electronic Medical Records. 2019, 533-533. DOI: 10.1145/3307339.3343239.Peer-Reviewed Original ResearchSentence similarity modelSimilarity modelLarge biomedical corporaLarge public datasetsTraditional machineClinical domainsBiomedical corporaText summarizationBidirectional transformersPublic datasetsSemantic similaritySmall datasetsSentence similarityDataset consistingSentence pairsDatasetElectronic medical recordsPrimary applicationCNNSummarizationBERTVital roleMachineDomainEmbeddingA multi-task deep learning model for the classification of Age-related Macular Degeneration.
Chen Q, Peng Y, Keenan T, Dharssi S, Agro N E, Wong W, Chew E, Lu Z. A multi-task deep learning model for the classification of Age-related Macular Degeneration. AMIA Joint Summits On Translational Science Proceedings 2019, 2019: 505-514. PMID: 31259005, PMCID: PMC6568069.Peer-Reviewed Original ResearchDeep learning modelsLearning modelMulti-task deep learning modelNovel deep learning modelMulti-task learning techniquesColor fundus imagesImage datasetsLearning techniquesAutomated classificationManual classificationArt modelsManual gradingFundus imagesGrading processClassificationImagesAge-related macular degenerationCurrent stateEye Disease Study GroupAMD severity scaleOverfittingDatasetMacular degenerationModelAccuracy
2018
Sentence Similarity Measures Revisited
Chen Q, Kim S, Wilbur W, Lu Z. Sentence Similarity Measures Revisited. 2018, 531-532. DOI: 10.1145/3233547.3233640.Peer-Reviewed Original ResearchSentence similaritySimilarity measureNatural language processingMultiple similarity measuresSentence similarity measureNDCG scoresText summarizationBiomedical domainLanguage processingLarge-scale benchmark setPubMed abstractsComputational biologySemantic measuresBenchmark setExperimental resultsSummarizationSentencesDatasetCrucial componentDocumentsProcessingSimilaritySet