2021
Benchmarking Effectiveness and Efficiency of Deep Learning Models for Semantic Textual Similarity in the Clinical Domain: Validation Study
Chen Q, Rankine A, Peng Y, Aghaarabi E, Lu Z. Benchmarking Effectiveness and Efficiency of Deep Learning Models for Semantic Textual Similarity in the Clinical Domain: Validation Study. JMIR Medical Informatics 2021, 9: e27386. PMID: 34967748, PMCID: PMC8759018, DOI: 10.2196/27386.Peer-Reviewed Original ResearchSemantic textual similarityConvolutional neural networkDeep learning modelsReal-time applicationsDL modelsSentence pairsNeural networkTextual similarityBERT modelNational Natural Language Processing Clinical ChallengesLearning modelNatural language processingAverage Pearson correlationData setsDifferent similarity levelsInference timeGeneralization capabilityManual annotationLanguage processingPearson correlationEnsemble modelWord orderTime efficiencyNegation termsTraining set
2020
Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records
Chen Q, Du J, Kim S, Wilbur W, Lu Z. Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records. BMC Medical Informatics And Decision Making 2020, 20: 73. PMID: 32349758, PMCID: PMC7191680, DOI: 10.1186/s12911-020-1044-0.Peer-Reviewed Original ResearchConceptsEnd deep learning modelEncoder networkDeep learning modelsSentence embeddingsBiomedical corporaLearning modelRandom forestTraditional machineText mining applicationsDeep learning approachSimilar sentencesMachine learning modelsHigh performanceMining applicationsRelated datasetsClinical notesLearning approachSentence semanticsPubMed abstractsChallenge taskEnsembled modelBest submissionSentence pairsNetworkTest set
2019
Evaluation of Five Sentence Similarity Models on Electronic Medical Records
Chen Q, Du J, Kim S, Wilbur W, Lu Z. Evaluation of Five Sentence Similarity Models on Electronic Medical Records. 2019, 533-533. DOI: 10.1145/3307339.3343239.Peer-Reviewed Original ResearchSentence similarity modelSimilarity modelLarge biomedical corporaLarge public datasetsTraditional machineClinical domainsBiomedical corporaText summarizationBidirectional transformersPublic datasetsSemantic similaritySmall datasetsSentence similarityDataset consistingSentence pairsDatasetElectronic medical recordsPrimary applicationCNNSummarizationBERTVital roleMachineDomainEmbedding