Hyunjae Kim
Postdoctoral Associate in Biomedical Informatics and Data ScienceAbout
Research
Publications
Featured Publications
Small language models learn enhanced reasoning skills from medical textbooks
Kim H, Hwang H, Lee J, Park S, Kim D, Lee T, Yoon C, Sohn J, Park J, Reykhart O, Fetherston T, Choi D, Kwak S, Chen Q, Kang J. Small language models learn enhanced reasoning skills from medical textbooks. Npj Digital Medicine 2025, 8: 240. PMID: 40316765, PMCID: PMC12048634, DOI: 10.1038/s41746-025-01653-8.Peer-Reviewed Original ResearchLanguage modelMulti-step reasoningEfficient training methodReasoning capabilitiesMedical domainHardware constraintsComplex medical tasksExam datasetsFine-tuningMedical tasksHuman scoresCurated datasetTraining methodsDatasetExpert evaluationMedical applicationsReasoning abilityPrivacyLimited parametersHardwareMedical textbooksTaskReasoning skillsReasonsModel
2025
Ophthalmological Question Answering and Reasoning Using OpenAI o1 vs Other Large Language Models
Srinivasan S, Ai X, Zou M, Zou K, Kim H, Lo T, Pushpanathan K, Yang G, Goh J, Kong Y, Li A, Singer M, Jin K, Antaki F, Chen D, Liu D, Adelman R, Chen Q, Tham Y. Ophthalmological Question Answering and Reasoning Using OpenAI o1 vs Other Large Language Models. JAMA Ophthalmology 2025, 143: 740-748. PMID: 40742581, PMCID: PMC12314776, DOI: 10.1001/jamaophthalmol.2025.2413.Peer-Reviewed Original ResearchAuthor Correction: Small language models learn enhanced reasoning skills from medical textbooks
Kim H, Hwang H, Lee J, Park S, Kim D, Lee T, Yoon C, Sohn J, Park J, Reykhart O, Fetherston T, Choi D, Kwak S, Chen Q, Kang J. Author Correction: Small language models learn enhanced reasoning skills from medical textbooks. Npj Digital Medicine 2025, 8: 339. PMID: 40481271, PMCID: PMC12144237, DOI: 10.1038/s41746-025-01745-5.Commentaries, Editorials and LettersETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
Lee T, Yoon C, Jang K, Lee D, Song M, Kim H, Kang J. ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage. 2025, 5497-5512. DOI: 10.18653/v1/2025.naacl-long.283.Peer-Reviewed Original ResearchRationale-Guided Retrieval Augmented Generation for Medical Question Answering
Sohn J, Park Y, Yoon C, Park S, Hwang H, Sung M, Kim H, Kang J. Rationale-Guided Retrieval Augmented Generation for Medical Question Answering. 2025, 12739-12753. DOI: 10.18653/v1/2025.naacl-long.635.Peer-Reviewed Original Research
2024
Augmenting biomedical named entity recognition with general-domain resources
Yin Y, Kim H, Xiao X, Wei C, Kang J, Lu Z, Xu H, Fang M, Chen Q. Augmenting biomedical named entity recognition with general-domain resources. Journal Of Biomedical Informatics 2024, 159: 104731. PMID: 39368529, DOI: 10.1016/j.jbi.2024.104731.Peer-Reviewed Original ResearchBioNER datasetsMulti-task learningNER datasetsEntity typesBiomedical datasetsBaseline modelGeneral domain datasetsBiomedical language modelNeural network-basedYield performance improvementsBioNER modelsEntity recognitionBiomedical corporaHuman annotatorsLabel ambiguityLanguage modelTransfer learningF1 scoreBioNERHuman effortNetwork-basedBiomedical resourcesPerformance improvementDatasetSuperior performance
2023
Biomedical relation extraction with knowledge base–refined weak supervision
Yoon W, Yi S, Jackson R, Kim H, Kim S, Kang J. Biomedical relation extraction with knowledge base–refined weak supervision. Database 2023, 2023: baad054. PMID: 37551911, PMCID: PMC10407973, DOI: 10.1093/database/baad054.Peer-Reviewed Original ResearchConceptsBiomedical relation extractionWeak supervisionRelation extractionSupervised learningHuman-labeled dataHuman-annotated dataHuman-labeled datasetLanguage modelPerformance gainsAnnotation processBiomedical entitiesOriginal datasetKnowledge baseDatasetBiomedical literatureBioCreativeBioRLearningLanguagePerformanceModel structureSystemAnnotationExternal literatureSupervisionChemical identification and indexing in full-text articles: an overview of the NLM-Chem track at BioCreative VII
Leaman R, Islamaj R, Adams V, Alliheedi M, Almeida J, Antunes R, Bevan R, Chang Y, Erdengasileng A, Hodgskiss M, Ida R, Kim H, Li K, Mercer R, Mertová L, Mobasher G, Shin H, Sung M, Tsujimura T, Yeh W, Lu Z. Chemical identification and indexing in full-text articles: an overview of the NLM-Chem track at BioCreative VII. Database 2023, 2023: baad005. PMID: 36882099, PMCID: PMC9991492, DOI: 10.1093/database/baad005.Peer-Reviewed Original ResearchConceptsIndexing taskAutomated recognitionChemical entity recognitionDeep learning technologyBiomedical literatureIdentification taskNational Library of MedicineNER performanceEntity recognitionBiomedical entitiesTracking datasetArticle indexesHigh performanceLearning technologyBiomedical subfieldsChemical identificationText-mining methodBioCreativeTaskPrediction accuracyMedical Subject HeadingsSubject headingsPerformanceTrackingCommunity challengesAutomatic Creation of Named Entity Recognition Datasets by Querying Phrase Representations
Kim H, Yoo J, Yoon S, Kang J. Automatic Creation of Named Entity Recognition Datasets by Querying Phrase Representations. 2023, 7148-7163. DOI: 10.18653/v1/2023.acl-long.394.Peer-Reviewed Original Research
2022
Full-text chemical identification with improved generalizability and tagging consistency
Kim H, Sung M, Yoon W, Park S, Kang J. Full-text chemical identification with improved generalizability and tagging consistency. Database 2022, 2022: baac074. PMID: 36170114, PMCID: PMC9518746, DOI: 10.1093/database/baac074.Peer-Reviewed Original ResearchConceptsF1 scoreEntity recognitionNormal tasksDictionary modelTransfer learningTag consistencyTagging inconsistencyTracking challengesNeural modelPost-processing methodMajority votingUnique identifiersLow generalizabilityHybrid modelArticle titlesTaskRecognitionBioCreativeDictionaryLimitations of modelsEntitiesLearningIdentifiersModelTags
News
Get In Touch
Contacts
Email