Huan He, PhD
Research Scientist in Biomedical Informatics and Data ScienceCards
About
Research
Publications
Featured Publications
VUSphere: Visual Analysis of Video Utilization in Online Distance Education
He H, Zheng O, Dong B. VUSphere: Visual Analysis of Video Utilization in Online Distance Education. 2018, 00: 25-35. DOI: 10.1109/vast.2018.8802383.Peer-Reviewed Original ResearchOnline distance educationDistance educationLearning processVideo utilizationMultiple perspectivesUser log dataLog dataLearning CenterVisual analytics systemProgram curriculumCourse videosStudent populationProfessional knowledgeQuality of serviceStudentsMassive videosCoordinated viewsDomain expertsAnalytics systemEducationReal datasetsComparable indicatorsVideoUtilization statisticsComparison viewTowards User-centered Corpus Development: Lessons Learnt from Designing and Developing MedTator.
He H, Fu S, Wang L, Wen A, Liu S, Moon S, Miller K, Liu H. Towards User-centered Corpus Development: Lessons Learnt from Designing and Developing MedTator. AMIA Annual Symposium Proceedings 2023, 2022: 532-541. PMID: 37128369, PMCID: PMC10148300.Peer-Reviewed Original ResearchConceptsAnnotation toolWeb-based annotation toolNatural language processing systemsCorpus developmentUser-centered interfacesData security concernsLanguage processing systemTarget end usersAnnotation taskClinical NLPAnnotation processEnd usersSecurity concernsLightweight solutionPowerful featuresAdvanced featuresProcessing systemCore taskTool developmentTaskCorpusAnnotatorsToolNLPUsersMedTator: A Serverless Web-based Tool for Corpus Annotation
He H, Fu S, Wang L, Wen A, Liu S, Liu H. MedTator: A Serverless Web-based Tool for Corpus Annotation. 2022, 00: 530-531. DOI: 10.1109/ichi54592.2022.00099.Peer-Reviewed Original Research
2026
Systemic Therapy in Patients With Metastatic Castration-Resistant Prostate Cancer: ASCO Living Guideline, Version 2026.1
Taplin M, Riaz I, Rumble R, Naqvi S, Hope T, Dias M, He H, Hotte S, Emamekhoo H, Murad M, Celano P, Kungel T, Hentzen S, Serzan M, Parikh R. Systemic Therapy in Patients With Metastatic Castration-Resistant Prostate Cancer: ASCO Living Guideline, Version 2026.1. Journal Of Clinical Oncology 2026, 44: e1-e14. PMID: 41557978, DOI: 10.1200/jco-25-02693.Peer-Reviewed Original ResearchConceptsMetastatic castration-resistant prostate cancerCastration-resistant prostate cancerSystemic therapyProstate cancerLiving guidelinesEvolving evidenceClinical practiceASCOPatientsExpert panelGuidelinesHealth LiteratureTherapyMetastatizationCancerIndividual variationIndependent professional judgmentClinic
2025
TopicForest: embedding-driven hierarchical clustering and labeling for biomedical literature
Chang C, Ondov B, Choi B, Peng X, He H, Xu H. TopicForest: embedding-driven hierarchical clustering and labeling for biomedical literature. Journal Of Biomedical Informatics 2025, 172: 104958. PMID: 41242669, PMCID: PMC12902918, DOI: 10.1016/j.jbi.2025.104958.Peer-Reviewed Original ResearchConceptsAdjusted Mutual InformationBiomedical abstractsBiomedical literatureExpansion of biomedical literatureHierarchical topic modelHierarchical clusteringHierarchical clustering techniqueTopic hierarchyLabeling diversityTopic discoveryTopic summarizationClustering qualityClustering performanceManifold learningEmbedding modelTopic modelsLabel qualityLabeling frameworkSemantic spaceClustering techniqueMulti-scale explorationMutual informationCoherent labelingClustering methodDimension reductionSemNovel – A new approach to detecting semantic novelty of biomedical publications using embeddings of large language models
Peng X, Xie Y, He H, Ondov B, Raja K, Liu Q, Mei Q, Xu H. SemNovel – A new approach to detecting semantic novelty of biomedical publications using embeddings of large language models. Journal Of Biomedical Informatics 2025, 172: 104952. PMID: 41242670, DOI: 10.1016/j.jbi.2025.104952.Peer-Reviewed Original ResearchConceptsJournal impact factorLanguage modelGrowth of scientific literatureNovelty detection frameworkResearch impactSemantic contentEvolution of scientific researchMeasures of noveltyCitation countsDetection frameworkEmbedding modelBiomedical publicationsT-SNEAuthor countNovel papersRobust methodNovel contributionsSemantic noveltyInteraction interfaceBiomedical literatureEmbeddingImpact factorNovelty indicatorPublication yearArticle featuresAn Information Extraction Approach to Detecting Novelty of Biomedical Publications.
Peng X, Ondov B, He H, Hu Y, Xu H. An Information Extraction Approach to Detecting Novelty of Biomedical Publications. AMIA Annual Symposium Proceedings 2025, 2024: 1013-1022. PMID: 41726468, PMCID: PMC12919535.Peer-Reviewed Original ResearchConceptsJournal impact factorEntity-relationSemantic measuresHigher citation impactInformation extraction approachMeasures of noveltyEntity recognitionCitation impactCitation countsBiomedical entitiesBiomedical publicationsExtraction approachHigh-impact journalsResearch evaluationResearch impactImpact factorResearch influenceSections of research articlesConclusion sectionNoveltyResearch articlesNERCitationsJournalsEntitiesFacilitating Clinical Information Extraction with Synthetic Data and Ontology using Large Language Models.
Hu Y, He H, Chen Q, Jiang X, Roberts K, Xu H. Facilitating Clinical Information Extraction with Synthetic Data and Ontology using Large Language Models. AMIA Annual Symposium Proceedings 2025, 2024: 500-505. PMID: 41726483, PMCID: PMC12919538.Peer-Reviewed Original ResearchConceptsLanguage modelSemantic mapSynthetic dataClinical information extractionInformation extraction systemSynthetic data generationSynthetic data creationEntity recognitionAnomaly detectionHuman annotatorsInformation extractionClinical NLPData utilityClinical textElectronic health recordsPerformance gainsIterative verificationAnnotation challengesData creationSNOMED-CTModel generalizabilityModel performanceData generationHealth recordsExperimental resultsSocial determinants of health extraction from clinical notes across institutions using large language models
Keloth V, Selek S, Chen Q, Gilman C, Fu S, Dang Y, Chen X, Hu X, Zhou Y, He H, Fan J, Wang K, Brandt C, Tao C, Liu H, Xu H. Social determinants of health extraction from clinical notes across institutions using large language models. Npj Digital Medicine 2025, 8: 287. PMID: 40379919, PMCID: PMC12084648, DOI: 10.1038/s41746-025-01645-8.Peer-Reviewed Original ResearchCDEMapper: enhancing National Institutes of Health common data element use with large language models
Wang Y, Huang J, He H, Zhang V, Zhou Y, Hao X, Ram P, Qian L, Xie Q, Weng R, Lin F, Hu Y, Cui L, Jiang X, Xu H, Hong N. CDEMapper: enhancing National Institutes of Health common data element use with large language models. Journal Of The American Medical Informatics Association 2025, 32: 1130-1139. PMID: 40332956, PMCID: PMC12202029, DOI: 10.1093/jamia/ocaf064.Peer-Reviewed Original ResearchConceptsData elementsRecommendation accuracySemantic searchLanguage modelUsability testingManual annotationData interoperabilityHuman reviewEvaluation resultsBM25Map servicesResearch reproducibilityMapping toolNational InstituteCore moduleUsabilityDataEmbeddingStreamlined pipelineCDE recommendationElasticsearchInteroperabilityRankersUsersValue sets
Academic Achievements & Community Involvement
Get In Touch
Contacts
Email
Locations
100 College Street
Academic Office
New Haven, CT 06510