Huan He, PhD
Research Scientist in Biomedical Informatics and Data ScienceCards
About
Research
Publications
Featured Publications
VUSphere: Visual Analysis of Video Utilization in Online Distance Education
He H, Zheng O, Dong B. VUSphere: Visual Analysis of Video Utilization in Online Distance Education. 2018, 00: 25-35. DOI: 10.1109/vast.2018.8802383.Peer-Reviewed Original ResearchOnline distance educationDistance educationLearning processVideo utilizationMultiple perspectivesUser log dataLog dataLearning CenterVisual analytics systemProgram curriculumCourse videosStudent populationProfessional knowledgeQuality of serviceStudentsMassive videosCoordinated viewsDomain expertsAnalytics systemEducationReal datasetsComparable indicatorsVideoUtilization statisticsComparison viewTowards User-centered Corpus Development: Lessons Learnt from Designing and Developing MedTator.
He H, Fu S, Wang L, Wen A, Liu S, Moon S, Miller K, Liu H. Towards User-centered Corpus Development: Lessons Learnt from Designing and Developing MedTator. AMIA Annual Symposium Proceedings 2023, 2022: 532-541. PMID: 37128369, PMCID: PMC10148300.Peer-Reviewed Original ResearchConceptsAnnotation toolWeb-based annotation toolNatural language processing systemsCorpus developmentUser-centered interfacesData security concernsLanguage processing systemTarget end usersAnnotation taskClinical NLPAnnotation processEnd usersSecurity concernsLightweight solutionPowerful featuresAdvanced featuresProcessing systemCore taskTool developmentTaskCorpusAnnotatorsToolNLPUsersMedTator: A Serverless Web-based Tool for Corpus Annotation
He H, Fu S, Wang L, Wen A, Liu S, Liu H. MedTator: A Serverless Web-based Tool for Corpus Annotation. 2022, 00: 530-531. DOI: 10.1109/ichi54592.2022.00099.Peer-Reviewed Original Research
2025
Collaborative large language models for automated data extraction in living systematic reviews
Khan M, Ayub U, Naqvi S, Khakwani K, Sipra Z, Raina A, Zhou S, He H, Saeidi A, Hasan B, Rumble R, Bitterman D, Warner J, Zou J, Tevaarwerk A, Leventakos K, Kehl K, Palmer J, Murad M, Baral C, Riaz I. Collaborative large language models for automated data extraction in living systematic reviews. Journal Of The American Medical Informatics Association 2025, ocae325. PMID: 39836495, DOI: 10.1093/jamia/ocae325.Peer-Reviewed Original Research
2024
Visual Explanation of the Assessment of Certainty of Evidence for Systematic Review and Meta-analysis
Naqvi S, Faisal K, Imran M, Khan M, Khakwani K, Murad M, He H, Bin Riaz I. Visual Explanation of the Assessment of Certainty of Evidence for Systematic Review and Meta-analysis. 2024, 00: 10-14. DOI: 10.1109/vahc65315.2024.00010.Peer-Reviewed Original ResearchA taxonomy for advancing systematic error analysis in multi-site electronic health record-based clinical concept extraction
Fu S, Wang L, He H, Wen A, Zong N, Kumari A, Liu F, Zhou S, Zhang R, Li C, Wang Y, St Sauver J, Liu H, Sohn S. A taxonomy for advancing systematic error analysis in multi-site electronic health record-based clinical concept extraction. Journal Of The American Medical Informatics Association 2024, 31: 1493-1502. PMID: 38742455, PMCID: PMC11187420, DOI: 10.1093/jamia/ocae101.Peer-Reviewed Original ResearchNatural language processingClinical concept extractionElectronic health recordsConcept extractionClinical natural language processingConcept extraction taskNatural language processing modelsElectronic health record settingsDomain-specific knowledgeError taxonomyHeterogeneity of electronic health recordsReal-world dataAnalysis processExtraction taskOWL formatAnnotated examplesError analysisNLP modelsCommunity feedbackLanguage processingMulti-site settingConduct error analysisError classesError typesError analysis process
2023
An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C)
Liu S, Wen A, Wang L, He H, Fu S, Miller R, Williams A, Harris D, Kavuluru R, Liu M, Abu-el-Rub N, Schutte D, Zhang R, Rouhizadeh M, Osborne J, He Y, Topaloglu U, Hong S, Saltz J, Schaffter T, Pfaff E, Chute C, Duong T, Haendel M, Fuentes R, Szolovits P, Xu H, Liu H. An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C). Journal Of The American Medical Informatics Association 2023, 30: 2036-2040. PMID: 37555837, PMCID: PMC10654844, DOI: 10.1093/jamia/ocad134.Peer-Reviewed Original ResearchConceptsNatural language processingNLP modelsClinical natural language processingNatural language processing frameworkEHR-based clinical researchMulti-site settingSymptom extractionProcessing frameworkNLP frameworkLanguage processingNLP solutionMulti-site dataAlgorithm robustnessMethodology advancementsResearch communityTranslational research communityNational COVID Cohort CollaborativeCase demonstrationProcess heterogeneityFrameworkAnnotationCOVID cohortThe IMPACT framework and implementation for accessible in silico clinical phenotyping in the digital era
Wen A, He H, Fu S, Liu S, Miller K, Wang L, Roberts K, Bedrick S, Hersh W, Liu H. The IMPACT framework and implementation for accessible in silico clinical phenotyping in the digital era. Npj Digital Medicine 2023, 6: 132. PMID: 37479735, PMCID: PMC10362064, DOI: 10.1038/s41746-023-00878-9.Peer-Reviewed Original ResearchAlgorithm development processEase of adoptionDigital health applicationsSoftware applicationsPhenotyping tasksExample implementationInterested usersReusable toolsManual abstractionSuch tasksDevelopment processHealth applicationsDigital eraFoundational requirementCost requirementsTaskImplementationSilico meansFrameworkPotential solutionsRequirementsUsersDatasetAbstractionApplicationsAcquisition of a Lexicon for Family History Information: Bidirectional Encoder Representations From Transformers–Assisted Sublanguage Analysis
Wang L, He H, Wen A, Moon S, Fu S, Peterson K, Ai X, Liu S, Kavuluru R, Liu H. Acquisition of a Lexicon for Family History Information: Bidirectional Encoder Representations From Transformers–Assisted Sublanguage Analysis. JMIR Medical Informatics 2023, 11: e48072. PMID: 37368483, PMCID: PMC10337517, DOI: 10.2196/48072.Peer-Reviewed Original ResearchChallenge data setInformation extractionClinical decision support applicationsNatural language processing systemsUnified Medical Language SystemLexical resourcesDecision support applicationsBidirectional Encoder RepresentationsLanguage processing systemConcept unique identifiersData setsTransformer-based methodData analyticsElectronic health recordsF1 scoreClinical notesMedicine Clinical TermsEncoder RepresentationsSupport applicationsUnique identifiersProcessing systemSublanguage analysisHistory informationReasonable performanceSystematized NomenclatureExtractive Clinical Question-Answering With Multianswer and Multifocus Questions: Data Set Development and Evaluation Study
Moon S, He H, Jia H, Liu H, Fan J. Extractive Clinical Question-Answering With Multianswer and Multifocus Questions: Data Set Development and Evaluation Study. JMIR AI 2023, 2: e41818. PMID: 38875580, PMCID: PMC11041481, DOI: 10.2196/41818.Peer-Reviewed Original ResearchData setsNational NLP Clinical ChallengesNatural language processing applicationsArtificial intelligence solutionsLanguage processing applicationsIntelligence solutionsQuestion AnsweringMultiple answersProcessing applicationsBaseline solutionRealistic scenariosEntire data setPatient-specific questionsAdjacent sentencesClinical notesSet DevelopmentSetResearch effortsEvaluation studyFocus point