- April 29, 2026
BIDS Graduates First CB&B Master's Cohort
- April 23, 2026
Grants Awarded to the Department of Biomedical Informatics & Data Science (Spring 2026)
- April 06, 2026
2026 Yale Medical AI Symposium
Clinical Natural Language Processing (NLP) Lab
Our lab is dedicated to advancing natural language processing (NLP) through the development of novel methods, robust software, and real-world applications across a range of biomedical texts, including clinical notes, scientific literature, and social media. These three areas are closely interconnected: innovative methods inform the creation of widely used software; that software supports clinical applications; and insights from those applications highlight new challenges, guiding the development of future methods. Together, they form a dynamic and collaborative ecosystem that drives our research in clinical NLP.
Upcoming Events
Copy Link
Everyone Younjoon ChungNLP/LLM Interest Group
Energy-based UDA for Medical Imaging
Abstract: Deep learning models have achieved expert-level performance in diagnosing various ophthalmic conditions using imaging modalities like color fundus photography (CFP). However, these models operate under the assumption that the training and test data are drawn from an identical distribution1. When this assumption is violated by covariate shifts (e.g., varying imaging protocols, camera hardware, field-of-view differences, patient demographics), performance degrades substantially. Unsupervised Domain Adaptation (UDA) addresses this problem by adapting models using unlabeled target data. Existing UDA approaches typically align feature distributions using adversarial learning or entropy-based objectives driven by softmax probabilities. However, softmax normalizes logit magnitudes, which may obscure distributional shifts and cause falsely overconfident predictions. In this study, we propose Class-Conditional Energy Alignment, which adapts source-trained classifiers by matching energy computed directly from unnormalized logits across source and target domains.
Younjoon Chung is a Ph.D. student in Computational Biology and Biomedical Informatics (CBB) at Yale University, advised by Prof. Qingyu Chen and Prof. Lucila Ohno-Machado. His research interests lie in the intersection of machine learning, computer vision and healthcare. Specifically, focusing on developing robust domain adaptation techniques to ensure medical AI models can generalize across diverse clinical environments, including variations in patient populations, imaging hardware, etc.
Virtual - Join our mailing list to receive Zoom Passcode: https://mailman.yale.edu/mailman/listinfo/nlp-llm-ig
- EveryoneTENTATIVESpeakers to be announced.
Everyone Mihaela Aslan, PhDBIDS Grand Rounds
Learning Health Systems: Bridging Clinical Research to Practice through Medical Informatics
Abstract: Medical informatics is the critical bridge connecting clinical research to practice within a Learning Healthcare System (LHS). This presentation highlights the core bioinformatics capabilities and activities of the VA CSP-CERC center in the context of establishing a quality management system using large observational databases and causal inference research to evaluate the effectiveness and safety of treatment strategies, aiding medical professionals in patient care decisions. This involves leveraging high-quality VA data with scalable advanced statistical methods, especially when randomized clinical trials are impractical.
Speaker Bio: Mihaela Aslan, PhD, is a Senior Research Scientist at the Yale School of Medicine and the Director of the Veterans Affairs Cooperative Studies Program Clinical Epidemiology Research Center (CSP-CERC). As a mathematical statistician, she conducts methodological research using high-dimensional data on a wide range of clinical topics, such as causal inference methods leveraging electronic medical records to emulate randomized clinical trials.
CME accredited seminar. Information for claiming credit will be provided at the start of the session.
- EveryoneTENTATIVESpeakers to be announced.
Past Events
Copy Link
Everyone Brian Ondov, PhDNLP/LLM Interest Group
Illuminating Literature Trends with Time-Structured Manifold Projections
Abstract: With a million scientific papers yearly, methods for organizing and exploring research are more important than ever. Though semantic embeddings combined with manifold learning algorithms like t-SNE or UMAP produce useful maps, these ignore publication date, which is crucial for understanding research progression. As a remedy, we develop a time-aware dimension reduction algorithm that groups similar papers while directly encoding date using radius in polar plots, revealing trends that would not be visible with standard t-SNE- or UMAP-based maps.
Join our mailing list to receive Zoom Passcode: https://mailman.yale.edu/mailman/listinfo/nlp-llm-ig
Everyone Qingxia "Cindy" Chen, PhDBIDS Special Seminar
"Making 'All of Us' More Like All of Us: RAILS Weights and Beyond"
Talk title: Making “All of Us” More Like All of Us: RAILS Weights and Beyond
Abstract: National biobanks are transforming medical research, but because participants volunteer, the enrolled cohort may not reflect the broader population, potentially biasing estimates of disease burden and risk. We present a survey-informed weighting approach (RAILS) that rebalances volunteer-based biobank data to better represent the target U.S. population. Applied to the All of Us Research Program using national survey benchmarks, the method improves agreement with population prevalence estimates and supports more reliable phenome-wide profiling and cardiovascular risk assessment. This work offers a practical tool for reducing selection bias and strengthening the population relevance of biobank-based findings.
Qingxia ‘Cindy’ Chen, PhD, is a Professor of Biostatistics, Biomedical Informatics, and Ophthalmology & Visual Sciences at Vanderbilt University Medical Center. She also serves as the Vice Chair of Education at the Department of Biostatistics. Her research spans both novel statistical and impactful biomedical research, with a focus on missing data, survival analysis, Bayesian analysis, and the use of electronic health records and multimodal data resources for precision medicine. Additionally, she co-led statistical efforts for the All of Us Research Program at the Data and Research Center and has contributed to initiatives focused on enhancing data quality and developing educational resources.
Everyone Michael Krauthammer, MD, PhDNLP/LLM Interest Group
"LLMs in Medical Decision-Making"
In this presentation, Michael Krauthammer, MD, PhD, investigates the transition of Large Language Models (LLMs) and their underlying technologies into clinical reasoning systems, drawing on research and implementation examples from the University of Zurich and University Hospital of Zurich.
Dr. Krauthammer will first demonstrate the efficacy of Vision Transformers (ViT) and Vision-Language Models (VLM) for image assessment and reporting in rheumatology and radiology. He will then explore the use of LLMs in the Zurich AI tumor board for clinical guideline mapping and counterfactual treatment planning. Finally, he will address the key hurdles to clinical adoption, including stakeholder management, infrastructure requirements, and certification.
Join our mailing list to receive Zoom Passcode: https://mailman.yale.edu/mailman/listinfo/nlp-llm-ig
Everyone Jessilyn Dunn, PhDResearch in Progress | Rising Star Seminar
The Digital Physiome: Wearables for Disease Detection and Monitoring
Title: The Digital Physiome: Wearables for progress Disease Detection and Monitoring
Abstract: Digital health is rapidly expanding due to surging healthcare costs, deteriorating health outcomes, and the growing prevalence and accessibility of mobile health and wearable technologies. Recent technological advancements make it possible to closely and continuously monitor individuals using multiple measurement modalities in real time. We are collecting and integrating such wearables data with clinical information to gain a more precise understanding of health and disease and develop actionable, predictive health models for improving outcomes. We are simultaneously developing open source data science and machine learning tools for the digital health community, including the Digital Biomarker Discovery Pipeline (DBDP), to facilitate the use of mobile device data in healthcare.
Jessilyn Dunn, PhD, is an Associate Professor of Biomedical Engineering and Biostatistics & Bioinformatics at Duke University. She directs the BIG IDEAs Lab, which is focused on digital health innovation, wearable sensors, and the development and validation of AI-driven digital biomarkers. Dr. Dunn is the Principal Investigator of research initiatives funded by the NIH, NSF, and FDA which are developing digital biomarkers of conditions ranging from pre- and type 2 diabetes to influenza-like illness to Opioid Use Disorder. Dr. Dunn was an NIH Big Data to Knowledge (BD2K) Postdoctoral Fellow at Stanford and an NSF Graduate Research Fellow at Georgia Tech and Emory, as well as a visiting scholar at the US Centers for Disease Control and Prevention and the National Cardiovascular Research Institute in Madrid, Spain. Her work has been internationally recognized with media coverage from the NIH Director’s Blog to Wired, Time, and US News and World Report. Dr. Dunn sits on the Google Consumer Health Advisory Panel and is a recipient of the NSF CAREER Award and the IEEE EMBS Early Career Achievement Award for her leadership and innovation across engineering and medicine.
This week's seminar will be held virtually on Zoom:
Principal Investigator
Copy Link
Contact Information
- Email