Skip to Main Content

Clinical Natural Language Processing (NLP) Lab

Our lab is dedicated to advancing natural language processing (NLP) through the development of novel methods, robust software, and real-world applications across a range of biomedical texts, including clinical notes, scientific literature, and social media. These three areas are closely interconnected: innovative methods inform the creation of widely used software; that software supports clinical applications; and insights from those applications highlight new challenges, guiding the development of future methods. Together, they form a dynamic and collaborative ecosystem that drives our research in clinical NLP.

Upcoming Events

Dec 202529Monday
Jan 20265Monday
Jan 202612Monday
Jan 202619Monday
Jan 202626Monday
Feb 20262Monday

Past Events

Dec 202522Monday
  • Everyone
    Lingfei Qian - Xueqing Peng, PhD

    NLP/LLM Interest Group

    This session will feature two exciting talks:

    1. Accelerating Cohort Identification from EHRs with Biomedical Knowledge and LLMs by Lingfei Qian, PhD

    AbstractIdentifying eligible patients from electronic health records (EHRs) is a key challenge in clinical research. We present a framework that combines large language models (LLMs), Text-to-SQL, and retrieval-augmented generation (RAG) to streamline cohort identification. Eligibility criteria are first decomposed and partially translated into structured queries via Text-to-SQL, providing a preliminary selection from OMOP-formatted EHR data. The core innovation focuses on RAG/QA to retrieve and assess patient-level evidence from both clinical notes and structured tables, emphasizing nuanced evaluation of complex criteria like disease chronicity, lab thresholds, and clinical stability, while supporting interactive cohort exploration and detailed patient-level evidence review. This workflow reduces manual effort, improves accuracy, and offers a scalable, clinically grounded solution for EHR-based cohort identification.

    2. An Information Extraction Approach to Detecting Novelty of Biomedical Publications by Xueqing Peng, PhD

    Abstract: Scientific novelty plays a critical role in shaping research impact, yet it remains inconsistently defined and difficult to quantify. Existing approaches often reduce novelty to a single measure, failing to distinguish the specific types of contributions (such as new concepts or relationships) that drive influence. In this study, we introduce a semantic measure of novelty based on the emergence of new biomedical entities and relationships within the conclusion sections of research articles. Leveraging transformer-based named entity recognition (NER) and relation extraction (RE) tools, we identify novel findings and classify articles into four categories: No Novelty, Entity-only Novelty, Relation-only Novelty, and Entity-Relation Novelty. We evaluate this framework using citation counts and Journal Impact Factors (JIF) as proxies for research influence. Our results show that Entity-Relation Novelty articles receive the highest citation impact, with relation novelty more closely aligned with high-impact journals. These findings offer a scalable framework for assessing novelty and guiding future research evaluation.

Dec 202515Monday
  • Everyone
    Yang Ren

    NLP/LLM Interest Group

    "A Prompt Library for Efficient Clinical Entity Recognition Using Large Language Models"

    A Prompt Library for Efficient Clinical Entity Recognition Using Large Language Models by Yang Ren, PhD

    Abstract: Large Language Models (LLMs) hold strong potential for clinical information extraction (IE), but their evaluation is often limited by manually crafted prompts and the need for annotated data. We developed an automated framework that extracts entity-level schema information from published clinical IE studies to construct structured prompts. Using literature covering 44 diseases and over 100 entities, we generated prompts to evaluate multiple LLMs under few-shot and fine-tuned settings. Compared to baselines using generic prompts, models prompted with schema-derived information consistently outperformed across tasks. Our results demonstrate the value of structured prompting for robust and reproducible LLM evaluation in diverse clinical IE applications.

Dec 20258Monday
  • Everyone
    Fan Ma - Anran Li

    NLP/LLM Interest Group

    This session will feature two exciting talks:

    1. A Collaborative Reasoning Agent-based Framework with Built-in Verification for Safe Medical Decision-Making by Fan Ma, PhD
      Abstract:
      Large language models (LLMs) have demonstrated expert-level capabilities on medical benchmarks, yet translating these achievements into clinical practice is impeded by persistent risks of hallucination and a lack of verifiable reasoning. While emerging agentic frameworks have begun to address these limitations through multi-step planning, existing systems often prioritize performance optimization over rigorous safety checks and fail to emulate the collective decision-making of multidisciplinary teams. To address these critical gaps, we introduce OpenDx, a multi-agent framework designed to bridge the divide between experimental prototypes and reliable clinical decision support. OpenDx is built upon three core principles: collaboration among specialized agents that simulate distinct clinical roles, integrated verification modules that strictly cross-check outputs for safety and consistency, and an architectural alignment with clinical auditability standards. We present the design and evaluation of OpenDx, demonstrating how structured collaboration significantly enhance reliability compared to baseline models. Our work advocates for a new paradigm of trustworthy medical AI, where performance gains are inseparable from the interpretability and safety assurances required for frontline healthcare deployment.

    2. A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine: Applications to Clinical Information Extraction by Anran Li, PhD
      Abstract:
      Large language models (LLMs) are advancing medical applications such as patient question answering and diagnosis. Yet extracting structured information from unstructured clinical narratives across healthcare systems remains challenging. Current LLMs struggle with such clinical information extraction (IE) due to complex language, limited annotations, and data silos. We present a federated, model-agnostic framework for training LLMs in medicine, applied to clinical IE. The proposed Fed-MedLoRA enables parameter-efficient federated fine-tuning by transmitting only low-rank adapter parameters, substantially reducing communication and computation costs. Accuracy was assessed across five patient cohorts through comparisons with baselines for LLMs under (1) in-domain training and testing, (2) external patient cohorts, and (3) a case study on new-site adaptation using real-world clinical notes.


Dec 20251Monday
  • Everyone
    Hyunjae Kim - Chia-Hsuan Chang

    NLP/LLM Interest Group

    This session will feature two exciting talks:

    1. Rethinking Retrieval-Augmented Generation for Medicine: A Large-Scale, Systematic Expert Evaluation and Practical Insights by Hyunjae Kim, PhD

    Abstract: Retrieval-augmented generation (RAG) is widely adopted to keep medical LLMs current and verifiable, yet its effectiveness remains unclear. We present the first end-to-end, expert annotated evaluation of RAG in medicine, systematically assessing the full pipeline across three stages: evidence retrieval, evidence selection, and response generation. Eighteen medical experts provided 80,502 annotations across 800 model outputs on 200 clinical queries.

    Contrary to expectations, conventional RAG often degraded performance—only 22% of retrieved passages were relevant, evidence selection was weak, and factuality dropped up to 6%. However, simple strategies like evidence filtering and query reformulation improved performance by up to 12%.

    Our findings challenge current RAG assumptions and highlight the need for deliberate system design in medical AI applications.

    2. TopicForest: Embedding-Driven Hierarchical Clustering and Labeling for Biomedical Literature by Chia-Hsuan Chang, PhD

    Abstract: The vast and complex landscape of biomedical literature presents significant challenges for organization and interpretation. Current embedding-based topic models like BERTopic are limited to flat, single-granularity clusters, failing to capture the inherently nested, hierarchical structure of scientific subjects. We introduce TopicForest, a novel framework that captures this natural hierarchy by building a "forest of topic trees" directly from text embeddings.

    TopicForest delivers high-quality topic clustering comparable to state-of-the-art flat models while providing the essential multi-scale resolution they lack. Through recursive topic labeling, the framework achieves efficient token usage and practical scalability for large corpora. This design provides researchers with an effective tool for exploring and visualizing hierarchical biomedical knowledge landscapes.