Skip to Main Content
Everyone (Public)

NLP/LLM Interest Group: RDMA

"Cost Effective Agent-Driven Rare Disease Discovery within Electronic Health Record Systems"

Talk Title: RDMA: Cost Effective Agent-Driven Rare Disease Discovery within Electronic Health Record Systems

Abstract: Rare diseases affect 1 in 10 Americans, yet standard ICD coding systems fail to capture these conditions in electronic health records (EHR), leaving crucial information about rare diseases, their clinical presentations, and phenotypic patterns buried in unstructured clinical notes. Current automated extraction approaches struggle with medical abbreviations, miss implicit phenotype mentions, raise privacy concerns through cloud processing, and lack the clinical reasoning abilities needed for accurate identification of rare disease presentations in human patients. We present Rare Disease Mining Agents (RDMA), a framework that mirrors how clinical experts approach rare disease identification by systematically connecting clinical observations directly to standardized ontologies like Orphanet and Human Phenotype Ontology. RDMA handles clinical abbreviations, recognizes implicit phenotype patterns, and applies contextual reasoning locally on standard hardware to extract and code rare disease information with supporting textual evidence. This approach reduces privacy risks while improving F1 performance by over 30\% and decreasing inference costs 10-fold, achieving high precision (89\%) in rare disease mining and coding. By enabling clinicians to access systematically coded rare disease information with explicit evidence from EHR systems without cloud-based privacy risks, RDMA supports identification and documentation of rare conditions. Available at https://github.com/jhnwu3/RDMA.


John Wu is a a Ph.D student in the CS department at the University of Illinois, currently advised by Professor Jimeng Sun. His main focus is on building agentic systems for healthcare settings, whether that be low resource (i.e rare diseases), interpretability, or clinical predictive modeling. He actively maintains PyHealth, and leads a community of open-source researchers, trying to build more reproducible healthcare solutions. His work is currently supported by the NSF GRFP.

Speaker

  • University of Illinois

    John Wu
    PhD Student in CS Department

Contact

Host Organizations

Admission

Free

Event Type

Lectures and Seminars

Tag

Next upcoming occurrences of this event

Nov 202510Monday