Lin, Chang Place 5th in International Natural Language Processing Competition

November 25, 2019

Eric Lin, MD, a fourth-year resident in the Yale Department of Psychiatry, and David Chang, a student in Yale's Computation Biology & Bioinformatics program, placed fifth in the National Natural Language Processing (NLP) Clinical Challenges (n2c2) Clinical Semantic Textual Similarity (STS) Task.

The competition was designed to further the development of automated methods to better process and aggregate data from increasingly bloated electronic health records for clinical use and research. Of the 78 international university and industry teams that registered, 33 teams completed the challenge. Notable participants included teams from IBM Watson Health Research (which placed first) and NCBI (last year's champion, which placed second this year).

Under Professors Cynthia Brandt, MD, MPH, of the Yale Center for Medical Informatics and Andrew Taylor, MD, MHS, of Emergency Medicine, the Yale team utilized clinical domain expertise to design a deep learning natural language processing model to approximate the semantic similarity between annotated EHR sentence pairs in a dataset developed by Mayo Clinic and Harvard's Department of Biomedical Informatics.

As with many of the other top performers, the Yale team's NLP model utilized many of the latest AI research techniques which include transfer learning for deep learning-based language models (clinicalBERT), graph convolutional networks for structured medical knowledge, knowledge distillation, and data augmentation.

Lin and Chang also presented a poster describing the machine learning model architecture and methods at the n2c2 workshop at the American Medical Informatics Association 2019 Annual Symposium in November in Washington, DC. International teams from over five countries convened at the workshop to discuss the cutting-edge techniques utilized across n2c2's four challenges.

Submitted by Christopher Gardner on November 26, 2019

Lin, Chang Place 5th in International Natural Language Processing Competition

Tags