Several researchers at Biomedical Informatics & Data Science are interested in exploring natural language processing (NLP) in biomedicine. In this article, four of these scientists explain what NLP means for their research and share perspectives on the opportunities of this fast-growing field.
Kei-Hoi Cheung: Harnessing the Power of Big Healthcare Data
Dr. Kei-Hoi Cheung is a renowned researcher and educator, and Professor of Biomedical Informatics & Data Science at Yale. Dr. Cheung has also co-edited two books about the “semantic web” (a framework to make Internet data machine-readable). Recently, Dr. Cheung has begun NLP projects on annotating, extracting and retrieving information from clinical text as part of the Veteran Administration’s electronic medical records.
In the digital world of healthcare, Electronic Health Records (EHRs) have given rise to big patient data, including everything from patients’ diagnoses and social determinants of health, to lab test results, drug prescriptions, and medical histories. However, the bulk of patient data is not structured, but rather exists as clinical notes in free text form. Because traditional healthcare analytics have relied predominantly on structured data, a wealth of clinical data remains buried and unused as free text. We call this buried data “dark data”. Mining large amounts of clinical notes to find “dark data” is a major challenge in data science. Excitingly, natural language processing (NLP) has emerged as a tool that can help us overcome this challenge. NLP is a machine learning technology that automatically interprets, manipulates, and comprehends human language.