Skip to Main Content

Current Research Areas

What are we working on? Privacy-protecting Data Sharing, Distributed Analytics and AI Model Evaluation, Biomedical Data Index, and more.

  • Privacy-protecting Data Sharing

    Patients have different preferences toward the sharing of their electronic health records (EHRs) for research. We study ways in which we can share these records with researchers while protecting patient privacy. Several solutions exist, ranging from transforming data to make it less re-identifiable, to policies that regulate how data can be used for various purposes.

  • Distributed Analytics and AI Model Evaluation

    Repositories holding large amounts of data from patients or study participants may require researchers to compute in their enclaves. To build and/or evaluate models across several repositories, we develop new approaches for distributed data analysis: from model constructing and testing, to internal and external validation, our methods produce results that are most times indistinguishable from those that would have been obtained by centralizing all data in one place. Our distributed model evaluation methods address both classification performance and model calibration. We are also particularly interested in bias that can be introduced by certain models.

  • Biomedical Data Index

    To promote scientific discovery, BIDS faculty members are currently developing technologies, standards, and policies that align with the principles of making biomedical data findable, accessible, interoperable, and reusable (FAIR). Our team has created DataMed, a biomedical data discovery index that simplifies the process of searching and accessing diverse types of biomedical datasets from a single platform. The goal of these initiatives is to facilitate the easy and efficient location, understanding, and utilization of biomedical data by researchers and machines, irrespective of its origin. Please feel free to contact Dr. Lucila Ohno-Machado and Dr. Hua Xu if you are interested in our work.

  • NLP

    Within the biomedical domain, a significant amount of detailed information is embedded in narrative documents, such as clinical notes of patients and biomedical articles. Natural language processing (NLP) technology provides an efficient way to automatically extract and standardize information from these texts, and has been widely used to support a range of applications, such as real-world studies using electronic health records, clinical decision support systems to improve care quality and safety, and literature mining to accelerate scientific discovery. BIDS faculty members are leading experts in biomedical NLP, having developed novel algorithms and useful tools in this area, which have been successfully applied to diverse applications in clinical research and practice. For more information on this topic, please contact Dr. Hua Xu.

  • Semantic web/ontology

    As biomedical research is increasingly data-driven, vast amounts of data (structured and unstructured) have been generated and stored in numerous databases. However, these databases are silos that are not connected and have heterogeneous formats and interfaces, making data integration/interoperability challenging. To transform these unconnected databases into connected knowledgebases, ontologies and semantic web are key technologies that enable systems-level data integration and knowledge-driven querying. This transformation will also increase data FAIRness. BIDS faculty members are leading experts in the exploration and application of advanced database, semantic technologies, and ontologies in the biomedical domain. Please contact Dr. Kei-Hoi Cheung if you are interested in this topic.