Research Departments & Organizations
As the number of life science databases and analytic tools increase, the interoperation of such databases and tools has become important and challenging. We tackle this challenge by exploring efficient and innovative approaches involving the use of ontologies, semantic web, metadata, natural language processing and high performance computing. Dr. Cheung has been collaborating with many faculty members in different departments and core facilities including Genetics, Biology, Computer Science, Biostatistics, and Yale Keck Protein Profiling Facility. His research has been carried out in the context of integrating and analyzing: a) genomics data, b) proteomics data including mass spectrometry (MS) data, c) immunology data and d) neuroscience data.
Specialized Terms: Genetic database; Tool interpolation
Extensive Research Description
- Yale Protein Expression Database (YPED). YPED is an institution-wide database for use by proteomics researchers at Yale and outside of Yale
- Human Immunology Project Consortium (HIPC). HIPC was established by NIAID, which generates a wide variety of phenotypic and molecular data from well-characterized patient cohorts, including genome-wide
expression profiling, high-dimensional flow cytometry and serum cytokine concentrations. The adoption and adherence
to data standards is critical to enable data integration across HIPC centers, and facilitate data re-use by the wider scientific community. One key component of HIPC involves data standardization effort, along with the infrastructure that has been developed.
- Center for Expanded Data Annotation and Retrieval (CEDAR). CEDAR is part of the Big Data to Knowledge (BD2K) initiative funded by NIH. It studies the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse.
- Clinical Natural Language Processing (NLP). To extract and retrieve information from large amounts of clinical notes (unstructured data) for facilitating clinical research, a variety of NLP techniques including the incorporation of ontologies have been explored in different domains including lung/colon cancer, post-traumatic stress disorder, psychogenic nonepileptic seizure, and chronic pain.
SenseLab Project Austria; China; South Korea (2006 - 2006)
This is a neuroscience informatics project involving the exploration of Semantic Web technologies in neuroscience data integration. In 2006 a visiting scholar from Zhejiang University, Hangzhou, Zhejiang, China came to YCMI to collaborate with Dr. Cheung on this neuroinformatics project. A PhD...
Bioinformatics Summer Exchange Program Hong Kong (2005 - 2006)
This is a summer program that has been established between the University of Hong Kong's (HKU) Bioinformatics Program (which is an undergraduate program) and Yale Center for Medical Informatics (YCMI) which is affiliated with Yale's Computational and Bioinformatics (CBB) PhD program. Prof. Cheung...
Utilizing protein structure to identify non-random somatic mutations.
Ryslik GA, Cheng Y, Cheung KH, Modis Y, Zhao H. Utilizing protein structure to identify non-random somatic mutations. BMC Bioinformatics 2013, 14:190. 2013
A semantic web framework to integrate cancer omics data with biological knowledge.
Holford ME, McCusker JP, Cheung KH, Krauthammer M. A semantic web framework to integrate cancer omics data with biological knowledge. BMC Bioinformatics 2012, 13 Suppl 1:S10. 2012
Structured digital tables on the Semantic Web: toward a structured digital literature.
Cheung KH, Samwald M, Auerbach RK, Gerstein MB. Structured digital tables on the Semantic Web: toward a structured digital literature. Molecular Systems Biology 2010, 6:403. 2010
AlzPharm: integration of neurodegeneration data using RDF.
Lam HY, Marenco L, Clark T, Gao Y, Kinoshita J, Shepherd G, Miller P, Wu E, Wong GT, Liu N, Crasto C, Morse T, Stephens S, Cheung KH. AlzPharm: integration of neurodegeneration data using RDF. BMC Bioinformatics 2007, 8 Suppl 3:S4. 2007
YeastHub: a semantic web use case for integrating data in the life sciences domain.
Cheung KH, Yip KY, Smith A, Deknikker R, Masiar A, Gerstein M. YeastHub: a semantic web use case for integrating data in the life sciences domain. Bioinformatics (Oxford, England) 2005, 21 Suppl 1:i85-96. 2005