Biostastics and Bioinformatics

PURPOSE and SCOPE:    The bioinformatics service is a function of YCAS which serves as the Biostatistics Shared Resource for Yale Cancer Center (YCC). It is a highly interactive team of cancer biostatisticians who work collaboratively with basic, clinical, translational and population science researchers to advance the frontiers of cancer medicine and public health. Yale Cancer Center (YCC), in conjunction with the Yale Center for Analytical Sciences (YCAS), provides for the biostatistical needs of the entire YCC. 

The services encompass bioinformatics support to YCC members on study designs, analysis, and grant and manuscript preparations. We aim to provide high-quality, cutting-edge and custom data analyses, as well as consultation and training/education for high throughput genomics, transcriptomics, proteomics, and other high-throughput data sets.

The best mechanism for engaging the BSR is through percent effort inclusion on grants. Absent this, there are limitations on the number of hours that can be dedicated to any individual project. The following guidelines will be used to prioritize the utilization of the BSR and to ensure fair utilization by the entire research community at YCC.
 
PRIORITIES: The priorities listed below will be used to triage and prioritize which studies will be eligible for BSR utilization. BSR resources are available only to members and associate members of the YCC for cancer related projects.
These priorities reflect the maximum number of hours that the biostatisticians may be able to dedicate to any particular project without compensation. 

Bioinformatics

Services

  • Array-based platforms, including genotyping arrays, gene expression arrays, copy number arrays, methylation arrays, and ChIP-on-chip arrays.
  • Analysis of next-generation sequencing data, including: 
  1. DNA-sequencing analysis, such as single nucleotide variations, indels, copy number variations, translocations and inversions from whole-exome and whole-genome DNA sequencing (DNA-seq)
  2. Gene expression, splice variants, gene-set, and pathway analysis from RNA sequencing (RNA-seq) 
  3. microRNA expression from microRNA sequencing (miRNA-seq)
  4. DNA methylation and differential methylation identification from methylation sequencing (methyl-seq)
  5. Transcription factor binding sites and chromatin modifications from ChIP sequencing (ChIP-seq)
  6. RNA binding sites identification from CLIP sequencing (CLIP-seq)
  • Proteomics data anyalsis
  • Data annotation, visualization, and database integration
  • User training in bioinformatics software and tools

Basic Science Support

Support of basic science (with priorities TBD) will include the following. Please contact Tory at xiangyu.cong@yale.edu for more information.

  • Experimental Design
  • Power calculations
  • Development of randomization schemes
  • Basic analysis (e.g.; chi-square, t-test, regression, ANOVA)
  • Graph generation
  • Statistical protocol development
  • Office hours
  • Analytic and research and design clinics (to workshop ideas)

Clinical Studies Support 

The following represents the priorities identified by YCC leadership:

  • Investigator initiated clinical research trials with translational component
  • Investigator initiated clinical trials
  • Prospective IRB-approved translational trials in a protocol setting
  • Cooperative group multi-center trials
  • Industry sponsored trials
  • Retrospective studies (laboratory based or otherwise) with clinical correlations

Population Science Support 

The following represents the level of biostatistical support for population science:

  • Study design, including sample size/power calculations and analytic plan
  • Development of randomization schemes
  • Analysis
  • Grant and manuscript preparation 
     

Utilization Limits


Efforts will be made to ensure parity of access for both basic, clinical, translational and population science projects. Nonetheless, limits on staffing may dictate that the scale of large jobs be capped. The size of such a cap will be determined by the BSR Oversight committee and is likely to change over time based on usage and staffing.

Currently, the guidelines for resources provided by YCC gratis will be as below.  Additional resource hours will require funding by the investigator or by special approval by the BSR Committee through a special application.  All investigators are entitled to receive a minimum of 16 hours per year.  Limits are as below:

  • Grant preparation for NIH funding – as many hours as reasonably required to prepare the grant. More than 40 hours requires committee approval.
  • Unfunded analyses, including those for manuscript preparation – 16 hours total per investigator per year.
  • Investigator-initiated trials–16 hours for study design; as many hours as reasonably required (to be reviewed by the Oversight Committee) to monitor, analyze and publish the study.
  • IRB approved translational projects – 8 hours for design and 16 hours of analysis per project.
  • Grant preparation for internal (e.g., TTARE) and external (non-NIH) funding – 8 hours.
  • Analyses and manuscript preparation for peer-reviewed and funded grants- 8 hours (or percent effort) are per grant budget.
        

Data Management Support

In addition to the support outlined above, support for data management for basic, clinical, translational and population science will be available through the BSR on a fee for service basis.

 
 
Publication Credit and External Funding

Standards for academic credit vary among research disciplines. For small-scale routine consultations, BSR does not expect authorship.

Reasonable expectations for larger projects must be guided by a sense of fairness and proportionality and community standards. The International Committee of Medical Journal Editors provides consensus guidelines on publication credit http://www.icmje.org/icmje-recommendations.pdf that can be used as a framework for publication credit discussions.

It is recommended that users of the resource have a direct discussion of these issues in the context of job specification discussions in early phases of the project so that a consensus can be achieved.

Fees

We provide supports for cancer-related projects. All Yale Cancer Center (YCC) members are entitled to receive free support for all non-billable services, and a maximum of 4 hours free support per year. Non-YCC members receive free support for non-billable activities, but need to pay for all billable activities. YCC members receive priority over nonmembers.

Non-billable activities: work on grant preparation, including analysis of pilot studies, provided the member of the YCAS Bioinformatics Core is written into the grant as percent FTE.  

Billable activities: include all other services

Fees: $121/hour for FY15.

Time Estimation

  • The following estimation assumes no complications (e.g., quality issues) with the data. The actual analysis time heavily depends on data complexity.
  • Turn-around time will depend on the work load.
  • For sequencing data, the service is divided into two parts: hands-on time and computer time. The computer time is estimated with moderate coverage data based on an 8-CPU server. Actual computing time will vary from days to weeks depending on sequencing depth and server condition.

 

Array-based

gene/miRNA expressionChip-on-chipMethylation array
Array-based4 hours/comparison4 hours/sample4 hours/sample

DNA-seq

MappingIndel/SNP callingCNV
Whole-exomeHands-on time vs. Computer time2 hours/sample vs. 6-8 hours/sample2 hours/sample vs. 3-4 hours/sample2 hours/sample vs. 4 hours/sample
Whole-genomeHands-on time vs. Computer time3 hours/sample vs. 200 hours/sample3 hours/sample vs. 100 hours/sample3 hours/sample vs. 100 hours/sample



Methyl-seq
MappingMethylation callingDifferential methylation
Targeted (RRBS)Hands-on time vs. Computer time2 hours/sample vs. 6-8 hours/sample2 hours/sample vs. 3-4 hours/sample2 hours/sample vs. 4 hours/sample
Whole-genomeHands-on time vs. Computer time3 hours/sample vs. 200 hours/sample3 hours/sample vs. 100 hours/sample3 hours/sample vs. 20 hours/sample
RNA-seq
MappingGene expressionPathway/gene-setSplice analysis
Hands-on time4 hours/comparison2 hours/comaprison1 hour/comparison2 hours/sample
Computer time1-2 days/comparison10 hours/comparison1 hour/comparison4 hours/sample

ChIP-seq

MappingPeak callingDifferential binding analysisMotif anaylsis
Hands-on time2 hours/sample2 hours/sample2 hours/comparison2 hours/sample
Computer time5 hours/sample2 hours/sample4 hours/comparison2 hours/sample
CLIP-seq
MappingDifferential binding analysisMotif analysis
Hands-on time2 hours/sample2 hours/comparison2 hours/sample
Computer time5 hours/sample4 hours/comparison2 hours/sample

Oversight Committee 

A committee headed by the Associate Cancer Center Director for Clinical Sciences will be constituted to help judge the suitability of projects and to monitor the amount of time spent as well as receive requests for additional utilization of the BSR. 

The utilization review committee will meet on an as needed basis and at least quarterly to review utilization.  The committee will consist of the following: 

David Stern, MD, Director of Shared Resources
Howard Hochster, MD, Associate Cancer Center Director for Clinical Sciences 
William Casey King, PhD, Executive Director YCAS
 
Affiliated YCAS Faculty and Staff:
Peter Peduzzi, PhD, Director of YCAS peter.peduzzi@yale.edu
William Casey King, PhD, Executive Director casey.king@yale.edu
Xiaopan Yao, PhD, Associate Director of Biostatistics Shared Resource xiaopan.yao@yale.edu

Contacts:

Biostatistics – xiaopan.yao@yale.edu

Bioinformatics – xiaoqing.yu@yale.edu

Administrative Support:
Alicia Lakomski (203) 737-5946 or alicia.lakomski@yale.edu