This website provides tools to detect and quantify selection from mutated B cell immunoglobulin (Ig) sequences. It implements a statistical framework for Bayesian estimation of Antigen-driven SELectIoN (BASELINe) based on the analysis of somatic mutation patterns. A complete description of the method is available in (Yaari et al., Nucleic Acids Res, 2012). Our previous method, the Focused Z test, developed in (Uduman et al., 2011) and (Hershberg et al., 2008), is available here.

This website provides models of somatic hypermutation (SHM) targeting and nucleotide substitution constructed from high-throughput B cell immunoglobulin (Ig) sequencing data. Source code to construct and visualize these models is also available. The S5F model is constructed using Synonymous mutations in 5-mer motifs of Functional Ig sequences. Version 07312013.1 is based on >800,000 mutations.

pRESTO (REpertoire Sequencing TOolkit) is an integrated collection of platform-independent Python modules for processing raw reads from high-throughput (next-generation) sequencing of lymphocyte repertoires. pRESTO processes raw sequences to produce error-corrected, sorted and annotated sequence sets, along with a wealth of metrics at each step. Example workflows for Roche 454 and Illumina (MiSeq) platforms are available.

This R/Bioconductor package implements the Quantitative Set Analysis for Gene Expression (QuSAGE) method described in (Yaari et al., Nucleic Acids Res, 2013). QuSAGE is a substitute for existing gene set methods, such as GSEA, and provides a faster, more accurate, and easier to understand test for gene expression studies. QuSAGE accounts for inter-gene correlations and quantifies gene set activity with a complete probability density function (PDF). From this PDF, P values and confidence intervals can be easily extracted. Preserving the PDF also allows for post-hoc analysis (e.g., pair-wise comparisons of gene set activity) while maintaining statistical traceability.

Cell subset prediction for blood genomic studies (SPEC) is a computational method to predict the cellular source for a pre-defined list of genes (i.e., a gene signature) using gene expression data from total PBMCs. Details of the method are described in (Bolen et al., BMC Bioinformatics, 2011).

alt text 

The TIme-Dependent Activity Linker (TIDAL) generates a transcription factor regulatory network from time-series gene expression data. It will identify transcription factors that are active at each time-point in your data, and link these factors in a coherent cascade which can be visualized. Details of the method are described in (Zaslavsky et al., BMC Bioinformatics, 2013) and (Zaslavsky et al., Journal of Immunology, 2010).

alt text 

PRIME Database (PrimeD) of the human Dendritic cell (DC) transcriptional response to viral infection.