Skip to Main Content

Computational Methods & Tools Developed in the Kleinstein Lab

The Immcantation framework provides a start-to-finish analytical ecosystem for high-throughput adaptive immune receptor repertoire sequencing (AIRR-seq) datasets, with a focus on B cell receptor (BCR) repertoire profiling. Beginning from raw reads, Python and R packages are provided for pre-processing, population structure determination, and repertoire analysis. An overview of AIRR-seq analysis can be found in (Yaari and Kleinstein, 2015).

The Immcantation framework includes:


LogMiNeR (Logistic Multiple Network-constrained Regression) is a method for analyzing high-throughput transcriptional profiling data (e.g., microarray or RNA-seq) in which multiple networks encoding prior knowledge are incorporated within a logistic modeling framework to improve model interpretability. A complete description of the method is available in (Avey et al., 2017).


This R/Bioconductor package implements the Quantitative Set Analysis for Gene Expression (QuSAGE) method described in (Yaari et al., Nucleic Acids Res, 2013). QuSAGE is a substitute for existing gene set methods, such as GSEA, and provides a faster, more accurate, and easier to understand test for gene expression studies. QuSAGE accounts for inter-gene correlations and quantifies gene set activity with a complete probability density function (PDF). From this PDF, P values and confidence intervals can be easily extracted. Preserving the PDF also allows for post-hoc analysis (e.g., pair-wise comparisons of gene set activity) while maintaining statistical traceability.


Cell subset prediction for blood genomic studies (SPEC) is a computational method to predict the cellular source for a pre-defined list of genes (i.e., a gene signature) using gene expression data from total PBMCs. Details of the method are described in (Bolen et al., BMC Bioinformatics, 2011).


The TIme-Dependent Activity Linker (TIDAL) generates a transcription factor regulatory network from time-series gene expression data. It will identify transcription factors that are active at each time-point in your data, and link these factors in a coherent cascade which can be visualized. Details of the method are described in (Zaslavsky et al., BMC Bioinformatics, 2013) and (Zaslavsky et al., Journal of Immunology, 2010).