Skip to Main Content

Computational Methods & Tools Developed in the Kleinstein Lab

The Immcantation framework provides a start-to-finish analytical ecosystem for high-throughput adaptive immune receptor repertoire sequencing (AIRR-seq) datasets, with a focus on B cell receptor (BCR) repertoire profiling. Beginning from raw reads, Python and R packages are provided for pre-processing, population structure determination, and repertoire analysis. An overview of AIRR-seq analysis can be found in (Yaari and Kleinstein, 2015).

nf-core/airrflow, a best-practice pipeline to analyze adaptive immune repertoire sequencing data from start to finish using the immcantation framework tools, supports analysis of bulk and single-cell targeted AIRRseq/VDJ libraries departing from either raw reads or assembled sequences. It also supports the extraction of BCR and TCR sequences from untargeted RNAseq and single-cell RNAseq data. Described in (Gabernet, Marquez et al. 2024).

The Immcantation framework includes:


LogMiNeR

LogMiNeR (Logistic Multiple Network-constrained Regression) is a method for analyzing high-throughput transcriptional profiling data (e.g., microarray or RNA-seq) in which multiple networks encoding prior knowledge are incorporated within a logistic modeling framework to improve model interpretability. A complete description of the method is available in (Avey et al., 2017).

QuSAGE

This R/Bioconductor package implements the Quantitative Set Analysis for Gene Expression (QuSAGE) method described in (Yaari et al., Nucleic Acids Res, 2013). QuSAGE is a substitute for existing gene set methods, such as GSEA, and provides a faster, more accurate, and easier to understand test for gene expression studies. QuSAGE accounts for inter-gene correlations and quantifies gene set activity with a complete probability density function (PDF). From this PDF, P values and confidence intervals can be easily extracted. Preserving the PDF also allows for post-hoc analysis (e.g., pair-wise comparisons of gene set activity) while maintaining statistical traceability.

Spec

Cell subset prediction for blood genomic studies (SPEC) is a computational method to predict the cellular source for a pre-defined list of genes (i.e., a gene signature) using gene expression data from total PBMCs. Details of the method are described in (Bolen et al., BMC Bioinformatics, 2011).

nipalsMCIA

nipalsMCIA is an R/Bioconductor package that uses Nonlinear Iterative Partial Least Squares (NIPALS) to perform joint dimensionality reduction on multi-omic data using Multiple Co-Inertia Analysis (MCIA). The iterative approach allows for fast low-dimensional embedding and visualization of high-dimensional multi-omic datasets, such as those arising from single-cell studies. Details of the method are described in (Mattessich et al., 2025)