Support homeCell Ranger ATACGetting Started
What is Cell Ranger ATAC?

What is Cell Ranger ATAC?

Cell Ranger ATAC is a set of analysis pipelines that process Chromium Epi ATAC data. Cell Ranger ATAC includes four pipelines:

  • cellranger-atac mkfastq demultiplexes raw base call (BCL) files generated by Illumina® sequencers into FASTQ files. It is a wrapper around bcl2fastq from Illumina®, with additional useful features that are specific to 10x Genomics libraries and a simplified sample sheet format.

  • cellranger-atac count inputs FASTQ files from cellranger-atac mkfastq and performs Epi ATAC analysis, including:

    • Read filtering and alignment
    • Barcode counting
    • Identification of transposase cut sites
    • Detection of accessible chromatin peaks
    • Cell calling
    • Count matrix generation for peaks and transcription factors
    • Dimensionality reduction
    • Cell clustering
    • Cluster differential accessibility
  • cellranger-atac aggr aggregates and analyzes the outputs from multiple runs of cellranger-atac count (such as from multiple samples from one experiment) by performing the following steps:

    • Normalization of input runs to same median fragments per cell (sensitivity)
    • Detection of accessible chromatin peaks
    • Count matrix generation for peaks and transcription factors for the aggregate data
    • Dimensionality reduction
    • Cell clustering
    • Cluster differential accessibility
    • Chemistry batch correction
  • cellranger-atac reanalyze takes the analysis files produced by cellranger-atac count or cellranger-atac aggr and reruns secondary analysis with tunable parameter settings:

    • Cell calling
    • Dimensionality reduction
    • Cell clustering
    • Cluster differential accessibility

Output is delivered in standard BAM, MEX, CSV, TSV, HDF5 and HTML formats that are augmented with cellular information.

The cellranger-atac count pipeline can take input from multiple sequencing runs on the same library.

Cell Ranger ATAC versions 2.1 supports libraries generated by the Chromium Epi ATAC v1, v1.1 Next GEM, and v2 reagent kits.

10x Genomics recommends using the pipeline analysis programs in order, starting with cellranger-atac mkfastq for demultiplexing the raw base call (BCL) files for each flow cell directory, and continuing with cellranger-atac count for single library analysis. If compatible FASTQ files are available from another source, a user can skip cellranger-atac mkfastq and use those FASTQ files as direct input to cellranger-atac count. Compatible FASTQ files can be found in reputable public datasets, or can be built by using Illumina software directly. See the Specifying Input FASTQs page for more details.

The subsequent steps vary depending on how many samplesGEM wells, and flow cells you have (see the Glossary). A few common scenarios are described here in order of increasing complexity, but this list is not intended to be exhaustive.

The simplest scenario is a single sample processed through one GEM well. The resulting library is then sequenced on a single flow cell. Assuming the FASTQs have been generated with cellranger-atac mkfastq, run cellranger-atac count as described in Single-Library Analysis.

For libraries generated from a single GEM well but sequenced across multiple flow cells (e.g. to increase sequencing saturation), you can pool the reads from both sequencing runs. Follow the steps in Specifying Input Fastqs to combine them in a single cellranger-atac count run.

In this example, one sample is processed through multiple GEM wells. This is often done when conducting technical replicate experiments, or to increase the number of cells in a library. The libraries from the GEM wells are then pooled into one flow cell and sequenced. In this case, demultiplex the data from the sequencing run, then run the libraries from each GEM well through a separate instance of cellranger-atac count. You can then perform a combined analysis using cellranger-atac aggr, as described in Multi-Library Aggregation.

In this example, multiple samples are processed through multiple GEM wells to generate multiple libraries, which are then pooled onto one flow cell. After demultiplexing, run cellranger-atac count separately for each GEM well to get sample-specific data. For example, if your experimental design involves two samples, run cellranger-atac count twice. Then you can aggregate them with a single instance of cellranger-atac aggr, as described in Multi-Library Aggregation.