This page describes the output file structure from the cellranger multi
subcommand specifically for Flex data.
Upon completion, the cellranger multi
subcommand will produce an outs/
directory with the following structure:
Using the tree
Linux command, the file structure looks like this:
├── config.csv
├── multi
│ ├── count
│ └── multiplexing_analysis
└── per_sample_outs
├── Sample1
└── Sample2
The first section of the outputs contains the config.csv
file, a duplicate of the input config CSV file. The files in the multi/
folder are generic to the entire Flex experiment, while the files in the per_sample_outs/
directory have been demultiplexed to single samples.
Within the multi/
directory, there are count and multiplexing_analysis
directories. The multiplexing_analysis
sub-directory is only generated in the context of multiplex Flex inputs.
└─ multi
├── count
│ ├── raw_cloupe.cloupe
│ ├── raw_feature_bc_matrix
│ │ ├── barcodes.tsv.gz
│ │ ├── features.tsv.gz
│ │ └── matrix.mtx.gz
│ ├── raw_feature_bc_matrix.h5
│ ├── raw_molecule_info.h5
│ ├── raw_probe_bc_matrix.h5
│ ├── unassigned_alignments.bam
│ └── unassigned_alignments.bam.bai
└── multiplexing_analysis
├── cells_per_tag.json
└── frp_gem_barcode_overlap.csv
The count
directory contains raw files that include cell-associated and background data from all samples within an experiment:
Output File | Description |
---|---|
raw_cloupe.cloupe | A Loupe-readable file containing all cell-associated barcodes in the experiment. |
raw_feature_bc_matrix | A matrix of UMI counts per (feature, barcode) pair, in MEX format. This matrix contains every barcode from the full fixed list of known good barcode sequences that has at least one read. This includes background and cell-associated barcodes from all samples as well as valid barcodes that were not assigned to a sample in case of a multiplex Flex experiment. Distinct ligation events are counted for Flex rather than distinct transcripts. |
raw_feature_bc_matrix.h5 | Same information as raw_feature_bc_matrix in HDF5 format. |
raw_molecule_info.h5 | Information about all molecules in the experiment. This file includes background and cell-associated barcodes from all samples as well as valid barcodes that were not assigned to a sample. Starting in Cell Ranger v7.1, this file also includes UMI counts per probe. This file cannot be used as input for cellranger aggr pipeline. |
raw_probe_bc_matrix.h5 | Contains UMI counts of each probe for all detected barcodes in HDF5 format. |
unassigned_alignments.bam | Reads with either valid barcodes not assigned to a sample or invalid barcodes. If a transcriptome reference is not provided, an unaligned BAM file is generated. The unassigned_alignments.bam is only generated if create-bam,true in config CSV. |
unassigned_alignments.bam.bai | Indexed file for unassigned_alignments.bam . Only generated if create-bam,true in config CSV. |
The multiplexing_analysis
directory contains:
Output File | Description |
---|---|
cells_per_tag.json | Lists the cell-associated barcodes that were assigned a given Probe Barcode tag, for each tag, in JSON format. |
frp_gem_barcode_overlap.csv | Contains the number of shared 10x GEM Barcodes for all pairs of observed Probe Barcode (BC) tags or Probe Barcode and Antibody Multiplexing Barcode (AB) tags assigned to a sample. More details below. |
The cells_per_tag.json
file looks like this, for each Probe Barcode (e.g., BC001
) the cell-associated barcodes (e.g., "AACAAGCTCCCTCAAAACTTTAGG-1"
, etc.) are listed below it:
{
"BC001":[
"AACAAGCTCCCTCAAAACTTTAGG-1",
"AACATAGTCCCATAGCACTTTAGG-1",
"AACCAGGTCATGGTCCACTTTAGG-1",
...
"BC002":[
"AAACTGTCAGGAGCAAAACGGGAA-1",
"AAAGGGATCTAATCGTAACGGGAA-1",
"AACCAAATCGGTCAAGAACGGGAA-1",
...
}
The frp_gem_barcode_overlap.csv
file can be used to troubleshoot scenarios where, for example, two different Probe Barcodes were accidentally added to the same hybridization reaction. For multiplexed GEX + Antibody Capture experiments, overlap between incorrect BC+AB barcode pairs may result from contamination or using the same Probe Barcode for two antibody panels. An alert in the web summary will be triggered if the overlap coefficient is ≥ 60%. Please contact [email protected] if you have questions.
barcode1_id,barcode2_id,barcode1_gems,barcode2_gems,common_gems,overlap
BC009,BC010,2333,2608,81,0.034719245606515216
[ … ]
BC012,AB011,2461,4419,188,0.07639171068671272
Column descriptions:
barcode1_id
: First barcode identifierbarcode2_id
: Second barcode identifierbarcode1_gems
: Number of 10x GEM Barcodes for barcode1barcode2_gems
: Number of 10x GEM Barcodes for barcode2common_gems
: Number of 10x GEM Barcodes in commonoverlap
: The overlap coefficient of these two barcodes (either pair of Probe Barcodes (BC), Probe Barcode (BC) and Antibody Multiplexing Barcode (AB), or AB-AB for Antibody Capture only analysis).
Overlap coefficient =
The per_sample_outs
directory contains sample-level files with any data associated with a valid Probe Barcode that could be assigned to a sample, including both cells and background.
├── count
│ ├── analysis
│ │ ├── clustering
│ │ ├── diffexp
│ │ ├── pca
│ │ ├── tsne
│ │ └── umap
│ ├── probe_set.csv
│ ├── sample_cloupe.cloupe
│ ├── sample_alignments.bam
│ ├── sample_alignments.bam.bai
│ ├── sample_filtered_barcodes.csv
│ ├── sample_filtered_feature_bc_matrix
│ │ ├── barcodes.tsv.gz
│ │ ├── features.tsv.gz
│ │ └── matrix.mtx.gz
│ ├── sample_filtered_feature_bc_matrix.h5
│ ├── sample_raw_feature_bc_matrix
│ │ ├── barcodes.tsv.gz
│ │ ├── features.tsv.gz
│ │ └── matrix.mtx.gz
│ ├── sample_raw_feature_bc_matrix.h5
│ ├── sample_molecule_info.h5
| └── sample_raw_probe_bc_matrix.h5
├── metrics_summary.csv
└── web_summary.html
The per_sample_outs
directory contains:
Output File | Description |
---|---|
count/ | Folder containing the results of any gene expression and Feature Barcode analysis, see table below. |
metrics_summary.csv | metrics_summary.csv Run summary metrics file in CSV format. Metric definitions available in the web summary ? help text. |
web_summary.html | Run summary metrics and charts in HTML format, described in the multi web summary page. |
The count
directory contains:
Output File | Description |
---|---|
analysis/ | Folder containing the results of graph-based clusters and K-means clustering 2-10; differential gene expression analysis between clusters; and PCA, t-SNE, and UMAP dimensionality reduction. Learn more |
probe_set.csv | A duplicate of the input probe_set.csv file. |
sample_cloupe.cloupe | A Loupe Browser visualization and analysis file with cell-associated barcodes for the specific sample. |
sample_filtered_feature_bc_matrix/ | Contains only detected cell-associated barcodes and only genes in the filtered probe set in MEX format. Each element of the matrix is the number of UMIs associated with a feature (row) and a barcode (column), as described in the feature-barcode matrix page. Distinct ligation events are counted for Flex rather than distinct transcripts. This file can be input into third-party packages and allows users to wrangle the barcode-feature matrix (e.g. to filter outlier cells, run dimensionality reduction, normalize gene expression). |
sample_filtered_feature_bc_matrix.h5 | Same information as sample_filtered_feature_bc_matrix in HDF5 format. |
sample_molecule_info.h5 | Contains per-molecule information for all molecules counted in the sample_raw_feature_bc_matrix for this sample that contain a valid barcode, valid UMI, and were assigned with high confidence to a gene or Feature Barcode. Starting in Cell Ranger v7.1, this file also includes UMI counts per probe. This file is a required input to run cellranger aggr . Learn more |
sample_alignments.bam | Indexed BAM file containing position-sorted reads aligned to the genome and transcriptome, as well as unaligned reads, annotated with barcode information. Learn more. If a transcriptome reference is not provided, an unaligned BAM file is generated. sample_alignments.bam is only generated if create-bam,true in multi config CSV. |
sample_alignments.bam.bai | Index file for the sample_alignments.bam . Only generated if create-bam,true in config CSV. |
sample_filtered_barcodes.csv | File containing a list of only cell-associated barcodes. |
sample_raw_feature_bc_matrix/ | Contains all barcodes assigned to this sample, including cell-associated and background barcodes in MEX format. Genes in this matrix include ones with DEPRECATED probes and probes with predicted off-target activities, as well as the ones in the filtered probe set. Each element of the matrix is the number of UMIs associated with a feature (row) and a barcode (column), as described in the feature-barcode matrix page. Distinct ligation events are counted for Flex rather than distinct transcripts. This file is only generated in the context of multiplex Flex configuration. |
sample_raw_feature_bc_matrix_h5 | Same information as sample_raw_feature_bc_matrix in HDF5 format. This file is only generated in the context of multiplex Flex configuration. |
sample_raw_probe_bc_matrix.h5 | Contains columns that indicate the probes in the filtered probe reference, the probes that passed gDNA filtering, and the probe barcodes that are in cells. It is similar to the feature-barcode matrix, but is organized at the probe level rather than the gene level. Probes are flagged as not passing the gDNA filter if their aggregate count across all cells have fewer UMI counts than the metric "Estimated UMIs from genomic DNA per unspliced probe". |
Upon completion, the cellranger multi
subcommand will produce an outs/
directory with the following structure:
├── config.csv
├── multi
│ └── count
└── per_sample_outs
└── Sample1
Within the multi/
directory, there is a count/
directory. This directory contains the same files as described above for the multiplex Flex count/
directory. There is no multiplexing_analysis/
directory for singleplex experiments.
└─ multi
└── count
├── raw_cloupe.cloupe
├── raw_feature_bc_matrix
│ ├── barcodes.tsv.gz
│ ├── features.tsv.gz
│ └── matrix.mtx.gz
├── raw_feature_bc_matrix.h5
├── raw_molecule_info.h5
├── raw_probe_bc_matrix.h5
├── unassigned_alignments.bam
└── unassigned_alignments.bam.bai
The per_sample_outs
directory contains data associated with all valid barcodes in the singleplex Flex library. This directory contains the same files as described above for the multiplex Flex per_sample_outs
directory, except for the sample_raw_feature_bc_matrix/
and sample_raw_feature_bc_matrix.h5
files. The web_summary.html
is described on the multi
web summary page.
├── count
│ ├── analysis
│ │ ├── clustering
│ │ ├── diffexp
│ │ ├── pca
│ │ ├── tsne
│ │ └── umap
│ ├── probe_set.csv
│ ├── sample_cloupe.cloupe
│ ├── sample_alignments.bam
│ ├── sample_alignments.bam.bai
│ ├── sample_filtered_barcodes.csv
│ ├── sample_filtered_feature_bc_matrix
│ │ ├── barcodes.tsv.gz
│ │ ├── features.tsv.gz
│ │ └── matrix.mtx.gz
│ ├── sample_filtered_feature_bc_matrix.h5
│ ├── sample_molecule_info.h5
│ └── sample_raw_probe_bc_matrix.h5
├── metrics_summary.csv
└── web_summary.html
The Flex analyses with an Antibody Capture library and/or CRISPR Guide Capture library have all the same outputs described in the per_sample_outs/count
directory of singleplex and multiplex GEX analyses. In addition, they contain a feature_reference.csv
file, which is a duplicate of the input Feature Reference CSV file.
├── count
│ ├── analysis
│ │ ├── clustering
│ │ ├── diffexp
│ │ ├── pca
│ │ ├── tsne
│ │ └── umap
│ ├── aggregate_barcodes.csv
│ ├── feature_reference.csv
│ ├── ...
The /multi/count
directory also contains a copy of the feature_reference.csv
file:
└─ multi
├── count
│ ├── raw_cloupe.cloupe
│ ├── raw_feature_bc_matrix
│ │ ├── barcodes.tsv.gz
│ │ ├── features.tsv.gz
│ │ └── matrix.mtx.gz
│ ├── feature_reference.csv
│ ├── raw_feature_bc_matrix.h5
│ ├── raw_molecule_info.h5
│ ├── unassigned_alignments.bam
│ └── unassigned_alignments.bam.bai
The web_summary.html
is described on the multi
web summary page.