The cellranger multi
pipeline uses a configuration CSV file to specify input file paths and analysis options. The general layout for all analyses includes the [gene-expression]
and [libraries]
sections. The other sections may be included depending on the analysis.
The information below is divided by config section and it is noted if the option only applies to certain assays or has unique recommendations for specific analyses. Example config CSV layouts for specific assays are shown on these pages:
The [gene-expression]
section specifies information about the Gene Expression library.
Field | Description |
---|---|
reference | Required. Absolute path to folder containing 10x Genomics-compatible genome reference. |
r1-length | Optional. Limit the length of the input Read 1 sequence of Gene Expression libraries to the first N bases, where N is a user-supplied value. Note that the length includes the 10x Barcode and UMI sequences so do not set this below 26. This and r2-length are useful options for determining the optimal read length for sequencing. Default: do not trim Read 1. |
r2-length | Optional. Limit the length of the input Read 2 sequence of Gene Expression libraries to the first N bases, where N is a user-supplied value. Trimming occurs before sequencing metrics are computed and therefore, limiting the length of Read 2 may affect Q30 scores. Default: do not trim Read 2. |
chemistry | Optional. Assay configuration. Note: by default, the assay configuration is detected automatically (recommended). Users will typically not need to specify a chemistry, options are listed below. Default: auto |
expect-cells | Optional. Override the pipeline’s auto-estimation of cells. See cell calling algorithm overview for details on how this parameter is used. If used, enter the expected number of recovered cells. expect-cells in the [gene-expression] section is only valid for the singleplex FRP configuration; note this option name has a dash (-). |
force-cells | Optional. Force pipeline to use this number of cells, bypassing cell detection. Default: detect cells using Cell Ranger's cell calling algorithm. For FRP, specifying library-level force-cells in the [gene-expression] section is only valid for the singleplex Fixed RNA Profiling configuration; note this option name has a dash (-). |
include-introns | Optional. Set to false to exclude intronic reads in count. Including introns in analysis is recommended to maximize sensitivity. Default: true This option does not apply to Fixed RNA Profiling analysis. |
no-secondary | Optional. Disable secondary analysis, e.g. clustering. Default: false |
no-bam | Optional. Set this argument to true to skip BAM file generation. This will reduce the total computation time for the pipestance and the size of the output directory. If unsure, we recommend not using this option, as BAM files can be useful for troubleshooting and downstream analysis. Default: false However, we recommend setting this option to true for Fixed RNA Profiling analysis to reduce the total computation time for the pipestance and the size of the output directory. |
check-library-compatibility | Optional. This option allows users to disable the check that evaluates 10x Barcode overlap between libraries when multiple libraries are specified (e.g., Gene Expression + Antibody Capture). Setting this option to false will disable the check across all library combinations. We recommend running this check (default), however if the pipeline errors out, users can bypass the check to generate outputs for troubleshooting. Default: true |
These are the options for 3', Fixed RNA Profiling (FRP), and 5' chemistries.
auto
: Chemistry autodetection (default)threeprime
: Single Cell 3'SC3Pv1
,SC3Pv2
,SC3Pv3
: Single Cell 3' v1, v2, or v3SC3Pv3HT
: Single Cell 3' v3.1 HTSC-FB
: Single Cell Antibody-only 3' v2 or 5'fiveprime
: Single Cell 5'SC5-PE
: Paired-end Single Cell 5'SC5P-R2
: R2-only Single Cell 5'SC5PHT
: Single Cell 5' v2 HTSFRP
: Singleplex FRPMFRP
: Multiplex FRP (Probe Barcode on R2)MFRP-R1
: Multiplex FRP (Probe Barcode on R1)
These [gene-expression]
options only apply to 3' Cell Multiplexing data analysis.
Field | Description |
---|---|
min-assignment-confidence | Optional. The minimum estimated likelihood to call a sample as tagged with a Cell Multiplexing Oligo (CMO) instead of "Unassigned". Users may wish to tolerate a higher rate of mis-assignment in order to obtain more singlets to include in their analysis, or a lower rate of mis-assignment at the cost of obtaining fewer singlets. By default, this value is 0.9. Contact [email protected] for further advice. |
cmo-set | Optional. The default CMO reference IDs are built into the Cell Ranger software and do not need to be specified. However, this option can be used to specify the path to a custom CMO set CSV file, declaring CMO constructs and associated barcodes. See CMO Reference section for details. |
barcode-sample-assignment | Optional. Absolute path to a barcode-sample assignment CSV file that specifies the barcodes that belong to each sample. See details below to set up this file. |
These [gene-expression]
options only apply to Fixed RNA Profiling data analysis.
Field | Description |
---|---|
probe-set | Required. Absolute path to the probe set reference CSV file. This file is included with the Cell Ranger package v7.0 and later (i.e., cellranger-x.y.z/probe_sets/ ) and on the Downloads page. |
filter-probes | Optional. Include all non-deprecated probes listed in the probe set reference CSV file. Probes that are predicted to have off-target activity to homologous genes are excluded from analysis by default. Setting filter-probes to false will result in UMI counts from all non-deprecated probes, including those with predicted off-target activity, to be used in the analysis. Probes whose ID is prefixed with DEPRECATED are always excluded from the analysis. Default: true |
The [feature]
section specifies information about the Feature Barcode library.
Field | Description |
---|---|
reference | Required only for Antibody Capture, Antigen Capture, or CRISPR Guide Capture libraries. Absolute path to the Feature reference CSV file, declaring Feature Barcode constructs and associated barcodes. |
r1-length | Optional. Limit the length of the input Read 1 sequence of Feature Barcode libraries to the first N bases, where N is a user-supplied value. Note that the length includes the 10x Barcode and UMI sequences so do not set this below 26. This and r2-length are useful options for determining the optimal read length for sequencing. Default: do not trim Read 1. |
r2-length | Optional. Limit the length of the input Read 2 sequence of Feature Barcode libraries to the first N bases, where N is a user-supplied value. Trimming occurs before sequencing metrics are computed and therefore, limiting the length of Read 2 may affect Q30 scores. Default: do not trim Read 2. |
The [libraries]
section specifies all the input library data (see also Specifying Input FASTQ Files).
Field | Description |
---|---|
fastq_id | Required. The Illumina sample name to analyze. This will be as specified in the sample sheet supplied to the demultiplexing software. |
fastqs | Required. Absolute path to the folder containing the FASTQ files to be analyzed. Generally, this will be the fastq_path folder generated by the demultiplexing software. If the same library was sequenced on multiple flow cells, the FASTQs folder from each flow cell must be specified a separate line in the CSV (see 5' example here). Doing this will treat all reads from the library, across flow cells, as one sample. If you have multiple libraries for the sample, you will need to run cellranger multi on them individually, and then combine them with cellranger aggr . |
feature_types | Required. The underlying feature type of the library (listed below). |
lanes | Optional. The lanes associated with this sample, separated with a pipe (e.g., 1|2 ). Default: uses all lanes |
physical_library_id | Optional. Library type. Note: by default, the library type is detected automatically based on specified feature_types (recommended). Users typically do not need to include the physical_library_id column in the CSV file. |
subsample_rate | Optional. The rate at which reads from the provided FASTQ files are sampled. Must be strictly greater than 0 and less than or equal to 1. |
These are the options for 3', Fixed RNA Profiling (FRP), and 5' feature types.
Gene Expression
Antibody Capture
CRISPR Guide Capture
Multiplexing Capture
for 3' Cell MultiplexingVDJ
VDJ-T
VDJ-T-GD
VDJ-B
Antigen Capture
Antigen Capture
should be used only for BEAM libraries. For other (non-BEAM) antigen libraries (TotalSeq™-C, Immudex's dMHC Dextramer® libraries with dCODE Dextramers), set feature_types
to Antibody Capture
. Setting this option to VDJ
will autodetect the chain type.
The [samples]
is used to specify sample-level options for multiplexed experiments.
Field | Description |
---|---|
sample_id | Required. A name to identify a multiplexed sample. Must be alphanumeric with hyphens and/or underscores, and less than 64 characters. |
expect_cells | Optional. Override the pipeline’s auto-estimation of cells. See Gene Expression algorithm overview for details. If used, enter the expected number of recovered cells. For FRP, specifying sample-level expect_cells in the [samples] section is only valid for the multiplex Fixed RNA Profiling configuration; note this column name has an underscore (_). |
force_cells | Optional. Force pipeline to use this number of cells, bypassing cell detection. Default: detect cells using EmptyDrops. For FRP, specifying sample-level force_cells in the [samples] section is only valid for the multiplex Fixed RNA Profiling configuration; note this column name has an underscore (_). |
description | Optional. A description for the sample. |
This [samples]
option only applies to 3' Cell Multiplexing data analysis.
Field | Description |
---|---|
cmo_ids | Required. The Cell Multiplexing oligo IDs used to multiplex this sample. Only input CMOs used in the experiment. If multiple CMOs were used for a sample, separate IDs with a pipe (e.g., CMO301|CMO302 ). |
This [samples]
option only applies to Fixed RNA Profiling data analysis.
Field | Description |
---|---|
probe_barcode_ids | Required. The Fixed RNA Probe Barcode IDs used for this sample, and for multiplex GEX + Antibody Capture libraries, the corresponding Antibody Multiplexing Barcode IDs. We recommend specifying both barcodes in the config CSV (e.g., BC001+AB001 ) when an Antibody Capture library is present. The barcode pair order is BC+AB and they are separated with a "+" (no spaces). Alternatively, you can specify the Probe Barcode ID alone and Cell Ranger’s barcode pairing auto-detection algorithm will automatically match to the corresponding Antibody Multiplexing Barcode. If multiple Probe Barcodes were used for a sample, separate IDs with a pipe (e.g., BC001|BC002 ). |
The [vdj]
section specifies information about the V(D)J library.
Field | Description |
---|---|
reference | Required for V(D)J Immune Profiling libraries. Absolute path of folder containing 10x Genomics-compatible V(D)J reference. |
inner-enrichment-primers | Optional. If inner enrichment primers other than those provided in the 10x Genomics kits are used, they need to be specified here as a text file with one primer per line. |
r1-length | Optional. Limit the length of the input Read 1 sequence of V(D)J libraries to the first N bases, where N is a user-supplied value. Note that the length includes the Barcode and UMI sequences so do not set this below 26. This and r2-length are useful options for determining the optimal read length for sequencing. Default: do not trim Read 1. |
r2-length | Optional. Limit the length of the input Read 2 sequence of V(D)J libraries to the first N bases, where N is a user-supplied value. Trimming occurs before sequencing metrics are computed and therefore, limiting the length of Read 2 may affect Q30 scores. Default: do not trim Read 2. |
This [antigen-specificity]
section is recommended if an Antigen Capture (BEAM) library is present. It is needed to calculate the antigen specificity score.
Field | Description |
---|---|
control_id | Required. A user-defined ID for any negative controls used in the T/BCR Antigen Capture assay. Must match id specified in the Feature Reference CSV. May only include ASCII characters and must not use whitespace, slash, quote, or comma characters. Each ID must be unique and must not collide with a gene identifier from the transcriptome. |
mhc_allele | The MHC allele for TCR Antigen Capture libraries. Must match mhc_allele name specified in the Feature Reference CSV. For BCR Antigen Capture library, analysis runs with or without this header. If you keep the header, leave rows blank. |