To follow along, you must:
- Have basic UNIX command line experience
- Fulfill these system requirements
- Download and install the Cell Ranger software
- Choose a compute platform
- Have access to a UNIX command prompt
We will work with the Human B cells dataset from a Healthy Donor (1k cells).
Watch this short video tutorial or follow text instructions to download example FASTQs.
Open up a terminal window. You may log in to a remote server or choose to perform the compute on your local machine. Refer to the System Requirements page for details.
In the working directory, create a new folder called dataset-multi-practice/
and cd
into that folder:
mkdir dataset-multi-practice
cd dataset-multi-practice
Download the input FASTQ files:
curl -LO https://cf.10xgenomics.com/samples/cell-vdj/6.0.0/sc5p_v2_hs_B_1k_multi_5gex_b_Multiplex/sc5p_v2_hs_B_1k_multi_5gex_b_Multiplex_fastqs.tar
A file named sc5p_v2_hs_B_1k_multi_5gex_b_Multiplex_fastqs.tar
should appear in your directory when you list files with the ls command.
Uncompress the FASTQs:
tar -xf sc5p_v2_hs_B_1k_multi_5gex_b_Multiplex_fastqs.tar
You should now see a folder called sc5p_v2_hs_B_1k_multi_5gex_b_fastqs
that contains two subfolders, sc5p_v2_hs_B_1k_5gex_fastqs
and sc5p_v2_hs_B_1k_b_fastqs
.
Navigate back to the working directory:
cd ..
Double check you are in the correct directory by running the ls command; the working directory should have the dataset-multi-practice
folder.
Watch a short video tutorial or follow the text instructions below.
Download the pre-built human reference transcriptome to the working directory and uncompress it:
curl -O https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-2020-A.tar.gz
tar -xf refdata-gex-GRCh38-2020-A.tar.g
Next, download the pre-built V(D)J reference to the working directory and uncompress it:
curl -O https://cf.10xgenomics.com/supp/cell-vdj/refdata-cellranger-vdj-GRCh38-alts-ensembl-5.0.0.tar.gz
tar -xf refdata-cellranger-vdj-GRCh38-alts-ensembl-5.0.0.tar.gz
Watch a short video tutorial or follow the text instructions below.
In your working directory, create a new CSV file called multi_config.csv
using your text editor of choice:
nano multi_config.csv
Copy and paste this text into the newly created file, and customize file paths:
[gene-expression]
reference,/jane.doe/working-directory/refdata-gex-GRCh38-2020-A
expect-cells,1000
[vdj]
reference,/jane.doe/working-directory/refdata-cellranger-vdj-GRCh38-alts-ensembl-5.0.0
[libraries]
fastq_id,fastqs,lanes,feature_types,subsample_rate
sc5p_v2_hs_B_1k_5gex,/jane.doe/working-directory/dataset-multi-practice/sc5p_v2_hs_B_1k_multi_5gex_b_fastqs/sc5p_v2_hs_B_1k_5gex_fastqs,1|2,gene expression,
sc5p_v2_hs_B_1k_b,/jane.doe/working-directory/dataset-multi-practice/sc5p_v2_hs_B_1k_multi_5gex_b_fastqs/sc5p_v2_hs_B_1k_b_fastqs,1|2,vdj,
Use your text editor's save command to save the file. In nano, save by typing CTR
L+X
→ y
→ ENTER
.
A customizable multi config CSV template is available for download on the example dataset page, under the Input Files tab.
Once you have all the necessary files, make a new directory called runs/
in your home directory:
mkdir runs/
cd runs/
You will run cellranger multi
in the runs/
directory.
After downloading the FASTQ files, the reference transcriptome, and a V(D)J reference, you are ready to run cellranger multi
.
Print the usage statement to get a list of all the options:
cellranger multi --help
The output should look similar to:
user_prompt$ cellranger multi --help
cellranger-multi
Analyze multiplexed data or combined gene expression/immune profiling/feature
barcode data
USAGE:
cellranger multi [FLAGS] [OPTIONS] --id --csv
FLAGS:
--dry Do not execute the pipeline. Generate a pipeline
invocation (.mro) file and stop
--disable-ui Do not serve the web UI
--noexit Keep web UI running after pipestance completes or fails
--nopreflight Skip preflight checks
-h, --help Prints help information
OPTIONS:
--id A unique run id and output folder name [a-zA-Z0-
9_-]+
--description Sample description to embed in output files
[default: ]
--csv Path of CSV file enumerating input libraries and
analysis parameters
--jobmode Job manager to use. Valid options: local
(default), sge, lsf, slurm or path to a
.template file. Search for help on "Cluster
Mode" at support.10xgenomics.com for more
details on configuring the pipeline to use a
compute cluster [default: local]
--localcores Set max cores the pipeline may request at one
time. Only applies to local jobs
....
Options used in this tutorial
Option | Description |
---|---|
--id | The id argument must be a unique run ID. We will call this run HumanB_Cell_multi based on the sample type in the example dataset. |
--csv | Path to the multi config CSV file enumerating input libraries and analysis parameters. Your multi_config.csv file is in the working directory. When executing cellranger multi from the runs directory, the relative path should be: ../multi_config.csv |
Watch a short video tutorial or follow the text instructions below.
From within the working-directory/runs/ directory
, run cellranger multi
cellranger multi --id=HumanB_Cell_multi --csv=../multi_config.csv
The run begins similar to this:
user_prompt$ cellranger multi --id=HumanB_Cell_multi --csv=/jane.doe/working-directory/multi_config.csv
Martian Runtime - v4.0.6
Serving UI at http://bespin1.fuzzplex.com:43129?auth=tIgY0u8ax70yeWhWKF61SkSgJDKvOIgZ-yjxYNJXXtY
Running preflight checks (please wait)...
2022-01-06 16:36:56 [runtime] (ready) ID.HumanB_Cell_multi.SC_MULTI_CS.PARSE_MULTI_CONFIG
2022-01-06 16:36:56 [runtime] (run:hydra) ID.HumanB_Cell_multi.SC_MULTI_CS.PARSE_MULTI_CONFIG.fork0.chnk0.main
2022-01-06 16:37:26 [runtime] (chunks_complete) ID.HumanB_Cell_multi.SC_MULTI_CS.PARSE_MULTI_CONFIG
2022-01-06 16:37:26 [runtime] (ready) ID.HumanB_Cell_multi.SC_MULTI_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.DETECT_COUNT_CHEMISTRY
2022-01-06 16:37:26 [runtime] (run:hydra) ID.HumanB_Cell_multi.SC_MULTI_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.DETECT_COUNT_CHEMISTRY.fork0.chnk0.main
....
When the output of the cellranger multi
command says, “Pipestance completed successfully!”, the job is done:
web_summary: /jane.doe/working-directory/runs/HumanB_Cell_multi/outs/per_sample_outs/HumanB_Cell_multi/web_summary.html
metrics_summary: /jane.doe/working-directory/runs/HumanB_Cell_multi/outs/per_sample_outs/HumanB_Cell_multi/metrics_summary.csv
}
Waiting 6 seconds for UI to do final refresh.
Pipestance completed successfully!
Watch a short video tutorial or follow the text instructions below.
Video tutorial Text instructions
A successful cellranger multi run produces a new directory called HumanB_Cell_multi/
(based on the --id
flag specified during the run). The contents of the HumanB_Cell_multi/
directory:
── runs
└── HumanB_Cell_multi
├── _cmdline
├── _filelist
├── _finalstate
├── HumanB_Cell_multi.mri.tgz
├── _invocation
├── _jobmode
├── _log
├── _mrosource
├── outs/
├── _perf
├── SC_MULTI_CS/
├── _sitecheck
├── _tags
├── _timestamp
├── _uuid
├── _vdrkill
└── _versions
The outs/
directory contains all important output files generated by the cellranger multi pipeline:
── runs
└── HumanB_Cell_multi
└──outs
├── config.csv
├── multi
│ ├── count
│ │ ├── raw_cloupe.cloupe
│ │ ├── raw_feature_bc_matrix
│ │ ├── raw_feature_bc_matrix.h5
│ │ ├── raw_molecule_info.h5
│ │ ├── unassigned_alignments.bam
│ │ └── unassigned_alignments.bam.bai
│ └── vdj_b
│ ├── all_contig_annotations.bed
│ ├── all_contig_annotations.csv
│ ├── all_contig_annotations.json
│ ├── all_contig.bam
│ ├── all_contig.bam.bai
│ ├── all_contig.fasta
│ ├── all_contig.fasta.fai
│ └── all_contig.fastq
├── per_sample_outs
│ └── HumanB_Cell_multi
│ ├── count
│ ├── metrics_summary.csv
│ ├── vdj_b
│ └── web_summary.html
└── vdj_reference
├── fasta
│ ├── donor_regions.fa
│ └── regions.fa
└── reference.json