Generating FASTQs with bcl2fastq (Illumina Software)

The bcl2fastq software is available for download and installation on the Illumina support website as an RPM package. An Illumina account is required for download. Please contact Illumina Support at techsupport@illumina.com if you have questions about bcl2fastq versions, or for help troubleshooting its download and installation.

You must create a sample sheet for bcl2fastq to correctly embed the names of samples into output FASTQ files. When you plan an experiment, you should know the name of the sample index set used for each sample, which comes from the reagent kit (such as "SI-TT-A1" for dual index or "SI-GA-A1" for single index).

The Illumina Experiment Manager can also be used to create sample sheets for use with bcl2fastq.

Important

Do not trim adapters during demultiplexing. Leave these settings blank. Trimming adapters from reads can potentially damage the 10x barcodes and the UMIs, resulting in pipeline failure or data loss. If you are using an Illumina sample sheet for demultiplexing with bcl2fastq, BCL Convert or our mkfastq pipeline, please remove these lines under the [Settings] section: Adapter or AdapterRead1 or AdapterRead2.

For each sample, enter its lane, sample name, and sample index set into the Illumina bcl2fastq sample sheet. Here is an example using "SI-TT-A1" indices for i7 and index2_workflow_b i5:


[Data]
Lane,Sample_ID,index,index2
1,s1,GTAACATGCG,AGGTAACACT

For each sample, enter its lane, sample name, and set of four sample indices into the Illumina bcl2fastq sample sheet. Here is an example using "SI-GA-A1" indices:


[Data]
Lane,Sample_ID,index
1,sample1,GGTTTACT
1,sample1,CTAAACGG
1,sample1,TCGGCGTC
1,sample1,AACCGTAA

If you are only running a single sample in a lane, then you can enter a single line with a blank index, although bcl2fastq will include reads associated with any sample index.

Illumina bcl2fastq must be called with the correct --use-bases-mask argument, and other arguments, in order to properly demultiplex and output FASTQs for all the reads in a Chromium library.

In the examples below, ${FLOWCELL_DIR} is the directory that contains a flow cell's Data/ folder, ${OUTPUT_DIR} is the directory that you want to output FASTQs to, and ${SAMPLE_SHEET_PATH} is the path to the sample sheet CSV you created.

For bcl2fastq2 v2.20, these are the most common command line formats for sequencers running RTA 1.18.54 and higher for either dual or single index kits:

Dual index demultiplexing (edit the file paths for your data):


SAMPLE_SHEET_PATH=/path/to/sample/sheet/csv
OUTPUT_DIR=/path/to/save/outputs
FLOWCELL_DIR=/path/to/input/BCL/files
INTEROP_DIR=./stats

bcl2fastq --use-bases-mask=Y28,I10,I10,Y91 \
--create-fastq-for-index-reads \
--minimum-trimmed-read-length=8 \
--mask-short-adapter-reads=8 \
--ignore-missing-positions \
--ignore-missing-controls \
--ignore-missing-filter \
--ignore-missing-bcls \
-r 6 -w 6 \
-R ${FLOWCELL_DIR} \
--output-dir=${OUTPUT_DIR} \
--interop-dir=${INTEROP_DIR} \
--sample-sheet=${SAMPLE_SHEET_PATH}

Single index demultiplexing (edit the file paths for your data):


SAMPLE_SHEET_PATH=/path/to/sample/sheet/csv
OUTPUT_DIR=/path/to/save/outputs
FLOWCELL_DIR=/path/to/input/BCL/files
INTEROP_DIR=./stats

bcl2fastq --use-bases-mask=Y26,I8,Y98 \
--create-fastq-for-index-reads \
--minimum-trimmed-read-length=8 \
--mask-short-adapter-reads=8 \
--ignore-missing-positions \
--ignore-missing-controls \
--ignore-missing-filter \
--ignore-missing-bcls \
-r 6 -w 6 \
-R ${FLOWCELL_DIR} \
--output-dir=${OUTPUT_DIR} \
--interop-dir=${INTEROP_DIR} \
--sample-sheet=${SAMPLE_SHEET_PATH}

To limit bcl2fastq to a subset of lanes, supply values to the --tiles argument.

Omitting extra bases from reads:

If you add extra bases to a sample index read, you will need to account for this in the --use-bases-mask argument. For example, if you ran a sample index read with nine bases, you will need to truncate the last base in order for Cell Ranger to run correctly.

You can exclude a single base by adding a single n character to the read argument, or adding nto exclude all bases after a certain position. See below:

Read	Desired	Actual	Argument
i7 Index Read (I1)	8	9	`I8n`

A new folder is created (name specified by the --output-dir flag). This folder contains the FASTQ file sets, statistics, and reports.

A convenient way to test bcl2fastq is by downloading the tiny-BCL-data example dataset. This dual-indexed iSeq dataset has been selected for its small size (541 MB). The example below is applicable to 3' Single Cell Gene Expression, 5' Immune Profiling, Fixed RNA Profiling, and Visium libraries processed with the TT Set A dual index kit. It should not be used to run downstream pipelines (e.g. cellranger count).

To follow along:

Download the iseq-DI.tar file
Uncompress the tar file by running:


tar -xf /working-directory/iseq-DI.tar.gz

Download the bcl2fastq_samplesheet.csv

Run bcl2fastq (remember to customize the /working-directory/ path with the path to your input/output directory):


SAMPLE_SHEET_PATH=/working-directory/bcl2fastq_samplesheet.csv
OUTPUT_DIR=/working-directory/tiny-FASTQs
FLOWCELL_DIR=/working-directory/iseq-DI
INTEROP_DIR=/working-directory/stats

bcl2fastq --use-bases-mask=Y28,I10,I10,Y91 \
  --create-fastq-for-index-reads \
  --minimum-trimmed-read-length=8 \
  --mask-short-adapter-reads=8 \
  --ignore-missing-positions \
  --ignore-missing-controls \
  --ignore-missing-filter \
  --ignore-missing-bcls \
  -r 6 -w 6 \
  -R ${FLOWCELL_DIR} \
  --output-dir=${OUTPUT_DIR} \
  --interop-dir=${INTEROP_DIR} \
  --sample-sheet=${SAMPLE_SHEET_PATH}

A folder called tiny-FASTQs is created in the working directory. This folder contains your newly created FASTQ files.


tiny-FASTQs/
    ├── Reports
    │   └── html
    ├── Stats
    │   ├── AdapterTrimming.txt
    │   ├── ConversionStats.xml
    │   ├── DemultiplexingStats.xml
    │   ├── DemuxSummaryF1L1.txt
    │   ├── FastqSummaryF1L1.txt
    │   └── Stats.json
    ├── iseq-DI_S1_L001_I1_001.fastq.gz
    ├── iseq-DI_S1_L001_I2_001.fastq.gz
    ├── iseq-DI_S1_L001_R1_001.fastq.gz
    ├── iseq-DI_S1_L001_R2_001.fastq.gz
    ├── Undetermined_S0_L001_I1_001.fastq.gz
    ├── Undetermined_S0_L001_I2_001.fastq.gz
    ├── Undetermined_S0_L001_R1_001.fastq.gz
    └── Undetermined_S0_L001_R2_001.fastq.gz

Back to previous page

Generating FASTQs with bcl2fastq (Illumina Software)

Download and install bcl2fastq

Creating the sample sheet

Dual index sample sheet

Single index sample sheet

Running bcl2fastq

Output FASTQs

Example dataset