Support homeCell RangerRelease Notes
Release Notes for Cell Ranger v1-v3.1 (Archived)

Release Notes for Cell Ranger v1-v3.1 (Archived)

  • Feature Barcoding Only Analysis - It is now possible to run cellranger count using Cell Surface Protein (antibody captured) libraries without a GEX library. The previous version of Cell Ranger required a Gene Expression library along with a library generated by Feature Barcoding technology. However, the new version of Cell Ranger provides customers with flexibility to sequence either one of the libraries, or both. In particular, cell calling now works with antibody counts only, and all secondary analyses (PCA, t-SNE, UMAP, clusterings) work with antibody-only count matrix as well. More details are available on the [Feature Barcoding analysis] page (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/3.1/using/feature-bc-analysis) page.

  • UMAP based lower dimensionality projections of datasets analyzed by cellranger count are now produced in addition to the previously produced t-SNE projections. The projections are made available both as CSV files and as data that can be directly viewed in Loupe Browser. The parameters for the projection can also be modified and experimented with using cellranger reanalyze. This alternate visualization method has become increasingly popular for visualizing single cell data since the earliest report that used it. For more details, see the description in the algorithms overview section.

  • New Web Summary Look - The Cell Ranger web_summary.html file has been updated to match the styles and formats of other 10x products. Compared to the old version users will notice new fonts and some aesthetic changes in the new version.

  • Bug Fix: If equal numbers of reads with given Barcode / UMI combination map to two genes, the assignment of the Barcode / UMI are now considered ambiguous and not reported in moleculeinfo.h5 or the count matrix. Previously they were reported _twice, once for each gene.

  • Other minor bug fixes

Release Notes for Martian 3.2.3: Job Scheduling

  • Fix a crash in cases where the mrp binary becomes unavailable on disk during a pipestance run.

  • In addition to logging the type of filesystem for the pipestance directory, mrp will also log the type of filesystem for the martian bin directory (which is often different from the pipestance directory), and also the mount options for both directories.

  • Regardless of --jobinterval setting, mrp will now never attempt to submit more than one job at a time to the queue in cluster mode.

  • mrp will now shut down if the pipestance log file has been deleted, even if a new one has been created in its place. This prevents problems in the case where the pipestance directory (including the log and lock files) have been deleted.

  • Memory cgroups limits are now detected, reported, and used as default limits where applicable. This should be especially helpful for users submitting mrp to a cluster such as SLURM which uses memory cgroups to prevent jobs from using too much memory, by preventing mrp from trying to use more than the job's allowance.

  • Other small bug fixes and performance improvements.

V(D)J Release Notes

Major algorithm changes and effects on performance

  • The assembly, annotation and cell calling algorithms have all been replaced, as have the reference sequences. However with noted exceptions, the interface is unchanged.

  • Many changes were made to the assembly algorithm that allow it to achieve the same sensitivity using less data. After these changes, the recommended sequencing configuration was changed to 26 x 91 (from 150 x 150), while leaving the number of read pairs per cell fixed at 5000. This enables V(D)J, Gene Expression and Feature Barcoding libraries to be sequenced in a single run, thereby simplifying the workflow.

  • The effect of the new changes varies considerably from sample to sample and we have added a discussion on Experimental Design that explains some of this. In some instances the number of productive pairs increases markedly if the same dataset is rerun with the new code.

  • The old read configuration 150 x 150 is still supported and may be preferable for some users, because of pricing or availability, particularly for users who are running only V(D)J data. For 150 x 150, the recommended depth is proportionally lower, 2000 read pairs per cell.

  • Many corrections were made to the Prebuilt reference sequences.

  • Contig annotation has been improved in several ways. This includes more accurate detection of CDR3 regions, a more stringent full-length requirement, and a requirement that V segments begin with a start codon (coupled to reference sequence corrections). This could affect annotation for species other than human or mouse, having incomplete reference sequences.

  • A productive pair is no longer declared in cases where there are three or more contigs having the same chain type (e.g. TRB, TRB, TRB). In such cases the GEM may contain two or more cells.

  • Some new large clones are now reported, that were missed previously for a variety of reasons, including failure to align J segments having high somatic hypermutation.

  • A productive pair is no longer declared in cases where three or more contigs share the same chain type (e.g. TRB, TRB, TRB). In such cases the GEM may contain two or more cells. In addition, certain clonal expansions of plasma cells are now contracted because the expansion represents mRNA leakage during processing, rather than a true biological expansion. Finally, requirements for small clones sharing a chain with a large clone have been tightened to reduce the likelihood of false clones arising from ambient mRNA or doublets. All of these changes correctly reduce the number of reported productive pairs (usually by a small fraction).

  • Because of these changes, we recommend that customers rerun existing datasets using Cell Ranger 3.1 if possible.

  • Because cell calling is changed, the denominator used for computing the Cells With Productive V-J Spanning Pair metric may have changed. For this reason, differences in performance between Cell Ranger 3.0 and 3.1 are better assessed using the Number of Cells With Productive V-J Spanning Pair metric.

  • Cell Ranger 3.1 is significantly faster. There are five fewer stages in the pipeline.

Interface Changes:

  • Cell Count Confidence is no longer reported because we found that in some cases incorrect counts were reported with high confidence. Cell counting from V(D)J data alone is limited in accuracy because targeted cells having sufficiently low expression cannot be detected.

  • Contigs Unannotated is no longer reported because all contigs are now annotated. The justification for this is that since enrichment uses primers binding to constant regions, bona fide contigs would be expected to have at least a C annotation.

  • For species other than human or mouse, for which custom primers are needed, the sequences of the inner enrichment primers must now be supplied as a command-line argument.

Job Scheduling Changes

  • Add support for SGE and LSF clusters that track virtual memory use.

Enable Analysis of CITE-seq Experiments

  • Cell Ranger can now process data from experiments where the antibodies were conjugated to oligonucleotides that were captured by oligo-dT primers. Previously, only experiments which used the Chromium Single Cell 3' Feature Barcode Library Kit, which utilizes a different capture sequence for Gene Expression and Feature Barcoding data, could be analyzed.

  • Please note that while Cell Ranger is now compatible with CITE-seq data, CITE-seq is not a supported application. To ensure full support for your 10x data analysis please visit the Feature Barcode Analysis page to see the supported Feature Barcoding technology.

Bug fixes

  • Fix an issue where STAR would crash on CPUs without AVX support.

  • Fix a determinism issue when aggregating 3' v2 and v3 data.

  • Increase the memory reservation for the SORT_BY_POS stage.

General

  • Cell Ranger has been overhauled to support user-defined Feature Barcoding reagents, and to quantify these features alongside standard gene-expression reads. See Feature Barcoding for details. For users who have already run their data through earlier versions, there is no need to rerun it again using this new version.

Cell Calling Changes

  • Cell Ranger 3.0 implements a version of the EmptyDrops cell calling algorithm that will call more low RNA content cells, especially when they are mixed with a population of high RNA content cells. See Cell Calling Algorithms for details.

  • The cell calling 'knee-plot' in the web summary now indicates what fraction of barcodes in each segment of the curve were called as cells, since the new cell calling algorithm no longer makes a hard threshold on UMI counts.

Output File Format Changes

  • The file formats of the gene-barcode matrix (now called the feature-barcode matrix) have changed to accommodate Feature Barcoding results.

  • The mtx and barcodes.tsv files are now gzipped to save disk space The genes.tsv file has been renamed features.tsv.gz, and contains extra columns indicating the feature_type of each gene / feature.

  • See Feature-Barcode Matrices for details.

  • As part of this change, cellranger-rkit is deprecated. We recommend Seurat for analysis in R.

  • The Molecule info file format has been substantially changed to enable output from the new Feature Barcoding technology and remove rarely used mapping metrics.

Cell Ranger 2.2.0 will require CentOS/RedHat 6 or Ubuntu 12 or later. See the 10x OS Support page for further information.

  • Fix Martian UI display in FireFox

  • Fix non-integral resource requests (memory/threads)

  • Fix SUBSAMPLE_READS producing wrong metric names. Newer version of Martian no longer casts zero-fractional floats to ints, which this code was relying on to produce metric names with integral subsampling rates in them.

  • Fix failure to detect whitelist with demux when a single Sample Index is bad

  • Fix always-on multi-chromosome transcript warning in mkref

  • Fix stall in ALIGN_READS on filesystems that don't support named pipes

  • Fix python error when autodetect of chemistry fails with multiple FASTQ paths

  • Fix handling of sample names with multiple underscores in mkfastq pipeline

  • Fix suppression of process limit errors in the mkfastq QC stage

Changes to mkfastq

  • Barcode-aware QC stage is now opt-in via the --qc flag.

  • Limit total CPU usage across stages to 12 cores unless --localcores is specified. This should improve reliability on machines with high numbers of cores.

Cell Ranger 2.1.1 Gene Expression

Note: This is expected to be the last version of Cell Ranger to support CentOS/RedHat 5 and Ubuntu 10. If you are using one of those operating systems, Cell Ranger will now warn you. Future versions of Cell Ranger will require CentOS/RedHat 6 or Ubuntu 12 or later. See the 10x OS Support page for further information.

Bug Fixes

  • Fix library ID labels being out of order in the matrix HDF5 file produced by cellranger aggr when 10 or more libraries are aggregated. This manifests as Loupe Cell Browser showing the library ID labels out of order after running cellranger reanalyze.

  • Fix an out-of-memory error occurring when generating the kmer index on a reference with very long transcripts, e.g. on a pre-mRNA reference used when analyzing nuclei samples.

  • Fix crash when analyzing FASTQs produced by SRA's fastq-dump.

  • Fix the Differential Expression table in the web summary disappearing when gene IDs are equal to gene names in the reference GTF.

  • Fix a few web summary metrics becoming negative when more than 2.1 billion reads are analyzed at once.

  • Fix incorrect parsing of the --localcores argument, causing --localmem to be ignored when specified immediately after --localcores.

  • Fix crash in mkfastq on NovaSeq when RunParameters.xml is named runParameters.xml.

  • Fix hang when running sitecheck on some systems.

  • Fix error reporting in python stage code imports.

  • Fix estimation of stage virtual memory usage.

Improvements

  • Truncate large metadata files when generating a tarball for upload to 10x, rather than omitting them. Remove the requirement that the reference FASTA file modification time precede the STAR index file modification times.

  • The default --localmem in cluster mode will no longer ever be more than the free memory available when the cellranger starts.

New Features

  • Add support for and autodetection of Single Cell 5' gene expression libraries, with support for both paired-end alignment (150x150) and R2-only alignment (26x98).

  • Add --r1-length and --r2-length options to cellranger count which enable hard trimming of input FASTQs.

  • Add --exclude-genes option to cellranger reanalyze which, analogously to --genes, allows for the exclusion of some genes from the secondary analysis (PCA, clustering, etc.).

  • Add --chemistry to cellranger count to override the automatic chemistry detection.

Performance Improvements

  • Reduce the run time by 30%.

  • Reduce the disk storage high-water-mark by 60%.

Algorithm Improvements

  • Change the Antisense Reads Metric to only count a read as antisense if it has no sense alignments, effectively prioritizing sense alignments over antisense for this computation.

Output File Changes

  • Stop generating the TR and TQ BAM tags because these tags were retaining trimmed sequences that Cell Ranger would ignore anyway after converting the BAM back to FASTQ.

  • Add more mapping metrics (Reads Mapped to Genome, Reads Mapped Confidently to Genome), and reorder the mapping metrics to be consistent with their order of computation.

Bug Fixes

  • Fix mkfastq allowing max bcl2fastq threads to exceed --localcores.

  • Fix mkfastq crashing when reading NovaSeq quality data from RTA 3.3 and later.

  • Fix excessive memory requests in SC_RNA_ANALYZER.

  • Fix nondetection of louvain binary failure in RUN_GRAPH_CLUSTERING.

  • Fix crash in RUN_GRAPH_CLUSTERING when /dev/stdin doesn't exist.

  • Fix the barcode rank plot concatenating instead of unioning barcodes in multi-genome datasets.

System Requirements Changes

  • Cell Ranger no longer supports Ubuntu 8 or CentOS 5.2 Linux distributions. Ubuntu 10.04 LTS or CentOS 5.5 or greater are now required.

Job Scheduling

  • The pipeline management system, mrp, is now open source on GitHub.

  • The monitoring port for the user interface is now always on by default, with an OS-selected port if none is specified.

    • This behavior can be disabled with --disable-ui.
    • Access to the user interface port, if no port was specified explicitly, now requires a randomly-generated authentication token. This token is visible in the pipeline standard output and in the _uiport file.
  • A new tool, mrstat is now available.

    • Given the path to the directory with a running pipeline, mrstat will return basic information about the progress of the pipeline.
    • With the --stop flag, it will cause the pipeline to fail and exit.
  • Two new variables are available for use in cluster-mode templates:

    • __MRO_JOB_WORKDIR__ can be used to specify the absolute path to the directory where the job should execute. This should alleviate issues on clusters such as PBS which sometimes do not set the working directory correctly.
    • __MRO_ACCOUNT__ passes the MRO_ACCOUNT environment variable from mrp's environment. This is intended for cluster managers which support charging resources to specific accounts.
  • The pipeline standard output and log will now periodically provide progress updates for in-progress stages.

  • mrp will now provide more clear and useful error reporting when the pipeline directory runs out of disk space.

  • Several enhancements to the reliability of pipeline restart.

  • Fixes for several cases where a pipeline could "hang" indefinitely without making further progress.

  • Pipelines should now do a better job of staying within their CPU usage allocation.

Bug fixes

  • Properly ignore SIGHUP when a pipeline is run using nohup.

Pipeline Argument Changes

  • Add --override option to all pipelines, allowing for stage-level overrides for cores and memory.

  • Reanalyze no longer requires --agg to persist library ID; it is only required for persisting user-defined fields.

Bug fixes

  • Fix CHUNK_READS using more cores and using them less efficiently than intended.

  • Fix aggr using incorrect downsampling rates when more than 10 libraries are aggregated.

  • Fix mkfastq proceeding even after bcl2fastq is killed.

  • Fix lack of robustness to rare events where NFS latency induces double file deletion or double directory creation events.

  • Fix ALIGN_READS proceeding after the STAR subprocess fails, causing crashes in ATTACH_BCS_AND_UMIS.

  • Improve error messages when STAR or samtools fail in ALIGN_READS.

  • Fix spaces in transcript IDs causing ATTACH_BCS_AND_UMIS to crash. mkref no longer allows spaces in transcript IDs.

  • Fix crash when reads are adapter-trimmed by bcl2fastq and some reads end up empty.

  • Fix out-of-memory condition in ATTACH_BCS_AND_UMIS for some libraries with >800M reads.

  • Fix question marks replacing axis titles of barcode rank plot in web summary.

  • Fix excessive memory consumption and runtime of mkfastq on large sample sheets.

Job Scheduling

  • Fix several cases where, after mrp (which is invoked by cellranger) gets killed, it was not able to restart correctly.

  • On SGE clusters, cellranger/mrp now periodically runs qstat to verify that the jobs it queued have not been killed or canceled.

  • If the run fails, instead of just displaying a message pointing the user to the relevant _errors file, the contents of that file is printed.

-On automatic retry of failed stages, the reason for the original failure is logged. mrp is now more resilient against certain kinds of filesystem errors.

  • In the event of certain types of filesystem problems (such as permissions errors or disk quota), mrp/cellranger should now sometimes be able to provide more useful and immediate error messages.

  • Additional information about the environment cellranger runs in is now logged and included in mri.tgz.

  • Additional information about the environment the analysis runs in is now logged and included in mri.tgz.

  • mrp now correctly handles the signals sent by SGE and LSF when a soft time limit is reached (e.g. for SGE, -l s_rt 23:00:00).

  • Now supports --overrides method to dynamically change additional CPU and memory per stage.

Pipeline argument changes

  • Add --barcodes and --genes options to reanalyze, which allow selection of a specific subset of barcodes and/or genes to use in the secondary analysis.

  • Add --force-cells option to count and reanalyze to explicitly set the cell count. If specified, Cell Ranger will take the top N barcodes (by UMI count) as cells instead of doing dynamic cell count estimation.

  • Rename the estimated cells option from --cells to --expect-cells for clarity.

  • Add --nosecondary flag to count, which skips the secondary analysis. Disallow slashes in the --genome argument in mkref.

Add --id option to mkfastq which allows you to name the output directory.

New subcommands

  • Add cellranger mat2csv command, which converts a Cell Ranger sparse gene-barcode matrix to a dense CSV format. Note that the resulting file will be very large, even for a few hundred cells.

Web summary changes

  • Add "Reads Mapped Antisense to Gene" metric, which quantifies reads that are mapped to the non-coding strand of a gene. High values can indicate the use of an unsupported chemistry type, e.g. passing a Single Cell V(D)J library to cellranger count.

  • Add "Fraction GEMs with >1 Cell (Lower / Upper Bound)" metrics, which define a confidence interval for the multiplet rate estimate in multi-genome samples.

  • Add more details to various metric descriptions.

Algorithm improvements

  • Add the requirement that reads overlap annotated exons by at least 50% in order to be considered exonic. As a result, "Reads Mapped Confidently to Exonic Regions" may differ slightly from previous versions.

  • Reduce EXTRACT_READS per-read runtime by 50% by avoiding OrderedDict and caching metric calculations.

  • Reduce SUBSAMPLE_READS runtime by reducing the number of fixed target values for subsampling (to just 25k and 50k reads per cell).

File format improvements

  • Due to a format change (removal of the IntervalTree object), references produced with cellranger mkref using Cell Ranger v2.0 are not compatible with pipelines from Cell Ranger v1.x.

  • Modify the TX, GX, and GN tags to have more granular transcript/gene annotations. Each BAM record is only annotated with transcripts/genes specific to that alignment, instead of combining annotations from all alignments of the corresponding read.

  • Add RE tag, which indicates whether an alignment is exonic, intronic or intergenic.

Bug fixes

  • Fix rare bug in interval arithmetic, leading to exonic reads being falsely annotated as intronic or intergenic. As a result of this bugfix, "Reads Mapped Confidently to Exonic Regions" may differ slightly from previous versions.

  • Fix excessive EXTRACT_READS runtime (10+ hours) on very large FASTQs such as those produced by mkfastq.

  • Fix a crash in RUN_GRAPH_CLUSTERING on filesystems that do not support named pipes.

  • Fix SUBSAMPLE_READS using more VMEM than expected, causing it to be killed by SGE when exceeding the h_vmem limit on certain clusters.

  • Fix mkfastq not merging output files properly due to sample numbering issues.

  • Fix mkfastq crash due to -d(demultiplexing-threads) argument being deprecated in bcl2fastq 2.19.

  • Fix the components.csv file produced by PCA, which did not contain the correct matrix.

  • Fix a crash in RUN_PCA when the number of nonzero genes is smaller than the number of principal components.

  • Fix a crash in mkref with very large genomes; use the limitGenomeGenerateRAM option in STAR to overcome its default reference size limit.

  • Fix certain special characters (like dashes) in reference names breaking the subsampled genes detected plot.

  • Fix mkloupe displaying an unhelpful error message when run on mixed-species runs and those from Cell Ranger v1.1 or earlier.

  • Fix the open-file-handle-limit check using the submit host rather than the execution machine.

  • Fix cellranger aggr allowing duplicate library_ids.

  • Fix CLOUPE_PREPROCESS taking the full matrix even after reanalyze subselects barcodes.

  • Fix a crash in mkfastq on RunInfo.xml files produced by the NovaSeq.

  • Fix a crash in mkfastq when bcl2fastq 2.19 is used in cluster mode or with the --demultiplexing-threads argument.

  • Fix mkfastq sometimes not properly merging samples in bcl2fastq 2.18 and 2.19 due to a change in the order in which lanes are processed by bcl2fastq.

Martian Runtime Changes

  • Add caching for deserialized JSON metadata. This improves performance for stages with many chunks.

Miscellaneous

  • Update samtools from 0.1.19 to 1.4.

  • Rename RUN_PREPROCESS to PREPROCESS_MATRIX in the SC_RNA_ANALYZER pipeline.

  • Add alerts.json as an output of the SUMMARIZE_REPORTS stage. This file is a machine-readable list of any abnormal metric values that raised alarms in the web summary.

  • For multi-genome samples, display the full reference name rather than a comma delimited list of genomes in the web summary ("hg19, mm10" becomes "hg19_and_mm10").

  • Fixes issue preventing mkfastq from demultiplexing data from recent sequencer software versions.

Analysis Improvements

  • Confidently align more reads to the transcriptome, greatly improving alignment rates with shorter reads. - Reads Confidently Mapped to Transcriptome increases from 55% to 62% with 98bp reads and from 34% to 54% with 32bp reads (Human PBMCs vs GRCh38).

  • Add a graph-based clustering algorithm: Louvain Modularity Optimization, which, unlike K-Means, does not require pre-specifying K.

Visualization

  • Automatically produce Loupe Cell Browser (.cloupe) files in the count, aggregate, and reanalyze pipelines.

  • Output a web summary HTML file in the reanalyze pipeline.

  • Be explicit about pre- and post- depth normalization metric values in the aggr web summary.

  • When the web summary subselects 10e3 cells for display, show the original cluster sizes and not the subselected sizes.

  • Make the web summary HTML slightly smaller by rounding t-SNE coordinates.

  • Update plotly to enable scrollable legends.

File format improvements

  • Add Read Group (RG) headers and tags to the output BAM file for better data provenance.

Bug fixes

  • Preserve trimmed bases via the TR/TQ BAM tags for much longer read lengths without crashing.

  • Fix crash when copying files on certain types of network shares that do not support file permissions.

  • Omit no-call bases from Q30 metrics to be consistent with Illumina's Q30 calculation.

  • Allow generation of 3-d (alongside 2-d) t-SNE projections without crashing.

  • Do a better job of hiding dynamic elements while the web summary HTML is loading.

General

  • Make the --params argument to reanalyze optional to enable re-runs with the default parameters.

  • Check for mismatches between the library IDs given in the aggr CSV and those in the matrix file.

  • Limit max_clusters for K-Means to 50 to ensure sane memory consumption.

  • Fix incorrect results being produced when aggr processes a count output that contains multiple libraries (gem groups).

  • Exclude untested genes from p-value adjustment.

  • Don't crash when extra commas are present in an IEM samplesheet for mkfastq.

  • Don't crash when no project folders are present for mkfastq.

  • Correctly handle the second index when mkfastq receives a dual-indexed IEM samplesheet.

  • Allow matrices to have more than 2^31-1 nonzero entries in the matrix HDF5 format.

  • Don't display alerts until the web summary page fully loads.

General

  • Rename main pipeline to cellranger count, which produces a gene-barcode matrix for one library sequenced one or more times.

  • Add support for and autodetection of Chromium Single Cell 3' v2 chemistry; still compatible with v1 chemistry.

  • Fix incorrect default cell count being used when "expected recovered cells" not specified.

New aggr aggregation pipeline

  • New pipeline cellranger aggr which aggregates data from multiple libraries into one dataset.

  • Supports combining libraries totalling up to 1,000,000 cells and secondary analysis of the combined data.

  • Automatically performs sequencing depth-normalization for all combined libraries.

New reanalyze custom reanalysis pipeline

  • Reruns secondary analysis (dimensionality reduction, clustering, and differential expression) with fully customizable parameters.

New mkfastq demultiplexing pipeline

  • Easier to integrate with existing bcl2fastq-based workflows.

  • Now the preferred demultiplexing method; demux still available but deprecated.

  • mkfastq is a thin wrapper around bcl2fastq with same basic interface.

  • Accepts Illumina Experiment Manager-compatible sample sheets with support for 10x sample index sets.

  • Produces FASTQ files and folders in the same structure as bcl2fastq.

  • Generates InterOp output for SAV.

  • Also generates 10x-specific run QC metrics in JSON format.

Scalability enhancements

  • Support combined secondary analysis (dimensionality reduction, clustering, differential expression, and visualization) of up to 1,000,000 cells in under 12 hours with 64 GB of RAM.

  • Change PCA implementation to the Netflix-scale memory-efficient method IRLBA.

  • Decrease runtime of t-SNE implementation.

Analysis Improvements

  • Change differential expression algorithm to the negative-binomial based method sSeq.

  • Report log2 fold-change and p-value for all genes in all clusters.

Sample and genome support

  • Add pre-built GRCh38 reference package

Web summary enhancements

  • Add plots that show Sequencing Saturation and Median Genes Detected as a function of downsampled reads per cell.

  • Add Total Genes Detected.

  • Rename "cDNA PCR Duplication" to "Sequencing Saturation."

  • Add chemistry field.

  • Order clusters by size.

  • Add help bubbles to charts.

File format improvements

  • Generate BAM index files with the same basename as the main file.

  • Change cell-barcode and UMI quality tags to CY and UY for better compatibility with the SAM specification.

  • Add TR, TQ tags to BAM to enable lossless BAM to FASTQ conversion.

  • Output HDF5-based sparse matrices in addition to the Matrix Exchange format files for better scalability to high cell counts.

  • Report proportion of variance explained for each principal component.

Martian runtime

  • Pipestance output files (outs) are no longer symlinks.

  • Partial stage restart.

  • Add output filename override, supports two output files having same basename.

  • Add --onfinish handler support.

  • Add support for units of KB and B for memory reservation in cluster job templates.

  • Pipestances now generate a UUID in _uuid.

  • Add auto-retry mechanism when pipeline stages fail due to causes that appear to be transient.

  • --maxjobs now defaults to 64 in local jobmode.

  • --jobinterval now defaults to 100ms in local jobmode.

  • Fix for rare race condition in some Python components

  • Enabled STAR multithreading

  • Added more detailed reference metadata

  • Fixed chromosome name mismatches in 10x reference data

  • Fixed t-SNE algorithm not converging for samples with high cell counts

  • Fixed cell-barcode correction not correcting as many sequences as it should

  • Fixed out-of-memory crash in COUNT_GENES for high-depth samples

  • Fixed occasional loss of the last few reads per chunk in ATTACH_BCS_AND_UMI

  • Added "Reads Mapped Confidently to Exonic Regions" metric to the summary.

  • Changed alert for "Reads Mapped Confidently to Transcriptome" to reflect shorter read lengths and non-human references.

  • Fixed problem where differential expression table sorts incorrectly on click.

  • Fixed problem where very high depth samples would cause an out-of-memory error.

  • Fixed problem where mkgtf would produce incorrectly formatted GTF files.

  • Fixed problem where debug tar.gz file would be very large if the pipestance halted mid-stage.

  • Fixed problem with copying files on certain CIFS volumes.

  • Initial release.