Outputs for Boosted Genes on Xenium Prime 5K Custom Panels

Starting in XOA v3.2, the gene_panel.json, analysis_summary.html, and transcripts.parquet files will display additional information when Xenium Prime 5K datasets contain pre-designed genes with additional probe sets per gene ("gene boosting"). Learn more about gene boosting in the panel design process on the Xenium Panel Designer website.

While no impact on assay performance is anticipated, 10x Genomics does not support or experimentally validate the use of custom probes, and thus cannot guarantee that custom probes will successfully detect their targets.

The gene_panel.json format is unchanged, however boosted genes will be listed twice: 1) with the 10x Genomics pre-designed gene_coverage and 2) with the boosted gene_coverage and source information.

Below is an example where the original coverage for ACKR3 is 3 probe sets and the boosted coverage is 5 probe sets, for a total of 8 probe sets for this gene:


{
  "codewords": [
    5226,
    6589,
    17780
  ],
  "info": {
    "gene_coverage": 3
  },
  "source": {
    "category": "base",
    "identity": {
      "design_id": "hAtlas_v1.1",
      "name": "Xenium Human 5K Pan Tissue & Pathways Panel",
      "version": "1.1.0"
    }
  },
  "type": {
    "data": {
      "id": "ENSG00000144476",
      "name": "ACKR3"
    },
    "descriptor": "gene"
  }
},
{
  "codewords": [
    9397,
    17171,
    4676,
    2240,
    11249
  ],
  "info": {
    "gene_coverage": 5
  },
  "source": {
    "category": "current",
    "identity": {
      "design_id": "XXXXXX",
      "name": "hMulti_100g",
      "version": "1.0.0"
    }
  },
  "type": {
    "data": {
      "id": "ENSG00000144476",
      "name": "ACKR3"
    },
    "descriptor": "gene"
  }
}

The analysis_summary.html file provides information about boosted genes on the Summary, Decoding, and Cell segmentation tabs.

On the Summary tab, the Panel Specification table provides the number of boosted genes as part of "Total custom targets (RNA)" metric.

Both plots on the Decoding tab include a category called "Boosted Genes". In particular, the Gene-Specific Transcript Quality plot shows the breakdown of mean Q-Score and total transcript information, which help to assess added probe performance, when hovering over the boosted gene point on the plot. For the example below, the mean Q-Score and total transcripts metrics are shown per gene for:

Total: the combined result from add-on + pre-designed probe sets
Add-on probe sets: the added probe sets result alone (light blue diamond icon)
Pre-designed probe sets: the original pre-designed probe sets result alone (purple circle icon)

In this example, gene boosting added 78.1k transcripts for this gene:

The Counts per Gene plot also indicates the number of transcripts from add-on probe sets and pre-designed probe sets. In this example, gene boosting added 283 transcripts for this gene:

The metrics in the Decoding tab Decoding Yield and Cell Segmentation tab Segmentation Metrics include separate metrics for pre-designed genes and custom genes. If boosted genes are present, transcripts from boosted genes are included in the pre-designed gene metrics. For example:

The median predesigned transcripts per cell metric is calculated as the median number of high-quality (Q >= 20) transcripts per cell from pre-designed genes (10x-designed) and boosted genes (pre-designed genes for which extra probe sets were added).
The median custom transcripts per cell metric does not include counts from boosted genes.

To assess add-on probe performance further, use the codeword_category column (2nd to last column) in the transcripts.parquet file. You can examine information for transcripts (predesigned_gene) targeted by the pre-designed probes vs. transcripts (custom_gene) targeted by the added probes.

Add-on probe performance can be assessed with the plots and metrics described above in the analysis_summary.html and the transcripts.parquet file for XOA v3.2 outputs. For XOA v3.0 - 3.1 data, you can examine the transcripts.parquet file to distinguish the contributions from pre-designed vs. added probes.


import dask.dataframe as dd

# Read in the parquet file, edit path to where parquet file saved
df = dd.read_parquet('transcripts.parquet')

# Print information about the data frame
df.info()

# Print first 5 rows of the dask data frame
df.head()

# Subset by the boosted gene, example is IL1B
boosted_genes = df[df['feature_name'] == 'IL1B']
print(boosted_genes.compute())

# save the subset as a CSV
boosted_genes.to_csv('subset_IL1B_transcript_info', index=False)

The subset CSV file looks something like this (scroll right to see the codeword_category distinction):


transcript_id,cell_id,overlaps_nucleus,feature_name,x_location,y_location,z_location,qv,fov_name,nucleus_distance,codeword_index,codeword_category,is_gene
281474976719863,fajfmnbb-1,0,IL1B,37.453125,64.5625,19.09375,40.0,Y12,3.046875,383,predesigned_gene,True
281474976719903,fbifjjcc-1,0,IL1B,64.46875,110.28125,15.3125,30.25,Y12,1.140625,383,predesigned_gene,True
281474976719934,fcknkoij-1,0,IL1B,83.65625,125.328125,15.4375,40.0,Y12,0.28125,383,predesigned_gene,True
281474976719949,fclbhomd-1,1,IL1B,92.5,122.328125,17.171875,40.0,Y12,0.0,383,predesigned_gene,True
281474976719954,fclbhomd-1,0,IL1B,93.328125,122.734375,18.46875,40.0,Y12,0.265625,383,predesigned_gene,True
...
281474976710687,fchcbdda-1,0,IL1B,7.078125,227.125,14.75,40.0,Y12,1.78125,416,custom_gene,True
281474976710696,faljlgoo-1,0,IL1B,10.140625,39.546875,15.0,40.0,Y12,1.421875,416,custom_gene,True
281474976710741,fahehfcf-1,0,IL1B,15.890625,53.890625,18.765625,40.0,Y12,1.703125,416,custom_gene,True
281474976710771,faflhkgm-1,0,IL1B,20.3125,23.359375,17.0,40.0,Y12,1.578125,416,custom_gene,True
281474976710773,UNASSIGNED,0,IL1B,18.171875,246.78125,18.03125,40.0,Y12,5.765625,416,custom_gene,True
281474976710783,fahklcna-1,0,IL1B,20.453125,102.78125,18.796875,40.0,Y12,3.453125,416,custom_gene,True
...

Outputs for Boosted Genes on Xenium Prime 5K Custom Panels

Overview

Gene panel JSON

Analysis summary file

Transcripts parquet file