Support homeXenium Panel DesignerTutorials
Xenium Panel Design Concepts and Terms

Xenium Panel Design Concepts and Terms

The factors below may contribute to the overall success of add-on custom and standalone custom panel designs and impact downstream data analysis.

The Xenium Onboard Analysis pipeline uses the Xenium codebook to decode transcripts. See the Overview of Xenium Algorithms page to learn about the decoding algorithm.

The Xenium codebook contains a collection of codewords that are assigned to genes in a given gene panel. The pipeline uses the gene_panel.json to specify a given gene name to an indexed codeword. Each codeword is defined based on a pattern of fluorescent signal intensities recorded across channels and cycles (see diagram below). Some codewords are reserved for negative controls.

The Xenium codebook can either have a 1 codeword : 1 gene relationship ("no splitting") or a configuration where there are multiple codewords per gene ("gene splitting"). These modes are illustrated below for an example gene (the probe colors below match those in the probe sets description).

  • The top section illustrates gene splitting, where Gene 1 has 3 codewords, with 1 probe set per codeword and 3 probe sets per gene.
  • The bottom section illustrates no splitting, where Gene 1 has 1 codeword, with multiple probe sets per codeword (for example, 3 probe sets shown below).

Xenium v1 panels are designed to have 1 codeword per gene, with up to 8 probe sets per gene. There is no gene splitting in Xenium v1 panel designs.

Xenium Prime panels have a gene split design in which multiple codewords are assigned to each gene, with 1 probe set per codeword and ~3 probe sets per gene (but up to 6 and as few as 1).

Xenium In Situ uses fluorescent microscopy to detect individual transcripts in cells. Like all imaging-based technologies, only a finite number of fluorescent signals can be distinguished within a given area or volume, such as a cell. This means that there is a finite number of transcripts that can be detected in a given tissue volume, which is referred to as a detection budget. As a consequence, panel design involves allocating the available budget in each cell type to a set of carefully selected panel genes.

Xenium In Situ was carefully designed to be able to compensate for optical crowding and enable robust transcript detection in most cases. However, in severe cases, the detection sensitivity for a subset of genes on a panel can be impacted significantly. Our custom panel design process is designed to help researchers minimize the risk of severe optical crowding. You can learn more about optical crowding and the ways in which Xenium compensates for this effect in the Xenium Add-on Panel Design Technical Note.


The Xenium chemistry is probe-based. Multiple probes are used per gene in order to successfully capture biological variation (i.e., multiple isoforms). A probe set is a collection of probes designed to detect as many isoforms of a gene as possible based on GENCODE BASIC annotations.


In the example illustrated above, the four isoforms of the gene are targeted by three probe sets. "Probe set 1" and "Probe set 2" each consist of one probe and target four isoforms. "Probe set 3" has two probes because it is not possible to target all four isoforms with a single probe.

Probes should unambiguously bind to the intended target transcript and not to other transcripts. The transcript-complementary regions of the probe must bind specifically to the target sequence within the limits of sample preparation conditions and must enable ligation at the padlock junction and efficient rolling-circle amplification.

Target probe sets per gene
For Xenium v1 pre-designed and custom panels, the panel designer targets eight probe sets per gene. Xenium Prime panels have improved per-probe set sensitivity, so the recommended default has been reduced to three probe sets per gene for pre-designed panels and five probe sets per gene for custom panels. These can be further refined in the panel designer tool as needed to optimize panel design.

It is not always possible to design a single probe that covers every isoform for a gene. This is because some genes have isoforms that do not share sequence, or more often, the sequence they do share is not a good target for probe design. Consequently, some genes in the panel will have probe sets consisting of multiple probes (a probe set is not necessarily a single oligonucleotide). We have pre-designed probe sets for all genes, except for a small fraction of genes for which we were unable to design effective probes. These include:

  • Genes with high sequence homology to paralogs (e.g., genes with zinc-finger motifs) are not amenable to specific probe design because they lack a unique target sequence.
  • Factors like extreme GC-content can also limit our ability to design effective probes for a given gene.

Based on the above limitations, some genes may have as few as one probe set. Generally, genes with at least three probe sets have robust detection, while those with one or two probe sets run the risk of not being detected. In the panel design process, we may include fewer than the default number of probe sets for a gene if it is very highly expressed to maximize the optical budget efficiency. A reduction in probe sets for a given gene results in an approximately linear decrease in sensitivity. For example, the detection efficiency for a given gene decreases by ~50% if probe sets are reduced from eight to four.

In Xenium panel designs, the FASTA file contains the panel probe sequences in FASTA format, as represented on the probes and reverse-complement to the targeted transcript in FASTA format. The sequence ID consists of the transcript ID, gene name, and a unique identifying string, e.g. ENSG00000134440|NARS|1f419e5.

In Xenium panel designs, the 13 column BED detail file contains the sequences and genomic coordinates of the probes. For a description of each column in this file, please see BED format specs. For the entry name (column 4), we format it as {GENE SYMBOL} ({ENSEMBL ID}) | {CODEWORD} | {INDEX}. The codeword, as well as the probe set group the entry is in, are also tabulated in column 13. The underlying genomic references are GRCh38 for human and GRCm38 for mouse.

This file may be used to visualize the probe locations with genome browsers like IGV (Integrated Genomics Viewer) and the UCSC Genome Browser, or to intersect the probe locations with other genomic features of interest using tools like bedtools.

The BED file information is useful for determining whether probes span exon-exon junctions. For example, probe ENSG00000134440|NARS|1f419e5, has the following sequence:

>ENSG00000134440|NARS|1f419e5 TCCTTCTAACGCACCAATCTTCACACATTTTGGCTCTGGG

And its BED record shows the coordinates in the top row below:

chr18 57613656 57615656 NARS (ENSG00000134440) | 168 | 7 0 + 57613656 57615656 35,23,27 2 24,16 0,1984 Codeword: 168, Probeset: 1

Loading the BED file in IGV shows this probe spans two exons:

By zooming in, we see the probe sequence as represented in the FASTA file is the reverse-complement of the transcript sequence:

Sensitivity refers to the fraction of transcripts detected per cell, i.e. the ratio between the number of transcripts observed and the number of transcripts actually present in the cell. Since it is usually not possible to know the true number of transcripts present, it is typically difficult to quantify sensitivity in absolute terms.

Like other single cell technologies (e.g., Chromium Single Cell 3’ Gene Expression and Chromium Single Cell Gene Expression Flex), Xenium in situ data analysis does not detect every transcript of a given gene present in a cell. Cell-by-cell variation in the fraction of transcripts detected manifests as technical variation (noise) in the number of observed transcript counts per cell.

For a given cell type, utilization describes the fraction of a cell's transcriptome that a Xenium Gene Expression panel is targeting. It is measured in units of "TP10K" (transcripts per 10,000).

Higher utilization values can indicate higher predictions of transcript density for cells belonging to a given cell type.

To calculate utilization per gene (measured in units of TP10K) from single cell data, we first calculate the mean expression of each gene in all cells labeled with a given cell type and then scale the resulting profile so it sums to 10,000. This is done by dividing all values of the profile by the total sum and multiplying by 10,000. To calculate utilization of the panel, we sum the TP10K values for all genes on the panel.

This method is similar to how bulk RNA-seq expression profiles are calculated, sometimes in units of "TPM" (transcripts per million). Here, we scale to 10,000 in order to obtain relative values that are closer to the absolute number of transcripts observed.