Figures and data

Schematic of potential off-target binding in 10x Xenium.
In this illustration, the arms of the padlock probes were designed to bind an RNA sequence intended to correspond to a target gene (green). However, these probes exhibit off-target binding and bind to an RNA sequence in a different off-target gene (red). The probe is circularized and subsequently amplified via rolling circle amplification (RCA). Hybridization of fluorescent probes to the RCA product enables the generation a fluorescent signal that is used to quantify RNA expression within cells.

OPT output of genes with predicted off-target binding based on perfect sequence homology in GENCODE v47.
This table shows the 45 genes whose probes in the 10x Genomics Xenium v1 Human Breast Gene Expression Panel exhibit predicted off-target probe binding, where each off-target alignment involves a perfect 40bp match to the probe sequence. Although OPT predicted off-target binding of CCPG1 probe sequences to the DNAAF1-CCPG1 gene, we manually excluded it from our list because DNAAF1-CCPG1 is a read-through gene containing portions of both DNAAF1 and CCPG1. The final column shows the gene types, in order, of each of the off-target genes shown in column 3. Abbreviations: PC = protein-coding; PG = pseudogene; NMD = nonsense-mediated decay; lncRNA = long non-coding RNA.

Comparison of spatial gene expression patterns between Xenium and Visium.
(A) Spatial gene expression of MS4A1 overlaid on the corresponding histological images for Xenium and Visium. (B) Gene expression patterns for TUBB2B: Xenium expression, Visium expression, the aggregated Visium expression combining TUBB2B and its predicted off-target gene’s expression TUBB2A, and Visium expression of TUBB2B’s predicted off-target TUBB2A. (C) Scatterplot of log-transformed total expression counts (with a pseudocount) for 307 genes comparing Visium and Xenium data. The dotted line represents X = Y, and points (genes) are colored by probe information.

Comparison of single-cell gene expression patterns between Xenium and scRNA-seq.
(A) Harmonized UMAP visualization of MS4A1 expression for Xenium and scRNA-seq data. (B) Comparison of TUBB2B expression patterns on harmonized UMAP: Xenium expression, scRNA-seq expression, an aggregated scRNA-seq profile combining TUBB2B and its predicted off-target gene’s expression TUBB2A, and scRNA-seq expression of TUBB2B’s potential off-target TUBB2A. Note that TUBB2B also has two additional off-targets present in the scRNA-seq data (Supplementary Figure 4). (C) Scatterplot of log-transformed total expression counts (with a pseudocount) for 313 genes between Visium and scRNA-seq data. The dotted line represents X = Y, and points (genes) are colored by probe information.

(A) Screenshot from the Integrated Genome Viewer (IGV) showing the 40bp probe sequence (ID: ENSG00000196154|S100A4|ab4e3dc) that matches both S100A5 and S100A4. Shown are 6 isoforms from the CHESS v3.1 annotation, 4 from GENCODE Basic v47, and 3 from RefSeq v110 for S100A4, as well as 2 RefSeq isoforms for the neighboring S100A5. The probe sequence aligns to the overlapping region between S100A5 and S100A4 gene loci. Matching probe shown in a zoomed-in view below. The forward- and reverse-strand sequences of the probe are shown, and the highlighted area indicates approximately where the probe falls within the gene. (B) Screenshot from the Integrated Genome Viewer (IGV) showing the 40bp probe sequence (ENSG00000137285|TUBB2B|1dec8c0) that matches both TUBB2A and TUBB2B. Shown are the 2 isoforms of TUBB2A from the RefSeq v110 annotation, 3 isoforms from the CHESS v3.1 annotation, and 9 from GENCODE Basic v47. Matching probe shown in a zoomed-in view below. The forward- and reverse-strand sequences of the probe are shown, and the highlighted area indicates approximately where the probe falls within the gene. (A) and (B) share a figure legend.

(A) Overlap regions between Visium (orange outline) and Xenium (blue outline) data, shown on the Xenium histological image and the Visium histological image, respectively. (B) Log transformed aggregated total gene counts for spots (∼55μm x 55μm) in both Xenium and Visium datasets, overlaid on their corresponding histological image.

UMAP visualization of integrated scRNA-seq and Xenium datasets: (A) before harmony batch correction and (B) after harmony batch correction.

Harmonized UMAP visualization of TUBB3 and TUBB4A expression in the scRNA-seq dataset, as well as the aggregated scRNA-seq expression of TUBB2B and all predicted off-targets (TUBB2A, TUBB3, TUBB4A).

(A) Schematic illustrating that hybridization may still occur even when there is a sequence mismatch at the non-ligated ends of the probe sequence. (B) Schematic depicting how probes could bind to each other instead of to their intended target.

(A) Spatial gene expression patterns for CEACAM8: Xenium expression, Visium expression, the aggregated Visium expression combining CEACAM8 with all predicted off-target binding gene expression, and Visium expression of CEACAM8’s potential off-targets (CEACAM5, CEACAM6, CEACAM7, and PSG6). (B) Comparison of CEACAM8 expression patterns on harmonized UMAP: Xenium expression, scRNA-seq expression, an aggregated scRNA-seq profile combining CEACAM8 with all predicted off-target binding gene expression, and scRNA-seq expression of CEACAM8’s potential off-targets (CEACAM5, CEACAM6, CEACAM7, and PSG6).

(A) Spatial gene expression patterns for CEACAM6: Xenium expression, Visium expression, the aggregated Visium expression combining CEACAM6 with all predicted off-target binding gene expression, and Visium expression of CEACAM6 potential off-target (CEACAM3). (B) Comparison of CEACAM6 expression patterns on harmonized UMAP: Xenium expression, scRNA-seq expression, an aggregated scRNA-seq profile combining CEACAM6 with all predicted off-target binding gene expression, and scRNA-seq expression of CEACAM6’s potential off-target (CEACAM3).

(A) Spatial gene expression of HDC overlaid on the corresponding histological images for Xenium and Visium. (B) Harmonized UMAP visualization of HDC expression for Xenium and scRNA-seq data.

(A) Spatial gene expression patterns of: APOBEC3B Xenium expression, APOBEC3B Visium expression, APOBEC3F Visium expression, the aggregated Visium expression combining APOBEC3B with all predicted off-target binding gene expression (APOBEC3F, APOBEC3A, APOBEC3D), APOBEC3A Visium expression, and APOBEC3D Visium expression. (B) Harmonized UMAP visualization of: APOBEC3B Xenium expression, APOBEC3B scRNA-seq expression, APOBEC3F scRNA-seq expression, the aggregated scRNA-seq expression combining APOBEC3B with all predicted off-target binding gene expression (APOBEC3F, APOBEC3A, APOBEC3D), APOBEC3A scRNA-seq expression, and APOBEC3D scRNA-seq expression.

OPT output of genes with predicted off-target binding based on perfect sequence homology in RefSeq.
This table shows the 22 genes whose probes in the Xenium breast cancer panel exhibit predicted off-target probe binding, where each off-target alignment involves a perfect 40bp match to the probe sequence. The final column shows the gene types, in order, of each of the off-target genes shown in column 3. Abbreviations: PC = protein-coding; PG = pseudogene; precursor_RNA = precursor RNA; misc_RNA = miscellaneous RNA; ncRNA = non-coding RNA.


OPT output of genes with predicted off-target binding based on perfect sequence homology in CHESS.
This table shows the 33 genes whose probes in the Xenium breast cancer panel exhibit predicted off-target probe binding, where each off-target alignment involves a perfect 40bp match to the probe sequence. The final column shows the gene types, in order, of each of the off-target genes shown in column 3. Abbreviations: PC = protein-coding; PG = pseudogene; miRNA = microRNA.

The number of off-target probes and affected genes (from the set of 280 genes in the Xenium panel)) found when looking for perfect matches between probe sequences and transcripts in four different reference annotations: GENCODE basic, GENCODE comprehensive, RefSeq, and CHESS.
Off-target alignments between CCPG1 probes and DNAAF1-CCPG1 were excluded.

Union set of protein coding genes that OPT predicts to be affected by off-target binding, across three different reference annotations: GENCODE basic and comprehensive, RefSeq, and CHESS.
Genes not shared across all four annotations are colored in red.

Nine additional genes that were identified to exhibit potential off-target probe binding when allowing for a 6bp error margin on either side of the binding site.
Abbreviations: PC = protein-coding; PG = pseudogene; lncRNA = long non-coding RNA.

Eighteen genes that were previously predicted to be affected by off-target binding based on perfect matching now show additional predicted off-target interactions when a 6bp error margin is allowed on either side of the binding site.
This effect is observed either through the accumulation of new probes with predicted off-target binding or via the identification of additional predicted off-target genes per probe. Abbreviations: PC = protein-coding; PG = pseudogene; lncRNA = long non-coding RNA.