Figures and data

ambisim, a multiplexed joint snRNA/snATAC read simulator that controls ambient proportions.
A) Example of a scenario in which ambient RNA can confound demultiplexing methods. B) Design of ambisim. Cell attributes, including donor identity and ambient RNA fraction, are specified, and then reads are sampled from a reference genome. For any read that overlaps a SNP, an allele is sampled from the donor’s genotype. If the read is an ambient read, a SNP is instead sampled based on all donors’ genotypes. C) Simulation designs. Heterotypic doublet proportion and number of multiplexed donors are varied along with ambient RNA/DNA fraction. Demultiplexing methods are evaluated based on two metrics: droplet type accuracy (percentage of droplets assigned to the correct droplet type) and singleton-donor accuracy (percentage of singlets assigned to the correct donor).

Accuracy comparisons in simulations.
A) Comparison of droplet-type accuracy as a function of ambient RNA/DNA, summarized across experiments with number of multiplexed donors = 4 and variable doublet rates. B) Same as A), but for ATAC. C) Comparison of singleton-donor accuracy as a function of ambient RNA/DNA, summarized across experiments with doublet rate = 10% and variable numbers of multiplexed donors. D) Same as C, but for ATAC. E) Same as A), but for singleton-donor accuracy. F) Same as B), but for singleton-donor accuracy. G) Same as C), but for singleton-donor accuracy. H) Same as D), but for singleton-donor accuracy.

Accuracy comparisons for lower-coverage versions of simulations.
A) Comparison of droplet-type accuracy as a function of ambient RNA/DNA, summarized across experiments with number of multiplexed donors = 4 and variable doublet rates. B) Same as A), but for ATAC. C) Comparison of singleton-donor accuracy as a function of ambient RNA/DNA, summarized across experiments with doublet rate = 10% and variable numbers of multiplexed donors. D) Same as C, but for ATAC. E) Same as A), but for singleton-donor accuracy. F) Same as B), but for singleton-donor accuracy. G) Same as C), but for singleton-donor accuracy. H) Same as D), but for singleton-donor accuracy.

Comparing demultiplexing within modalities in real data.
A) Distributions of singlet and doublet calls in stem cell dataset. B) Same as A), but for aorta dataset. C) Droplet-type correlations between methods in RNA (top) and ATAC (bottom) in stem cell dataset. D) Same as C), but for aorta dataset.

Comparison of demultiplexing methods within and across modalities between multiple methods.
A) Mean droplet-type overlap between methods, including across modalities. B) Proportion of nuclei that are called the same droplet type and individual across all methods, both within and across modalities. C) UpSet visualization of all methods with both RNA/ATAC-based demultiplexing in the stem cell dataset. D) Same as C), but for the aorta dataset.

Concept of variant consistency metric and application to simulations.
A) Schematic of variant consistency metric. Per nucleus allele counts are classified into four categories of consistency based on the uniqueness of each variant. B) Variant consistency ratios based on ambient RNA rates and multiplexed donors in simulated experiments. C) I1 rates based on how many times a SNP is covered experiment wide in lower-depth simulations. D) Correlation of allele counts that are inconsistent but non-unique (I1) with true ambient RNA/DNA across simulations is high, showcasing our ability to quantify ambient contamination in real datasets.

Applying variant consistency to real data reveals differences in demultiplexed singlets quality between methods in both RNA and ATAC.
A) Counting the number of droplets that each method detects outside of the within-modality intersection. B) C2 and I1 rates in the droplets called uniquely by each method. C) Correlation of I1 rates between modalities.