Sibling chimerism analysis at single-cell resolution.

(A) Numbers of variant sites at which marmoset sibling genomes differ, for 96 sibling-pairs. Each dot is a sibling-pair; birth siblings are colored in pink. (B) Numbers of transcribed SNPs (per nucleus) visible in various cell types from blood and brain snRNA-seq datasets of marmoset CJ028. x-axis: snRNA-seq library; y-axis: number of ascertained, transcribed SNPs for which host and sibling have different genotypes; black horizontal lines: median values per cell type. (C) Donor-of-origin assignment of each nucleus in blood snRNA-seq of a marmoset. The host marmoset (CJ028) was born with two siblings; each nucleus was assigned to either the host or to one of the two birth siblings. x-axis: number of unique molecular identifiers (UMI; a measure of transcript abundance) that contains SNPs for which the host and sibling’s genomes differ, in log scale; y-axis: inferred likelihood that the cell has host genome minus likelihood that the cell has sibling genome (log10). (D) Two-dimensional visualization (tSNE plot) of snRNA-seq data from a marmoset’s (CJ026) blood, kidney and liver. (E) Donor-of-origin assignment in marmoset CJ026’s blood, kidney and liver. axes: same as in (C). (F) Levels of chimerism in each cell type ascertained in blood, kidney and liver snRNA-seq of marmoset CJ026. y-axis: fraction of sibling cells in each cell type; numbers in fraction: number of sibling nuclei over total nuclei in the cell type; percentage in x-axis labels: cell type representation in the tissue; vertical bars: binomial confidence interval (95%); P-values: test of heterogeneity (Chi-square) across immune cell types of a tissue (to test for differences in contribution of sibling across immune cell types).

Microglia and macrophages, but not neurons and glia, are chimeric in the marmoset brain.

(A) Two-dimensional visualization (tSNE plot) of marmoset CJ028’s snRNA-seq profiles from 8 brain regions. (B) Expression of markers of microglia and macrophages in microglia, macrophages, neurons, glia (astrocytes, oligodendrocytes, polydendrocytes, ependymal cells) and endothelial cells. y-axis: expression levels, measured as numbers of detected transcripts (unique molecular identifiers (UMIs), a measure of transcript abundance) from that gene per 100,000 total. Microglia markers: TREM2, LAPTM5, C3; macrophage markers: F13A1, LYVE1. Data from all CJ028’s brain regions. (C) Donor assignment of each nucleus to one of three possible donors (host, sibling1 or sibling2) of marmoset CJ028’s brain snRNA-seq data. x-axis: number of UMI that contains SNPs for which the host and sibling’s genomes differ, in log10 scale; y-axis: inferred likelihood that the cell has host genome minus likelihood that the cell has sibling genome (log10). Data from all of CJ028’s brain regions. (D) Fractions of cells with a sibling’s genome among microglia, macrophages, and other brain cell types (neurons, astrocytes, oligodendrocytes, polydendrocytes, ependyma, endothelial) from 11 marmosets. CJ001, CJ023 and CJ028 were part of a triplet litter and the chimerism fraction of each birth sibling is shown in separate panels. y-axis: fraction of sibling-cells in the cell type. vertical bars: binomial confidence interval (95%). P-values: Chi-square test from comparison of chimerism fractions in microglia and macrophage; * (P-value<0.05). (E) Sibling contributions to microglial (x-axis) and macrophage (y-axis) populations in the same animals. Dots and bars are from (D). Pearson correlation R=0.32, 95% confidence interval (−0.26 to 0.73), P-value=0.27.

Sibling contributions to hematopoiesis-derived cells diverge between blood and brain.

(A)-(E) Chimerism fractions in brain and blood of three animals. CJ028 was part of a triplet litter, and the contribution of each sibling is shown in a separate panel (D,E). Red horizontal lines: twin contribution to blood cells as ascertained from whole-genome sequencing of whole-blood-derived genomic DNA (Census-seq). Blue horizontal lines: total twin contribution to blood cells as estimated from PBMC snRNA-seq (all cells). Vertical bars: binomial confidence interval (95%). Numbers in fraction: number of sibling cells over total cells in the cell type. glia+endothelia: astrocytes, oligodendrocytes, polydendrocytes, ependymal cells, endothelial cells.

Sibling contributions to brain microglial populations vary across an animal’s brain areas.

Contributions of sibling(s) to the microglial populations ascertained in principal brain areas (A), and finer-scale brain substructures (B). CJ001, CJ023 and CJ028 were part of a triplet litter and the chimerism contribution of each twin is shown in a separate panel. CJ102 was profiled in only one brain region and hence was not included in the analysis. y-axis: fraction of twin cells. Vertical bars: binomial confidence interval (95%); P-values: test of heterogeneity across an animal’s brain regions.

Utilizing natural chimerism to distinguish cell-autonomous from non-cell-autonomous effects on gene expression, and to compare the effects context and genetic variation in shaping gene expression.

(A-C) Comparisons of RNA expression between microglial populations within host animal CJ028, who had two birth siblings. Comparisons of gene expression between microglia with the genomes of (A) the female host and male sibling, (B) the female host and female sibling, and (C) the two siblings (male and female). Each point represents a gene; its location on the plot represents the level of expression of that gene among microglia with two different genomes in the same animal. x- and y-axes: normalized gene expression levels (number of transcripts per 100,000 transcripts). FC: fold-change of gene expression, female/male for XIST. Fold-change and P-values were calculated using edgeR. Differentially expressed genes (black dots) were defined as: FDR Q-value<0.05 and fold-change>1.5 or less than 1/1.5 and the gene must be expressed in at least 10% of one of the microglia sets. (D-I) Higher effect of context than genetic differences in shaping gene expression. (D) In the brain of marmoset CJ027, the neocortex and striatum are two contexts where two sets of microglia with different genomes reside. (E) Effect of genetic differences. x-axis: log2-fold-change of cortical microglia gene expression between host and sibling cells; y-axis: log2-fold-change of striatal microglia gene expression between host and sibling cells. (F) Effect of context. x-axis: log2-fold-change of the sibling’s gene expression between the two brain regions; y-axis: log2-fold-change of the host’s gene expression between the two brain regions. (G-I) The brains (cortex, striatum and hippocampus) of two birth siblings provide biological contexts in which populations of microglia with two sibling genomes reside. The effect of genetic differences (H) and effect of animal context (I) are compared, for the same brain areas (combined data from cortex, striatum and hippocampus). x- and y-axes: log2-fold-changes of the gene expression between two sets of microglia (the sets being compared are indicated in the axis labels). R: Spearman correlation.

Marmosets analyzed with snRNA-seq in this study. Colonies: NEPRC - New England Primate Research Colony; CLEA - Central Institute for Experimental Animals, Japan; Company A: marmosets obtained from a non-clinical contract research organization.

Number of microglia and macrophage cells identified in brain datasets and number of nuclei profiled in blood, liver and kidney.

Comparison of chimerism between CJ028’s two birth siblings, in blood and in brain myeloid cells (microglia and macrophage). P-values are from a two-sided test of proportions between chimerism fractions of sibling 1 and sibling 2 using the prop.test function in R.

Summary of context versus genetic effects analysis, with two brain regions of an animal as two contexts. The analysis described in Fig. 5D-F was repeated across all animals and brain regions with at least 60 cells that are available for analysis in each context, and the summary of the correlations are tabulated here. The correlations are plotted in Supplementary Fig. 5. Abbreviations; STR: striatum; Thal: thalamus; Hippo: hippocampus; Hyp: hypothalamus; BF: basal forebrain.

Clustering parameters used to identify microglia and macrophage cell types, and thresholds for identifying host-sibling doublets. The final number of microglia and macrophages after the second round of clustering are in Supplementary Table 2. A cell is assigned as a doublet if the Dropulation tool DetectDoublets assigned the highest likelihood for the cell as a doublet and if the log10 of the best likelihood minus the log10 of the second-best likelihood (lrt_test_stat) is greater than the doublet detection threshold (last column).

Whole genome sequencing datasets used in (1) donor-of-origin assignment from snRNA-seq (Dropulation), and (2) estimating chimerism from blood whole genome sequencing (Census-seq).

Microglia and brain macrophages can be identified in all animals.

Expression of microglia and macrophage markers in microglia (left sub-panels), macrophages (middle sub-panels) and all other cell types in the brain (right sub-panels; neurons, glia (astrocytes, oligodendrocytes, polydendrocytes, ependymal cells), and endothelial cells of each animal. Marmosets CJ022 and CJ102 were profiled using two technologies (DS: Drop-seq, 10X: 10X Chromium). y-axis: unique molecular identifier (UMI, a measure of transcript abundance) of each gene across cells, summed and normalized to 100,000 transcripts. Microglia markers: TREM2, LAPTM5, C3; macrophage markers: F13A1, MSTN, LYVE1.

Donor-of-origin assignments from brain snRNA-seq reveals only microglia and macrophages are chimeric.

(A-L) Donor of origin (Dropulation) assignments of each nucleus from brain snRNA-seq of 10 animals. Marmosets CJ022 and CJ102 were profiled using two technologies (DS: Drop-seq, 10X: 10X Chromium). For each marmoset, the snRNA-seq data are grouped into microglia, macrophage, and all other cell types (others: neurons, astrocytes, oligodendrocytes, polydendrocytes, ependymal cells, endothelial cells). x-axis: number of UMI that contains SNPs for which the host and sibling’s genomes differ, in log scale; y-axis: inferred likelihood that the cell has host genome minus likelihood that the cell has sibling genome (log10). Nuclei with positive y-values are assigned to the host and those on the negative y-axes are assigned to the sibling.

Summary of microglia (A) and macrophage (B) chimerism across animals.

y-axis: fraction of twin cells. Vertical bars: binomial confidence interval (95%). P-values: test of heterogeneity across animals.

Comparison of gene expression between microglia with different genomes in each host animal’s brain.

Each point represents a gene; its location on the plot represents the level of expression of that gene among microglia with two different genomes in the same animal. x- and y-axes: normalized gene expression levels (number of transcripts per 100,000 transcripts). Fold-change and P-values were calculated using edgeR and differentially expressed genes (black dots) were defined as: FDR Q-value<0.05 and fold-change>1.5 or less than 1/1.5 and the gene must be expressed in at least 10% of one of the microglia sets.

Summary of genetic versus context effects.

This is a plot of the correlation values from Supplementary Table 5. Abbreviations; STR: striatum; Thal: thalamus; Hippo: hippocampus; Hyp: hypothalamus; BF: basal forebrain

Doublet detection using host and sibling genotypes.

The axes are the same as in Supplementary Fig. 2, and each dot is a nucleus. Here, nuclei that were identified as doublets and discarded in analyses were indicated (black dots). Marmosets CJ022 and CJ102 were profiled using two technologies (DS: Drop-seq, 10X: 10X Chromium).

Second round clustering of microglia to discard mis-classified cells.

(A-M) Gene expression comparison between host and sibling cells. Red dots: nuclei identified as microglia from first-round of clustering, black dots: nuclei that were retained after second-round clustering. For triplets, only the first sibling is included in the plots. Pearson correlation as calculated for each set (before and after second round clustering) and shows an improvement in correlation after discarding mis-classified cells. Marmosets CJ022 and CJ102 were profiled using two technologies (Drop-seq and 10X Chromium). (N) Summary (box plot) of fraction of microglia cells discarded during second round of clustering, for host and birth sibling.

Analysis of genotyping and chimerism if the genotypes of the sibling are contaminated by the hosts’ DNA.

(A)-(F) Sensitivity and false positives in genotyping; HomRef: homozygous reference, HomAlt: homozygous alternate allele, Het: heterozygous. (G)-(L) Sensitivity and false positives in donor-of-origin assignment. (M)-(Q) Microglia chimerism estimates when sibling WGS are contaminated by the hosts’ DNA, for 5 brain regions; red horizontal line: chimerism estimates when there’s no error in sibling genotypes. (R)-(U) Macrophage chimerism estimates when sibling WGS are contaminated by hosts’ DNA; red horizontal line: chimerism estimates when there’s no error in sibling genotypes.