Generation of a large RNAseq dataset of midbrain dopamine neurons. A) Schematic of single nucleus isolation pipeline using microfluidic chip-based sorting from n=8 mice. B) UMAP representation of clusters. Of note, putative clusters 16 and 21 are shown in grey, and are removed from analyses due to co-expression of glial markers. C) Expression of the defining midbrain dopaminergic neuron markers vesicular monoamine transporter 2 (Vmat2; Slc18a2), DOPA decarboxylase (Ddc), tyrosine hydroxylase (Th), and dopamine transporter (Dat; Slc6a3). Robust expression is observed throughout all clusters. D) Dot plot showing expression of several markers of previously described dopamine neuron subtypes, consistent with finding additional heterogeneity within established subtypes. E) Expression patterns of midbrain floorplate markers. All clusters show robust expression with exception of 0 and 17, which indicate these populations are not midbrain DA neurons but might be located in nearby hypothalamic nuclei.

Mapping subtype identities, marker genes, and relationships to previously described populations. A) Dendrogram of DA neuron subtypes in our dataset. Distance between branch points represent approximations of relatedness between subsequent nodes, and genes labeled at each branch point represent the top differentially expressed genes distinguishing the downstream groups. Subtypes fell largely into three “families,” with expression patterns generally defined by Gad2, Sox6, or Calb1. Notably, clusters 1, 11, and 13 are shown with an asterisk as these do not display differential expression in line with the defining genes of their cluster families (i.e. are not significantly enriched for their eponymous family genes). B) Heatmap of top differentially expression genes for each cluster. The top 3 genes for each cluster are shown. C-F) Coexpression of genes at specified branch points denoted in 2A. G) Table describing putative clusters and relation to previous literature. Top row: cluster number. Second row: Equivalent cluster in Azcorra & Gaertner et al25. Third row: stepwise genetic signatures based on following sequential branch points of cluster dendrogram. Bottom row: Proposed shorthand name for subtypes, based on cluster family and top defining genes for that cluster within each family.

Identification of dopamine neurons in MERFISH. A) Schematic showing the processing and imaging of brain tissue with MERFISH. B) Clustering and annotation of all cells identified by MERFISH. C) Relative expression of Th, Slc6a3 and Slc18a2 shows the presence of a single cluster (blue cluster in B) that expresses all three genes. D) Spatial location of neuronal clusters for whole brain MERFISH along three rostral-caudal delineations associated with -3.0, -3.2 and -3.6mm Bregma. E) Cellular expression of Th in the sections shown in D.

Mapping of MERFISH data with snRNA-seq clustering: A) Grouping and recoloring of snRNA-seq clusters into distinct families represented by Gad2, Sox6 and Calb1 (Left). Relative proportion of dopamine neurons comprising each cluster from the snRNA-seq (SN, orange) and MERFISH label transfer (MER, grey) data (Right). B) Spatial representation of predicted clustering across the rostral-caudal axis, zoom of one representative hemisphere shown below. C-E) Imputed cellular expression of Gad2, Calb1 and Sox6 across the rostral-caudal axis.

Spatial localization of Sox6 family of DA neurons. A) Distribution of Sox6 family DA neuron subtypes relative to all other DA neurons along the rostral-caudal axis. DA cells belonging to the Calb1 and Gad2 family are colored in light grey. B) Location of individual subtype within the Sox6 family (left two panels) with other cells in the Sox6 family shown in dark grey. Relative cellular expression of genetic markers associated with each subfamily shown in a UMAP and a midbrain section.

Spatial localization of Calb1 family DA neurons. A) Calb1 family of neurons were divided into two distinct branches based on hierarchical clustering. Distribution of Branch 1 of Calb1 family DA neurons along the rostral-caudal axis with all other DA cells colored in grey. B) Location of individual Calb1-Branch1 clusters within the Calb1-Branch1 family. Only cells in the Calb1-Branch1 family are shown. Relative cellular expression of genetic markers associated with each subtype shown in a UMAP and brain section. C) and D) Same as A and B, but for Branch 2 of Calb1 family DA neurons. Relative cellular expression of genetic markers associated with each subfamily shown in a UMAP and a midbrain section.

Spatial location of Gad2 family DA neurons. A) Distribution of Gad2 family DA neurons along the rostral-caudal axis with other DA cells colored in grey. B) Location of individual subtypes within the Gad2 family with only cells of the Gad2 family shown in grey.

Global effects of Lrrk2G2019S mutation are observed across all clusters. A) Expression pattern of Lrrk2 RNA, which is present in all clusters, though notably low in Gad2Egfr (cluster 15). B) Cluster proportions in control and mutant samples. Relative proportions of each subtype within the dataset are relatively unchanged. C) Cluster similarity heatmap shows that each cluster from control samples is more similar to its corresponding cluster in the mutant samples, suggesting there is no large-scale change in cell type across samples. D) Volcano plot of differentially expressed genes across conditions. Each point is colored according to its significance and magnitude of fold change. E) Gene set enrichment analysis (GSEA) results comparing all clusters across conditions. Several top enriched pathways in Lrrk2 mutants are related to mitochondrial energy production/oxidative phosphorylation. F) Gene ontology (GO) results comparing all clusters across conditions. Top enriched pathways are related to synapse organization and function, consistent with previously described synaptic dysfunction in Lrrk2G2019S mutant mice.

Changes in gene expression are observed in PD-implicated DA neuron subtypes in Lrrk2G2019S mice. A) GO results comparing Sox6 cluster family across conditions. Several pathways are highly relevant to locomotor behavior and previously described dysfunction (such as synaptic organization pathways) in Lrrk2G2019S mutant mice. B) GSEA results comparing Sox6 cluster family across conditions. While twenty pathways were positively enriched in mutant Sox6 family clusters (BH-corrected p-values < 0.05), several additional pathways were enriched in either direction using a less conservative cutoff of p < 0.1. The top five positively and negatively enriched pathways with this less conservative cutoff are shown. Pathway enrichment was similar to global changes, but more pronounced in the Sox6 family of clusters. C) GO results comparing the Sox6Tafa1 cluster, which showed the highest expression of Anxa1, across conditions. This population has previously been shown to selectively signal during motor acceleration and shows enrichment for motor pathways as a function of Lrrk2 genotype, which could suggest DA signaling deficits in acceleration-correlated circuits and a potential substrate for subsequent motor deficits in Lrrk2G2019S DA neurons. D) Results of SynGO analyses for Sox6 family, Calb1 family, or Sox6Tafa1 clusters. SynGO-annotated (i.e. synaptic) genes were enriched among DEGs for ventral SNc populations, particularly Sox6Tafa1. Of note, among these synaptic genes, Sox6Tafa1 cluster showed much heavier enrichment for those localized to the presynapse, which may suggest subtype-specific presynaptic function in acceleration-corrected subtypes. G) Feature plot of each cell’s association with PD risk loci proposed by GWAS as calculated by scDRS scores. Clear differences are observed across clusters. H) Mean and 95% confidence intervals for PD GWAS risk (scDRS scores) for each cluster. Sox6 family of clusters are particularly elevated. I) Average risk scores with 95% confidence intervals among cluster families, as well as clusters Sox6Tafa1 and Sox6Vcan, the two Anxa1-expressing SNc clusters, which showed the highest average risk scores.

Quality control for dataset generation. A) Violin plots for each of our 4 samples (post-filtering) for each of the four QC metrics used for filtering our datasets. B) Expression patterns of sex-specific genes. All clusters are represented by both sexes. C) Histogram of number of cells plotted by male to female gene ratios (scored continuously along the x-axis). Three discrete peaks emerge, representing cells likely originating from either sex or those with indeterminate ratios due to technical drop-off in RNAseq reads. Similar numbers of male and female cells were observed, as represented by the sums of adjacent bins in each of the three marked regions on the x-axis. D) Cluster representation from each individual RNAseq library. All clusters were represented in all samples. E) Cluster representation from pooled control and Lrrk2 mutant samples. Distributions are roughly equivalent between conditions, suggesting no overt change in subtype composition as a function of genotype. F) Dotplot of expression for glial marker genes Mbp and Atp1a2, showing high expression in clusters 16 and 21, respectively, indicating likely doublets of DA neurons and glia.

Representations of cluster heterogeneity. A) Cluster stability metrics shown as a box plot for each cluster. Clusters with lower stability contain cells that more easily collapse into other clusters. Outliers (defined as more than 1.5 times the interquartile range (IQR) above the third quartile or below the first quartile) are shown as circles. B) Cluster tree displaying evolution of clusters when calculated at different resolutions. As resolution increases, new levels of heterogeneity emerge, but ultimately become largely stable at higher resolutions. Resolution used for our clustering scheme is highlighted by dotted line.

Validating snRNA-seq clusters with Python and published DA clusters. A) Comparison of UMAP from Azcorra*,Gaertner* et al., 202325 and UMAP from Figure 1. A Sankey diagram was created by mapping nuclei collected in Fig 1 onto the clusters from Azcorra*,Gaertner* et al., 202325. The sankey diagrams only include cell transfers that make up at least 5% of the cluster. B) UMAP representation of clusters using Scanpy package on Python and expression of Anxa1 gene. C) Sankey diagram comparing the clustering of dataset from Figure 1 using Scanpy and Seurat. Note, cells in cluster 19 of the Scanpy plot did not transfer to one cluster in Seurat with >5% of its total cell count.

Quality control metrics for MERFISH experiments. A-C) Volume, number of features (nFeatures) and number of total transcripts counts (nCounts) metrics for each cell identified in MERFISH experiments after quality control. D) Example correlation coefficient between total cellular transcripts detected for each gene across two experiments and a matrix showing experimental similarity. Spatial location of cortical excitatory (E) and inhibitory (F) neurons show distinct layering patterns by MERFISH imaging. G) Relative expression levels of groups of genes associated with synthesis and secretion of glutamate (Slc17a6, Slc17a7, Slc17a8), GABA (Slc32a1, Gad1, Gad2), dopamine (Slc6a3, Th) and serotonin (Tph2, Slc6a4) showcase the multilingual nature of the dopaminergic and serotonergic systems.

Clustering of DA neurons identified by MERFISH. A) Schematic showing the manual outlining of anatomical regions of the ventral midbrain. B) Relative proportions of total DA neurons identified by MERFISH distributed throughout the ventral midbrain. C) Subclustering of DA neurons from MERFISH dataset reveals 12 distinct subclusters. D) Heatmap showing relative expression of genes associated with dopaminergic character and the top three differentially expressed genes within each cluster identified in c). E) Relative expression of genes associated with dopaminergic character (Slc6a3, Th, Slc18a2) distributed across DA clusters. F) Midbrain location of clusters identified by MERFISH using the manual outlining described in A).

Integrating MERFISH with snRNA-seq: A) Overview of MERFISH integration with snRNA-seq dataset (top). snRNA-seq UMAP space and cluster identification were transferred to MERFISH cells by identifying mutual nearest neighbors (MNN) and assigning a score based on the overlap of neighbor cells that share the same labelling (middle). High confidence MERFISH cells (bottom) were identified as those that showed a cell similarity score >0.5 (2,298 of 4,399 original DA neurons). B) Sankey plot showing the proportion of cells from MERFISH clustering identities that make up the snRNA-seq clustering identities. Flow chart only includes cell transfers that make up at least 10% of the snRNA-Seq cluster. C-E) comparison of the transcript detection in snRNA-seq dataset, MERFISH dataset and imputed gene expression (Gad2, Calb1 and Aldh1a1 - left). Spatial location of normalized transcript counts and imputed data show similar distributions (middle). Example representations of individual transcripts detected by MERFISH in the VTA for each gene (right and inset).

A) Anatomical distribution of individual DA neurons by cluster family. B) Relative proportions of DA cluster subtypes by anatomical region.

Analysis of synaptic localizations and functional roles of differentially expressed genes using SynGO. A) Venn diagram of differentially expressed genes (DEGs) between Sox6 and Calb1 families. B) Mapping of pre- and post-synaptic genes found among significant DEGs in Lrrk2G2019S mutants compared to controls for Sox6 family, Sox6Tafa1, and Calb1 family of clusters. Bars are colored according to relative association with each cellular compartment. C) Functional roles of synaptic DEGs in each population in Lrrk2G2019S mutants compared to controls.