Introduction

The locus coeruleus (LC) is a small bilateral nucleus located in the dorsal pons of the brainstem, which serves as the brain’s primary site for production of the neuromodulator norepinephrine (NE). NE-producing neurons in the LC project widely to many regions of the central nervous system to modulate a variety of highly divergent functions including attention, arousal, and mood [14]. The LC, translated as “blue spot”, comprises merely 3,000 NE neurons in the rodent (∼1500-1600 each side of the brainstem) [1], and estimates in the human LC range from 19,000-46,000 total NE neurons [5]. Despite its prominent involvement in a number of critical brain functions and its unique capacity to synthesize NE, the LC’s small size and deep positioning within the brainstem has rendered it relatively intractable to a comprehensive cellular, molecular, and physiological characterization.

The LC plays important roles in core behavioral and physiological brain function across the lifespan and in disease. For example, there is strong evidence for age-related cell loss in the LC [6,7], and the LC-NE system is implicated in multiple neuropsychiatric and neurological disorders [4,8]. The LC is one of the earliest sites of degeneration in Alzheimer’s disease (AD) and Parkinson’s disease (PD), and profound loss of LC-NE neurons is evident with disease progression [810]. Moreover, maintaining the neural density of LC-NE neurons prevents cognitive decline during aging [11]. In addition, primary neuropathologies for AD (hyperphosphorylated tau) and PD (alpha-synuclein) can be detected in the LC prior to other brain regions [1215]. However, the molecular mechanisms rendering LC-NE neurons particularly vulnerable to age-related decline and neurodegeneration are not well-understood. In addition to its role in aging, the LC-NE system plays a critical role in mediating sustained attention, and its dysregulation is associated with attention-deficit hyperactivity disorder (ADHD) [16,17]. Of note, the NE reuptake inhibitor atomoxetine is the first non-stimulant medication that is FDA-approved for ADHD [1820]. Better understanding the gene expression landscape of the LC and surrounding region and delineating the molecular profile of LC-NE neurons in the human brain could facilitate the ability to target these neurons for disease prevention or manipulate their function for treatment.

The recent development of single-nucleus RNA-sequencing (snRNA-seq) and spatially-resolved transcriptomics (SRT) technological platforms provides an opportunity to investigate transcriptome-wide gene expression scale with cellular and spatial resolution [21,22]. SRT has recently been used to characterize transcriptome-wide gene expression within defined neuroanatomy of cortical regions in the postmortem human brain [22], while snRNA-seq has been used to investigate specialized cell types in a number of postmortem human brain regions including medium spiny neurons in the nucleus accumbens and dopaminergic neurons in the midbrain [21,23]. Importantly, snRNA-seq and SRT provide complementary views: snRNA-seq identifies transcriptome-wide gene expression within individual nuclei, while SRT captures transcriptome-wide gene expression in all cellular compartments (including the nucleus, cytoplasm, and cell processes) while retaining the spatial coordinates of these measurements. While not all SRT platforms achieve single-cell resolution, depending on the technological platform and tissue cell density, spatial gene expression has been resolved at, for example, ∼1-10 cells per spatial measurement location with a diameter of 55 μm in the human brain [22]. These platforms have been successfully used in tandem to spatially map single-nucleus gene expression in several regions of both neurotypical and pathological tissues in the human brain including the dorsolateral prefrontal cortex [22] and the dopaminergic substantia nigra [21].

In this report, we characterize the gene expression signature of the LC and surrounding region at spatial resolution, and identify and characterize a population of NE neurons at single-nucleus resolution in the neurotypical adult human brain. In addition to NE neurons, we identify a population of 5-hydroxytryptamine (5-HT, serotonin) neurons, which have not previously been characterized at the molecular level in human brain samples [24]. We observe expression of cholinergic marker genes within NE neurons, and confirm that this is due to co-expression within individual cells using fluorescent labeling with high-resolution imaging. We compare our findings from the human LC and adjacent region to molecular profiles of LC and peri-LC neurons that were previously characterized in rodents using alternative technological platforms [2527], and observe partial conservation of LC-associated genes across these species.

Results

Experimental design and study overview of postmortem human LC

We selected 5 neurotypical adult human brain donors to characterize transcriptome-wide gene expression within the LC at spatial and single-nucleus resolution using the 10x Genomics Visium SRT [28] and 10x Genomics Chromium snRNA-seq [29] platforms (see Supplementary Table 1 for donor demographic details). In each tissue sample, the LC was first visually identified by neuroanatomical landmarks and presence of neurons containing the pigment neuromelanin on transverse slabs of the pons (Figure 1A). Prior to SRT and snRNA-seq assays, we ensured that the tissue blocks encompassed the LC by probing for known LC marker genes [30]. Specifically, we cut 10 μm cryosections from tissue blocks from each donor and probed for the presence of a pan-neuronal marker gene (SNAP25) and two NE neuron-specific marker genes (TH and SLC6A2) by multiplexed single-molecule fluorescence in situ hybridization (smFISH) using RNAscope [31,32] (Figure 1B). Robust mRNA signal from these markers, visualized as puncta on imaged tissue sections, was used as a quality control measure in all tissue blocks prior to proceeding with inclusion in the study and performing SRT and snRNA-seq assays.

Summary of data resources providing access to datasets described in this manuscript.

All datasets described in this manuscript are freely accessible in the form of interactive web apps and downloadable R/Bioconductor objects.

Experimental design to measure the landscape of gene expression in the postmortem human locus coeruleus (LC) using spatially-resolved transcriptomics (SRT) and single-nucleus RNA-sequencing (snRNA-seq).

(A) Brainstem dissections at the level of the LC were conducted to collect tissue blocks from 5 neurotypical adult human brain donors. (B) Inclusion of the LC within the tissue sample block was validated using RNAscope [31,32] for a pan-neuronal marker gene (SNAP25) and two NE neuron-specific marker genes (TH and SLC6A2). High-resolution H&E stained histology images were acquired prior to SRT and snRNA-seq assays (scale bars: 2 mm in H&E stained image; 20 μm in RNAscope images). (C) Prior to collecting tissue sections for SRT and snRNA-seq assays, tissue blocks were scored to enrich for the NE neuron-containing regions. For each sample, the LC region was manually annotated by visually identifying NE neurons in the H&E stained tissue sections. 100 μm tissue sections from 3 of the same donors were used for snRNA-seq assays, which included FANS-based neuronal enrichment prior to library preparation to enrich for neuronal populations.

For tissue blocks included in the study, we cut additional 10 μm tissue sections, which were used for gene expression profiling at spatial resolution using the 10x Genomics Visium SRT platform [28] (Figure 1C). Fresh-frozen tissue sections were placed onto each of four capture areas per Visium slide, where each capture area contains approximately 5,000 expression spots (spatial measurement locations with diameter 55 μm and 100 μm center-to-center, where transcripts are captured) laid out in a honeycomb pattern. Spatial barcodes unique to each spot are incorporated during reverse transcription, thus allowing the spatial coordinates of the gene expression measurements to be identified [28]. Visium slides were stained with hematoxylin and eosin (H&E), followed by high-resolution acquisition of histology images prior to on-slide cDNA synthesis, completion of the Visium assay, and sequencing. For our study, 10 μm tissue sections from the LC-containing tissue blocks were collected from the 5 brain donors, with assays completed on 2-4 tissue sections per donor. Given the small size of the LC compared to the area of the array, tissue blocks were scored to fit 2-3 tissue sections from the same donor onto a single capture area to maximize the use of the Visium slides, resulting in a total of N=9 Visium capture areas (hereafter referred to as samples).

For 3 of the 5 donors, we cut additional 100 μm sections from the same tissue blocks to profile transcriptome-wide gene expression at single-nucleus resolution with the 10x Genomics Chromium single cell 3’ gene expression platform [29] (Figure 1C). Prior to collecting tissue sections, the tissue blocks were scored to enrich for NE neuron-containing regions. Neuronal enrichment was employed with fluorescence-activated nuclear sorting (FANS) prior to library preparation to enhance capture of neuronal population diversity, and snRNA-seq assays were subsequently completed. Supplementary Table 1 provides a summary of SRT and snRNA-seq sample information and demographic characteristics of the donors.

Spatial gene expression in the human LC

After applying the 10x Genomics Visium SRT platform [28], we executed several analyses to characterize transcriptome-wide gene expression at spatial resolution within the human LC. First, we manually annotated spots within regions identified as containing LC-NE neurons, based on pigmentation, cell size, and morphology from the H&E stained histology images (Figure 2A and Supplementary Figure 1). Next, we performed additional sample-level quality control (QC) on the initial N=9 Visium capture areas (hereafter referred to as samples) by visualizing expression of two NE neuron-specific marker genes (TH and SLC6A2) (Figure 2B), which identified one sample (Br5459_LC_round2) without clear expression of these markers (Supplementary Figure 2A-B). This sample was excluded from subsequent analyses, leaving N=8 samples from 4 out of the 5 donors. For the N=8 Visium samples, the annotated regions were highly enriched in the expression of the NE neuron marker genes (TH and SLC6A2) (Figure 2C and Supplementary Figure 2C), confirming that these samples captured dense regions of LC-NE neurons within the annotated regions. We performed spot-level QC to remove low-quality spots based on QC metrics previously applied to SRT data [22,33,34] (Methods). Due to the large differences in read depth between samples (Supplementary Figure 3A, Supplementary Table 1, Methods), we performed spot-level QC independently within each sample. After filtering low-expressed genes (Methods), this resulted in a total of 12,827 genes and 20,380 spots across the N=8 samples used for downstream analyses (Supplementary Figure 3B).

Spatial gene expression in the human LC using SRT.

(A) Spots within manually annotated LC regions containing NE neurons (red) and non-LC regions (gray), which were identified based on pigmentation, cell size, and morphology from the H&E stained histology images, from donors Br2701 (top row) and Br8079 (bottom row). (B) Expression of two NE neuron-specific marker genes (TH and SLC6A2). Color scale indicates unique molecular identifier (UMI) counts per spot. Additional samples corresponding to A and B are shown in Supplementary Figures 1, 2A-B. (C) Boxplots illustrating the enrichment in expression of two NE neuron-specific marker genes (TH and SLC6A2) in manually annotated LC regions compared to non-LC regions in the N=8 Visium samples. Values show mean log-transformed normalized counts (logcounts) per spot within the regions per sample. Additional details are shown in Supplementary Figure 2C. (D) Volcano plot resulting from differential expression (DE) testing between the pseudobulked manually annotated LC and non-LC regions, which identified 32 highly significant genes (red) at a false discovery rate (FDR) significance threshold of 10−3 and expression fold-change (FC) threshold of 3 (dashed blue lines). Horizontal axis is shown on log2 scale and vertical axis on log10 scale. Additional details and results for 437 statistically significant genes identified at an FDR threshold of 0.05 and an FC threshold of 2 are shown in Supplementary Figure 7 and Supplementary Table 2. (E) Average expression in manually annotated LC and non-LC regions for the 32 genes from D. Color scale shows logcounts in the pseudobulked LC and non-LC regions averaged across N=8 Visium samples. Genes are ordered in descending order by FDR (Supplementary Table 2). (F-G) Cross-species comparison showing expression of human ortholog genes for LC-associated genes identified in the rodent LC [25,26] using alternative experimental technologies. Boxplots show mean logcounts per spot in the manually annotated LC and non-LC regions per sample in the human data.

To investigate whether the LC regions could be annotated in a data-driven manner, we applied a spatially-aware unsupervised clustering algorithm (BayesSpace [35]) after applying a batch integration tool (Harmony [36]) to remove sample-specific technical variation in the molecular measurements (Supplementary Figure 4). The spatially-aware clustering using k=5 clusters identified one cluster that overlapped with the manually annotated LC regions in several samples. However, the proportion of overlapping spots between the manually annotated LC region and this data-driven cluster (cluster 4, colored red in Supplementary Figure 5A) was relatively low and varied across samples. We quantitatively evaluated the clustering performance by calculating the precision, recall, F1 score, and adjusted Rand index (ARI) for this cluster in each sample (see Methods for definitions). We found that while precision was >0.8 in 3 out of 8 samples, recall was <0.4 in all samples, the F1 score was <0.6 in all samples, and the ARI was <0.5 in all samples (Supplementary Figure 5B). Therefore, we judged that the data-driven spatial domains identified from BayesSpace were not sufficiently reliable to use for the downstream analyses, and instead proceeded with the histology-driven manual annotations for all further analyses. In addition, we note that using the manual annotations avoids potential issues due to inflated false discoveries resulting from circularity when performing differential gene expression testing between sets of cells or spots defined by unsupervised clustering, when the same genes are used for both clustering and differential testing [37]. Next, in addition to the manually annotated LC regions, we also manually annotated a set of individual spots that overlapped with NE neuron cell bodies identified within the LC regions, based on pigmentation, cell size, and morphology from the H&E histology images (Supplementary Figure 6A). However, we observed relatively low overlap between spots with expression of NE neuron marker genes and this second set of annotated individual spots. For example, out of 706 annotated spots, only 331 spots had >=2 observed UMI counts of TH (Supplementary Figure 6B). We hypothesize that this may be due to technical factors including sampling variability in the gene expression measurements, partial overlap between spots and cell bodies, potential diffusion of mRNA molecules between spots, as well as biological variability in the expression of these marker genes. Therefore, we instead used the LC region-level manual annotations for all further analyses.

Next, to identify expressed genes associated with the LC regions, we performed differential expression (DE) testing between the manually annotated LC and non-LC regions by pseudobulking spots, defined as aggregating UMI counts from the combined spots within the annotated LC and non-LC regions in each sample [22]. This analysis identified 32 highly significant genes at a false discovery rate (FDR) threshold of 10−3 and expression fold-change (FC) threshold of 3 (Figure 2D and Supplementary Figure 7A). This includes known NE neuron marker genes including DBH (the top-ranked gene by FDR within this set), SLC6A2 (ranked 6th), TH (ranked 7th), and SLC18A2 (ranked 14th). Out of the 32 genes, 31 were elevated in expression within the LC regions, while one (MCM5) was depleted. The set includes one long noncoding RNA (LINC00682), while the remaining 31 genes are protein-coding genes (Figure 2E and Supplementary Table 2). Alternatively, using standard significance thresholds of FDR < 0.05 and expression FC > 2, we identified a total of 437 statistically significant genes (Supplementary Figure 7B and Supplementary Table 2).

As a second approach to identify genes associated with LC-NE neurons in an unsupervised manner, we applied a method to identify spatially variable genes (SVGs), nnSVG [38]. This method ranks genes in terms of the strength in the spatial correlation in their expression patterns across the tissue areas. We ran nnSVG within each contiguous tissue area containing an annotated LC region for the N=8 Visium samples (total 13 tissue areas, where each Visium sample contains 1-3 tissue areas) and combined the lists of top-ranked SVGs for the multiple tissue areas by averaging the ranks per gene. In this analysis, we found that a subset of the top-ranked SVGs (11 out of the top 50) were highly-ranked in samples from only one donor (Br8079), which we determined was due to the inclusion of a section of the choroid plexus adjacent to the LC in these samples (based on expression of choroid plexus marker genes including CAPS and CRLF1) (Supplementary Figure 8A-C). In order to focus on LC-associated SVGs that were replicated across samples, we excluded the choroid plexus-associated genes by calculating an overall average ranking of SVGs that were each included within the top 100 SVGs in at least 10 out of the 13 tissue areas, which identified a list of 32 highly-ranked, replicated LC-associated SVGs. These genes included known NE neuron marker genes (DBH, TH, SLC6A2, and SLC18A2) as well as mitochondrial genes (Supplementary Figure 8D).

We also compared the expression of LC-associated genes previously identified in the rodent LC from two separate studies. The first study used translating ribosomal affinity purification sequencing (TRAP-seq) using an SLC6A2 bacTRAP mouse line to identify gene expression profiles of the translatome of LC neurons [25]. The second study used microarrays to assess gene expression patterns from laser-capture microdissection of individual cells in tissue sections of the rat LC [26]. We converted the lists of rodent LC-associated genes from these studies to human orthologs and calculated average expression for each gene within the manually annotated LC and non-LC regions. A small number of genes from both studies were highly associated with the manually annotated LC regions in the human data, including DBH, TH, and SLC6A2 from [25], and DBH and GNAS from [26]. However, the majority of the genes from both studies were expressed at low levels in the human data, which may reflect species-specific differences in biological function of these genes as well as differences due to the experimental technologies employed (Figure 2F-G).

Single-nucleus gene expression of NE neurons in the human LC

To add cellular resolution to our spatial analyses, we characterized gene expression in the human LC and surrounding region at single-nucleus resolution using the 10x Genomics Chromium single cell 3’ gene expression platform [29] in 3 of the same neurotypical adult donors from the SRT analyses.

Samples were enriched for NE neurons by scoring tissue blocks for the LC region and performing FANS to enhance capture of neurons. After raw data processing, doublet removal using scDblFinder [39], and standard QC and filtering, we obtained a total of 20,191 nuclei across the 3 samples (7,957, 3,015, and 9,219 nuclei respectively from donors Br2701, Br6522, and Br8079) (see Supplementary Table 1 for additional details). For nucleus-level QC processing, we used standard QC metrics including the sum of UMI counts and detected genes [33] (see Methods for additional details). We observed an unexpectedly high proportion of mitochondrial reads in nuclei with expression of NE neuron marker genes (DBH, TH, and SLC6A2), which represented our rare population of interest, and hence we did not remove nuclei based on proportion of mitochondrial reads (Supplementary Figures 9-10).

We identified NE neuron nuclei in the snRNA-seq data by applying an unsupervised clustering workflow adapted from workflows used for snRNA-seq data in the human brain [23], using a two-stage clustering algorithm consisting of high-resolution k-means and graph-based clustering that provides sensitivity to identify rare cell populations [33]. The unsupervised clustering workflow identified 30 clusters, including clusters representing major neuronal and non-neuronal cell populations, which we labeled based on expression of known marker genes (Figure 3A-B). This included a cluster of NE neurons consisting of 295 nuclei (168, 4, and 123 nuclei from donors Br2701, Br6522, and Br8079, respectively), which we identified based on expression of NE neuron marker genes (DBH, TH, and SLC6A2). In addition to the NE neuron cluster, we identified clusters representing excitatory neurons, inhibitory neurons, astrocytes, endothelial and mural cells, macrophages and microglia, oligodendrocytes, and oligodendrocyte precursor cells (OPCs), as well as several clusters with ambiguous expression profiles including pan-neuronal marker genes (SNAP25 and SYT1), which may represent damaged neuronal nuclei (Figure 3A-B and Supplementary Figure 9).

Single-nucleus gene expression in the human LC using snRNA-seq.

We applied an unsupervised clustering workflow to identify cell populations in the snRNA-seq data. (A) Unsupervised clustering identified 30 clusters representing populations including NE neurons (red), 5-HT neurons (purple), and other major neuronal and non-neuronal cell populations (additional colors). Marker genes (columns) were used to identify clusters (rows). Cluster IDs are shown in labels on the right, and numbers of nuclei per cluster are shown in horizontal bars on the right. Heatmap values represent mean logcounts per cluster. (B) UMAP representation of nuclei, with colors matching cell populations from heatmap. (C) DE testing between neuronal clusters identified a total of 327 statistically significant genes with elevated expression in the NE neuron cluster, at an FDR threshold of 0.05 and FC threshold of 2. Heatmap displays the top 70 genes, ranked in descending order by FDR, excluding mitochondrial genes, with NE neuron marker genes described in text highlighted in red. The full list of 327 genes including mitochondrial genes is provided in Supplementary Table 4. Heatmap values represent mean logcounts in the NE neuron cluster and mean logcounts per cluster averaged across all other neuronal clusters. (D-E) Cross-species comparison showing expression of human ortholog genes for LC-associated genes identified in the rodent LC [25,26] using alternative experimental technologies. Boxplots show logcounts per nucleus in the NE neuron cluster and all other neuronal clusters. Boxplot whiskers extend to 1.5 times interquartile range, and outliers are not shown. (F) DE testing between neuronal clusters identified a total of 361 statistically significant genes with elevated expression in the 5-HT neuron cluster, at an FDR threshold of 0.05 and FC threshold of 2. Heatmap displays the top 70 genes, ranked in descending order by FDR, with 5-HT neuron marker genes described in text highlighted in red. The full list of 361 genes is provided in Supplementary Table 5.

To validate the unsupervised clustering, we also applied a supervised strategy to identify NE neuron nuclei by simply thresholding on expression of NE neuron marker genes (selecting nuclei with >=1 UMI counts of both DBH and TH). As described above, we noted a higher than expected proportion of mitochondrial reads in nuclei with expression of DBH and TH, and did not filter on this parameter during QC processing, in order to retain these nuclei (Supplementary Figure 10A-B). This supervised approach identified 332 NE neuron nuclei (173, 4, and 155 nuclei from donors Br2701, Br6522, and Br8079, respectively), including 188 out of the 295 NE neuron nuclei identified by unsupervised clustering (Supplementary Figure 10C). We hypothesized that the differences for nuclei that did not agree between the two approaches were due to sampling variability in the snRNA-seq measurements for these two marker genes. To confirm this, we used an alternative method (smFISH RNAscope [32]) to assess co-localization of three NE neuron marker genes (DBH, TH, and SLC6A2) within individual cells on additional tissue sections from one additional independent donor (Br8689). Visualization of high-magnification confocal images demonstrated clear co-localization of these three marker genes within individual cells (Supplementary Figure 11). Since the unsupervised clustering is based on expression of a large number of genes and is therefore less sensitive to sampling variability for individual genes, we used the unsupervised clustering results for all further downstream analyses.

We performed DE testing between the neuronal clusters and identified 327 statistically significant genes with elevated expression in the NE neuron cluster, compared to all other neuronal clusters captured in this region, at an FDR threshold of 0.05 and FC threshold of 2. These genes include known NE neuron marker genes (DBH, TH, SLC6A2, and SLC18A2) as well as the 13 protein-coding mitochondrial genes, which are highly expressed in large, metabolically active NE neurons (Figure 3C, Supplementary Figure 12A, and Supplementary Table 4). Compared to the LC-associated genes identified in the SRT samples, differences are expected since the snRNA-seq data contains measurements from nuclei at single-nucleus resolution, while the SRT samples contain reads from nuclei, cytoplasm, and cell processes from multiple cell populations within the annotated LC regions.

To compare with previous results in rodents, we evaluated the expression of the rodent LC marker genes from [25,26] in the NE neuron cluster compared to all other neuronal clusters in the human snRNA-seq data (Figure 3D-E). Consistent with the SRT samples, we observed that several genes were conserved across species. However, compared to the SRT samples, we observed relatively higher expression of the conserved genes within the NE neuron cluster, which is expected since the NE neuron cluster contains reads from individual nuclei from this population only.

We note that a recent publication using snRNA-seq in mice found that LC-NE neurons were highly enriched for Calca, Cartpt, Gal, and Calcr in addition to canonical NE neuron marker genes [27]. In the human data, we noted significant enrichment of GAL and CARTPT in DE testing between the manually annotated LC and non-LC regions in the SRT samples (Supplementary Table 2). While visualization of the snRNA-seq clustering suggests that CARTPT is expressed in the NE neuron cluster in the snRNA-seq data (Supplementary Figure 13), it was not identified as statistically significant in the DE testing between the NE cluster compared to all other neuronal clusters (Supplementary Table 4). For CALCA and CALCR, we observed no enrichment in the annotated LC regions in the SRT samples, nor in the NE neuron cluster in the snRNA-seq data (Supplementary Tables 2, 4 and Supplementary Figure 13).

Identification of 5-HT neurons and diversity of inhibitory neuron subpopulations in single-nucleus data

In addition to NE neurons, we identified a cluster of likely 5-hydroxytryptamine (5-HT, serotonin) neurons in the unsupervised clustering analyses of the snRNA-seq data (Figure 3A-B) based on expression of 5-HT neuron marker genes (TPH2 and SLC6A4) [40]. This cluster consisted of 186 nuclei (145, 28, and 13 nuclei from donors Br2701, Br6522, and Br8079, respectively). DE testing between the neuronal clusters identified 361 statistically significant genes with elevated expression in the 5-HT neuron cluster, compared to all other neuronal clusters captured in this region, at an FDR threshold of 0.05 and FC threshold of 2. These genes included the 5-HT neuron marker genes TPH2 and SLC6A4 (Figure 3F, Supplementary Figure 12B, and Supplementary Table 5). To investigate the spatial distribution of this population, we visualized the spatial expression of the 5-HT neuron marker genes TPH2 and SLC6A4 in the N=9 initial Visium samples, which showed that this population was distributed across both the LC and non-LC regions (Supplementary Figure 14A-B). Similarly, we did not observe significant spatial enrichment of TPH2 and SLC6A4 expression within the manually annotated LC regions (Supplementary Figure 14C). To further confirm this finding, we applied RNAscope [32] to visualize expression of an NE neuron marker gene (TH) and 5-HT neuron marker genes (TPH2 and SLC6A4) within additional tissue sections from donor Br6522, which demonstrated that the NE and 5-HT marker genes were expressed within distinct cells and these neuronal populations were not localized within the same regions (Supplementary Figure 15).

We also investigated the diversity of inhibitory neuronal subpopulations within the snRNA-seq data from the human LC and surrounding region by applying a secondary round of unsupervised clustering to the inhibitory neuron nuclei identified in the first round of clustering. This identified 14 clusters representing inhibitory neuronal subpopulations, which varied in their expression of several inhibitory neuronal marker genes including CALB1, CALB2, TAC1, CNR1, and VIP (additional marker genes shown in Supplementary Figures 13, 16). In addition, similar to recently published results from mice [27], we found that expression of neuropeptides PNOC, TAC1, and PENK varied across the individual inhibitory neuronal populations (Supplementary Figure 13).

In order to integrate the single-nucleus and spatial data, we also applied a spot-level deconvolution algorithm, cell2location [41], to map the spatial coordinates of NE and 5-HT neurons within the Visium samples. This algorithm integrates the snRNA-seq and SRT data by estimating the cell abundance of the snRNA-seq populations, which are used as reference populations at cellular resolution, at each spatial location (spot) in the Visium samples. This successfully mapped the NE neuron population from the snRNA-seq data to the manually annotated LC regions in the Visium samples (Supplementary Figure 17A). Similarly, for the 5-HT neurons, this population was mapped to the regions where this population was previously identified based on expression of marker genes (Supplementary Figure 17B). However, the estimated absolute cell abundance of these neuronal populations per spot was higher than expected, which may be due to their relatively large size and high transcriptional activity, especially for NE neurons, compared to other neuronal and non-neuronal cell populations.

Co-expression of cholinergic marker genes within NE neurons

We observed expression of cholinergic marker genes, including SLC5A7, which encodes the high-affinity choline transporter, and ACHE, within the NE neuron cluster in the snRNA-seq data (Supplementary Figure 13). Because this result was unexpected, we experimentally confirmed co-expression of SLC5A7 transcripts with transcripts for NE neuron marker genes in individual cells using RNAscope [32] on independent tissue sections from donors Br6522 and Br8079. We used RNAscope probes for SLC5A7 and TH (NE neuron marker), and imaged stained sections at 63x magnification to generate high-resolution images, which allowed us to definitively localize expression of individual transcripts within cell bodies. This confirmed co-expression of SLC5A7 and TH in individual cells in a tissue section from donor Br8079 (Supplementary Figure 18), validating that these transcripts are expressed within the same cells. To further investigate the spatial distribution of the cholinergic marker genes, we visualized expression of SLC5A7 and ACHE in the Visium samples, which showed that these genes were expressed both within and outside the annotated LC regions (Supplementary Figure 19).

Interactive and accessible data resources

We provide freely accessible data resources containing all datasets described in this manuscript, in the form of both interactive web-accessible and downloadable resources (Table 1). The interactive resources can be explored in a web browser via a Shiny [42] app for the Visium SRT data and an Isee [43] app for the snRNA-seq data (Supplementary Figure 20). The data resources are also available from an R/Bioconductor ExperimentHub data package as downloadable objects in the SpatialExperiment [44] and SingleCellExperiment [33] formats, which can be loaded in an R session and contain meta-data including manual annotation labels and cluster labels.

Discussion

Due to the small size and inaccessibility of the LC within the brainstem, this region has been relatively understudied in the human brain, despite its involvement in numerous functions and disease mechanisms. Our dataset provides the first transcriptome-wide characterization of the gene expression landscape of the human locus coeruleus (LC) using spatially-resolved transcriptomics (SRT) and single-nucleus RNA-sequencing (snRNA-seq). Analysis of these data identified a population of norepinephrine (NE) neurons as well as a population of 5-hydroxytryptamine (5-HT, serotonin) neurons, and spatially localized them within the LC and surrounding region. We evaluated expression of previously known marker genes for these populations and identified novel sets of significant differentially expressed (DE) genes, and assessed how their expression varies in space across the neuroanatomy of the region. We compared our findings from the human LC to molecular profiles of LC and peri-LC neurons that were previously characterized in rodents using alternative technological platforms, which confirmed partial conservation of LC-associated genes across these species. Finally, we validated our results using smFISH RNAscope to assess co-localization of marker genes on independent tissue sections.

Identifying genes whose expression is enriched in NE neurons is important because the LC-NE system is implicated in multiple neuropsychiatric and neurological disorders [4,8], and prominent loss of NE cells in the LC occurs in neurodegenerative disorders [810]. Unbiased analysis of the snRNA-seq and SRT data identified a number of genes that are enriched in the human LC region and in LC-NE neurons themselves. As expected, these analyses validated enrichment of genes involved in NE synthesis and packaging (TH, SLC18A2, DBH) as well as NE reuptake (SLC6A2). We also noted expression selectively in LC-NE neurons of a number of genes whose expression is altered in animal models or in human disease in the LC (SSTR2, PHOX2A, PHOX2A) [45,46]. We identified LC enrichment of a number of genes that have been associated at the cellular level with apoptosis, cell loss, or pathology in the context of neurodegeneration (RND3, P2RY1) [4750]. We also noted enrichment of NT5DC2 in LC, a gene which has been associated with attention-deficit hyperactivity disorder (ADHD) and regulates catecholamine synthesis in vitro [51,52]. Localization of these genes to the LC in humans, and LC-NE neurons in particular, may provide important biological insights about physiological function of these neurons and provide context about underlying mechanistic links between these genes and disease risk. Future work using the transcriptome-wide molecular expression profiles of NE neurons at single-nucleus resolution and the LC region at spatial resolution generated here could investigate associations with individual genes and gene sets from genome-wide association studies (GWAS) for these disorders as well as genes more generally associated with aging-related processes.

Our study has several limitations. The SRT data using the 10x Genomics Visium platform captures around 1-10 cells per measurement location in the human brain, and future studies could apply a higher-resolution platform to characterize expression at single-cell or sub-cellular spatial resolution. In the single-nucleus data, we identified a relatively small number of NE neurons, which may be related to technical factors that affect the recovery of this population due to their relatively large size and fragility. These technical factors may have also contributed to the unexpectedly high proportion of mitochondrial reads that we observed in NE neurons in the snRNA-seq data. While mitochondrial reads are not expected in the nuclear compartment, recent studies reported contamination of nuclear preparations in snRNA-seq data with ambient mRNAs from abundant cellular transcripts [53]. Given the relatively elevated energy demand and increased metabolic activity of NE neurons, higher than expected mitochondrial contamination in the nuclear preparation of LC tissue may be plausible. Because NE neurons were the population of highest interest to profile in this dataset, we opted not to perform QC filtering on the proportion of mitochondrial reads, in order to retain this population. Further optimizing technical procedures for cell sorting and cryosectioning to avoid cellular damage, as well as for straining for large cells, could enhance recovery of this population for future, larger-scale snRNA-seq studies in this brain region.

Since the identification of 5-HT neurons in the single-nucleus data was an unexpected finding, the experimental procedures were not designed to optimally recover this population, and the precise anatomical origin of the 5-HT neurons recovered in this dataset is not entirely clear. It is possible that these cells were close to the borders of the LC dissections, residing within the dorsal raphe nucleus, which is neuroanatomically adjacent to the LC. Supporting this hypothesis, RNAscope data in Supplementary Figure 15 from an independent tissue section shows that TPH2 and SLC6A4 expression appears to be distinct from the LC region containing a high density of NE cells. However, there is some evidence for expression of serotonergic markers within the LC region in rodents [26,54,55], and our SRT data does support this possibility in the human brain, although further characterization is needed. Comprehensively understanding the full molecular diversity of 5-HT neurons in the human brain would require dissections that systematically sample across the extent of the dorsal raphe nucleus. Similarly, the identification of cholinergic marker gene expression, particularly the robust expression of SLC5A7 within NE neurons, was unexpected. While previous studies have identified small populations of cholinergic interneurons within or adjacent to the LC in rodents [27], analysis of our data did not classify any of the other neuronal populations as cholinergic per se.

However, both the SRT and RNAscope data (Supplementary Figure 18) supports the hypothesis that expression of cholinergic markers occurs in NE cells themselves, as well as in sparse populations of cholinergic neurons adjacent to the LC region that do not express NE marker genes. We note that our snRNA-seq data may be underpowered to fully identify and classify these sparse populations, and future experiments designed to specifically investigate this finding in more detail could lead to a better understanding of cholinergic signaling within the LC of the human brain. Similarly, our snRNA-seq may be underpowered to perform gene set enrichment studies [56] to identify whether the NE and 5-HT neurons harbor aggregated genetic risk for psychiatric disorders, but these data could be aggregated with future snRNA-seq data generated from the LC to address this question. To facilitate further exploration of these data, we provide a freely accessible data resource that includes the snRNA-seq and SRT data in both web-based and R-based formats, as well as reproducible code for the computational analysis workflows for the snRNA-seq and SRT data.

Materials and Methods

Postmortem human brain tissue samples for RNAscope, SRT, and snRNA-seq assays

Brain donations in the Lieber Institute for Brain Development (LIBD) Human Brain Repository were collected from the Office of the Chief Medical Examiner of the State of Maryland under the Maryland Department of Health’s IRB protocol #12-24, and from the Western Michigan University Homer Stryker MD School of Medicine, Department of Pathology, and the Department of Pathology, University of North Dakota School of Medicine and Health Sciences, both under WCG IRB protocol #20111080. Clinical characterization, diagnoses, and macro- and microscopic neuropathological examinations were performed on all samples using a standardized paradigm, and subjects with evidence of macro- or microscopic neuropathology were excluded. Details of tissue acquisition, handling, processing, dissection, clinical characterization, diagnoses, neuropathological examinations, RNA extraction, and quality control measures have been described previously [57,58]. We obtained tissue blocks from 5 male neurotypical brain donors of European ancestry. To select tissue blocks for study inclusion, we identified the LC in transverse slabs of the pons from fresh-frozen human brain. The LC was identified through visual inspection of the slab, based on neuroanatomical landmarks and the presence of neuromelanin pigmentation. For each donor, a tissue block was dissected from the dorsal aspect of the pons, centered around the LC, using a dental drill. The tissue block was taken at the level of the motor trigeminal nucleus and superior cerebellar peduncle. Tissue blocks were kept at -80 °C until sectioning for experiments. We cut 10 μm tissue sections for performing SRT assays using the 10x Genomics Visium SRT platform [28]. High-resolution images of the H&E stained histology were acquired prior to on-slide cDNA synthesis and completing the Visium assays. Assays were performed on 2-4 tissue sections collected from each of the 5 donors, and the tissue blocks were scored to fit 2-3 tissue sections from the same donor onto a single Visium capture area to maximize the use of the Visium slides. This resulted in a total of N=9 Visium capture areas (hereafter referred to as samples) in the SRT dataset. For 3 of the 5 donors, we cut additional 100 μm cryosections for snRNA-seq assays using the 10x Genomics Chromium snRNA-seq platform [29]. Supplementary Table 1 provides information on brain donor demographics as well as sample information for the SRT and snRNA-seq datasets.

Multiplexed smFISH using RNAscope

For RNAscope experiments, tissue blocks were sectioned at 10 μm and single-molecule fluorescent in situ hybridization assays were performed with RNAscope technology [32] using the Fluorescent Multiplex Kit v.2 and 4-plex Ancillary Kit (catalog no. 323100, 323120 ACD) according to the manufacturer’s instructions. Briefly, 10 μm tissue sections (2-4 sections per donor) were fixed with 10% neutral buffered formalin solution (catalog no. HT501128, Sigma-Aldrich) for 30 min at room temperature, series dehydrated in increasing concentrations of ethanol (50%, 70%, 100%, and 100%), pretreated with hydrogen peroxide for 10 min at room temperature and treated with protease IV for 30 min. For QC experiments to confirm LC inclusion in the tissue block (Figure 1B showing example for additional independent donor Br8689), tissue sections were incubated with 3 different probes (2 LC-NE neuron markers and one pan-neuronal marker): SLC6A2 (catalog no. 526801-C1, Advanced Cell Diagnostics) encoding the norepinephrine transporter, TH (catalog no. 441651-C2, Advanced Cell Diagnostics) encoding tyrosine hydroxylase, and SNAP25 (catalog no. 518851-C3, Advanced Cell Diagnostics). To confirm co-expression of LC-NE marker genes within individual cells (Supplementary Figure 11), we used SLC6A2 (catalog no. 526801-C1, Advanced Cell Diagnostics), TH (catalog no. 441651-C2, Advanced Cell Diagnostics), and DBH (catalog no. 545791-C3, Advanced Cell Diagnostics) encoding dopamine beta-hydroxylase. To localize serotonergic and cholinergic markers within the LC (Supplementary Figure 15), we used TH (catalog no. 441651-C2, Advanced Cell Diagnostics) encoding tyrosine hydroxylase, TPH2 (catalog no. 471451-C1, Advanced Cell Diagnostics) encoding tryptophan hydroxylase 2, SLC6A4 (catalog no. 604741-C3, Advanced Cell Diagnostics) encoding the serotonin transporter, and SLC5A7 (catalog no. 564671-C4, Advanced Cell Diagnostics) encoding the high-affinity choline transporter. After probe labeling, sections were stored overnight in 4x saline-sodium citrate buffer (catalog no. 10128-690, VWR). After amplification steps (AMP1-3), probes were fluorescently labeled with Opal Dyes 520, 570, and 690 (catalog no. FP1487001KT, FP1488001KT, and FP1497001KT, Akoya Biosciences; 1:500 dilutions for all the dyes) and counter-stained with DAPI (4′,6-diamidino-2-phenylindole) to label cell nuclei. Lambda stacks were acquired in z-series using a Zeiss LSM780 confocal microscope equipped with 20x, 0.8 numerical aperture (NA) and 63x, 1.4 NA objectives, a GaAsP spectral detector, and 405-, 488-, 561- and 633-nm lasers. All lambda stacks were acquired with the same imaging settings and laser power intensities.

After image acquisition, lambda stacks in z-series were linearly unmixed using Zen Black (weighted; no autoscale) using reference emission spectral profiles previously created in Zen for the dotdotdot software (git hash v.4e1350b) [31], stitched, maximum intensity projected, and saved as Carl Zeiss Image (.czi) files.

Visium SRT with H&E staining data generation and sequencing

Tissue blocks were embedded in OCT medium and cryosectioned at 10 μm on a cryostat (Leica Biosystems). Briefly, Visium Gene Expression Slides were cooled inside the cryostat and tissue sections were then adhered to the slides. Tissue sections were fixed with methanol and then stained with hematoxylin and eosin (H&E) according to manufacturer’s staining and imaging instructions (User guide CG000160 Rev C). Images of the H&E stained slides were acquired using a CS2 slide scanner (Leica Biosystems) equipped with a color camera and a 20x, 0.75 NA objective and saved as a Tagged Image File (.tif). Following H&E staining and acquisition of images, slides were processed for the Visium assay according to manufacturer’s reagent instructions (Visium Gene Expression protocol User guide CG000239, Rev D) as previously described [22]. In brief, the workflow includes permeabilization of the tissue to allow access to mRNA, followed by reverse transcription, removal of cDNA from the slide, and library construction. Tissue permeabilization experiments were conducted on a single LC sample used in the study and an optimal permeabilization time of 18 minutes was identified and used for all sections across donors. Sequencing libraries were quality controlled and then sequenced on the MiSeq, NextSeq, or NovaSeq Illumina platforms. Sequencing details and summary statistics for each sample are reported in Supplementary Table 1.

snRNA-seq data generation and sequencing

Following SRT data collection, three tissue blocks (Br6522, Br8079, and Br2701) were used for snRNA-seq. Prior to tissue collection for snRNA-seq assays, tissue blocks were further scored based on the RNAscope and SRT data to enrich tissue collection to the localized site of LC-NE neurons. After scoring the tissue blocks, samples were sectioned at 100 μm and collected in cryotubes. The sample collected for Br6522 contained 10 sections, weighing 51 mg, and the samples from Br8079 and Br2701 each contained 15 sections, weighing 60.9 mg and 78.9 mg, respectively. These samples were processed following a modified version of the ‘Frankenstein’ nuclei isolation protocol as previously described [23]. Specifically, chilled EZ lysis buffer (MilliporeSigma) was added to the LoBind microcentrifuge tube (Eppendorf) containing cryosections, and the tissue was gently broken up, on ice, via pipetting. This lysate was transferred to a chilled dounce, rinsing the tube with additional EZ lysis buffer. The tissue was then homogenized using part A and B pestles, respectively, for ∼10 strokes, each, and the homogenate was strained through a 70 μm cell strainer. After lysis, the samples were centrifuged at 500 g at 4 °C for 5 min, supernatant was removed, then the sample was resuspended in EZ lysis buffer. Following a 5 min incubation, samples were centrifuged again. After supernatant removal, wash/resuspension buffer (PBS, 1% BSA, and 0.2 U/uL RNasin), was gently added to the pellet for buffer interchange. After a 5 min incubation, each pellet was again washed, resuspended and centrifuged three times.

For staining, each pellet was resuspended in a wash/resuspension buffer with 3% BSA, and stained with AF488-conjugated anti-NeuN antibody (MiliporeSigma, catalog no. MAB377X), for 30 min on ice with frequent, gentle mixing. After incubation, these samples were washed with 1 mL wash/resuspension buffer, then centrifuged, and after the supernatant was aspirated, the pellet was resuspended in wash/resuspension buffer with propidium iodide (PI), then strained through a 35 μm cell filter attached to a FACS tube. Each sample was then sorted on a Bio-Rad S3e Cell Sorter on ‘Purity’ mode into a 10x Genomics reverse transcription mix, without enzyme. A range of 5637-9000 nuclei were sorted for each sample, aiming for an enrichment of ∼60% singlet, NeuN+ nuclei. Then the 10x Chromium reaction was completed following the Chromium Next GEM Single Cell 3’ Reagent Kits v3.1 (Dual Index) revision A protocol, provided by the manufacturer (10x Genomics) to generate libraries for sequencing. The number of sequencing reads and platform used per sample are shown in Supplementary Table 1.

Analysis of Visium SRT data

This section describes additional details on the computational analyses of the Visium SRT data that are not included in the main text.

Manual alignment of the H&E stained histology images to the expression spot grid was performed using the 10x Genomics Loupe Browser software (v. 5.1.0). The raw sequencing data files (FASTQ files) for the sequenced library samples were processed using the 10x Genomics Space Ranger software (v. 1.3.0) [59] using the human genome reference transcriptome version GRCh38 2020-A (July 7, 2020) provided by 10x Genomics. Sequencing summary statistics for each sample are provided in Supplementary Table 1.

For spot-level QC filtering, we removed outlier spots that were more than 3 median absolute deviations (MADs) above or below the median sum of UMI counts or the median number of detected genes per spot [33]. We did not use the proportion of mitochondrial reads per spot for spot-level QC filtering, since we observed a high proportion of mitochondrial reads in the NE nuclei in the snRNA-seq data (for more details, see Results and Supplementary Figure 9), so using this QC metric would risk removing the rare population of NE nuclei of interest in the snRNA-seq data. Therefore, for consistency with the snRNA-seq analyses, we did not use the proportion of mitochondrial reads for spot-level QC filtering in the SRT data. Due to the large differences in read depth between samples (e.g. median sum of UMI counts ranged from 118 for sample Br6522_LC_2_round1 to 2,252 for sample Br6522_LC_round3; see Supplementary Figure 3A and Supplementary Table 1 for additional details), we performed spot-level QC independently within each sample. The spot-level QC filtering identified a total of 287 low-quality spots (1.4% out of 20,667 total spots) from the N=8 Visium samples that passed sample-level QC. These spots were removed from subsequent analyses (Supplementary Figure 3B). For gene-level QC filtering, we removed low-expressed genes with a total of less than 80 UMI counts summed across the N=8 Visium samples.

For the evaluation of the spatially-aware clustering with BayesSpace [35] to identify the LC regions in a data-driven manner, precision is defined as the proportion of spots in the selected cluster that are from the true annotated LC region, recall is defined as the proportion of spots in the true annotated LC region that are in the selected cluster, F1 score is defined as the harmonic mean of precision and recall (values ranging from 0 to 1, with 1 representing perfect accuracy), and the adjusted Rand index is defined as the percentage of correct assignments, adjusted for chance (values ranging from 0 for random assignment to 1 for perfect assignment).

For the pseudobulked DE testing, we aggregated the reads within the LC and non-LC regions using the scater package [60], and then used the limma package [61] to calculate empirical Bayes moderated DE tests.

Analysis of snRNA-seq data

This section describes additional details on the computational analyses of the snRNA-seq data that are not included in the main text.

We aligned sequencing reads using the 10x Genomics Cell Ranger software [62] (version 6.1.1, cellranger count, with option --include-introns), using the human genome reference transcriptome version GRCh38 2020-A (July 7, 2020) provided by 10x Genomics. We called nuclei (distinguishing barcodes containing nuclei from empty droplets) using Cell Ranger (“filtered” outputs), which recovered 8,979, 3,220, and 10,585 barcodes for donors Br2701, Br6522, and Br8079, respectively. We applied scDblFinder [39] using default parameters to computationally identify and remove doublets, which removed 1,022, 205, and 1,366 barcodes identified as doublets for donors Br2701, Br6522, and Br8079, respectively.

We performed nucleus-level QC processing by defining low-quality nuclei as nuclei with outlier values more than 3 median absolute deviations (MADs) above or below the median sum of UMI counts or the median number of detected genes [33], which did not identify any low-quality nuclei, so all nuclei were retained. We did not use the proportion of mitochondrial reads for QC processing, since we observed a high proportion of mitochondrial reads in nuclei with expression of NE neuron markers (DBH and TH) (for more details, see Results and Supplementary Figure 10). Therefore, QC filtering on the proportion of mitochondrial reads would risk removing the rare population of NE neuron nuclei of interest. We performed gene-level QC filtering by removing low-expressed genes with less than 30 UMI counts summed across all nuclei. After doublet removal and QC processing, we obtained a total of 7,957, 3,015, and 9,219 nuclei from donors Br2701, Br6522, and Br8079, respectively.

For the unsupervised clustering, we used a two-stage clustering algorithm consisting of high-resolution k-means and graph-based clustering that provides sensitivity to identify rare cell populations [33]. For the first round of clustering (results in Figure 3A-B, Supplementary Figure 13), we used 2,000 clusters for the k-means step, and 10 nearest neighbors and Walktrap clustering for the graph-based step. For the secondary clustering of inhibitory neurons (Supplementary Figure 16), we used 1,000 clusters for the k-means step, and 10 nearest neighbors and Walktrap clustering for the graph-based step. We did not perform any batch integration prior to clustering, since batch integration algorithms may strongly affect rare populations but these algorithms have not yet been independently evaluated on datasets with rare populations (<1% of cells). We calculated highly variable genes, log-transformed normalized counts (logcounts), and performed dimensionality reduction using the scater and scran packages [34,60].

For the DE testing, we performed pairwise DE testing between all neuronal clusters, using the findMarkers() function from the scran package [34]. We tested for genes with log2-fold-changes (log2FC) significantly greater than 1 (lfc = 1, direction = “up”) to identify genes with elevated expression in any of the neuronal clusters compared to all other neuronal clusters.

For the spot-level deconvolution using cell2location [41], we used the following parameters for human brain data from the Visium platform: detection_alpha = 20, N_cells_per_location = 3.

Supplementary Tables

Supplementary Table 1: Summary of experimental design, sample information, and donor demographic details. Information includes the types of assays performed, donor demographic details, sample IDs, number of Visium tissue areas per sample, and sequencing summary statistics for each sample. The table is provided as an .xlsx file.

Supplementary Table 2: Differential expression (DE) testing results in pseudobulked Visium SRT data. Columns include gene ID, gene name, mean log-transformed normalized counts (logcounts) in manually annotated LC and non-LC regions (“mean_logcounts_LC” and “mean_logcounts_nonLC”), log2 fold change (log2FC), p-value, false discovery rate (FDR), and columns identifying significant (FDR < 0.05 and FC > 2) and highly significant (FDR < 10−3 and FC > 3) genes. The table is provided as a .csv file.

Supplementary Table 3: Spatially variable genes (SVGs) in Visium SRT data. Results for SVGs identified using nnSVG in Visium SRT data. Columns include gene ID, gene name, overall rank of SVGs identified in replicated tissue areas (“replicated_overall_rank”, i.e. top LC-associated SVGs; see Results), overall rank of identified SVGs according to average rank across tissue areas (“overall_rank”, i.e. including choroid plexus-associated SVGs from one donor; see Results), average rank of identified SVGs across individual tissue areas (“average_rank”), number of times (tissue areas) identified within top 100 SVGs (“n_withinTop100”), and ranks within each individual tissue area. The table is provided as a .csv file.

Supplementary Table 4: Differential expression (DE) testing results for NE neuron cluster in snRNA-seq data. DE testing results comparing NE neuron cluster against all other neuronal clusters in snRNA-seq data. Columns include gene ID, gene name, sum of UMI counts across all nuclei (“sum_gene”), average log-transformed normalized counts (logcounts) within the NE neuron cluster (“self_average”), average of average logcounts within all other neuronal clusters (“other_average”), combined p-value, false discovery rate (FDR), summary log2 fold change in the pairwise comparison with the lowest p-value (“summary_logFC”), and column identifying significant (FDR < 0.05 and FC > 2) genes. The table is provided as a .csv file.

Supplementary Table 5: Differential expression (DE) testing results for 5-HT neuron cluster in snRNA-seq data. DE testing results comparing 5-HT neuron cluster against all other neuronal clusters in snRNA-seq data. Columns include gene ID, gene name, sum of UMI counts across all nuclei (“sum_gene”), average log-transformed normalized counts (logcounts) within the 5-HT neuron cluster (“self_average”), average of average logcounts within all other neuronal clusters (“other_average”), combined p-value, false discovery rate (FDR), summary log2 fold change in the pairwise comparison with the lowest p-value (“summary_logFC”), and column identifying significant (FDR < 0.05 and FC > 2) genes. The table is provided as a .csv file.

Supplementary Figures

Spot-plot visualizations of manually annotated Visium spots within regions identified as containing LC-NE neurons in SRT data.

For each of the N=9 Visium capture areas (hereafter referred to as samples), the spots were manually annotated as being within the LC regions (red) or within the non-LC regions (gray) based on spots containing NE neurons, which were identified by pigmentation, cell size, and morphology on the H&E stained histology images.

Spatial expression of two NE neuron-specific marker genes in Visium samples for quality control (QC) in SRT data.

(A-B) Spot-plot visualizations of NE neuron marker gene expression (TH and SLC6A2, A and B, respectively) in the N=9 Visium samples. Color scale shows UMI counts per spot. One sample (Br5459_LC_round2) did not show clear expression of the NE neuron marker genes. This sample was excluded from subsequent analyses, leaving N=8 Visium capture areas (samples) from 4 out of the 5 donors. (C) Enrichment of NE neuron marker gene expression (TH and SLC6A2) within manually annotated LC regions compared to non-LC regions in the N=8 Visium samples. Boxplots show values as mean log-transformed normalized counts (logcounts) per spot within each region per sample, with samples represented by shapes.

Spot-level quality control (QC) data visualizations for Visium samples in SRT data.

(A) QC metrics, medians per sample (from left to right: sum of UMI counts per spot, number of detected genes per spot, and proportion of mitochondrial reads per spot). Boxplots show median for each QC metric per sample, with samples represented by shapes. (B) Applying thresholds of 3 median absolute deviations (MADs) to the sum of UMI counts and number of detected genes for each sample identified a total of 287 low-quality spots (red) (1.4% out of 20,667 total spots), which were removed from subsequent analyses. We did not use the proportion of mitochondrial reads for spot-level QC filtering (see Methods for more details).

Dimensionality reduction embeddings before and after batch integration across Visium samples in SRT data.

We applied a batch integration tool (Harmony [36]) to remove technical variation in the molecular measurements between the N=8 Visium samples from 4 donors. The integrated measurements were subsequently used as the input for spatially-aware clustering using BayesSpace [35]. (A) Principal component analysis (PCA) (top 2 PCs) calculated on molecular expression measurements, with spots labeled (left to right) by donor ID, round ID, and sample ID, without applying any batch integration. (B) Harmony embeddings (top 2 Harmony embedding dimensions) after applying Harmony batch integration on sample IDs, with spots labeled (left to right) by donor ID, round ID, and sample ID, demonstrating that the technical variation has been reduced.

Identifying LC and non-LC regions in a data-driven manner by spatially-aware unsupervised clustering in SRT data.

We applied a spatially-aware unsupervised clustering algorithm (BayesSpace [35]) to investigate whether the LC and non-LC regions in each Visium sample could be annotated in a data-driven manner. (A) Using BayesSpace with k=5 clusters, we clustered spots from the N=8 Visium samples using the Harmony batch-integrated molecular measurements. Cluster 4 (red) corresponds most closely to the manually annotated LC regions. (B) BayesSpace clustering performance evaluated in terms of concordance between cluster 4 (red) and the manually annotated LC region in each sample. Clustering performance was evaluated in terms of precision, recall, F1 score, and adjusted Rand index (ARI) (see Methods for definitions).

Comparison of spot-level and region-level manual annotations in SRT data.

(A) We manually annotated individual Visium spots (black) overlapping with NE neuron cell bodies within the previously manually annotated LC regions (red), based on pigmentation, cell size, and morphology from the H&E stained histology images, in the N=8 Visium samples. (B) We observed relatively low overlap between spots with expression of the NE neuron marker gene TH (>=2 observed UMI counts per spot) and the set of annotated individual spots. The differences included both false positives (annotated spots that were not TH+) and false negatives (TH+ spots that were not annotated). Therefore, we did not use the spot-level annotations for subsequent analyses, and instead used the LC region-level annotations for all further analyses.

Results from differential expression (DE) analysis to identify expressed genes associated with LC regions in SRT data.

We performed DE testing between the manually annotated LC and non-LC regions by pseudobulking spots, defined as aggregating UMI counts from the combined set of spots, within the annotated LC and non-LC regions in each sample. (A) Using a false discovery rate (FDR) significance threshold of 10−3 and an expression fold-change (FC) threshold of 3 (dashed blue lines), we identified 32 highly significant genes (red points). (B) Using standard significance thresholds of FDR < 0.05 and expression FC > 2, we identified 437 significant genes (red). Vertical axes are on reversed log10 scale, and horizontal axes are on log2 scale. Additional details are provided in Supplementary Table 2.

Results from applying nnSVG to identify spatially variable genes (SVGs) in SRT data.

We applied nnSVG [38], a method to identify spatially variable genes (SVGs), in the Visium SRT samples. We ran nnSVG within each contiguous tissue area containing a manually annotated LC region (13 tissue areas in the N=8 Visium samples) and calculated an overall ranking of top SVGs by averaging the ranks per gene from each tissue area. (A) The top 50 ranked SVGs from this analysis included a subset (11 out of 50) of genes that were highly ranked in samples from only one donor (Br8079, genes highlighted in maroon). We determined that this was due to the inclusion of a section of the choroid plexus adjacent to the LC for this donor. Bars show the number of times (out of 13 tissue areas) each gene was included within the top 100 SVGs. Rows are ordered by overall average ranking in descending order. (B) Spatial expression of CAPS, a choroid plexus marker gene, in the N=8 Visium samples. (C) Histology image showing the two tissue areas for sample Br8079_LC_round3. (D) In order to focus on LC-associated SVGs, we calculated an overall average ranking of SVGs that were each included within the top 100 SVGs in at least 10 out of the 13 tissue areas, which identified 32 highly-ranked, replicated LC-associated SVGs. Boxplots show the ranks in each tissue area. Rows are ordered by the overall average ranking in descending order.

Distribution of nucleus-level quality control (QC) metrics across unsupervised clusters in snRNA-seq data.

(A) Sum of UMI counts per nucleus and cluster, (B) total number of detected genes per nucleus and cluster, (C) percentage of mitochondrial reads per nucleus and cluster, and (D) number of nuclei per cluster. We observed an unexpectedly high percentage of mitochondrial reads in the NE neuron cluster (cluster 6, red, C). Since NE neurons were of particular interest for analysis, we did not remove nuclei with a high percentage of mitochondrial reads during QC filtering.

Supervised identification of NE neuron nuclei by thresholding on expression of NE neuron marker genes in snRNA-seq data.

We applied a supervised strategy to identify NE neuron nuclei by simply thresholding on expression of NE neuron marker genes (selecting nuclei with >=1 UMI counts of both DBH and TH). We observed a higher than expected proportion of mitochondrial reads within this set of nuclei, and did not filter on this parameter during QC processing, in order to retain these nuclei. (A) Percentage of mitochondrial reads within the supervised set of nuclei by donor (Br2701, Br6522, and Br8079). (B) Histogram showing percentage of mitochondrial reads within the supervised set of nuclei across all donors. (C) Venn diagram showing overlap between NE neuron cluster identified by unsupervised clustering (left) and NE neuron population identified by supervised thresholding (right). Values display number of nuclei.

Expression of NE neuron marker genes in individual cells using RNAscope and high-magnification confocal imaging.

We applied RNAscope [32] and high-magnification confocal imaging to visualize expression of NE neuron marker genes (DBH in yellow, TH in green, and SLC6A2 in pink, with white representing all three colors overlapping) and DAPI stain for nuclei (blue) on additional tissue sections from an additional independent donor, Br8689. The figure displays a region from a single tissue section, demonstrating clear co-localization of expression of the three NE neuron marker genes (white points) within individual cells. Scale bar: 20 μm.

DE testing results between neuronal clusters in the LC and surrounding region in snRNA-seq data.

(A) Volcano plot showing 327 statistically significant DE genes (FDR < 0.05 and FC > 2) elevated in expression within the NE neuron cluster compared to all other neuronal clusters captured in this region. The significant DE genes include known NE neuron marker genes (DBH, TH, SLC6A2, and SLC18A2) and mitochondrial genes. (B) Volcano plot showing 361 statistically significant DE genes (FDR < 0.05 and FC > 2) elevated in expression within the 5-HT neuron cluster compared to all other neuronal clusters captured in this region. The significant DE genes include known 5-HT neuron marker genes (TPH2 and SLC6A4). Vertical axes are on reversed log10 scale, and horizontal axes are on log2 scale. Additional details are provided in Supplementary Tables 4, 5.

Unsupervised clustering results showing additional inhibitory neuronal, miscellaneous, and cholinergic marker genes in snRNA-seq data.

Extended form of heatmap displayed in Figure 3A, showing additional inhibitory neuronal marker genes (light blue), miscellaneous marker genes including neuropeptides and receptors included for comparison with [27] (dark blue-purple), and cholinergic marker genes (yellow). We observed diversity in expression of inhibitory neuronal marker genes across inhibitory neuronal subpopulations (additional results in Supplementary Figure 16), and we observed expression of cholinergic marker genes within NE neurons (additional results in Supplementary Figure 18).

Spatial expression and enrichment analysis of 5-HT neuron marker genes in Visium SRT samples.

(A-B) We visualized the spatial expression of 5-HT (5-hydroxytryptamine or serotonin) neuron marker genes (TPH2 and SLC6A4) in the N=9 initial Visium SRT samples within the Visium SRT samples, which showed that the population of 5-HT neurons was distributed across both the LC and non-LC regions. (C) Enrichment of 5-HT neuron marker gene expression (TPH2 and SLC6A4) within manually annotated LC regions compared to non-LC regions in the N=8 Visium SRT samples. Boxplots show values as mean log-transformed normalized counts (logcounts) per spot within each region per sample, with samples represented by shapes.

Expression of NE neuron and 5-HT neuron marker genes using RNAscope.

We applied RNAscope [32] to visualize expression of an NE neuron marker gene (TH) as well as 5-HT neuron marker genes (TPH2 and SLC6A4) within an additional tissue section from donor Br6522, demonstrating that the NE and 5-HT marker genes were expressed within distinct cells and that the NE and 5-HT neuron populations were not localized within the same regions. Scale bar: 500 μm.

Inhibitory neuronal subpopulations identified by secondary unsupervised clustering on inhibitory neurons in snRNA-seq data.

We applied a secondary round of unsupervised clustering to the inhibitory neuron nuclei identified in the first round of clustering. This identified 14 clusters representing inhibitory neuronal subpopulations. Heatmap displays expression of neuronal marker genes (black) and inhibitory neuron marker genes (light blue) (columns) in the 14 clusters (rows). Cluster IDs are shown in labels on the right, and numbers of nuclei per cluster are shown in horizontal bars on the right. Heatmap values represent mean log-transformed normalized counts (logcounts) per cluster.

Spot-level deconvolution to map the spatial coordinates of snRNA-seq populations within the Visium SRT samples.

We applied a spot-level deconvolution algorithm (cell2location [41]) to integrate the snRNA-seq and SRT data by estimating the cell abundance of the snRNA-seq populations, which are used as reference populations, at each spatial location (spot) in the Visium SRT samples. This correctly mapped (A) NE neurons (cluster 6) and (B) 5-HT neurons (cluster 21) to the spatial regions where these populations were previously identified based on expression of marker genes (Supplementary Figures 2 and 14). However, the estimated absolute cell abundance of these populations per spot was higher than expected.

High-resolution images demonstrating co-expression of cholinergic marker gene within NE neurons.

We applied RNAscope [32] and high-resolution imaging at 63x magnification to visualize expression of SLC5A7 (cholinergic marker gene encoding the high affinity choline transporter, shown in pink) and TH (NE neuron marker gene encoding tyrosine hydroxylase, shown in green), and DAPI stain for nuclei (blue), in a tissue section from donor Br8079. This confirmed co-expression of SLC5A7 and TH within individual cells. Scale bar: 25 μm.

Spatial expression of cholinergic marker genes in Visium SRT samples.

We visualized the spatial expression of cholinergic marker genes (A) SLC5A7 and (B) ACHE in the N=9 initial Visium SRT samples, which showed that these genes were expressed both within and outside the annotated LC regions. Color scale shows UMI counts per spot.

Interactive web-accessible data resources.

All datasets described in this manuscript are freely accessible via interactive web apps and downloadable R/Bioconductor objects (see Table 1 for details). (A) Screenshot of Shiny [42] web app providing interactive access to Visium SRT data. (B) Screenshot of iSEE [43] web app providing interactive access to snRNA-seq data.

Back Matter

Acknowledgements

The authors would like to extend their gratitude to the families and next of kin of the donors for their generosity in supporting and expanding our knowledge of the human brain and neuropsychiatric disease. We would also like to thank the physicians and staff of the Office of the Chief Medical Examiner of the State of Maryland, the Western Michigan University Homer Stryker MD School of Medicine, Department of Pathology, and the Department of Pathology, University of North Dakota School of Medicine and Health Sciences Medical Examiners’ office. We would also like to extend our appreciation to Drs. Lewellyn Bigelow and Fernando Goes, and Amy Deep-Soboslay for their provision of the detailed diagnostic evaluation of each case used in this study, James Tooke for his assistance with coordinating dissections within the Lieber Institute for Brain Development (LIBD) Human Brain Repository and Daniel Weinberger for suggestions and advice on the manuscript.

Author Contributions

Conceptualization: KRM, KM, SCH

Data Curation: LMW, HRD, MNT, MT

Formal Analysis: LMW, MNT

Funding acquisition: LMW, KRM, KM, SCH

Investigation: HRD, MNT, SHK, AS, KDM

Methodology: HRD, SHK, MNT, KRM

Project Administration: KRM, KM, SCH

Resources: RB, JEK, TMH

Software: LMW, HRD, LCT

Supervision: SCP, KRM, KM, SCH

Validation: HRD

Visualization: LMW, HRD, MT

Writing – original draft: LMW, HRD, MNT, KM, SCH

Writing – review & editing: LMW, HRD, MNT, KRM, KM, SCH

Funding

Research reported in this publication was supported by the Lieber Institute for Brain Development, National Institutes of Health awards U01MH122849 (KM, SCH), R01DA053581 (KM, SCH), K99HG012229 (LMW), and awards CZF2019-002443 and CZF2018-183446 (SCH) from the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation.

Competing Interests

The authors declare that they have no competing interests. Matthew N. Tran (MNT) is now a full-time employee at 23andMe and whose current work is unrelated to the contents of this manuscript. His contributions to this manuscript were made while previously employed at the Lieber Institute for Brain Development (LIBD).

Code Availability

Code scripts to reproduce all analyses and figures in this manuscript, including the computational analysis workflows for the snRNA-seq and SRT data, are available from GitHub at https://github.com/lmweber/locus-c. We used R version 4.2 and Bioconductor version 3.15 packages for analyses in R.

Data Availability

The datasets described in this manuscript are freely accessible in web-based formats from https://libd.shinyapps.io/locus-c_Visium/ (Shiny [42] app containing Visium SRT data) and https://libd.shinyapps.io/locus-c_snRNA-seq/ (iSEE [43] app containing snRNA-seq data), and in R/Bioconductor formats from https://bioconductor.org/packages/WeberDivechaLCdata (R/Bioconductor ExperimentHub data package containing Visium SRT data in SpatialExperiment [44] format and snRNA-seq data in SingleCellExperiment [33] format). The R/Bioconductor data package is available in Bioconductor release version 3.16 from Nov 2, 2022 onwards. Instructions to install and access the R/Bioconductor data package are also available from GitHub at https://github.com/lmweber/WeberDivechaLCdata. Raw data including FASTQ sequence data files and raw image files will be made available from a Globus endpoint.