The landscape of regulatory genes in brain-wide neuronal phenotypes of a vertebrate brain

  1. Hui Zhang
  2. Haifang Wang
  3. Xiaoyu Shen
  4. Xinling Jia
  5. Shuguang Yu
  6. Xiaoying Qiu
  7. Yufan Wang
  8. Jiulin Du  Is a corresponding author
  9. Jun Yan  Is a corresponding author
  10. Jie He  Is a corresponding author
  1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, China
  2. University of Chinese Academy of Sciences, China
  3. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, China
  4. School of Future Technology, University of Chinese Academy of Sciences, China

Abstract

Multidimensional landscapes of regulatory genes in neuronal phenotypes at whole-brain levels in the vertebrate remain elusive. We generated single-cell transcriptomes of ~67,000 region- and neurotransmitter/neuromodulator-identifiable cells from larval zebrafish brains. Hierarchical clustering based on effector gene profiles (‘terminal features’) distinguished major brain cell types. Sister clusters at hierarchical termini displayed similar terminal features. It was further verified by a population-level statistical method. Intriguingly, glutamatergic/GABAergic sister clusters mostly expressed distinct transcription factor (TF) profiles (‘convergent pattern’), whereas neuromodulator-type sister clusters predominantly expressed the same TF profiles (‘matched pattern’). Interestingly, glutamatergic/GABAergic clusters with similar TF profiles could also display different terminal features (‘divergent pattern’). It led us to identify a library of RNA-binding proteins that differentially marked divergent pair clusters, suggesting the post-transcriptional regulation of neuron diversification. Thus, our findings reveal multidimensional landscapes of transcriptional and post-transcriptional regulators in whole-brain neuronal phenotypes in the zebrafish brain.

eLife digest

The brain harbors an astounding diversity of interconnected cells. Each cell contains the same basic set of genetic instructions, but only a fraction of the genome is used in each cell to assemble proteins. This selective gene expression gives rise to each cell’s characteristic properties, such as their shape and location, or whether they can activate or inhibit neighbouring cells.

How these defining features are encoded on a genetic level and selectively activated in cells to produce such diversity in the brain is not fully understood. One way to study gene expression in single cells involves profiling the transcriptome, the full range of intermediary RNA molecules a cell produces from its genes to make proteins.

Zhang et al. used transcriptome profiling to better understand how thousands of regulatory genes encoding regulatory proteins called transcription factors create different types of neurons in the zebrafish brain, which is similar to but much simpler than the human brain. To do so, they analysed transcriptome data extracted from cell populations located in specific brain regions and displaying different properties.

Zhang et al. identified distinct clusters of neurons in the larval zebrafish brain. Mathematical models then analysed the transcriptome profiles of these neuronal clusters with characteristic features. They revealed that neurons with similar characteristics did not necessarily share the same transcription factors. In other words, distinct sets of transcription factors gave rise to the same types of cells. Zhang et al. described this observation as a ‘convergent’ pattern. On the contrary, some neurons with dissimilar features expressed the same sorts of transcription factors, suggesting a ‘divergent’ developmental pattern also exists.

In summary, this work sheds light on variable gene expression patterns akin to design principles that shape neuronal diversity in the brain. It gives a new appreciation of how neuronal subtypes result from a complex set of regulatory factors controlling gene expression.

Introduction

The vertebrate brain harbors highly diverse neuronal types that are specifically interconnected to form functional circuits (Kepecs and Fishell, 2014; Armañanzas and Ascoli, 2015; Moffitt et al., 2018). The brain comprises conserved major cell types (neurons, progenitors, glia, endothelial cells, etc.) (Marques et al., 2016; Saunders et al., 2018; Hodge et al., 2019). Previous studies have indicated that neuronal types could be well characterized by their features, including electrophysiological properties, neurotransmitter/modulator identity, synaptic connectivity, brain region identity, and cellular morphology (Zeisel et al., 2015; Pandey et al., 2018; Zeisel et al., 2018). How these neuronal features are molecularly encoded at the whole-brain level is an important issue in neuroscience.

With the advent of single-cell RNA-sequencing (scRNA-seq) technology, recent studies have begun to characterize neuronal diversity by analyzing single-cell transcriptomes of large populations in the whole brain or specific brain regions of mice and humans (Darmanis et al., 2015; Lake et al., 2016; Poulin et al., 2016; Tasic et al., 2016; Chen et al., 2017; Tasic et al., 2018). These studies have provided compelling evidence that individual neuronal types could be well characterized by their transcriptomes (Poulin et al., 2016; Tasic et al., 2018; Sugino et al., 2019). For instance, scRNA-seq analyses of half a million cells from the mouse brain showed that neurons could be classified by genes related to neuronal connectivity, synaptic transmission, and membrane conductance (Zeisel et al., 2018). Besides, the scRNA-seq analysis of nearly 200 genetically marked mouse neuronal populations also showed that neurons could be classified by the expression level of various transcription factors (TFs), ion channels, synaptic proteins, and cell adhesion molecules (Sugino et al., 2019). These studies provided extensive information on the molecules that could be used to define neuronal types.

Generally, these molecules could be sub-divided into two primary categories: regulatory genes and effector genes. Regulatory genes encode proteins involved in gene transcription and translation (e.g., TFs and post-transcriptional regulators), while effector genes encode proteins serving specific neuronal terminal features (e.g., synaptic proteins, ion channels, transporters, and receptors) (Kratsios et al., 2015; Zeisel et al., 2015; Paul et al., 2017, Zeisel et al., 2018; Reilly et al., 2020). Regulatory genes are critical for establishing and maintaining effector gene profiles that resulted in diverse neuronal types. Interestingly, distinct TFs can determine neuronal types with similar neurotransmitter identities in the Caenorhabditis elegans, arguing for phenotypic convergence (Serrano-Saiz et al., 2013; Gendrel et al., 2016; Hobert and Kratsios, 2019). Remarkably, this phenotypic convergence has also been reported in the Drosophila’s optic lobe (Konstantinides et al., 2018). However, the multidimensional landscapes of regulatory genes in vertebrate whole-brain neuronal phenotypes remain elusive.

The larval zebrafish brain comprises only about 100,000 cells, thus providing an outstanding vertebrate model for studying cell diversity within the entire brain, using single-cell transcriptome analysis with full cell coverage by the 10× Genomics Platform. Previous scRNA analysis of the zebrafish nervous system has elegantly demonstrated the temporal dynamics of brain cell development (Raj et al., 2018; Raj et al., 2020). In this study, we generated the multidimensional landscape of regulator genes in effector gene-based neuronal phenotypes (terminal features) at the whole-brain level by combining single-cell transcriptome data obtained from the whole brain, specific brain regions, as well as neurotransmitter- and neuromodulator-defined neuronal populations. We found that, at the transcriptional and post-transcriptional levels, glutamatergic/GABAergic neurons with the same terminal features could express different TF profiles, while those with different terminal features could express the same TF but different RNA-binding protein (RBP) profiles. In contrast, neuromodulator-type neurons that display particular terminal features expressed unique TF profiles. Thus, our findings reveal multidimensional landscapes of transcriptional and post-transcriptional regulators in the whole zebrafish brain.

Results

Molecular classification of whole-brain cells in zebrafish

To uncover the transcriptomic profiles of diverse cell types with regional identity in the larval zebrafish whole brain at single-cell resolution, we dissected and dissociated cells from the whole brain (n = 4), four specific brain regions (n = 2 each), including the forebrain (Fore), optic tectum (OT), hindbrain (Hind), and the region underneath the optic tectum (sub-OT) in the 8 days post fertilization (dpf) zebrafish (Figure 1—figure supplement 1A). We performed scRNA-seq of these cells using the 10× Genomics Chromium 3’ v2 platform. The libraries were sequenced to a mean depth of 126,651 reads per library, with a median of 1891 UMI and 866 genes per cell (Figure 1—source data 1). Reproducibility of transcription analysis was shown by the finding that replicates of whole-brain samples and individual brain regions were primarily overlapped in the t-distributed stochastic neighbor embedding (t-SNE) plots (Figure 1—figure supplement 1B-C). We obtained the transcriptomes of 65,253 cells and selected 45,746 cells for further analysis after filtering out low-quality data (Materials and methods). Then aggregated all cells into 68 clusters using high-variance genes (n = 1402) in t-SNE plot (Figure 1A). The Jaccard index-based analysis showed that most clusters had a high Jaccard index greater than 0.6, indicating the robustness of these clusters (details in Materials and methods) (Tang et al., 2020) except for a few neuronal clusters from sub-OT (Clusters 52, 66), and non-neuronal clusters (neuroprogenitors: Clusters 37, 56; radial astrocytes: 65, Figure 1—figure supplement 1E).

Figure 1 with 2 supplements see all
Molecular classification of whole-brain cells in larval zebrafish.

(A) The t-distributed stochastic neighbor embedding (t-SNE) plot of 45,746 single-cell transcriptomes pooled from whole brains (n = 4) and four different individual brain regions (n = 2 each). The pooled cells were aggregated into 68 clusters, marked by a number. Each color-coded the major cell type as F. (B) The schematic showing different samples separately examined by single-cell RNA-sequencing on 10× Genomics Drop-seq platform: whole brain (n = 4), forebrain (Fore, n = 2), optic tectum (OT, n = 2), hindbrain (Hind, n = 2), and the region underneath the optic tectum (sub-OT, n = 2). OB: olfactory bulb; Tel: telencephalon; OT: optic tectum; Th: thalamus; H: hypothalamus; Pit: pituitary; Ce: cerebellum; MO: medulla oblongata. (C) Venn plots showing the differentially expressed genes in four major cell types identified by cell-type marker genes (vglut+, glutamatergic neurons, Glu; gad1b+, GABAergic neurons, Gaba; pcna+, neuroprogenitors, P; cx43+, radial astrocytes, R) in three brain regions (Fore, Hind, and OT). Commonly expressed genes in all cell types for a given brain region were identified as region-specific genes: six for forebrain (Fore), one for optic tectum (OT), and one for hindbrain (Hind), with genes listed below. (D) Dot plot showing the expression levels of region-specific marker genes in four major cell types (colored circles as C) in three brain regions. The gray level represents the average expression; dot size represents the percentage of cells expressing the marker genes. (E) Lawson-Hanson algorithm for non-negative least squares (NNLS) analysis showed cell clusters of Fore, OT, and Hind exhibited a high correlation with their counterparts of the juvenile zebrafish. Degree of correlation in marker genes is coded by the gray level and size of circle. (F) The dendrogram for the taxonomy of 68 identified clusters based on effector gene profiles (n = 1099). Main branches of neuronal and non-neuronal cells were classified into six branches (red dashed line) that include: I, cerebellum and habenula (hb); IIa, glutamatergic neurons (Glu); IIb, inhibitory neurons (Gaba); III, neuroprogenitors (P); IV, radial astrocytes (R); V, others, including microglia, endothelial cells, and oligodendrocytes. The colored dots and squares below indicate their regional origins and neurotransmitter-type, respectively.

Figure 1—source data 1

Bioinformatics processing of raw reads of single-cell samples.

https://cdn.elifesciences.org/articles/68224/elife-68224-fig1-data1-v2.xlsx
Figure 1—source data 2

The annotation of 68 clusters of whole-brain sample.

https://cdn.elifesciences.org/articles/68224/elife-68224-fig1-data2-v2.xlsx
Figure 1—source data 3

The regional origins and neurotransmitter-type annotation of each whole-brain cluster with well-known markers.

https://cdn.elifesciences.org/articles/68224/elife-68224-fig1-data3-v2.xlsx
Figure 1—source data 4

Top 20 marker genes of whole-brain larval zebrafish 68 clusters.

https://cdn.elifesciences.org/articles/68224/elife-68224-fig1-data4-v2.xlsx
Figure 1—source data 5

Marker genes of major six cell type in whole brain.

https://cdn.elifesciences.org/articles/68224/elife-68224-fig1-data5-v2.xlsx

Each cluster was annotated according to cell-type-specific marker genes from the literature or ZFIN (Zebrafish Information Network) database (Figure 1—source data 2). To assign the brain region identity for each cluster, cells from each of the four specific regions (Fore, OT, sub-OT, Hind) were found to cover multiple but non-overlapping clusters and all 68 clusters could be assigned with their brain region origins (Figure 1—figure supplement 2A-B, and Figure 1—source data 3). Furthermore, to identify potential brain region-specific markers that exist in all cell types, we identified all genes that were differentially expressed in each region for all major cell types (vglut+, glutamatergic neurons; gad+, inhibitory neurons; pcna+, neuroprogenitors; cx43+, radial astrocytes). The differentially expressed genes shared by all cell types were considered to represent the targeted region-specific markers (Figure 1C). We indeed identified a small set of genes specific to each brain region independent of cell type. For instance, foxg1a, en2a, and hoxb3a were explicitly expressed in all cell types of the Fore, OT, and Hind, respectively (Figure 1D). Interestingly, these brain region-specific genes also exhibited a conserved region-specific expression pattern in the mouse brain (Hanks et al., 1995, Manzanares et al., 2001, Kumamoto and Hanashima, 2017). We found no specific gene for the sub-OT, probably due to the diverse brain structures in this region. These region-specific genes, which may be involved in forming regional identity during brain development, could be used to study region-specific neuronal connectivity and function.

To assign the neurotransmitter/modulator identity for each cluster, we used the marker genes that were specific to primary neurotransmitter/modulator phenotypes, including slc17a6b (glutamatergic), gad1b (GABAergic), slc6a5 (glycinergic), th (dopaminergic [DA]), tph2 (serotonergic [5-HT]), and chata (cholinergic [ChAT]) (Figure 1—figure supplement 2C, Figure 1—source data 3). The ratio of glutamatergic to GABAergic neurons was the highest in the forebrain and lowest in the hindbrain, indicating that glutamatergic neurons predominantly belonged to the forebrain, whereas glycinergic neurons mainly resided in the hindbrain (Figure 1—figure supplement 2D). These regional patterns of neurons expressing different neurotransmitter types were validated using the transgenic fishlines: Tg (vglut2a:loxp-DsRed-loxp-GFP), Tg (gad1b:EGFP), and Tg (glyT2:GFP), each exhibiting distinct labeling of glutamatergic, GABAergic, and glycinergic neurons, respectively (Figure 1—figure supplement 2E).

Moreover, the Lawson-Hanson algorithm for non-negative least squares (NNLS) analysis using cluster-specific marker genes (top 20, Figure 1—source data 4) showed that these 68 clusters exhibited a high overlap with their counterparts in the juvenile zebrafish brain recently reported (Raj et al., 2018). Meanwhile, clusters with different regional origins or cell types also exhibited a high correlation with their counterparts in the juvenile zebrafish (Figure 1E and Figure 1—figure supplement 2F). Thus, our analysis indicated that the brain at 8 dpf mostly represented cellular diversity in the juvenile brain.

Furthermore, Gene Ontology (GO) analysis of 1402 variable genes used for the classification of all 68 clusters showed that the majority of these genes (78.4%, n = 1099) were effector genes, which could be classified as neuropeptides, receptors, transporters, ion channels, synaptic proteins, and cell adhesion molecules (Figure 1—figure supplement 2G). This result suggests the importance of effector gene profiles in brain cell classification, which has also been appreciated by previous studies in different species (Paul et al., 2017, Hodge et al., 2019).

We thus generated the hierarchical classification of all 68 cell clusters using the profiles of these 1099 effector genes. All 68 clusters were first segregated into two groups, neuronal cells (48 clusters, 37,880 cells) and non-neuronal cells (20 clusters, 7866 cells) (Figure 1F). Among non-neuronal cells, oligodendrocytes (Clusters 40, 42; olig2+, sox10+), microglia (Cluster 55; apoeb+, mpeg1.1+), endothelial cells (Clusters 48, 57; fxyd1+, rbp4+), erythrocytes (Cluster 67; hbbe1.2+, hbae5+), radial astrocytes (Clusters 21, 46, 65; cx43+, glua+), and neuroprogenitors (Clusters 8, 14, 18, 28, 29, 33, 34, 37, 53, 56; pcna+, cdk1+) were identified according to putative marker genes (Figure 1—source data 5). On the other hand, among neuronal cells, the first segregation defined three classes of neurons (branch I, cerebellum and habenula; branch IIa, glutamatergic neurons, branch IIb, inhibitory neurons). Branch I consisted of granule cells (Clusters 19), torus longitudinals (Cluster 45), cranial ganglions (Clusters 19, 50, 63), dorsal and ventral habenula neurons (Clusters 35 and 59). Branch IIa included 22 subclasses of excitatory glutamatergic neurons (vgluta2a+-gad1b--glyt2-), whereas Branch IIb inhibitory neurons included 17 subclasses of GABAergic neurons (vgluta2a-/gad1b+) and 1 subclass of glycinergic neurons (vgluta2a--gad1b+-glyt2+, Cluster17, Figure 1—source data 5).

Molecular classification of neuromodulator-type neurons

To further examine the expression of neuromodulators at the whole-brain level, we performed the scRNA-seq analysis of neuromodulator neurons sorted from the whole brain of Tg (ETvmat2:GFP) transgenic fish (Figure 2A). Using this fishline, we could examine transcriptomes of DA neurons, 5-hydroxytryptamine (5-HT) neurons, and norepinephrinergic (NE) neurons (Wen et al., 2008). After the filtering procedures described above, we obtained a total of 5368 vmat2-expressing cells (Materials and methods). The analysis aggregated these cells into 22 clusters in t-SNE plots (Figure 2B), more than those found using whole-brain samples (two DA, one 5-HT, and three ChAT; Figure 1—source data 2). To further validate the stability of clusters, we used the Jaccard index, which showed that the majority of clusters (20/22) were stable using mean/median Jaccard index >0.6 as cutoff (Figure 2—figure supplement 1A).

Figure 2 with 1 supplement see all
Molecular classification of neuromodulator-type neurons.

(A) The schematic showing the procedure of collecting single-cell transcriptomes of neuromodulator neurons with fluorescence-activated cell sorting (FACS). Using Tg (ETvmat2:GFP) fishline, we could isolate dopaminergic (DA), serotonergic (5-HT), and norepinephrinergic (NE) neurons. (B) The t-distributed stochastic neighbor embedding (t-SNE) plot of 5368 cells obtained from Tg (ETvmat2:GFP) fishline expressing monoaminergic neuromodulators, showing 22 clusters, each marked by a number, color-coding brain regions and cell type of each cluster. (C) Dot plot showing the expression of glutamatergic marker (vglut2a/vglut2b/vglu1/vglu3) and GABAergic marker (gad1b/gad2) in each of the 22 vmat2+ clusters in B. The neurotransmitter phenotypes were color-coded. Empty squares depict the ones with undefined neurotransmitter phenotypes. The average expression levels of these genes for all cells in each cluster were coded by the gray level. The percentage of cells expressing each gene within each cluster was coded by dot size.

Figure 2—source data 1

The annotation of neuromodulator-type neuronal types with well-known markers.

https://cdn.elifesciences.org/articles/68224/elife-68224-fig2-data1-v2.xlsx

According to region-specific marker genes identified above and known marker genes for neuromodulator neurons, we assigned seven clusters (Fore: Clusters 18–19; sub-OT: Clusters 16, 20; OT: Cluster 22; and Clusters 17, 21 without specific regional identity) as DA neurons, 14 clusters (Hind: Cluster 10; sub-OT: Clusters 1, 2, 6, 7; and Clusters 3–5, 8–9, 11–14 without regional identity) as 5-HT neurons, one cluster (Hind: Cluster 15) as NE neurons (Figure 2—figure supplement 1C-D, Figure 2—source data 1).

Further examination of these neuromodulator clusters for their expression of specific neurotransmitters showed that the majority of 5-HT (8/14) and DA clusters (5/7) expressed GABAergic markers gad1b/gad2 (Figure 2C). This result was further validated by Tg (ETvmat2:GFP::gad1b:gal4::uas:mCherry) (Figure 2—figure supplement 1E). Only three 5-HT clusters (Clusters 11–13) expressed glutamatergic marker vglut3, two DA clusters (Clusters 16, 17), and one NE cluster (Cluster 15) expressed glutamatergic markers, vglut2a and vglut2b (Figure 2C). These results are consistent with previous studies (Filippi et al., 2014). Besides, we also found three clusters (Clusters 24, 41, and 64) in whole brain showed choline and glutamate preferential co-expression (Figure 1—source data 2). Thus, our analysis provided a whole-brain characterization of the co-expression patterns of neurotransmitters and neuromodulators. In sum, DA and 5-HT neurons preferentially expressed GABAergic markers, whereas NE and ChAT neurons mostly expressed glutamatergic markers.

The TF regulatory landscape of whole-brain glutamatergic/GABAergic neuron clusters

In the hierarchical classification based on effector gene profiles (Figure 1F), glutamatergic/GABAergic clusters (n = 39) at the terminus pairs represented the ones with the most similar terminal features, termed ‘sister clusters’. In addition, the certainty of this hierarchical classification was verified by the bootstrap re-sampling analysis using pvclust v.2.0 (Figure 3—figure supplement 1A; Suzuki and Shimodaira, 2006). We identified 11 pairs of sister clusters, neurons in each pair exhibited the same neurotransmitter types (Figure 3—figure supplement 2A). To our surprise, neurons of each sister clusters could be from either the same (n = 6) or different (n = 5) brain regions (Figure 3—figure supplement 2A), which did not reflect the strong brain region preference.

To further examine the TF profiles of these effector gene-based sister clusters, we classified glutamatergic (IIa) and inhibitory (IIb) neurotransmitter-type neurons using TF profiles. Notably, TF-based and effector gene-based trees were distinct in terms of matching node (only one matching node: Clusters 9/61) and tree distances (tree distance = 0.71, Figure 3—figure supplement 1B), suggesting that the effector gene-based sister clusters (Figure 3—figure supplement 2A) might express different TF profiles.

We found out of 11 effector genes-based sister clusters, only one pairs could be found in TF-based sister clusters (Clusters 9/61, ‘matched pattern’, Figure 3—figure supplement 2B). And other 10 effector gene-based sister clusters were separated in TF-based classification, suggesting that neurons with similar terminal features mostly expressed different TF profiles (‘convergent pattern’; Figure 3—figure supplement 2C). Also, neurons in each of these 10 sister clusters could come from either the same (n = 5) or different (n = 5) brain regions, exhibiting no brain region preference (Figure 3—figure supplement 2C).

Alternatively, we performed the population-level statistical analysis to compare the landscape of TF and effector gene expression accounting for the full spectrum of cell types rather than just the most similar sister clusters. For all glutamatergic/GABAergic neuronal clusters (n = 39), we calculated the distances between every two clusters (C392) based on either effector gene profiles or TF profiles, and then defined the pairs, which had the lowest 10% distances after ranking, as similar pair clusters (Figure 3A). Then, we defined paired clusters that were similar in both TF and effector gene profiles as ‘matched pattern’, those paired clusters that were similar in effector gene profiles but not in TF profiles as ‘convergent pattern’ (Figure 3B). The population-level analysis identified 19 pairs of effector gene-based similar pair clusters, 5 with matched pattern and 14 with convergent pattern (Figure 3—figure supplement 2D-G). Overall, similar pair clusters with either matched or convergent pattern identified by the population-level analysis showed an overlapping but distinct pattern with those identified by the hierarchical sister cluster analysis (Figure 3B). This discrepancy was likely due to the following facts: (1) In the population-level statistical analysis, we arbitrarily set the lowest threshold as a criterion to identify similar pair clusters. Thus, the levels of this threshold could influence the production of similar pair clusters. (2) In population-level statistical analysis, each cluster could use for multiple times, whereas in hierarchical sister cluster analysis, once a cluster was selected as a pair with another cluster, it could not be re-used again. To overcome this discrepancy, we intersected the results from hierarchical sister cluster analysis and population-level statistical analysis, and identified eight pairs of effector gene-based glutamatergic/GABAergic similar pair clusters, one with matched pattern and seven with convergent pattern (Figure 3B).

Figure 3 with 4 supplements see all
The transcription factor (TF) regulatory landscape in whole-brain neuronal clusters.

(A) Schematic showing the strategies to assess the cluster similarity based on effector gene and TF profiles. We focused on clusters of whole-brain glutamatergic/GABAergic neurons and neuromodulator neurons. Similar pair clusters were identified by two strategies: the first strategy was based on hierarchical sister clusters. The second strategy was based on population-level statistical analysis, in which we calculated and ranked the distances of every two clusters from 39 neurotransmitter-type clusters (C392) and chose the ones with the lowest 10% distance as similar pair clusters. (B) Left: schematic showing the criteria of three patterns: pair clusters that were similar in both TF and effector gene profiles as ‘matched pattern’, those pair clusters that were similar in effector gene profiles but not in TF profiles as ‘convergent pattern’, and those pair clusters that were similar in TF profiles but not in effector gene profiles as ‘divergent pattern’. Right: the plot showing the number of each pattern using two strategies in A. The red dashed circle showing the number of cluster pairs with given pattern based on hierarchical sister cluster analysis; the black solid circle showing the number of cluster pairs with given pattern based on population-level statistical analysis. (C–D) Violin plots showing the expression of TFs (yellow) or effector genes (black) in glutamatergic and GABAergic similar pair clusters of convergent pattern. (E) The bar plot showing the proportions of different patterns for neuronal clusters with neurotransmitter or neuromodulator types. The numbers of each pattern were indicated. Fisher′s exact test was used to test the significant association of different patterns, p = 0.02564, *p < 0.05. (F–G) Violin plots showing the expression of TF profiles (yellow) or effector gene profiles (black) in neuronal clusters of divergent pattern.

We further validated these patterns of paired neuronal clusters by subsampling of genes (80% of total either TFs or effector genes) and average statistics over 20 times to re-identify similar paired clusters based on either TFs or effector genes (Materials and methods). Notably, re-identified paired clusters of different patterns completely recapitulated those pairs identified using the population-level statistical analysis above (Figure 3—figure supplement 2L), indicating the robustness of these patterns.

By combining sister cluster analysis and the population-level statistical analysis, we identified one paired clusters of ‘matched pattern’, seven paired clusters of ‘convergent pattern’ (Figure 3B). Here were some representative cases with the convergent pattern. The first case was a glutamatergic pair cluster from different brain regions: tectal glutamatergic Cluster 1 and hindbrain glutamatergic Cluster 31 shared effector gene profiles including camk2n1a/stmn4/cbln2b/olfm3a/cd63, but differentially expressed TF profiles, atf5b/bhlhe41/lhx1a and ddit3/cebpb/lef1, respectively (Figure 3C). The second case was GABAergic pair cluster from different brain regions: GABAergic clusters in the forebrain (Cluster 36) and the sub-OT (Cluster 16) shared effector gene profiles mcl1a/cbln1/tspan18a/dusp5/ncaldb, but differentially expressed TF profiles, tbr1b/foxg1a/barhl2 and otpa/otpb/six6a, respectively (Figure 3D). Consistently, each of the above TF profiles has previously been characterized to participate in the specification of either glutamatergic or GABAergic neurons (Li et al., 2007; Kala et al., 2009; Talbot et al., 2010; Waite et al., 2011; Achim et al., 2013).

The TF regulatory landscape of neuromodulatory neuron clusters

We further examined neuromodulator-type sister clusters based on either TF or effector gene profiles (Figure 3—figure supplement 3A). Using hierarchical sister cluster analysis, 5/8 of neuromodulator-type sister clusters matched with those defined by TF profiles (‘matched pattern’), 3/8 sister cluster were found to be separated in TF-based classification (‘convergent pattern’, Figure 3—figure supplement 3D-G).

Then, we also performed the population-level statistical analysis using all genes or 80% genes (subsampling) to compare the landscape of TF and effector gene expression. We found eight pairs with matched pattern and one pair with convergent pattern (Figure 3—figure supplement 3H-I). Thus, we intersected hierarchical sister cluster analysis and population-level statistical analysis, and identified six pairs of similar neuromodulator pair clusters, five with matched pattern, and one with convergent pattern (Figure 3—figure supplement 3J-L).

Together with the above results of neuromodulator-type and neurotransmitter types, our analysis showed that neuromodulator pairs with similar effector gene profiles predominantly were ‘matched’ pattern, whereas neurotransmitter pairs with similar effector gene profiles mainly were ‘convergent’ pattern (Figure 3E). Moreover, we performed global tree measurement using the R package ‘TreeDist’, and found that the distance between TF-based and effector-based hierarchical tree of neuromodulator neurons (Tree distance = 0.38, Figure 3—figure supplement 3A-B) was lower than that of glutamatergic/GABAergic neurons (tree distance = 0.71, Figure 3—figure supplement 1B), suggesting distinct TF regulatory logic of effector-based phenotypes between neurotransmitter-type and neuromodulator-type at the global level. The smaller distance for neuromodulator neurons was consistent with our conclusion that neuromodulator-type clusters predominantly expressed the same TF profiles (‘matched’). The ‘matched’ pattern suggested the generation of similar neuromodulator pairs shared specific TF programs (‘stereotyped programming’), and the ‘convergent’ pattern suggested the generation of similar neurotransmitter pairs could be generated by different TF programs (‘flexible programming’). These different strategies may account for the fact that across species, neuromodulator types are more conserved, whereas neurotransmitter types are much diverse and variable (La Manno et al., 2016; Saunders et al., 2018; Tiklová et al., 2019; Poulin et al., 2020).

Different neuronal clusters exhibit similar TF profiles

On the other hand, in glutamatergic/GABAergic neuronal classification, we surprisingly found that 13 pairs of sister clusters with similar TF profiles were separated in the effector gene-based classification. In other words, different from matched and convergent pairs mentioned above (Figure 3B–D), these 13 glutamatergic/GABAergic clusters exhibited different effector gene profiles but similar TF profiles, here terms as ‘divergent’ pattern (Figure 3—figure supplement 4A). Also, the population-level statistical analysis identified divergent pairs (n = 15, Figure 3—figure supplement 4B) that were largely overlapped with those identified by hierarchical sister analysis (n = 10, Figure 3B).

Neurons in each of these 10 divergent paired clusters could be from the same (n = 6) or different (n = 4) brain regions (Figure 3—figure supplement 4A). More strikingly, two pairs of neuronal clusters with divergent pattern expressed different neurotransmitters. For examples, sub-tectal glutamatergic neurons (cluster 24) and hindbrain GABAergic neurons (cluster 17) shared similar TF profiles id2a/pax2a/zfhx4/meis1b, but expressed different effector gene profiles, cpne4a/c1ql4b/slc5a7a and gad1b/slc6a1b/slc6a5, respectively (Figure 3F); forebrain glutamatergic neurons (Cluster 30) and tectal GABAergic neurons (Cluster 11) shared TF profiles, bhlhe41/bcl11ba/mef2cb, but had differentially expressed effector gene profiles, adcyap1b/syt5b/rprml/gadd45bb, and kcnip1b/cxcl4/plxdc1, respectively (Figure 3G). This result illustrated that neurons with different terminal features could also express the same TF profiles. Together, our analysis demonstrated the landscape of TFs in whole-brain-wide neuronal phenotypes in the larval zebrafish brain.

Combinatorial TFs in marking tectal morphological subclasses

Neuronal morphology and effector gene profile are two critical criteria of neuron diversity classification (Sugino et al., 2019; Peng et al., 2021). Also, many previous studies have provided the apparent links between effector genes and neuron morphology (Whitford et al., 2002, Marcette et al., 2014; Delandre et al., 2016; Noblett et al., 2019; Peng et al., 2021). Thus, after we found three patterns (‘matched’, ‘convergent’, and ‘divergent’), we wondered if similar patterns present between morphological subclasses and TFs.

To experimentally verify the relationship between TFs and neuronal phenotypes, we focused on morphological subtypes of tectal glutamatergic neurons. Higher-coverage transcriptomes of tectal glutamatergic neurons were further achieved by scRNA-seq of sorted tectal glutamatergic neurons in 8 dpf using Tg (vglut2a:loxp-DsRed-loxp-GFP) fishline (Figure 4A). Canonical correlation analysis (CCA) of cells from two independent experiments indicated the reproducibility of the analysis (Figure 4—figure supplement 1A). Analysis of 3, 883 sorted tectal glutamatergic neurons yielded 11 clusters (n = 11, Figure 4A), more than those identified from the whole-brain samples (n = 5, Figure 4—figure supplement 1B). The majority of clusters (10/11) were stable using mean/median Jaccard index >0.6 as cutoff (Figure 4—figure supplement 1D). Each cluster of tectal glutamatergic neurons was identified by TF marker genes (Figure 4—figure supplement 1E-F, Figure 4—source data 1).

Figure 4 with 1 supplement see all
Combinatorial transcription factors (TFs) in marking tectal morphological subclasses.

(A) Left: the schematic showing the procedure of collecting single-cell transcriptomes of tectal glutamatergic neurons with fluorescence-activated cell sorting (FACS) using Tg (vglut2a:loxp-DsRed-loxp-GFP) fishline. Right: the t-distributed stochastic neighbor embedding (t-SNE) plot of 3, 883 cells obtained from Tg (vglut2a:loxp-DsRed-loxp-GFP) fishline labelling glutamatergic neurons in the optic tectum showing 11 clusters, each color-coded and marked by a number. (B) Schematic showing the designing of the Gal4FF BAC plasmids for six TF marker genes, covering all 11 clusters of tectal glutamatergic neurons shown in A. (C) Schematic showing the method for single TF-based labeling of tectal glutamatergic neurons. CRE expressed in glutamatergic neurons was used to excise loxp and drive fluorescent tdTomatocaax expression. (D) Representative images of seven subclasses of tectal glutamatergic neurons with distinct morphological subclasses using the method described in (B–C). The number of neurons collected for each subclass was shown. Insets depicted the morphological characteristics. Neurons with ascending and descending projections were those projecting to the forebrain and hindbrain, respectively. Solid lines marked the boundaries between brain regions. Dashed lines marked the boundary of the tectal neuropile. Scale bars: 20 μm. (E) The matrix showing the expressions of six TFs in each of seven morphological subclasses. The black squares represented TFs could label particular morphological subclasses, and no expression was indicated by white squares.

Figure 4—source data 1

The annotation of transcription factors (TFs) expression in each tectal glutamatergic clusters.

https://cdn.elifesciences.org/articles/68224/elife-68224-fig4-data1-v2.xlsx
Figure 4—source data 2

Gene labeled morphological analysis of tectal glutamatergic neurons.

https://cdn.elifesciences.org/articles/68224/elife-68224-fig4-data2-v2.xlsx

We developed a labeling strategy using three different plasmids to analyze morphological subclasses of tectal glutamatergic neurons expressing identified TFs. First, we placed Gal4FF cassette at the start-codon of each TF by bacterial artificial chromosome (BAC) recombination technique (Suster et al., 2011) and constructed the Gal4FF plasmids for 15 marker TF genes (Figure 4B, Materials and methods). Second, we generated vglut2a:CRE BAC plasmid using a similar method. Third, plasmid uas:loxp-stop-loxp-tdTomatocaax was used with the first two plasmids to intersectionally label single tectal glutamatergic neurons, each expressing one specific TF (Figure 4C). These labeled neurons were then used for morphological analysis. We labeled and defined morphological subclasses of tectal neurons by injecting all three plasmids into zebrafish embryos at 4–16 cells stage. We observed the enormous variability in the morphology of tectal glutamatergic neurons in their more refined structures. By the limited number of neurons analyzed (n = 574), we were unlikely to define morphological subclasses using a global morphological description. Instead, we used the criterion based on major morphological features, including stratification, soma position, and projection patterns, to define the morphological subclasses in the current study. In addition, similar morphological classification has also been used previously for tectal neurons (Nevin et al., 2010, Robles et al., 2011; DeMarco et al., 2020). We selected six TF marker genes (en2b, foxb1a, zic1, bhlhe22, zbtb18, and irx1a) for further analysis based on two criteria: (1) These TFs are highly expressed and specific to individual tectal glutamatergic clusters based on scRNA-seq analysis. (2) Their BAC plasmids could reliably mark particular morphological subclasses (at least in four animals).

Furthermore, using confocal imaging, we reconstructed a total of 574 tectal neurons (from 263 zebrafish, Figure 4—source data 2) that expressed one of these six TFs (Figure 4—figure supplement 1G). These neurons were categorized into seven morphological subclasses: bi-stratified (I, n = 36), mono-stratified (II, n = 71), non-stratified (III, n = 318), necklace-like (IV, n = 76), cross-hemispheric (V, n = 15), ascending (VI, n = 18), and descending (VII, n = 40) (Figure 4D).

Whether a specific TF could serve as a marker for a particular morphological subclass was determined by the criterion that the TF appeared in a given subclass at least four times (from at least four fish). Remarkably, all six TFs marked multiple morphological subclasses in a combinatorial manner, while each of seven morphological subclasses was marked by numerous TFs (Figure 4E). For example, zic1 highly marked non-stratified and bi-stratified subclasses, and all six TFs marked the non-stratified subclass.

In summary, we found that single TF could mark multiple morphological subtypes in the optic tectum, and multiple TFs could mark a single morphological subtype. This observation could be inferred from ‘convergent pattern’ (same TFs were expressed in different effector-based subtypes) and ‘divergent pattern’ (different TFs were expressed in similar effector-based subtypes), respectively. However, we could not exclude unknown indirect regulations of morphological subtypes by TFs.

The landscape of post-transcriptional regulators in neuronal clusters with different terminal features but similar TF profiles

The above finding that neuronal clusters with different terminal features expressed similar TF profiles were likely resulted from non-TF determinants (Figure 3F–G). We then performed GO analysis of differentially expressed genes between sister clusters of 10 pairs with different terminal features but similar TF profiles (‘divergent pattern’). We surprisingly found that many differentially expressed genes encoded RBPs (Figure 5A). We thus yielded a library of RBP-encoded genes that could differentially mark neurons with different terminal features (Figure 5—source data 1). Here were two examples: two forebrain glutamatergic neuronal clusters (Clusters 7 and 38) shared the similar expression pattern of TF profiles eomesa/neurod1/tbr1b/foxg1a, but had different effector gene profiles and RBPs, rprma/stxbp1b/egfl6/khdrbs2/msi2b/rbm5 and nrgna/susp5/fam43b/qkia/larp6a, respectively; GABAergic neurons in hindbrain (Cluster 12) and sub-tectal (Cluster 16) shared TF profiles, otpa/bhlhe41/id2a, but differentially expressed effector genes and RBPs, kctd4/rgs5b/cpne2/eif5a2/celf3a and slc6a1b/cplx2l /ndrg2/msi2b/rbfox1/msi2a, respectively (Figure 5B). These RBP-encoded genes exhibited wide RNA functions, including capping, splicing, polyadenylation, transport, stability, and translation (Figure 5C, Figure 5—source data 3).

Figure 5 with 1 supplement see all
The post-transcriptional regulatory landscape in neurons with different terminal features but similar transcription factor (TF) profiles.

(A) Dot plot showing Gene Ontology (GO) analysis of differentially expressed genes between each pair clusters with divergent pattern. Red highlighted the category of post-transcriptional regulators. The number of genes in each category was coded by dot size, p.adjust of each category was color-coded. (B) Violin plots showing the expression of genes encoding TFs (yellow), RNA-binding proteins (RBPs) (blue), and effector genes (black) in two pairs of neuronal clusters with divergent pattern (7/38 were glutamatergic neurons, 12/16 were GABAergic neurons). (C) 3D pie plot showing the functional annotation of differentially expressed RBP genes identified in A. (D) Bar plot showing the number of pattern-specific RBPs per pair in different patterns. Each pattern was color-coded. (E) Bar plot showing the effector binding target proportion of pattern-specific RBPs. Each pattern was color-coded. (F) Bar plot showing the percentage of one RBP used as differentially expressed genes between pair clusters with different patterns.

Figure 5—source data 1

Differentially gene expression between sister pair clusters with matched, convergent, and divergent patterns.

https://cdn.elifesciences.org/articles/68224/elife-68224-fig5-data1-v2.xlsx
Figure 5—source data 2

The p-value of differential gene expression between sister pair clusters.

https://cdn.elifesciences.org/articles/68224/elife-68224-fig5-data2-v2.xlsx
Figure 5—source data 3

Annotation the function of RNA-binding proteins (RBPs).

https://cdn.elifesciences.org/articles/68224/elife-68224-fig5-data3-v2.xlsx
Figure 5—source data 4

The binding sites of pattern specific RNA-binding proteins (RBPs).

https://cdn.elifesciences.org/articles/68224/elife-68224-fig5-data4-v2.xlsx
Figure 5—source data 5

The combinatorial usage of RNA-binding proteins (RBPs).

https://cdn.elifesciences.org/articles/68224/elife-68224-fig5-data5-v2.xlsx

Besides, our analysis also yielded a comprehensive list of genes encoding RBPs marking other patterns (Figure 5—figure supplement 1A-D, Figure 5—source data 1, Figure 5—source data 2). We found that some RBPs exhibited differential expression between paired clusters specific to one of the above three patterns (Figure 5—figure supplement 1). Also, we found that the number of RBPs that were specific to divergent pattern (n = 52 from 10 pairs) was much more than those specific to either matched pattern (n = 7 from one pair) or convergent pattern (n = 10 from seven pairs, Figure 5—figure supplement 1E-F). Thus, in terms of the number of pattern-specific RBPs per pair, RBPs showed the preferential expressions in divergent pairs compared to convergent ones (Figure 5D). Note that since there was only one pair of matched pattern, we did not include it in this analysis. These pattern-specific RBPs were known to be involved in a wide range of RNA functions, including capping, splicing, polyadenylation, transport, stability, and translation (Figure 5—figure supplement 1F, Figure 5—source data 3). Thus, in addition to TFs, post-transcriptional regulatory factors also play significant roles in determining transcriptome profiles of neuronal classification.

Furthermore, we examined effector gene that were targeted by pattern-specific RBPs using the oRNAment database (http://rnabiology.ircm.qc.ca/oRNAment/) to explore a potential causality between the divergence of RBPs and the divergence of effector genes (Benoit Bouvrette et al., 2020). GO analysis showed that pattern-specific RBPs targeted various molecular categories including effector genes, TFs, mRNA processing, metabolism (Figure 5—figure supplement 1H-J, Figure 5—source data 4). More importantly, divergent pattern-specific RBPs targeted significantly higher proportions of effector genes than matched and convergent pattern-specific RBPs (Figure 5E). These results suggested the potential causality between RBPs and effector gene profiles in divergent pairs.

Interestingly, genes encoding well-known sequence-specific RBP Rbfox1-encoded gene was differentially expressed in multiple neuronal clusters, like 23/25, 27/54, 11/30, 13/24, 12/16, 2/3, and 7/38 (Figure 5—source data 5). Rbfox1 specifically recognizes UGCAUG motifs, which are often found at 5′- and 3′-regions of introns (Yui Jin, 2003; Vuong et al., 2018). Previous studies have elegantly revealed the importance of activity-dependent splicing regulator Rbfox1 in the transition from neuroprogenitors to neurons as well as in the interneuron subtype-specific splicing in the mouse cortex (Zhang et al., 2016). Moreover, we calculated the frequency of individual RBPs in marking all sister pair clusters with each of three patterns and found that single RBPs were frequently present in multiple pair clusters (42.5%, 51/120, Figure 5F), indicating that clusters mostly used RBPs in a combinatorial manner.

Thus, our analysis revealed the importance of RBP-encoded genes in specifying diverse neuronal phenotypes, and the identification of a comprehensive list of RBP-encoded genes in our study provides a valuable resource for future investigation.

Discussion

This study analyzed ~65,000 single cells from the whole-brain, specific brain regions, neurotransmitter/neuromodulator-defined neuronal populations of the 8 dpf larval zebrafish (Figures 1A, 2B ,, 4A). Notably, our transcriptome analysis offered a close-to-full coverage of all cells (~100,000) in the zebrafish brain, providing multidimensional landscapes of TFs and post-transcriptional regulators in vertebrate whole-brain neuron classification (Figure 6).

Organization of transcriptional and post-transcriptional regulators in the specification of brain-wide neuronal clusters.

(A) Graphic summary describes the general organization of transcription factors (TFs) and post-transcriptional regulators in the specification of neuronal clusters at the whole-brain level. Effector genes describe the terminal features of whole-brain neuronal classification. Neuronal clusters with similar terminal features (named ‘sister clusters’) can share TF profiles (‘matched’, left), or exhibit distinct TF profiles (‘convergent’, middle). Besides, neuronal clusters with distinct effector gene profiles (terminal features) can share TFs profiles (‘divergent’, right), and in each pairs, two neuronal clusters are differentially marked by the expression of RNA-binding proteins (RBPs) at the post-transcriptional levels. Moreover, sister clusters with ‘convergent’ and ‘divergent’ patterns are glutamatergic/GABAergic neurons, whereas sister clusters with ‘matched’ pattern are neuromodulator-type neurons.

A significant focus of our analysis is the analysis of sister clusters at the termini of effector gene-based hierarchical classification (Figure 3A). These sister pair clusters represent the finest molecularly defined neuronal subclasses based on effector gene profiles. Because sister clusters of the terminus pair with highly similar terminal features, it offers us an opportunity to study the determinants of their terminal feature similarity and factors that lead to their slight diversification. Our analysis of the expression profiles of regulatory genes in sister clusters including TFs and post-transcriptional regulators (Figure 6) may thus help to elucidate the molecular logic underlying neuronal diversification.

Specifically, our analysis identified three patterns of TFs in specifying neuronal phenotypes, ‘matched pattern’, ‘convergent pattern’, and ‘divergent pattern’ (Figures 3B and 6A). Glutamatergic/GABAergic sister clusters with similar terminal features mostly expressed distinct TF profiles (‘convergent pattern’), whereas neuromodulator-type sister clusters largely expressed the same TF profiles (‘matched pattern’, Figure 3E). The results indicated that the ‘matched’ pattern supports the notion that the same set of TFs plays a crucial role in determining a given profile of neuromodulator-type terminal features, whereas the ‘convergent’ pattern suggests that even though TFs are important for subclass determination, different TFs could still converge onto the same transcriptome phenotype. Combined with earlier findings of extensive phenotypic convergence of distinct neuron types in worms and flies (Gendrel et al., 2016; Konstantinides et al., 2018; Hobert and Kratsios, 2019), our findings in zebrafish strongly suggest that phenotypic convergence is highly conserved from invertebrates to vertebrates.

More interestingly, we identified 10 terminus clusters exhibiting different transcriptome patterns but similar TF profiles (‘divergent pattern’). Thus, these clusters are likely determined by factors other than TFs (Figure 3F–G). Our further analyses revealed the potential importance of the expression pattern of post-transcriptional regulators, particularly RBPs, in marking these neuron clusters (Figure 5A). For instance, upf3a (the marker distinguishing subclasses of the pairs 23/25, 12/16, and 60/62, Figure 5—source data 5) is known to regulate mRNA stability via nonsense-mediated RNA decay (Shum et al., 2016), rbfox1/rbfox3a (the marker distinguishing subclasses of the pair 27/54, 13/24, 11/30, 23/25, 12/16, 2/3, and 7/38, Figure 5—source data 5) could mediate cell-type-specific splicing in cortical interneurons, assembly of axon initial segment, and synaptic transmission (Jacko et al., 2018; Vuong et al., 2018; Wamsley et al., 2018). In addition to post-transcriptional regulators, other factors could also be expected to involve neuron diversification, such as epigenetic regulators, translational regulators, protein stability, which are interesting to explore in future studies. Together, our findings suggest that combinatorial TFs and post-transcriptional regulators could work in concert to determine neuronal types in the larval zebrafish brain (Figure 6).

Regional identity is an essential factor for classifying brain cells. In the larval zebrafish brain, our results showed that neurons, qRG, and neuronal progenitors exhibited prominent regional characteristics (Figure 1C). In the hierarchical clustering and population-level statistical analysis, similar pair clusters from the same region or different regions occur in nearly equal probabilities (Figure 3—figure supplement 2A). Considering the fact that neuronal phenotypes from the same region could exhibit common regional identities, it was surprising to observe that such a higher proportion of sister subclasses with similar transcriptomic phenotypes could arise from two different brain regions. Putting this finding into the context of neuron-type evolution, it raises the possibility that different brain regions independently give rise to similar neurotransmitter phenotypes through different TF programs. Alternatively, as brain regions are functionally diversified during the evolution, these highly similar neuronal clusters of different brain regions possibly derive from ancient building blocks, which become divergent through the evolutionary acquisition of different combinatorial TF codes.

scRNA-seq resolves the transcriptomes of brain cells at a given time point (Erhard et al., 2019). On the other hand, gene expression is highly dynamic and sensitive to spontaneous or stimulus-evoked neuronal activities (West et al., 2002; Kim et al., 2010; Yap and Greenberg, 2018), circadian rhythm (Panda et al., 2002, Takahashi, 2017), as well as other extrinsic and intrinsic factors. Thus, gene profiles of individual neuron subclasses could represent dynamic cellular states at a given time rather than a static subclass (Wu et al., 2016, Ofengeim et al., 2017). This issue could be resolved by comparing transcriptome profiles of cell samples from different animals or from different developmental stages. In this study, the 68 cell subclasses that we have identified for the larval zebrafish (8 dpf) were confirmed in another fish brain of the same age and were largely recapitulated by cell subclasses identified from the juvenile zebrafish (23–25 dpf, Figure 1E and Figure 1—figure supplement 2F; Raj et al., 2018). Thus, transcriptome-defined subclasses in our study indeed largely represent stable neuronal subclasses. An alternative demonstration of stable neuronal subclasses is to perform genetic labeling of the brain using promoters of marker genes for transcriptome-defined subclasses. Reproducible labeling of the same morphological subclasses, as demonstrated in our morphological studies of tectal neurons expressing specific TFs, also supports that we have identified stable neuronal subclasses (Figure 4D–E). Identification of neuronal subclasses in the present study paves the way for understanding neuronal diversity. Future studies of temporal changes of single-cell transcriptome profiles in the whole brain could provide important insights into the dynamic changes of the brain.

Materials and methods

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiers Additional information
Strain, strain background (Danio rerio)Wild typeDr WilliamAB
Strain, strain background (Danio rerio)gad1b:EGFPWang et al., 2020ZDB-TGCONSTRCT-210507–9Tg (gad1b:EGFP)
Strain, strain background (Danio rerio)vglut2a:loxp-DsRed:loxp-GFPSatou et al., 2012ZDB-FISH-150901–9050Tg (vglut2a:loxp-DsRed:loxp-GFP)
Strain, strain background (Danio rerio)glyT2:GFPMcLean et al., 2007ZDB-ALT-070514–1Tg (glyT2:GFP)
Strain, strain background (Danio rerio)vmat2:GFPWen et al., 2008ZDB-PUB-080102–11Tg (ETvmat2:GFP)
Strain, strain background (Danio rerio)elavl3: H2B-GCaMP6sFreeman et al., 2014ZDB-TGCONSTRCT-190827–1Tg (elavl3: H2B-GCaMP6s)
Recombinant DNA reagentpTol2-10xUAS:loxp-stop-loxp-tdtomatoThis paperWe made this plasmid by ligated three PCR fragments: loxp-stop-loxp, tdTomatocaax and 10× uas backbone
Recombinant DNA reagentvglut2a:creThis paperBAC plasmid use BAC (CH211-111D5)
Recombinant DNA reagentzic1:gal4FFThis paperBAC plasmid use BAC (CH211-95F4)
Recombinant DNA reagentbhlhe22:gal4FFThis paperBAC plasmid use BAC (CH211-277b21)
Recombinant DNA reagenten2b:gal4FFThis paperBAC plasmid use BAC (DKEY-265A7)
Recombinant DNA reagentfoxb1a:gal4FFThis paperBAC plasmid use BAC (CH211-2C17R)
Recombinant DNA reagentzbtb18:gal4FFThis paperBAC plasmid use BAC (CH211-221N23)
Recombinant DNA reagentirx1a:gal4FFThis paperBAC plasmid use BAC (CH73-211K12)
Commercial assay or kitSingle Cell 3' Library and Gel Bead kit v2 Chip kit10× Genomics120237scRNA-seq
Commercial assay or kitdsDNA High Sensitivity Assay KitAATIDNF-474–0500scRNA-seq
Commercial assay or kitClonExpressMultiS One Step Cloning KitVazymeCat#C112-01/02Prepare recombinant plasmid
Chemical compound, drugpapainWorthington Biochemical CorporationLS003126Prepare papain solution to dissociation cells
Chemical compound, drugDNase ISigmaCat#DN25Prepare papain solution to dissociate cells
Chemical compound, drugL-cysteineSigmaCat#C6852Prepare papain solution to dissociate cells
Chemical compound, drugDMEM/F12InvitrogenCat#11330032Prepare papain solution to dissociate cells
Chemical compound, drug45% glucoseGibcoCat#04196545SBPrepare wash buffer during dissociation
Chemical compound, drugHEPESSigmaCat#H4034Prepare wash buffer during dissociation
Chemical compound, drugFBSGibcoCat#10270106Prepare wash buffer during dissociation
Chemical compound, drugDPBSInvitrogenCat#14190–144Prepare wash buffer during dissociation
Chemical compound, drugMS222SigmaCat#A5040Anaesthesia
Chemical compound, drugLow melting agaroseSigmaCat#A0701Embedded fish
Software, algorithmR 3.5.1R-projecthttps://www.r-project.org/Data analysis
Software, algorithmCell Ranger Single Cell Software Suite (v2.1.0)10× Genomicshttps://support.10xgenomics.comscRNA-seq data analysis
Software, algorithmSeurathttps://satijalab.org/seurat/http://satijalab.org/seurat/scRNA-seq data analysis
Software, algorithmbioDisthttps://www.bioconductor.orghttps://www.bioconductor.org/packages/release/bioc/html/bioDist.html
Software, algorithmscclustevalR-package, Jaccard indexhttps://github.com/crazyhottommy/scclusteval; Tang et al., 2020Jaccard index could be used to evaluate the robustness of clusters
Software, algorithmTreeDistR-package, calculate two tree similarityhttps://github.com/ms609/TreeDist; Smith, 2021Calculate tree distance
Software, algorithmFIJIPMID:22743772http://fiji.sc/Analysis image
Software, algorithmGraphPad PrismGraphPad Softwarehttps://www.graphpad.comData analysis
Software, algorithmFV10-ASW 4.0 ViewerOlympushttps://www.olympus-global.comAnalysis image
Software, algorithmclusterProfilerR-packagehttps://bioconductor.org/packages/release/bioc/html/clusterProfiler.htmlGO analysis
Software, algorithmNNLSR-package, non-negative least squares solver from Lawson and Hansonhttps://github.com/rdeits/NNLS.jl; Deits, 2021Compare correlation of clusters

Animal maintenance and transgenic lines

Request a detailed protocol

For all experiments in this study, zebrafish were maintained, mated, and raised at 28°C according to standard protocols. Animals were staged according to dpf. Animal procedures performed in this study were approved by the Animal Use Committee of Institute of Neuroscience, Chinese Academy of Sciences (NA-045–2019).

Transgenic lines used in this study include: Tg (glyT2:GFP) (McLean et al., 2007), Tg (vglut2a: loxp-DsRed-loxp-GFP) (Satou et al., 2012), Tg (Etvmat2:GFP) (Wen et al., 2008), Tg (gad1b:EGFP) (Wang et al., 2020).

Single-cell sample preparation

Whole-brain and brain region identified cell populations

Request a detailed protocol

Wild-type fish were processed for 10× Genomics single-cell transcriptome sequencing. Whole brains or different brain regions (Fore, OT, Hind, and sub-OT) were dissected as anatomical structures (Figure 1B, Figure 1—figure supplement 1A). Tissues were dissociated with 300 μL papain (28 units/mL, Worthington) in papain solution (1% DNase, 12 mg/mL L-cysteine in DMEM/F12), incubated at 37°C for 15 min with proper mix methods (Yu and He, 2019). Dissociated cells were washed twice with washing buffer (100 mL: 650 μL 45% glucose, 500 μL 1 M HEPES and 5 mL FBS into 93.85 mL 1× DPBS, sterilized with a 0.22 μm pore size filter), and sterilized with 40 μm cell strainers (BD Falcon) into 300–400 μL washing buffer.

Whole-brain neuromodulator or tectal glutamatergic neurons

Request a detailed protocol

For the transgenic fishlines, brains were dissected and dissociated as above; the cell suspension was used to sort cells with high fluorescence intensity with flow cytometry (Moflo XDP, Beckman Coulter). The interesting cells were transferred to washing buffer. In this paper, we used Tg (ETvmat2:GFP) transgenic fishlines to acquire neuromodulator-defined cell populations; Tg (vglut2a: loxp-DsRed-loxp-GFP) was used to collect glutamatergic neurons. Tg (ETvmat2:GFP) could label most monoaminergic neurons, including DA neurons, 5-HT neurons, and NE neurons (Figure 2A, Figure 2—figure supplement 1B; Wen et al., 2008).

scRNA-seq on 10× Genomics Chromium Platform

Request a detailed protocol

All sampling was carried out with 10× Genomics Chromium Single Cell Kit (Version 2): suspensions prepared as described above were diluted to concentrations 300–1000 cells/mL with washing buffer, then added 10× Chromium RT mix to achieve loading target cell numbers 13,000. Downstream cDNA synthesis (14 PCR cycles), library preparation, and sequencing were carried out according to manufacturer’s instructions (https://www.10xgenomics.com/solutions/single-cell/).

Data analysis of scRNA-seq

Bioinformatics processing of raw reads

Request a detailed protocol

Sequencing data (FASTQ files) were converted to matrices of expression counts using the Cell Ranger software (Version 2.0.1) provided by 10× Genomics. Transcriptome libraries were mapped to a zebrafish reference built from a custom GTF file and the zebrafish GRCz11 (Ensemble release-96) genome assembly. For each sample, cells with less than 600 detected molecules (UMIs), or less than 1.2-fold molecule to gene ratio, and genes detected in fewer than 20 cells or more than 60% of all cells were excluded for the following analysis.

Cell clustering analysis of cells from the whole brain

Request a detailed protocol

After filtering, the combined gene expression matrix of whole brain and four major brain region samples were combined (using Cellranger aggr pipeline) and loaded into Seurat package (Version 2.3.4) in R (Version 3.4.3) for the following analysis as described in tutorials (http://satijalab.org/seurat/). In brief, digital gene expression matrices were column-normalized and log-transformed. Cells with fewer than 500 expressed genes, greater than 5% mitochondrial content, very high numbers of UMIs and gene counts that were outliers of a normal distribution (likely doublets/multiplets) were removed from further analysis. Combined all samples together, we got 45,746 cells, which included four whole-brain samples, four specific brain regions (n = 2 each) including the Fore, OT, Hind, and the sub-OT; 1402 variable genes were selected for principal component analysis (PCA) by binning the average expression of all genes into 300 evenly sized groups, and calculating the median dispersion in each bin (parameters for MeanVarPlot function: x.low.cutoff = 0.0125, x.high.cutoff = 8, y.cutoff = 0.5). The top 100 principal components (PCs) were used for the first round of clustering with the Louvain modularity algorithm (FindClusters function, resolution = 2.5), which generated 68 distinct clusters (Figure 1A). Marker genes for each cluster were calculated using FindAllMarkers function in Seurat (parameters: min.pct = 0.1, min.diff.pct = 0.25).

Clustering analysis of neuromodulator-defined cell populations

Request a detailed protocol

After filtering as above and selected all cells with neuromodulators marker gene expression (vmat2+ monoaminergic neurons: th+/tph2+/th2+/dbh+, Figure 2—figure supplement 1C, and Figure 2—source data 1), variable genes (n = 2,120) were selected for PCA. The top 46 PCs were used for the first round of clustering with the Louvain modularity algorithm (FindClusters function, resolution = 0.6). In total, 5398 cells obtained from Tg (ETvmat2:GFP) fishline expressed monoaminergic neuromodulators, showing 22 clusters (Figure 2B). Marker genes for each cluster were calculated using FindAllMarkers function in Seurat (parameters: min.pct = 0.1, min.diff.pct = 0.25).

Clustering analysis of glutamatergic neurotransmitter-type cell populations

Request a detailed protocol

After filtering as above and selected all cells with tectal glutamatergic marker gene expression (vglut2a, mab21l2, tubb5, and tfap2a, Figure 4—figure supplement 1C), we obtained 3883 cells from optic tectum with Tg (vglut2a:loxp-DsRed-loxp-GFP) fishline expression tectal glutamatergic neurons, showing 11 clusters using 1276 variable genes and 37 PCs (Figure 4A). Marker genes for each cluster were calculated using FindAllMarkers function in Seurat (parameters: min.pct = 0.1, min.diff.pct = 0.25).

Jaccard similarity index

Request a detailed protocol

To evaluate the stability of each cluster, we performed R package ‘scclusteval’, in which we re-sampled a subset of the cells (80%) from the population and repeated clustering, and then used the Jaccard index to evaluate cluster similarity before and after re-clustering. If a cluster is robust and stable, random subsampling and re-clustering will keep the cell identities within the same cluster. After repeating the re-clustering 20 times, we used the mean/median of the Jaccard index as the metric to evaluate the stability of the clusters (Tang et al., 2020). The distribution of the Jaccard index across subsamples measures the robustness of the cluster. Clusters with a mean/median stability score less than 0.6 should be considered unstable (Zumel, 2014). Here, we used JaccardRainCloudPlot to visualize the Jaccard index and found most clusters of whole brain (Figure 1—figure supplement 1E), vmat2+ neuromodulator (Figure 2—figure supplement 1A), and tectal glutamatergic (Figure 4—figure supplement 1D) were stable.

Identification of brain region markers

Request a detailed protocol

To identify potential brain region-specific markers that exist in all cell types, clusters were assigned when most of the cells in a cluster intermingle with cells from one particular brain region (Figure 1—figure supplement 2B). Then, we defined differentially expressed genes shared by all cell types (vglut+, glutamatergic neurons; gad+, GABAergic neurons; pcna+, neuroprogenitors; cx43+, radial astrocytes) to represent the targeted brain region-specific markers. We separated each cell type in four different brain regions as individual samples, FindAllMarkers function in Seurat was used to identify the differentially expressed markers of a given cell type. We listed these brain-region specific markers in Figure 1C, and these genes exhibited specific expression pattern in all cell types of particular brain region (Figure 1D). Furthermore, these brain region-specific genes were used to identify the regional origins of each cell clusters. Only clusters with over 5% of cells expressing the marker genes or with averaged UMI of genes-expressing cells over 2 were assigned regional identities (Figure 1—source data 3).

The regional distribution of whole-brain neurotransmitter-type neurons

Request a detailed protocol

According to the marker genes that are specific to primary neurotransmitter phenotypes, including slc17a6b (glutamatergic neurons), gad1b (GABAergic neurons), and slc6a5 (glycinergic neurons), we identified subsets of neurons with different neurotransmitter type as clusters with over 5% of cells expressing the marker genes or with averaged UMI of genes-expressing cells over 2 (Figure 1—source data 3). We found that the forebrain was predominantly populated by glutamatergic neurons, whereas glycinergic neurons mainly resided in the hindbrain (Figure 1—figure supplement 2C-D). Live image with fishline Tg (glyT2:GFP), Tg (vglut2a: loxp-DsRed-loxp-GFP), and Tg (gad1b:EGFP) support the results above (Figure 1—figure supplement 2E).

Annotation of each cluster in neuromodulator-type neurons

Request a detailed protocol

To assign brain region identity for each neuromodulator-type neurons, we used markers dlx5a, otpa, dlx2a, barhl2 for sub-OT, en2a for OT, mab21l2 for midbrain and hindbrain, and phox2bb, hoxb3a, mafba, and efnb2a for hindbrain (Figure 2—figure supplement 1D). As for the subtype of neuromodulator-type neurons, we used markers th for DA neurons, tph2 for 5-HT neurons, dbh for NE neurons, and slc5a7a, chata, slc18a3a as markers of ChAT neurons (Figure 2—figure supplement 1C). And defined clusters with over 5% of cells expressing the marker genes or with averaged UMI of genes-expressing cells over 2 as a threshold for neurons with their identities (Figure 2—source data 1). In addition, we used markers gad1b, gad2 for GABAergic and vglut1, vglut2a, vglut2b, vglut3 for glutamatergic (Figure 2C) as a method to assign co-expression pattern of neurotransmitter for each neuromodulator-type neurons.

Comparison of the single-cell transcriptome between larval and juvenile zebrafish

Request a detailed protocol

The NNLS was applied to compare our data (8 dpf) with juvenile zebrafish (23–25 dpf, GSE105010). Top 20 (listed in Figure 1—source data 4) marker genes defined in our data that are also highly variable genes in juvenile data were used for decomposition. Decomposition was performed on mean expression values by averaging our dataset or by averaging other single-cell datasets using cluster assignments provided by the authors. NNLS was implemented using the ‘nnls’ package in R (Version 3.4.3). The 68 clusters we identified exhibited a high overlap with their counterparts in the juvenile zebrafish brain. Meanwhile, we also found clusters with specific regional origins and cell types exhibited high correlation with their juvenile counterparts (Figure 1E and Figure 1—figure supplement 2F).

Hierarchical clustering analysis based on effector gene and TF profiles

Request a detailed protocol

The effector genes (n = 1099) in highly variable genes (n = 1402) were selected for the clustering analysis of whole-brain clusters, and mean expression value of each gene was calculated for each cluster. The distance matrix was defined as (1 − Pearson correlation coefficient between clusters)/2. Hclust function (clustering method: ward.D) implemented in R (Version 3.4.3) was applied for the hierarchically clustering. Six major brain cell types were identified using height = 0.42 as cutoff (shown with different color in Figure 1A and F). For each cell type, pan-enriched genes were identified using FindAllMarkers function in Seurat (parameter: min.pct = 0.1, min.diff.pct = 0.25). We identified these six major cell type as: branch Ⅰ (cerebellum cells and habenula cells), branch Ⅱa (glutamatergic neurons), branch Ⅱb (inhibitory neurons), branch Ⅲ (neuroprogenitors), branch Ⅳ (radial astrocytes), and branch Ⅴ (others: oligodendrocytes, microglia, and endothelia cells) according to marker genes of each branch (Figure 1—source data 5, Figure 1F).

Besides, we also selected 283 TFs in variable genes (n = 1402) to do hierarchical clustering analysis as above (Figure 3—figure supplements 1B and 2A). Meanwhile, we performed hierarchical analysis for vmat2+ neuromodulator neurons based on effectors (n = 1783) and TFs (n = 319) in variable genes (Figure 3—figure supplement 3A-B).

Moreover, R package ‘pvclust’ were used to assess the uncertainty in hierarchical clustering by performing bootstrap analysis of clustering (Figure 3—figure supplement 1A). Groups strongly supported with AU (approximately unbiased) p-value > 0.9 were highlighted.

Identification the similarity of pair clusters using two strategies

Hierarchical sister clusters analysis

Request a detailed protocol

Before analysis, we compare the similarity of tree based on effector gene and TF profiles with R package ‘TreeDist’. In glutamatergic/GABAergic neurons, these two tree shown different organization, with only one matching node (node 13, Clusters 9 and 61, tree distance = 0.7127991, Figure 3—figure supplement 1B); while in neuromodulator neurons, these two tree shown much similar organization, with eight matching nodes (tree distance = 0.3766735, Figure 3—figure supplement 3A-B).

Sister clusters in the terminus of tree indicate pair cluster with most similar gene expression, we focus on 39 glutamatergic (IIa) and inhibitory (IIb) neurons and identified 11 pairs of sister clusters that exhibited high similarity in effector gene profiles and 14 sister clusters with similar TF expression (Figure 3—figure supplement 2A). And for vmat2+ neuromodulator neurons, we identified eight sister clusters in the terminus branches of both effector-based hierarchy tree and TF-based tree (Figure 3—figure supplement 3C).

Population-level statistical analysis

Request a detailed protocol

To compare the landscape of TF and effector gene expression accounting for the full spectrum of cell types rather than just the most similar sister clusters, we also performed population-level statistical analysis. First, we calculate any distance of two clusters that were randomly picked from 39 glutamatergic/GABAergic neuronal clusters (C392). If the distance is within the lowest 10% population of all distances containing either of two randomly selected clusters, we defined it a similar pair cluster; while the distance is within the top 80% population of all distances containing either of two randomly selected clusters, we defined it as a dissimilar pair (Figure 3A). Then we performed these analyses on both effector gene profiles and TFs, and defined those pair clusters similar in both effector-based and TF-based tree as ‘matched pattern’; those pair clusters similar in effector-gene profiles but not TFs as ‘convergent pattern’, and those pair clusters similar in TF profiles but not effector-gene profiles as ‘divergent pattern’ (Figure 3B). In matched pattern, the distance of pair cluster was the lowest distance in both effector gene-based (white) and TFs-based (gray) distances (Figure 3—figure supplement 2D). In convergent pattern, the distance of pair cluster was the lowest distance in effector gene-based (white) distances, but not the lowest distance in TFs-based (gray) distances (Figure 3—figure supplement 2F). In divergent pattern, the distance of pair cluster was the lowest distance in TFs-based (gray) distance, but not the lowest distance in effector gene-based (white) distances (Figure 3—figure supplement 4B).

Moreover, we performed population-level statistical analysis with vmat2+ neuromodulator neurons, and found nine pairs with matched pattern which showing the lowest distance in both TF and effector gene-based distances; two pairs with convergent pattern which showing the lowest distance in effector gene-based distance, but not in TF-based distance (Figure 3—figure supplement 3H).

However, this analysis is subject to the intrinsic statistics of TF and effector gene expression, as well as the sensitivity of scRNA-seq method. Thus, we intersected the results from hierarchical sister cluster and population-level statistical analysis to identify pairs with each patterns (Figure 3B and Figure 3—figure supplement 3J).

Subsampling of genes in population-level statistical analysis

Request a detailed protocol

To test the robustness of three patterns based on distance measures, we subsampling 80% gene inputs in population-level statistical analysis. First, we random subsampled 80% effector genes/TFs, calculated their distance defined as (1 − Pearson correlation coefficient between clusters)/2. Then average statistics over 20 times subsampling calculations and performed population-level statistical analysis as above. These re-identified pairs completely recapitulated those identified using the population-level statistical analysis based on total genes, suggesting the robustness of three patterns in glutamatergic/GABAergic neurons (Figure 3—figure supplement 2H-K). All pairs re-identified are consistent with pairs with all genes (Figure 3—figure supplement 2L). Besides, 8 out of 12 re-identified neuromodulator-type pairs with matched pattern, 1 out of 3 re-identified pairs with convergent pattern were consistent with those identified using the population-level statistical analysis based on total genes (Figure 3—figure supplement 3I).

Moreover, we quantified the distance of two similar trees that was exemplified by TF-based or effector gene-based clusters with or without subsampling of 80% genes, and the result showed that the tree distances of TF-based or effector gene-based clusters before and after subsampling were only 0.20 (TF-based) and 0.14 (effector gene-based), respectively (Figure 3—figure supplement 2H-K). These distances are smaller than the distance between TF and effector-gene profiles (0.7127991, Figure 3—figure supplement 1B). All above results indicated the overall distinction of the landscape of TF-based and effector gene-based clusters.

Comparison of two rounds of tectal glutamatergic transcriptome data

Request a detailed protocol

CCA in Seurat was used to test the reproducibility of two rounds of tectal glutamatergic transcriptome data. Two rounds of transcriptome data were intermingled with each other in tSNE plot showing a better performance of two parallel transcriptome data (Figure 4—figure supplement 1A).

Sorted tectal glutamatergic transcriptome data yielded more clusters

Request a detailed protocol

The NNLS was applied to compare the transcriptome data from sorted tectal glutamatergic neurons and the subset of these neurons from whole brain. The top 20 marker genes defined in sorted sample which shared in whole-brain sample were used for decomposition. Decomposition was performed on mean expression values by averaging our dataset. NNLS was implemented using the ‘nnls’ package in R (Version 3.4.3). Sorted samples yield more clusters than whole-brain samples. Besides, sorted sample clusters exhibited diversity with their counterparts in the whole-brain sample (Figure 4—figure supplement 1B).

Differentiation expression of genes encoding RBP between paired clusters

Request a detailed protocol

For each paired clusters, differential expression genes were identified using FindMarkers function in Seurat (Parameter: min.pct = 0.1, min.diff.pct = 0.25). Then we mapped the expression of genes encoding RBP within each paired clusters in different patterns (matched, convergent, and divergent patterns, Figure 5—source data 1, Figure 5—figure supplement 1B-D). Comparing the RBPs used in each pattern, we identified pattern-specific RBPs (Figure 1—figure supplement 1E-F).

Searching the binding sites of pattern-specific RBPs

Request a detailed protocol

We take advantage of the oRNAment database (Benoit Bouvrette et al., 2020) to search the putative target sites of RBPs. We only found three matched pattern-specific RBPs (celf1, nova1, pum1), three convergent pattern-specific RBPs (msi2a, taf15, tia1), 18 divergent pattern-specific RBPs (eif4a2, elavl1, hnrnpa1b, igf2bp3, hnrnpl2, mbnl2, mbnl3, pcbp4, raly, rbm25b, rbm28, sart3, sf3b4, sfpq, snrnp70, snrpa1, taf13, u2af2b) in the database. Then, we searched the putative binding sites of these RBPs use score > 0.5 as a cutoff (matched: 2987 binding sites; convergent: 2755 binding sites; divergent: 2856 binding sites, Figure 5—source data 4).

Gene ontology

Request a detailed protocol

R package ‘clusterProfiler’ was used to analyze and visualize function profile of gene and gene clusters (Yu et al., 2012). Bioconductor annotation packages org.Dr.eg.db were imported for genome-wide annotation of mapping Entrez gene identifiers or ORF identifiers for zebrafish. We performed GO analysis on 1402 variable genes of whole-brain and 2120 variable genes of vmat2+ neuromodulator-type neurons, and found most genes belong to effector genes, including receptors, neuropeptide, ion channel, and so on (Figure 1—figure supplement 2G).

We performed GO analysis on differential expression genes between sister pair clusters with different patterns (convergent and divergent), and found these genes belong to TFs, effectors, and genes encoding RBP (Figure 5A, Figure 5—figure supplement 1A).

Besides, we performed GO analysis of binding target sites of pattern-specific RBPs, showing the TFs, effectors, mRNA processing are the major targets in these RBPs (Figure 5—figure supplement 1H-J). And found the percentage of effectors in all binding sites of divergent pattern were higher than matched and convergent pattern (matched: 5%; convergent: 3.8%, divergent 7.6%, Figure 5E).

Morphologies analysis of tectal glutamatergic neurons

Plasmids

To trace the morphology of tectal glutamatergic neurons, we designed three plasmids (Figure 4B–C). In the first plasmid, the Gal4FF cassette was placed at the start-codon site using a BAC recombination technique (Suster et al., 2011). Using this method, we created the Gal4FF BAC plasmids for 15 TF marker genes covering all 11 clusters of tectal glutamatergic neurons. Considering the promoter and other regulatory elements of higher expression genes may not be powerful to drive BAC plasmid expression, we selected six of them for their better performance in labeling tectal glutamatergic neuron morphologies based on two criterions: (1) These TFs are highly expressed and specific to individual tectal glutamatergic clusters based on scRNA-seq analysis. (2) Their BAC plasmids could reliably mark particular morphological subclasses (at least in four animals). The second BAC plasmid vglut2a:CRE was constructed using similar method with BAC (CH211-111D5). The CRE cassette was placed at the start-codon site of vglut2a (CH211-111D5) using BAC recombination technique. The third plasmid uas:loxp-stop-loxp-tdTomatocaax was constructed with the ClonExpressMultiS One Step Cloning Kit (Vazyme, C112-01/02) according to the standard protocol. We ligated three PCR fragments: loxp-stop-loxp, tdTomatocaax, and 10× uas backbone (Figure 4C). With these tree plasmids, vglut2a promoter-driven Cre-mediated excision, uas-Gal4 system activate tdTomatocaax expression as Figure 4D shown.

Imaging

Request a detailed protocol

Embryos at 8 dpf were used for imaging. The fish were anesthetized with 0.04% MS222 (Sigma, A5040) and embedded in 1% agarose (Sigma, A0701) on imaging dish. Imaging were performed on Olympus FV1200 confocal using 30× oil-immersion objectives (Olympus, NA = 1.2). The image data were analysis with Image J (https://imagej.net/Simple_Neurite_Tracer). In total, we collected 574 single neuronal morphologies from 263 distinct fish (Figure 4—source data 1 and Figure 4—figure supplement 1G). All neurons were categorized into seven morphological subclasses: bi-stratified (I, n = 36), mono-stratified (II, n = 71), non-stratified (III, n = 318), necklace-like (IV, n = 76), cross-hemispheric (V, n = 15), ascending (VI, n = 18), and descending (VII, n = 40) (Figure 4D).

The analysis of TFs and tectal morphological subclasses

Request a detailed protocol

To calculate the expression of six TFs (en2b, irx1a, zbtb18, foxb1a, bhlhe22, and zic1) with respect to 11 tectal glutamatergic clusters, we created a binary expression matrix consisting of the gene expression (ON: with expression; OFF: without expression) and cluster identity (Li et al., 2017). We marked TFs that have expression in each of 11 clusters with cutoff of 5% cells or have averaged UMI of each TF-expressed cell UMI = 2 as ON, shown with black box. The rests were marked as OFF, shown with white box (Figure 4—source data 1 and Figure 4—figure supplement 1E-F). All six TFs marked multiple transcriptome-based subclasses in a combinatorial manner (Figure 4—figure supplement 1F).

To calculate the expression of six TFs with respect to seven tectal glutamatergic morphological subclasses, we also created a binary expression matrix consisting of the gene expression (ON: with expression; OFF: without expression) and morphological identity. We marked genes labeling one morphological subclass at least four times in individual four fish as ON, shown with black box. The rests were marked as OFF, shown with white box. All six TFs marked multiple morphological subclasses in a combinatorial manner (Figure 4E).

These two binary matrices were used to test the correlation between transcriptome-based clusters and morphological subclasses of tectal glutamatergic neurons with NNLS. In total, 6 out of 11 transcriptomic clusters exhibited strong correlation with at least one morphological subclass (Figure 4—figure supplement 1H).

Data availability

Request a detailed protocol

scRNA-seq data has been deposited on BIG (CRA002361): https://ngdc.cncb.ac.cn/gsa/browse/CRA002361.

Code availability

Request a detailed protocol

All the code for the data analysis is available.

Data availability

Single cell RNA-seq data has been deposited on BIG (Genome Sequence Archive - CRA002361).

The following data sets were generated
    1. Zhang H
    (2021) Genome Sequence Archive
    ID CRA002361. Single-cell RNA sequencing of zebrafish whole brain.
The following previously published data sets were used

References

  1. Book
    1. Zumel N
    (2014)
    Practical Data Science with R
    Manning Publications Co.

Decision letter

  1. Koichi Kawakami
    Reviewing Editor; National Institute of Genetics, Japan
  2. Didier YR Stainier
    Senior Editor; Max Planck Institute for Heart and Lung Research, Germany
  3. Koichi Kawakami
    Reviewer; National Institute of Genetics, Japan

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

Thank you for submitting your article "The landscape of regulatory genes in brain-wide neuronal phenotypes of a vertebrate brain" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, including Koichi Kawakami as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by a Reviewing Editor and Didier Stainier as the Senior Editor.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

The reviewers recognized the authors performed enormous amounts of works and the data presented in the manuscript should be useful resources for neurobiologists. However the reviewers indicated the following major concerns which should be addressed by the authors.

1) The validity of clustering. The reviewers think the authors need control analysis for this.

2) For "divergent" and "convergent" classes, the hypothesis presented by the authors were not proven. To experimentally prove this is a bit too much for the present paper. Instead, the reviewers request more statistical data to support these ideas.

Reviewer #1:

In this manuscript, the authors performed single cell RNA-seq analysis of ~46,000 cells from the zebrafish larval brain and identified 68 clusters and mapped them on the brain regions. They found (1) region-specific markers, (2) correlation with cell types in juvenile fish, (3) 1099 effector genes out of 1402 genes used to define the clusters, (4) and 48 neuronal clusters and 20 non-neurons including glia. Then, they analyzed scRNA-seq using vmat (monoaminergic) and vachte (cholinergic) transgenic fish and (5) identified 22 and 14 neuromodulator clusters. (6) The neuromodulator clusters were further analyzed for neurotransmitter types and they revealed coexpression patterns.

In the hierarchical classification, they identified (7) 11 "sister clusters" that have similar expression profiles for effector genes, that had the same neurotransmitter types but did not show brain region preference. Then they examined TF profiles in neurotransmitter type neurons and identified 14 TF sister clusters. They found (8) one effector gene cluster that match with a TF cluster (matched) and 10 effector gene clusters that did not match with TF clusters (convergent) (these expressed the same neurotransmitters), and also found (9) neuromodulator clusters were well matched with TF profiles (matched). (10) They found the same TF clusters can express different effector genes also (divergent).

Then they aimed to see relationship between TF and morphology. They sorted tectal glut cells and identified 11 TF clusters. Then they made Bac-gal4 for 15 TFs and performed intersection using Cre-glut. (11) They analyzed 574 tectal neurons and found that TF could mark multiple morphology.

Since (in the divergent class) TF did not correlate with effector, they analyzed RNA binding proteins. (12) RNA-binding proteins were expressed differently in different neuron clusters.

Strength: The authors performed a comprehensive analysis of brain cells by single-cell RNA-seq. The scRNA-seq data presented here will be a good reference when the neuroscientists study a certain neuronal types, for instance searching for marker genes expressed in the neuronal clusters of their interest. Also, the amount of work is enormous.

Weakness: The contents of this paper are rather descriptive and poor in mechanistic aspects (causal relationship). Also I felt difficulty in following the manuscript since experiments were not described as hypothesis-driven.

1) The authors prepared cells for sc-RNA seq in different ways and this made the manuscript a bit confusing. Please clarify the purpose and reason for the to do so.

2) I think only "positive" data is the identification of the "matched" class. For "convergent" and "divergent" classes, many other explanations will be possible since there are no mechanistic analysis. For "convergent" class, examine if any TFs with weak expression had been overlooked. For "divergent" class, examine if the authors can find specific (classes) of RBP as real candidates.

3) As for the section describing the morphology, 11 clusters identified, and the 7 morphological classes seemed irrelevant. Also, relationship between 6TFs and the 7 morphological classes did not sound although the work for construction of 15 BACs should have to be enormous. Rewrite the section for readers to be understandable.

Reviewer #2:

In this study, the authors investigated how diverse neuronal types develop in the brain by using single-cell RNA sequencing methods.

The authors performed rigorous data collections for different brain regions, monoaminergic neurons, catecholaminergic neurons, and glutamatergic neurons. Such careful data collection for different types of neurons and brain regions has not been done for larval zebrafish. Their data will be a valuable resource for the neuroscience field.

They further found that the expression patterns of transcription factors are not necessarily predictive of the expression patterns effector genes that constitute the "terminal features" of neuronal cell types. Such predictiveness depends on the cell types. In neuromodulatory neurons, TF expression is predictive of effector gene expressions. On the contrary, in "neurotransmitter" neurons, which represent glutamatergic and GABAergic neurons, this relationship is loose and diverse. This finding, if confirmed, will advance our understanding of how diverse neuronal types develop in vertebrate brains.

The major weakness of this study is the lack of statistical controls in their analyses. I raise two examples here. First, the classification of "convergent types" and "divergent types" is solely based on hierarchical clustering analysis and its distance measures which are not validated by sub-sampling or by other statistical methods. Therefore, this analysis cannot rule out the possibility that such classification arises from large variations of gene expressions within the analyzed populations and that there are no such distinct populations.

The second example is their analysis of the expression of RNA-binding proteins. They show that the RBPs have differential expression in "divergent" cluster pairs. However, they do not show whether such differential expression is more prevalent among "divergent" cluster pairs than in other neuronal cluster pairs. If this is not the case, the differential expression of RBPs may not be the reason for the differential expression of effector genes.

Although the claim of this paper may be of broad interest in the neuroscience field, the above weaknesses significantly affect the reliability of the conclusion of this paper. Therefore, I recommend a revision of this manuscript for its publication in eLife.

1) The classification of "convergent" and "divergent" types in Figure 3 needs quantitative validation to exclude the possibility that such classification arises from large variations of gene expressions. This problem is unavoidable for analyses that solely rely on hierarchical clustering methods and their distance measures. Showing expression patterns of several example genes does not reinforce the conclusion, as it is always possible to find genes that show differential expression between any given cluster pairs. The result of neuromodulatory neurons only works as a partial control, as they are different neuronal types. This study needs to reinforce the validity of the classification by cross-validation of cluster distances among samples or by using unbiased statistical methods other than clustering.

2) The analysis of differential expression of RNA-binding proteins among "divergent" neuronal clusters needs statistical control. The authors need to show that RBPs in "divergent" pairs have more divergent expression patterns than in "convergent" pairs in an unbiased, quantitative way. Again, showing the example of few genes is not enough. Otherwise, RBPs cannot explain the divergence of effector gene expression from similar TF expression profiles.

3) The coloring schemes in figures are confusing. For example, I can see many color schemes in Figure 1. We only need two types of classification: (1) brain regions and (2) cell types. Figure 1a may not need colors/numberings and only need names for some of the clusters. Also, the classification presented in Figure 1c (Glu-GABA-P-R) and the one presented in Figure 1g (I- II, III, IV, V) are redundant. I understand that these different schemes serve different purposes, but I recommend unifying these classification schemes for clarity. This unification may need reordering of figure panels.

Reviewer #3:

In this manuscript, the authors performed single-cell RNA sequencing of >60,000 cells across the whole zebrafish brain with region- and molecular- identity. Using the acquired transcriptomes, the authors tried to deduce the regulation logic of neuronal diversification by comparing sister clusters in hierarchical clustering based on effector genes and regulatory TFs. The author showed that while TF similarity in modulatory neurons largely predict the similarity of their neuromodulator types, neurotransmitter types and their TF profiles usually do not agree. Further analysis of cell-type divergence from common TF regulators revealed an interesting differential enrichment of RNA binding proteins, which are potentially involved in post-transcriptional regulation of neuronal identity.

The transcriptomic data is comprehensive and of high quality, with cells covering the entire zebrafish brain, and with close-up analysis generated by new experiments to give higher discriminatory power. The data offers a valuable resource to the developmental neurobiology community. Some patterns (e.g. phenotypic convergence) echoes the findings in invertebrate nervous systems. Difference in regulatory logic of neurotransmitter vs. neuromodulator types, as well as the identification of post-transcriptional regulator genes are novel and interesting.

However, some caveats in data analysis may affect the reliability of the conclusions. The key claims in the study heavily relied on analysis of "sister clusters", i.e. clusters of cells with most similar TF or effector gene profiles. Yet not enough justification was given to the selection of such clusters or the focus only on the direct sibling clusters, and the fact that neurotransmitter and neuromodulator data were acquired differently adds complication to the interpretation of the result. Meanwhile, although the descriptive data in this study gives a detailed account of neuronal diversity, the lack of causal evidence and/or concrete mechanistic explanation between regulatory genes and terminal effectors rendered the conclusions a bit elusive -they tend to fell into providing interesting insight while failing to account for alternative explanations.

Overall, I think the claims are supported by the data for the most part, and with the addition of certain control and additional analyses, it enhances our understanding of vertebrate neuronal diversification as a thought-provoking descriptive study: Genes involved in convergent or divergent cell types identified in this study serve as curious candidate for follow-up investigation. Regulatory logic and mechanisms, when compared with similar studies in invertebrates, can help us to gain insights on the origin and evolution of the nervous system.

1. In the first and second section of Result, clusters of the transcriptome data were treated as the "smallest unit" for subsequent analysis. However, there's a lack of justification for the clustering criteria: i.e. how distinct the clusters are, and how robust the subsequent analysis result is if cell-type clustering is performed slightly differently. This is especially problematic because even the authors themselves have shown that given cleaner quality data (e.g. modulatory neurons in FAC sorted cells vs. whole brain), clustering partition could be different. Additional control analysis would be necessary to show clustering makes sense and robust to noise level in the data.

2. The authors compared the 8dpf brain transcriptomes with juvenile brain in Raj et al., 2018 and claimed that "are likely to represent the full cellular diversity of the mature zebrafish brain". The stretch is a bit far because they partitioned the data only using variable genes expressed in both datasets, i.e. differentially expressed genes involved in subsequent diversification were not accounted for. I think at best this analysis serves as a similarity claim rather than "full" cellular diversity.

3. Result Section 3: a major weakness in the design of analysis is the focus only on "sister clusters", which can be sensitive to your cluster partition and does not necessarily represent the real diversification event. To compare the landscape of TF and effector gene expression, there are many alternative methods accounting for the full spectrum of cell types rather than just the most similar sibling clusters. I would suggest two supplementary analyses: (1) A population-level statistical analysis to show the difference in regulatory logic across all cell types regardless of clustering, even in the less similar cells. (2) A test to ensure that the disagreement between TF and effectors does not disappear as we break clusters in neurotransmitter types into smaller sub types. As stated in point #1, it is necessary for the authors to demonstrate the difference in regulatory logic between transmitter and modulator types is not a result of different noise level in the data.

4. Result Section 4: the criteria for assigning TF expression to morphology class is again relying too much on binary classification. In particular, the authors considered a TF to be a marker for the morphology class "appeared for at least 4 times". The morphology classes are very different in sizes and this is an unfair comparison. Rather more quantitative metrics and vigorous statistical tests should be used to support this conclusion.

5. Result Section 5: in addition to transcriptome data, it would be nice if the authors can demonstrate some causal links between the RBP genes and the cell type regulation, or verify experimentally that they indeed encode neurons with different identity, morphology or functions. If additional experiments to demonstrate causal links are not possible, the authors should sufficiently account for alternative explanations for cell type divergence. Besides post-transcriptional regulation, there are a lot of other factors that could affect gene diversification, including early regulators that are transiently expressed in the embryo that primed fate selection, and external signaling factors in the neuron's environment. Such information was lost as only mature neurons are sequenced with very little spatial context.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "The landscape of regulatory genes in brain-wide neuronal phenotypes of a vertebrate brain" for further consideration by eLife. Your revised article has been evaluated by Didier Stainier (Senior Editor) and a Reviewing Editor.

The manuscript has been corrected but there are some remaining issues that need to be addressed, as outlined below:

The reviewers still think the data is a good resource, and the differentially expressed genes can be good candidates for future investigation. However, the reviewers think that cross-validation of their statistical measures that used to identify "matched", "convergent" and "divergent" types is necessary. They used two measures in parallel, hierarchical clustering, and population-level similarity, to identify these pairs. These measures can be made robust by repeating the sub-sampling of genes and average statistics over many sub-sampling calculations. The reviewers wonder how much of it remain true with sub-sampling test added. Or alternatively, the reviewers would suggest moving away from hierarchical clustering and instead re-identifying matched, convergent and divergent populations based on TF-based and effector-based distance measures, which is bootstrapped to ensure robustness. This will provide more interpretable, consistent results.

Reviewer #1:

I found that the authors performed Jaccard similarity index-based analysis to validate clustering and rewrote the text. However, these revealed some weakness of the manuscript. I have the following comments.

1. The paper lacks biological aspects. The authors identified sister clusters from different brain regions (p11, line 234). The authors should at least discuss the significance of the finding.

2. The authors separated IIa and IIb before classification using TF profiles. I am wondering why?

3. A paragraph (p12, lines 249-258), it is not clear how they performed the population-level statistical analysis. Explain more precisely.

4. Why did the hierarchical and population analyses make the difference? (p12) Explain. Also, I do not understand the meaning of making overlaps of these.

5. p13, line 277. I think the authors examined both the TF and effector profiles, correct?

6. p13, Again the paper lacks biological aspects. Why did the neurotransmitter-type and neuromodulator-type make difference? Explain the idea in the discussion.

7. p14, lines 287-297, on the other hand… this part seems redundant with the previous parts and, to me, was difficult to read. Please clarify.

8. p14-15. I admire their efforts to make many BACs. They mentioned this may support "divergent pattern"(p17, line 358), but the reason why they thought so is very weak. They need to show relevance between morphology and effector type somehow experimentally.

9. p16, line 340. They did not describe how they picked up 6 TFs.

10. Figure3—figure supplement 2D and 2F. Figure3—figure supplement 4B. I could not tell how to read these data. In addition, in all cases, the effector-based distances are small (between 0-0.1) and TF-based distances are large (between 0.1-0.4). I don't see much difference between these populations.

Reviewer #2:

I have to say the authors did not address my concerns at all.

My first concern was about the validity of "matched", "convergent" and "divergent" classifications from hierarchical clusterings that are based on effector genes and TF genes in Figures 3 and 5. As an answer, the authors calculated Jaccard similarity indices for clusterings using all genes in Figure 1 and Figure 2. These added analyses were completely unrelated to the clustering of "matched", "convergent" and "divergent." Therefore, I have to say the authors did not address the issue.

My second concern was about the validity of their claim about the differential expression of RBPs in "divergent cell population". They claim that the RBPs have more divergent expressions in "divergent cell pairs". However, it is hard to interpret the statistics presented in Figure 5E. The indication of the index [(number of genes that are solely divergent in group A) / (number of all genes that are divergent in group A)] is unclear.

Moreover, this analysis does not indicate any causality between the divergence of RBPs and the divergence of effector genes. In principle, the more heterologous the population is, we would see more divergence of RBPs and effector genes. These factors are not independent of each other. I was expecting to see a more careful statistical approach that takes these considerations into account. Therefore, I have to say the authors did not address the issue.

Reviewer #3:

In the revised manuscript, the authors mainly added three additional control analyses that increased the credibility of the conclusion, including:

1) Robustness analysis of the clustering result using Jaccard similarity of subsampled data.

2) Analysis of the matched/convergent/divergent patterns of the glutamatergic/GABAergic neurons based on relaxed criteria that include not only the terminal sister clusters, but also non-sibling clusters with relatively close relationships.

3) Control analysis to show the enrichment of differential RBP expression in divergent patterns.

The control analyses largely addressed my concerns to a large extent, although they made the dramatic contrast between the regulatory logic of neurotransmitter and neuromodulator data weaker. The identified convergent/divergent events are now more convincing and allow for further exploration of the phenotype in the future.

However, I cannot help but noticed many more changes have been made without being pointed out in the rebuttal letter, including (1) exclusion of the data from the cholinergic neurons altogether, and (2) slightly different analysis result with seemingly identical input data (e.g. comparing Figure 3—figure supplement 2 in the new version vs. Figure 3 in the original manuscript). The authors should explain why such changes were made and ensure there is no selective report of the data.

Finally, some suggestions/questions to help achieve a better presentation of the data.

1) The current "population-level" analysis is rather just a relaxed definition of the terminal sister clusters, but there are many other metrics to quantify the similarity between the TF-based and effector-based clustering result over the entire dataset. For hierarchical clustering this can be editing distance between the two trees, or more generally, similarity between organization of TF and effector gene distribution across all the clusters. This will not identify specific matched, convergent or divergent patterns, but would be an unbiased measurement of how TF and effector landscape agree with each other.

2) Figure 3: similar color regime was used to represent forebrain/OT/sub OT and matched/convergent/divergent patterns. This is very confusing. I would suggest weakening brain region visualization (e.g. changing it to a less pronounced representation or getting rid of it altogether -see comment #3) and focus on the regulatory logic. It is also recommended to color the pattern names in 3B accordingly.

3) Figure 3C: The number of matched patterns on the Venn diagram does not match the number shown in Figure 3F. Also, Venn diagram is not the best way to show that the conclusion in sister cluster- and population-statistics are similar. [Minor correction: the number of selection should be Combination C(39, 2) rather than Permutation A(39, 2)].

4) Throughout the manuscript, the brain region identities are shown next to the clusters in every figure, yet very little conclusion was made about the relationship between transcriptome and brain region identity. The authors are advised to summarize the major findings of data feature related to brain region identities (no correlation can also be a finding), and simplify the color use of brain region when the color does not convey meaningful information.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "The landscape of regulatory genes in brain-wide neuronal phenotypes of a vertebrate brain" for further consideration by eLife. Your revised article has been evaluated by Didier Stainier (Senior Editor), a Reviewing Editor, and the original reviewers.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

1) Although for neurotransmitter-type, "sister cluster analysis and the population-level statistical analysis " are carried out. However, for neuromodulator-type, only sister cluster analysis was done. It is fair to perform population-level analysis for neuromodulator-type.

2) Line 265: largely recapitulated. I do not agree with this. Rather, explain why the results were not consistent.

3) Line 226: The TF Regulatory Landscape in Whole-brain Neuronal Phenotypes

This section is a bit too long. The section should be separated, for example, for neurotransmitter-type and neuromodulator-type.

4) Line 395: results supported "divergent" and "convergent" patterns, because divergent pattern indicated the link of same TFs to different neuron phenotypes, while convergent pattern indicated the link of different TFs to similar neuron phenotypes.

This is too much speculation. The relationship between the morphology and effector gene profiles are unclear. I think it is fair to say that morphology does not directly correlate with TF profiles.

Reviewer #2:

I can say the authors addressed my concerns on statistical analyses after this round of revision, shrugging off unnecessary parts and exposing the most interesting part of this study. In addition, they included a new analysis in Figure 5 which indicates potential causality between the differential expression of RBPs and differential expression of effector genes.

There are still concerns about the validity of classification (lower 10% of gene distance compared to shuffled pairs, how do you justify the threshold?). However, this justification will not be so straightforward because it will be prone to the intrinsic statistics of TF expression, effector gene expression, and the sensitivity of the sequencing technique. It is obvious this is the best authors can do. I advise authors to acknowledge such limitations in the discussion.

Reviewer #3:

Unfortunately, the authors misunderstood my request for adding Tree Distance analysis. An important claim of the paper was that glutamatergic/GABAergic clusters show different TF and terminal profile patterns ("convergent/divergent"), whereas neuromodulator type clusters predominantly expressed the same TF profiles ("matched"). However, supplementary analysis was only shown for the glutamatergic/GABAergic types but not the modulator types, and the main figure that supports this conclusion (Figure 3E) still uses sister clusters from hierarchical clustering for classification of neuromodulator types, which I found to be an unfair comparison. Also, the gene subsampling test of the robustness of matched/divergent/convergent pairs was only performed on similar pairs defined by population-statistics. It was not clear to me why the authors still adhered to the "sister cluster" definition (or an INTERSECTION of this strategy with the population-statistics), for all subsequent conclusions. I suggested tree distance as a global measurement of whether the glutamatergic/GABAergic types and modulator types truly have different regulatory logic at the population level, rather only at the terminal sisters.

https://doi.org/10.7554/eLife.68224.sa1

Author response

Essential revisions:

The reviewers recognized the authors performed enormous amounts of works and the data presented in the manuscript should be useful resources for neurobiologists. However the reviewers indicated the following major concerns which should be addressed by the authors.

1) The validity of clustering. The reviewers think the authors need control analysis for this.

We have introduced Jaccard similarity index-based analysis by subsampling to validate our clusters (Figure 1—figure supplement 1, whole-brain clusters; Figure 2—figure supplement 1A, vmat+ neuromodulator clusters; Figure 4—figure supplement 1D, tectal glutamatergic clusters) (Tang et al., 2020).

Specifically, to validate 68 clusters derived from whole-brain samples, we introduced R package “scclusteval”. 80 % of total cells were subsampled and re-clustered. The Jaccard index was then used to measure the similarity of clusters derived from subsampled cells and total cells. The mean/median of Jaccard index after 20-times repeats was used to evaluate the robustness of original 68 clusters. One cluster with a mean/median of Jaccard index above 0.6 is considered as a stable cluster as described previously (Zumel N. , 2014).

2) For "divergent" and "convergent" classes, the hypothesis presented by the authors were not proven. To experimentally prove this is a bit too much for the present paper. Instead, the reviewers request more statistical data to support these ideas.

Thanks for the constructive suggestion. In the revised manuscript, to compare the landscape of TF and effector gene expression accounting for the full spectrum of cell types rather than just the most similar sister clusters, we also performed population-level statistical analysis to support the idea of “divergent” and “convergent” patterns (Figure 3A-B, Figure 3—figure supplement 2D-G, and Figure 3—figure supplement 4B-C).

Reviewer #1:

In this manuscript, the authors performed single cell RNA-seq analysis of ~46,000 cells from the zebrafish larval brain and identified 68 clusters and mapped them on the brain regions. They found (1) region-specific markers, (2) correlation with cell types in juvenile fish, (3) 1099 effector genes out of 1402 genes used to define the clusters, (4) and 48 neuronal clusters and 20 non-neurons including glia. Then, they analyzed scRNA-seq using vmat (monoaminergic) and vachte (cholinergic) transgenic fish and (5) identified 22 and 14 neuromodulator clusters. (6) The neuromodulator clusters were further analyzed for neurotransmitter types and they revealed coexpression patterns.

In the hierarchical classification, they identified (7) 11 "sister clusters" that have similar expression profiles for effector genes, that had the same neurotransmitter types but did not show brain region preference. Then they examined TF profiles in neurotransmitter type neurons and identified 14 TF sister clusters. They found (8) one effector gene cluster that match with a TF cluster (matched) and 10 effector gene clusters that did not match with TF clusters (convergent) (these expressed the same neurotransmitters), and also found (9) neuromodulator clusters were well matched with TF profiles (matched). (10) They found the same TF clusters can express different effector genes also (divergent).

Then they aimed to see relationship between TF and morphology. They sorted tectal glut cells and identified 11 TF clusters. Then they made Bac-gal4 for 15 TFs and performed intersection using Cre-glut. (11) They analyzed 574 tectal neurons and found that TF could mark multiple morphology.

Since (in the divergent class) TF did not correlate with effector, they analyzed RNA binding proteins. (12) RNA-binding proteins were expressed differently in different neuron clusters.

Strength: The authors performed a comprehensive analysis of brain cells by single-cell RNA-seq. The scRNA-seq data presented here will be a good reference when the neuroscientists study a certain neuronal types, for instance searching for marker genes expressed in the neuronal clusters of their interest. Also, the amount of work is enormous.

Weakness: The contents of this paper are rather descriptive and poor in mechanistic aspects (causal relationship). Also, I felt difficulty in following the manuscript since experiments were not described as hypothesis-driven.

1) The authors prepared cells for sc-RNA seq in different ways and this made the manuscript a bit confusing. Please clarify the purpose and reason for the to do so.

Sorry for the confusion. We have clarified the reasons of preparing cells for scRNAseq using different methods in the revision.

Specifically, we used three methods to isolate cells for scRNAseq: a. from whole-brain samples, b. from samples of particular brain regions, and c. from transgenic fishlines with labeling of neurons expressing given neurotransmitters or neuromodulators.

To validate our data at whole-brain level, we took advantage of whole-brain samples to get a whole picture of transcriptome in larval zebrafish brain. Besides, for quantifying the ratio of different neurotransmitter/modulator-type neurons at whole-brain level (Figure 1—figure supplement 2C-D), we used cells isolated from whole-brain samples.

For studying regional identities, we dissected anatomically distinct brain samples (Figure 1—figure supplement 2B).

Since we found few neuromodulator clusters (n=3) in ~46,000 cells from samples of whole brains and different brain regions, we sorted cells from transgenic fishlines that specifically marked neuromodulators to analyze enriched neuromodulator populations (Figure 2B).

For studying the relationship between transcriptomic clusters and morphological subclasses for tectal glutamatergic neurons, we need finer-defined transcriptomic glutamatergic clusters. Thus, we further verify tectal glutamatergic clusters by performing additional scRNAseq analysis on cells isolated from Tg (vgluta2a:loxP-DsRed-loxp-GFP, Figure 4A)

2) I think only "positive" data is the identification of the "matched" class. For "convergent" and "divergent" classes, many other explanations will be possible since there are no mechanistic analysis. For "convergent" class, examine if any TFs with weak expression had been overlooked. For "divergent" class, examine if the authors can find specific (classes) of RBP as real candidates.

For identifying “convergent” classes, we used variable TFs (n = 283) based on all 68 clusters, that is some were expressed at higher levels and others were expressed at lower levels for a given cluster. Thus, our analysis does not particularly overlooked TFs with weak expression.

For two clusters with “divergent pattern”, we found that RBPs could be specifically expressed in one but not the other (Figure 5B). Within 121 RBPs that distinguish paired clusters of three patterns, 38 % RBPs are specific to divergent pattern, while 6.6 % RBPs are specific to either convergent or matched patterns (Figure 5D-E).

3) As for the section describing the morphology, 11 clusters identified, and the 7 morphological classes seemed irrelevant. Also, relationship between 6TFs and the 7 morphological classes did not sound although the work for construction of 15 BACs should have to be enormous. Rewrite the section for readers to be understandable.

Sorry for the confusion. We have further clarified the purpose and logic of this part and a newly-analyzed correlation data in the revised manuscript.

Since we described “convergent pattern” (different TFs could mark neuron clusters with similar effector-gene profiles) and “divergent pattern” (single TFs could mark neuron clusters with different effector-gene profiles) in Figure 3, we then decided to provide the experimental evidence to support the presence of both phenotypes by analyzing the relationship between TFs and different morphological neuron subclasses in the tectum.

Specifically, we identified 11 morphological subclasses of tectal glutamatergic neurons, and 6 TFs that showed highly variable expressions among these 11 clusters. Then by creating BACs for each of these 6 TFs, we showed that each of the 6 TFs were mostly used by multiple morphological neuron subclasses in a combinatorial manner, which supported “divergent pattern”; meanwhile, each of 7 morphological subclasses could be marked by multiple TFs, providing the evidence for “convergent pattern” as Figure 4E.

Reviewer #2:

In this study, the authors investigated how diverse neuronal types develop in the brain by using single-cell RNA sequencing methods.

The authors performed rigorous data collections for different brain regions, monoaminergic neurons, catecholaminergic neurons, and glutamatergic neurons. Such careful data collection for different types of neurons and brain regions has not been done for larval zebrafish. Their data will be a valuable resource for the neuroscience field.

They further found that the expression patterns of transcription factors are not necessarily predictive of the expression patterns effector genes that constitute the "terminal features" of neuronal cell types. Such predictiveness depends on the cell types. In neuromodulatory neurons, TF expression is predictive of effector gene expressions. On the contrary, in "neurotransmitter" neurons, which represent glutamatergic and GABAergic neurons, this relationship is loose and diverse. This finding, if confirmed, will advance our understanding of how diverse neuronal types develop in vertebrate brains.

The major weakness of this study is the lack of statistical controls in their analyses. I raise two examples here. First, the classification of "convergent types" and "divergent types" is solely based on hierarchical clustering analysis and its distance measures which are not validated by sub-sampling or by other statistical methods. Therefore, this analysis cannot rule out the possibility that such classification arises from large variations of gene expressions within the analyzed populations and that there are no such distinct populations.

Thanks for the question and great suggestions. In the revised manuscript, we have verified the neuron classification by sub-sampling, and the Jaccard similarity index (Tang et al., 2020) were introduced to further validate our “convergence pattern” and “divergent pattern”.

The second example is their analysis of the expression of RNA-binding proteins. They show that the RBPs have differential expression in "divergent" cluster pairs. However, they do not show whether such differential expression is more prevalent among "divergent" cluster pairs than in other neuronal cluster pairs. If this is not the case, the differential expression of RBPs may not be the reason for the differential expression of effector genes.

Thanks for the question and great suggestions. We have included a new analysis in the revision, which shows that such differential expression of RBPs are more prevalent among “divergent” cluster pairs than in other pairs in terms of gene numbers and differential expression levels (Figure 5D-E, Figure 5—figure supplement 1C-E).

Although the claim of this paper may be of broad interest in the neuroscience field, the above weaknesses significantly affect the reliability of the conclusion of this paper. Therefore, I recommend a revision of this manuscript for its publication in eLife.

1) The classification of "convergent" and "divergent" types in Figure 3 needs quantitative validation to exclude the possibility that such classification arises from large variations of gene expressions. This problem is unavoidable for analyses that solely rely on hierarchical clustering methods and their distance measures. Showing expression patterns of several example genes does not reinforce the conclusion, as it is always possible to find genes that show differential expression between any given cluster pairs. The result of neuromodulatory neurons only works as a partial control, as they are different neuronal types. This study needs to reinforce the validity of the classification by cross-validation of cluster distances among samples or by using unbiased statistical methods other than clustering.

Thanks for the question and great suggestions. We have performed Jaccard similarity index-based analysis by sub-sampling (Tang et al., 2020) to further validate the classification (as shown in Figure 1—figure supplement 1B-C, Figure 2—figure supplement 1A).

2) The analysis of differential expression of RNA-binding proteins among "divergent" neuronal clusters needs statistical control. The authors need to show that RBPs in "divergent" pairs have more divergent expression patterns than in "convergent" pairs in an unbiased, quantitative way. Again, showing the example of few genes is not enough. Otherwise, RBPs cannot explain the divergence of effector gene expression from similar TF expression profiles.

Thanks for the question and great suggestions. We have included a new analysis in the revision, which shows that such differential expression of RBPs are more prevalent among pairs with “divergent” cluster pairs than in other patterns in terms of gene numbers and differential expression levels (as shown in Figure 5D-E, and Figure 5—figure supplement 1C-E).

3) The coloring schemes in figures are confusing. For example, I can see many color schemes in Figure 1. We only need two types of classification: (1) brain regions and (2) cell types. Figure 1a may not need colors/numberings and only need names for some of the clusters. Also, the classification presented in Figure 1c (Glu-GABA-P-R) and the one presented in Figure 1g (I- II, III, IV , V ) are redundant. I understand that these different schemes serve different purposes, but I recommend unifying these classification schemes for clarity. This unification may need reordering of figure panels.

Thanks for the great suggestions. We have updated the color schemes in Figure 1A in the revision using major cell type in brain. Also, we have updated figure panels in the Figure 1F.

Reviewer #3:

In this manuscript, the authors performed single-cell RNA sequencing of >60,000 cells across the whole zebrafish brain with region- and molecular- identity. Using the acquired transcriptomes, the authors tried to deduce the regulation logic of neuronal diversification by comparing sister clusters in hierarchical clustering based on effector genes and regulatory TFs. The author showed that while TF similarity in modulatory neurons largely predict the similarity of their neuromodulator types, neurotransmitter types and their TF profiles usually do not agree. Further analysis of cell-type divergence from common TF regulators revealed an interesting differential enrichment of RNA binding proteins, which are potentially involved in post-transcriptional regulation of neuronal identity.

The transcriptomic data is comprehensive and of high quality, with cells covering the entire zebrafish brain, and with close-up analysis generated by new experiments to give higher discriminatory power. The data offers a valuable resource to the developmental neurobiology community. Some patterns (e.g. phenotypic convergence) echoes the findings in invertebrate nervous systems. Difference in regulatory logic of neurotransmitter vs. neuromodulator types, as well as the identification of post-transcriptional regulator genes are novel and interesting.

However, some caveats in data analysis may affect the reliability of the conclusions. The key claims in the study heavily relied on analysis of "sister clusters", i.e. clusters of cells with most similar TF or effector gene profiles. Yet not enough justification was given to the selection of such clusters or the focus only on the direct sibling clusters, and the fact that neurotransmitter and neuromodulator data were acquired differently adds complication to the interpretation of the result. Meanwhile, although the descriptive data in this study gives a detailed account of neuronal diversity, the lack of causal evidence and/or concrete mechanistic explanation between regulatory genes and terminal effectors rendered the conclusions a bit elusive -they tend to fell into providing interesting insight while failing to account for alternative explanations.

Overall, I think the claims are supported by the data for the most part, and with the addition of certain control and additional analyses, it enhances our understanding of vertebrate neuronal diversification as a thought-provoking descriptive study: Genes involved in convergent or divergent cell types identified in this study serve as curious candidate for follow-up investigation. Regulatory logic and mechanisms, when compared with similar studies in invertebrates, can help us to gain insights on the origin and evolution of the nervous system.

1. In the first and second section of Result, clusters of the transcriptome data were treated as the "smallest unit" for subsequent analysis. However, there's a lack of justification for the clustering criteria: i.e. how distinct the clusters are, and how robust the subsequent analysis result is if cell-type clustering is performed slightly differently. This is especially problematic because even the authors themselves have shown that given cleaner quality data (e.g. modulatory neurons in FAC sorted cells vs. whole brain), clustering partition could be different. Additional control analysis would be necessary to show clustering makes sense and robust to noise level in the data.

Thanks for the questions and great suggestions. In the revision, we further performed Jaccard similarity index-based analysis by sub-sampling, showing that sub-sampling of cells could give rise to the similar clustering partition, supporting the robustness of our clustering (Figure 1—figure supplement 1E, whole-brain clusters; Figure 2—figure supplement 1A, vmat+ neuromodulator clusters; Figure 4—figure supplement 1D, tectal glutamatergic clusters)(Tang et al., 2020).

Specifically, to validate 68 clusters derived from whole-brain samples, we introduced R package “scclusteval”. 80 % of total cells were subsampled and re-clustered. The jaccard index was then used to measure the similarity of clusters derived from subsampled cells and total cells. The mean/median of Jaccard index after 20-times repeats was used to evaluate the robustness of original 68 clusters. One cluster with a mean/median Jaccard index above 0.6 is considered as a stable cluster as described previously (Zumel N. , 2014).

2. The authors compared the 8dpf brain transcriptomes with juvenile brain in Raj et al., 2018 and claimed that "are likely to represent the full cellular diversity of the mature zebrafish brain". The stretch is a bit far because they partitioned the data only using variable genes expressed in both datasets, i.e. differentially expressed genes involved in subsequent diversification were not accounted for. I think at best this analysis serves as a similarity claim rather than "full" cellular diversity.

Thanks for this constructive suggestion. We have updated our conclusion to “our analysis indicated that the brain at 8 dpf mostly represented cellular diversity in the juvenile brain.”

3. Result Section 3: a major weakness in the design of analysis is the focus only on "sister clusters", which can be sensitive to your cluster partition and does not necessarily represent the real diversification event. To compare the landscape of TF and effector gene expression, there are many alternative methods accounting for the full spectrum of cell types rather than just the most similar sibling clusters. I would suggest two supplementary analyses: 1) A population-level statistical analysis to show the difference in regulatory logic across all cell types regardless of clustering, even in the less similar cells. 2) A test to ensure that the disagreement between TF and effectors does not disappear as we break clusters in neurotransmitter types into smaller sub types. As stated in point #1, it is necessary for the authors to demonstrate the difference in regulatory logic between transmitter and modulator types is not a result of different noise level in the data.

Thanks for raising this concern and the constructive suggestions. Thanks for the constructive suggestion. In the revised manuscript, we performed Jaccard similarity index-based analysis by sub-sampling, showing that sub-sampling of cells could give rise to the similar clustering partition, supporting the robustness of our clustering (Tang et al., 2020). Then to compare the landscape of TF and effector gene expression accounting for the full spectrum of cell types rather than just the most similar sister clusters, we also performed population-level statistical analysis to support the idea of “divergent” and “convergent” classes (Figure 3A, Figure 3—figure supplement 2D-G, Figure 3—figure supplement 4B-C).

4. Result Section 4: the criteria for assigning TF expression to morphology class is again relying too much on binary classification. In particular, the authors considered a TF to be a marker for the morphology class "appeared for at least 4 times". The morphology classes are very different in sizes and this is an unfair comparison. Rather more quantitative metrics and vigorous statistical tests should be used to support this conclusion.

Thanks for raising this concern. We have made a further clarification on this issue in the revision. During the analysis, we appreciated the enormous variability in the morphology of tectal glutamatergic neurons in the finer structures. By the limited number of neurons we analyzed (n=574), we were unlikely to define morphological subclasses using a full morphological description. Instead, we used the criterion based on major morphological features, including stratification, soma position, and projection patterns, to define the morphological subclasses in the current study. In addition, non-stratified, mono-stratified, and bi-stratified subclasses we defined were also used previously for tectal neurons (Robles et al., 2011). We have discussed these caveats in morphology classification in the revision. Furthermore, we have made the further clarification in the revision that our conclusion on the relationship between the TFs and morphology was based on the criteria using major morphological features mentioned above.

5. Result Section 5: in addition to transcriptome data, it would be nice if the authors can demonstrate some causal links between the RBP genes and the cell type regulation, or verify experimentally that they indeed encode neurons with different identity, morphology or functions. If additional experiments to demonstrate causal links are not possible, the authors should sufficiently account for alternative explanations for cell type divergence. Besides post-transcriptional regulation, there are a lot of other factors that could affect gene diversification, including early regulators that are transiently expressed in the embryo that primed fate selection, and external signaling factors in the neuron's environment. Such information was lost as only mature neurons are sequenced with very little spatial context.

Thanks for the question and great suggestions. We have included a new analysis in the revision, which further shows that such differential expression of RBPs in “divergent” clusters are more prevalent than those from other pairs in terms of gene numbers and differential expression levels in Figure 5D-E and Figure 5—figure supplement 1C-E. Furthermore, we included an additional discussion on the alternative explanations for neuron type divergence as suggested.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

The reviewers still think the data is a good resource, and the differentially expressed genes can be good candidates for future investigation. However, the reviewers think that cross-validation of their statistical measures that used to identify "matched", "convergent" and "divergent" types is necessary. They used two measures in parallel, hierarchical clustering, and population-level similarity, to identify these pairs. These measures can be made robust by repeating the sub-sampling of genes and average statistics over many sub-sampling calculations. The reviewers wonder how much of it remain true with sub-sampling test added. Or alternatively, the reviewers would suggest moving away from hierarchical clustering and instead re-identifying matched, convergent and divergent populations based on TF-based and effector-based distance measures, which is bootstrapped to ensure robustness. This will provide more interpretable, consistent results.

Thank you for the suggestions.

Besides hierarchical clustering, we performed the population-level statistical analysis in our last revision based on TF-based and effector-based distance measures to re-identify “matched”, “convergent”, and “divergent” pairs. These re-identified pairs mostly recapitulate those identified using hierarchical clustering (Figure 3B), indicating the robustness of three patterns.

In the current revision, we have further performed the population-level statistical analysis by sub-sampling of genes (80% of total either TFs or effector genes) and average statistics over 20 times to re-identify “matched”, “convergent”, and “divergent” pairs. These re-identified pairs completely recapitulated those identified using the statistical analysis based on total genes (Figure 3—figure supplemental 4E), indicating the robustness of three patterns.

Reviewer #1:

I found that the authors performed Jaccard similarity index-based analysis to validate clustering and rewrote the text. However, these revealed some weakness of the manuscript. I have the following comments.

1. The paper lacks biological aspects. The authors identified sister clusters from different brain regions (p11, line 234). The authors should at least discuss the significance of the finding.

Thanks for your suggestions. We have discussed the significance of different brain-region origins of sister clusters in the revision (p24, line 513-527).

First, since sister clusters in the same brain region may result from over-clustering, sister clusters from different brain regions showed the robustness of sister clusters.

Second, because neuronal phenotypes from the same region could exhibit common regional identities, it was surprising to observe that such a higher proportion of sister clusters with similar transcriptomic phenotypes could arise from two different brain regions (Figure 3—figure supplementary 2A). Putting this finding into the context of neuron type evolution raises the possibility that different brain regions independently give rise to similar neurotransmitter phenotypes through different TF programs. Alternatively, as brain regions are functionally diversified during the evolution, these highly similar neuronal clusters of different brain regions possibly derive from ancient building blocks, which become divergent through the evolutionary acquisition of different combinatorial TF codes.

2. The authors separated IIa and IIb before classification using TF profiles. I am wondering why?

Thanks for the question.

Our initial transcriptome-based classification showed that highly variable genes among all single cells were primarily effector genes (Figure 1—figure supplementary 2G), suggesting its critical role in brain cell classification. Previous studies also reported that effector genes better described the phenotype of neurons (Paul, Crow et al. 2017, Hodge, Bakken et al. 2019). Thus, to better classify neuron phenotypes, we first separated clusters using effector gene-based classification and identified IIa and IIb branch as glutamatergic and GABAergic neurons according to their marker genes (Figure 1-Source Data 5). To further address how regulatory genes determine effector gene-based neuron paired clusters, we introduced the classification using TFs and compared TF-based and effector gene-based classification to identify three patterns (Figure 3).

3. A paragraph (p12, lines 249-258), it is not clear how they performed the population-level statistical analysis. Explain more precisely.

Sorry for the confusion. We further clarified the population-level statistical analysis in the revision (p12-13, line 255-266).

For all glutamatergic/GABAergic neuronal clusters (n=39), we calculated the distances between every two clusters (C392) based on either effector gene profiles or TF profiles, then defined the pairs, which had the lowest 10% distances after ranking, as similar pair clusters (Figure 3A). Then, we defined pair clusters that were similar in both TF and effector gene profiles as “matched pattern”, those pair clusters that were similar in effector gene profiles but not in TF profiles as “convergent pattern”, and those pair clusters that were similar in TF profiles but not in effector gene profiles as “divergent pattern” (Figure 3B).

More detailed description has been enlisted in Materials and methods.

4. Why did the hierarchical and population analyses make the difference? (p12) Explain. Also, I do not understand the meaning of making overlaps of these.

Sorry for the confusion. We have clarified this issue in the revision (p13, line 275-277).

We were also aware of the weakness of hierarchical clustering: initial seeds, the order of data, and outlier data points have influences on the final hierarchical tree structure. Specifically, once a neuronal cluster has been assigned as a sister cluster with another cluster, it could no longer be paired with others. Therefore, we introduced a population-level statistical analysis to re-identify similar pair clusters, in which we calculated the distances between every two clusters (C392) based on either effector gene profiles or TF profiles, and defined the pairs, which have the lowest 10% distances after ranking, as similar pair clusters (Figure 3A; Details in Materials and methods). This analysis could identify similar pair clusters in a more unbiased manner.

Thus, we combined two methods to consolidate “matched”, “convergent”, and “divergent” types independently. The results showed that neuron pairs of three patterns were mostly similar using two strategies (Figure 3B). We overlap these paired clusters to make a better understanding of each cluster.

5. p13, line 277. I think the authors examined both the TF and effector profiles, correct?

Sorry for the confusion. Yes, we examined both TFs and effector profiles in Figure 3—figure supplementary 3A-B. We update this part in the revision (p14, line 291-292).

6. p13, Again the paper lacks biological aspects. Why did the neurotransmitter-type and neuromodulator-type make difference? Explain the idea in the discussion.

Thank you for the suggestions. We have discussed this idea in the revised manuscript (p14, line 296-307).

Our analysis showed that neuromodulator pairs with similar effector gene profiles predominantly were “matched” pattern, whereas neurotransmitter pairs with similar effector gene profiles mainly were a “convergent” pattern (Figure 3E). The former suggests the generation of similar neuromodulator pairs shared the specific TF program (“stereotyped programming”), and the latter suggests the generation of similar neurotransmitter pairs could be generated by different TF programs (“flexible programming”). These different programming strategies may account for the fact that across species, neuromodulator types are more conserved, whereas neurotransmitter types are much diverse and variable (La Manno, Gyllborg et al. 2016, Saunders, Macosko et al. 2018, Tiklova, Bjorklund et al. 2019, Poulin, Gaertner et al. 2020).

7. p14, lines 287-297, on the other hand… this part seems redundant with the previous parts and, to me, was difficult to read. Please clarify.

Sorry for the confusion. We re-write this part in the revision (p15, lines 309-314).

On the other hand, in glutamatergic/GABAergic neuronal classification, we also found that 13 pairs of sister clusters with similar TF profiles were separated in the effector gene-based classification. In other words, different from matched and convergent pairs mentioned above (Figure 3B-D), these 13 paired clusters exhibited different effector gene profiles but similar TFs, here terms as “divergent” pattern (Figure 3—figure supplement 4A). Also, the population-level statistical analysis identified divergent pairs (n = 15, Figure 3—figure supplement 4B) that were largely overlapped with those identified by hierarchical sister analysis (n = 10, Figure 3B). Neurons in each of these 10 paired neuronal clusters could be from the same (n=6) or different (n=4) brain regions (Figure 3—figure supplement 4A).

8. p14-15. I admire their efforts to make many BACs. They mentioned this may support "divergent pattern"(p17, line 358), but the reason why they thought so is very weak. They need to show relevance between morphology and effector type somehow experimentally.

9. p16, line 340. They did not describe how they picked up 6 TFs.

Thanks for your suggestions. We have clarified this issue in the revision (p16, lines 332-338 and p18, line 391-395).

Neuronal morphology and effector gene profile are two critical criteria of neuron diversity classification (Sugino, Clark et al. 2019, Peng, Xie et al. 2021). Also, many previous studies have provided the apparent links between effector genes and neuron morphology (Kristin L. Whitford 2002, Marcette, Chen et al. 2014, Delandre, Amikura et al. 2016, Noblett, Wu et al. 2019, Peng, Xie et al. 2021). Thus, after we found three patterns (“matched”, “convergent”, and “divergent”), we then wondered if similar patterns present between morphological subtypes and TFs. Indeed, we found in the optic tectum that single TFs could mark multiple morphological subtypes, and single morphological subtype could be marked by multiple TFs, which was in agreement with “divergent” and “convergent” patterns, respectively, considering that divergent pattern indicated the link of same TFs to different neuron phenotypes while convergent pattern indicated the link of different TFs to similar neuron phenotypes.

10. Figure3—figure supplement 2D and 2F. Figure3—figure supplement 4B. I could not tell how to read these data. In addition, in all cases, the effector-based distances are small (between 0-0.1) and TF-based distances are large (between 0.1-0.4). I don't see much difference between these populations.

Thanks for the question. We have clarified this issue in the revision (p17-18, line 372-375).

We selected 6 TFs (en2b, foxb1a, zic1, bhlhe22, zbtb18, and irx1a) for further analysis based on two criterions: 1. These TFs are highly expressed and specific to individual tectal glutamatergic clusters based on single-cell RNAseq analysis; 2. Their BAC plasmids could reliably mark particular morphological subclasses (at least in four animals).

Reviewer #2:

I have to say the authors did not address my concerns at all.

My first concern was about the validity of "matched", "convergent" and "divergent" classifications from hierarchical clusterings that are based on effector genes and TF genes in Figures 3 and 5. As an answer, the authors calculated Jaccard similarity indices for clusterings using all genes in Figure 1 and Figure 2. These added analyses were completely unrelated to the clustering of "matched", "convergent" and "divergent." Therefore, I have to say the authors did not address the issue.

Sorry for the confusion.

We first performed the Jaccard similarity index to address the concerns raised by reviewers in the first review regarding the robustness of 68 transcriptome-based clusters.

Then, to verify "matched", "convergent", and "divergent" patterns, we have enlisted the following analyses:

1. Considering that matched, convergent, and divergent patterns are derived from hierarchical clustering or distance measure, which could be sensitive to the tree structure. We have performed population-level statistical analysis in the last revision based on TF-based and effector-based distance measures to re-identify "matched", "convergent", and "divergent" cluster pairs. These re-identified pairs mostly recapitulate those identified using hierarchical clustering (Figure 3B), indicating the robustness of three patterns.

2. Furthermore, we performed the population-level statistical analysis by sub-sampling of genes (80% of total either TF or effector genes) and average statistics over 20 times in this revision to re-identify "matched", "convergent", and "divergent" cluster pairs in Figure 3—figure supplemental 4C-E. These re-identified pairs completely recapitulated those identified using the population-level statistical analysis based on total genes (Figure 3-supplemental 4E), again indicating the robustness of three patterns.

My second concern was about the validity of their claim about the differential expression of RBPs in "divergent cell population". They claim that the RBPs have more divergent expressions in "divergent cell pairs". However, it is hard to interpret the statistics presented in Figure 5E. The indication of the index [(number of genes that are solely divergent in group A) / (number of all genes that are divergent in group A)] is unclear.

Sorry for the confusion. In the revision, we have updated the result in p20, lines 420-427.

The number of RBPs that were specific to divergent pattern (n=52 from 10 pairs) was much more than those specific to convergent pattern (n=10 from 7 pairs, Figure 5—figure supplementary 1E-F). Thus, in terms of the number of pattern-specific RBPs per pair, RBPs showed the preferential expressions in divergent pairs compared to convergent ones (Figure 5D). Note that since there was only one pair of matched pattern, we did not include it in this analysis.

Moreover, this analysis does not indicate any causality between the divergence of RBPs and the divergence of effector genes. In principle, the more heterologous the population is, we would see more divergence of RBPs and effector genes. These factors are not independent of each other. I was expecting to see a more careful statistical approach that takes these considerations into account. Therefore, I have to say the authors did not address the issue.

Thanks for your suggestion. In the revision, we have updated the result in p20-21, lines 433-442.

We have enlisted a new analysis on the probability of effector gene which are the binding sites of pattern-specific RBPs using oRNAment database (http://rnabiology.ircm.qc.ca/oRNAment/) to address a potential causality between the divergence of RBPs and the divergence of effector genes (Benoit Bouvrette, Bovaird et al. 2020). GO analysis showed that pattern-specific RBPs targeted various molecular categories including effector genes, TFs, mRNA processing, metabolism (Figure 5—figure supplement 1H-J, Figure 5-Source Data 4). More importantly, significantly higher proportions of effector genes (in all targeted genes) are bound by divergent pattern-specific RBPs than by other two pattern-specific RBPs (Figure 5E). These results suggested the causality between RBPs and effector gene profiles in divergent pairs.

Unfortunately, due to the low gene coverage of 10x Genome-based scRNAseq and the limited number of targeted genes available in the database, we were unable to directly analyze the causality between the divergence of RBPs and the divergence of effector genes at the resolution of single divergent pairs. It is necessary in the future to directly examine the binding of RBPs and effector marker genes in divergent pairs.

Reviewer #3:

In the revised manuscript, the authors mainly added three additional control analyses that increased the credibility of the conclusion, including:

1) Robustness analysis of the clustering result using Jaccard similarity of subsampled data.

2) Analysis of the matched/convergent/divergent patterns of the glutamatergic/GABAergic neurons based on relaxed criteria that include not only the terminal sister clusters, but also non-sibling clusters with relatively close relationships.

3) Control analysis to show the enrichment of differential RBP expression in divergent patterns.

The control analyses largely addressed my concerns to a large extent, although they made the dramatic contrast between the regulatory logic of neurotransmitter and neuromodulator data weaker. The identified convergent/divergent events are now more convincing and allow for further exploration of the phenotype in the future.

However, I cannot help but noticed many more changes have been made without being pointed out in the rebuttal letter, including (1) exclusion of the data from the cholinergic neurons altogether,

Thanks for the question.

We did remove the data from the cholinergic neurons due to the following reasons:

1. Recently, we have noticed that the newly-generated transgenic line Tg(chata:EGFP) showed some levels of variability in marking cholinergic neurons at the whole-brain scale. Currently, we continue verifying this line and screening for new stable lines.

2. Also, considering that whole-brain scRNAseq data also showed the preferential co-expression of choline and glutamate (Figure 1-Source Data 2), the exclusion of the data from cholinergic neurons does not influence any conclusion in the paper, we then decided to exclude this data set from the revised version to ensure the quality of scRNA data in this Resource paper. Meanwhile, we described the above results in the revision (p10-11, line 218-220).

Sorry for this ignorance in the last response letter and we hope the reviewers can accept this request.

and (2) slightly different analysis result with seemingly identical input data (e.g. comparing Figure 3—figure supplement 2 in the new version vs. Figure 3 in the original manuscript). The authors should explain why such changes were made and ensure there is no selective report of the data.

Sorry for the confusion. In the revision, we have introduced new analysis in p11, lines 240-242.

We did include three new matched sister pairs (30-23, 4-32, 4-22) in the last revision because they were the pairs with the lowest 10% of distance measures using population-level statistical analysis, even though they were not immediate sister clusters at hierarchical termini (Figure 3-figure supplementary 2C and 4A).

However, in this revision, we identify matched pairs in hierarchical clustering by applying more rigorous criteria: 1. They are terminus sister clusters; 2. They were identified as matching modes based on R package “TreeDist”. The result showed that Cluster 9-61 was the only matched pair (Figure 3-figure supplementary 1B and 2B).

Finally, some suggestions/questions to help achieve a better presentation of the data.

1) The current "population-level" analysis is rather just a relaxed definition of the terminal sister clusters, but there are many other metrics to quantify the similarity between the TF-based and effector-based clustering result over the entire dataset. For hierarchical clustering this can be editing distance between the two trees, or more generally, similarity between organization of TF and effector gene distribution across all the clusters. This will not identify specific matched, convergent or divergent patterns, but would be an unbiased measurement of how TF and effector landscape agree with each other.

Thank you for your suggestions. We introduce R package “treeDist” to calculate tree distance in the revision (p11-12, line 241-242 and Materials and methods p41, line 794-801).

In this revision, we first tried “editing distance”. Unfortunately, since there are a total of 36 nodes (for 39 neuronal clusters) in both TF-based and effector gene-based hierarchical clusters, we were unable to run “editing distance” on such big trees successfully. Alternatively, we quantified the similarity between TF-based and effector gene-based hierarchical clusters by tree distance using R package “TreeDist”.

Remarkably, the analysis identified only one matching node (Cluster 9-61) after comparing TF-based and effector gene-based clusters (Figure 3—figure supplementary 1B). This paired cluster (Cluster 9-61) was consistent with the matched pair identified by hierarchical clustering (Figure 3—figure supplementary 2B) and the population-level statistical analysis (Figure 3—figure supplementary 2D).

Moreover, the tree distance between TF-based and effector gene-based clusters was 0.71 (Figure 3—figure supplementary 1B, p11-12, line 241-242). Meanwhile, we quantified the distance of two similar trees that was exemplified by TF-based or effector gene-based clusters with or without subsampling of 80% genes, and the result showed that the tree distances of TF-based or effector gene-based clusters before and after subsampling were only 0.20 (TF-based) and 0.14 (effector gene-based), respectively (Figure 3—figure supplementary 4C-D).

All above results indicated the overall distinction of the landscape of TF-based and effector gene-based clusters.

2) Figure 3: similar color regime was used to represent forebrain/OT/sub OT and matched/convergent/divergent patterns. This is very confusing. I would suggest weakening brain region visualization (e.g. changing it to a less pronounced representation or getting rid of it altogether -see comment #3) and focus on the regulatory logic. It is also recommended to color the pattern names in 3B accordingly.

Thanks for the suggestion. We have updated the color regime in the revision as follows:

1. We removed the brain region visualization and focused on the regulatory logic in Figure 3C-D, and 3F-G.

2. We have colored the pattern names in Figure 3B, 3C-D, and 3F-G and Figure 5B.

3) Figure 3C: The number of matched patterns on the Venn diagram does not match the number shown in Figure 3F. Also, Venn diagram is not the best way to show that the conclusion in sister cluster- and population-statistics are similar. [Minor correction: the number of selection should be Combination C(39, 2) rather than Permutation A(39, 2)].

Thank you for your suggestions.

1. We have updated the Venn diagram in Figure 3B in the revision.

2. We have updated the number of selection using the combination C(39, 2) in Figure 3A in the revision.

4) Throughout the manuscript, the brain region identities are shown next to the clusters in every figure, yet very little conclusion was made about the relationship between transcriptome and brain region identity. The authors are advised to summarize the major findings of data feature related to brain region identities (no correlation can also be a finding), and simplify the color use of brain region when the color does not convey meaningful information.

Thanks for your suggestions. We have discussed the significance of brain region identity in the revision (p24, line 513-527).

Regional identity is an essential factor for classifying brain cells. Our results showed that neurons, qRG, and neuronal progenitors exhibited prominent regional characteristics (Figure 1C). In the hierarchical clustering and population-level statistical analysis, similar pair clusters from the same region or different regions occur in nearly equal probabilities (Figure 3—figure supplement 2A). Because neuronal phenotypes from the same region could exhibit common regional identities, it was surprising to observe that such a higher proportion of sister subclasses with similar transcriptomic phenotypes could arise from two different brain regions. Putting this finding into the context of neuron type evolution raises the possibility that different brain regions independently give rise to similar neurotransmitter phenotypes through different TF programs. Alternatively, as brain regions are functionally diversified during the evolution, these highly similar neuronal clusters of different brain regions possibly derive from ancient building blocks, which become divergent through the evolutionary acquisition of different combinatorial TF codes.

Reference

Benoit Bouvrette, L. P., S. Bovaird, M. Blanchette and E. Lecuyer (2020). "oRNAment: a database of putative RNA binding protein target sites in the transcriptomes of model species." Nucleic Acids Res 48(D1): D166-D173.

Delandre, C., R. Amikura and A. W. Moore (2016). "Microtubule nucleation and organization in dendrites." Cell Cycle 15(13): 1685-1692.

Hodge, R. D., T. E. Bakken, J. A. Miller, K. A. Smith, E. R. Barkan, L. T. Graybuck, J. L. Close, B. Long, N. Johansen, O. Penn, Z. Yao, J. Eggermont, T. Hollt, B. P. Levi, S. I. Shehata, B. Aevermann, A. Beller, D. Bertagnolli, K. Brouner, T. Casper, C. Cobbs, R. Dalley, N. Dee, S. L. Ding, R. G. Ellenbogen, O. Fong, E. Garren, J. Goldy, R. P. Gwinn, D. Hirschstein, C. D. Keene, M. Keshk, A. L. Ko, K. Lathia, A. Mahfouz, Z. Maltzer, M. McGraw, T. N. Nguyen, J. Nyhus, J. G. Ojemann, A. Oldre, S. Parry, S. Reynolds, C. Rimorin, N. V. Shapovalova, S. Somasundaram, A. Szafer, E. R. Thomsen, M. Tieu, G. Quon, R. H. Scheuermann, R. Yuste, S. M. Sunkin, B. Lelieveldt, D. Feng, L. Ng, A. Bernard, M. Hawrylycz, J. W. Phillips, B. Tasic, H. Zeng, A. R. Jones, C. Koch and E. S. Lein (2019). "Conserved cell types with divergent features in human versus mouse cortex." Nature 573(7772): 61-68.

Kristin L. Whitford, V. r. M., Elke Stein, Corey S. Goodman, Marc Tessier-Lavigne, Alain Che´ dotal, and Anirvan Ghosh (2002). "Regulation of Cortical Dendrite Development by Slit-Robo Interactions." Neuron 33: 47-61.

La Manno, G., D. Gyllborg, S. Codeluppi, K. Nishimura, C. Salto, A. Zeisel, L. E. Borm, S. R. W. Stott, E. M. Toledo, J. C. Villaescusa, P. Lonnerberg, J. Ryge, R. A. Barker, E. Arenas and S. Linnarsson (2016). "Molecular Diversity of Midbrain Development in Mouse, Human, and Stem Cells." Cell 167(2): 566-580 e519.

Marcette, J. D., J. J. Chen and M. L. Nonet (2014). "The Caenorhabditis elegans microtubule minus-end binding homolog PTRN-1 stabilizes synapses and neurites." ELife 3: e01637.

Noblett, N., Z. Wu, Z. H. Ding, S. Park, T. Roenspies, S. Flibotte, A. D. Chisholm, Y. Jin and A. Colavita (2019). "DIP-2 suppresses ectopic neurite sprouting and axonal regeneration in mature neurons." J Cell Biol 218(1): 125-133.

Paul, A., M. Crow, R. Raudales, M. He, J. Gillis and Z. J. Huang (2017). "Transcriptional Architecture of Synaptic Communication Delineates GABAergic Neuron Identity." Cell 171(3): 522-539 e520.

Peng, H., P. Xie, L. Liu, X. Kuang, Y. Wang, L. Qu, H. Gong, S. Jiang, A. Li, Z. Ruan, L. Ding, Z. Yao, C. Chen, M. Chen, T. L. Daigle, R. Dalley, Z. Ding, Y. Duan, A. Feiner, P. He, C. Hill, K. E. Hirokawa, G. Hong, L. Huang, S. Kebede, H. C. Kuo, R. Larsen, P. Lesnar, L. Li, Q. Li, X. Li, Y. Li, Y. Li, A. Liu, D. Lu, S. Mok, L. Ng, T. N. Nguyen, Q. Ouyang, J. Pan, E. Shen, Y. Song, S. M. Sunkin, B. Tasic, M. B. Veldman, W. Wakeman, W. Wan, P. Wang, Q. Wang, T. Wang, Y. Wang, F. Xiong, W. Xiong, W. Xu, M. Ye, L. Yin, Y. Yu, J. Yuan, J. Yuan, Z. Yun, S. Zeng, S. Zhang, S. Zhao, Z. Zhao, Z. Zhou, Z. J. Huang, L. Esposito, M. J. Hawrylycz, S. A. Sorensen, X. W. Yang, Y. Zheng, Z. Gu, W. Xie, C. Koch, Q. Luo, J. A. Harris, Y. Wang and H. Zeng (2021). "Morphological diversity of single neurons in molecularly defined cell types." Nature 598(7879): 174-181.

Poulin, J. F., Z. Gaertner, O. A. Moreno-Ramos and R. Awatramani (2020). "Classification of Midbrain Dopamine Neurons Using Single-Cell Gene Expression Profiling Approaches." Trends Neurosci 43(3): 155-169.

Saunders, A., E. Z. Macosko, A. Wysoker, M. Goldman, F. M. Krienen, H. de Rivera, E. Bien, M. Baum, L. Bortolin, S. Wang, A. Goeva, J. Nemesh, N. Kamitaki, S. Brumbaugh, D. Kulp and S. A. McCarroll (2018). "Molecular Diversity and Specializations among the Cells of the Adult Mouse Brain." Cell 174(4): 1015-1030 e1016.

Sugino, K., E. Clark, A. Schulmann, Y. Shima, L. Wang, D. L. Hunt, B. M. Hooks, D. Trankner, J. Chandrashekar, S. Picard, A. L. Lemire, N. Spruston, A. W. Hantman and S. B. Nelson (2019). "Mapping the transcriptional diversity of genetically and anatomically defined cell populations in the mouse brain." ELife 8.

Tiklova, K., A. K. Bjorklund, L. Lahti, A. Fiorenzano, S. Nolbrant, L. Gillberg, N. Volakakis, C. Yokota, M. M. Hilscher, T. Hauling, F. Holmstrom, E. Joodmardi, M. Nilsson, M. Parmar and T. Perlmann (2019). "Single-cell RNA sequencing reveals midbrain dopamine neuron diversity emerging during mouse brain development." Nat Commun 10(1): 581.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

1) Although for neurotransmitter-type, "sister cluster analysis and the population-level statistical analysis " are carried out. However, for neuromodulator-type, only sister cluster analysis was done. It is fair to perform population-level analysis for neuromodulator-type.

Thanks for the suggestion. We have performed the population-level statistical analysis on neuromodulator-type neurons in this revision (line 305-318; Figure 3—figure supplementary 3E-G).

In this revision, we first updated the analysis based on hierarchical sister clusters by applying more rigorous criterions: 1. They are terminus sister clusters; 2. They were identified as matching nodes based on R package “TreeDist”. The analysis showed that 5 out 8 sister neuromodulator clusters with matched pattern (Figure 3—figure supplementary 3D). In the previous analysis, 7 out of 8 sister neuromodulator clusters were identified as matched pattern, but 2 pairs of similar clusters (4/14; 6/21) were not immediate sister clusters at the hierarchical termini (Figure 3—figure supplementary 3D′).

Meanwhile, we performed the population-level statistical analysis using all genes or subsampling of 80% genes to identify similar neuromodulator pair clusters. We identified 9 similar neuromodulator pair clusters, 8 with matched pattern and 1 with convergent pattern (Figure 3—figure supplementary 3F).

Thus, we intersected hierarchical sister cluster analysis and population-level statistical analysis, and identified 6 pairs of similar neuromodulator clusters, 5 with matched pattern and 1 with convergent pattern (Figure 3—figure supplementary 3G-I).

2) Line 265: largely recapitulated. I do not agree with this. Rather, explain why the results were not consistent.

Thanks for the suggestion. We re-write this part and explain the results in the revision (Line 265-279) as followings:

“The population-level analysis identified 19 pairs of effector gene-based similar pair clusters, 5 with matched pattern and 14 with convergent pattern (Figure 3—figure supplement 2D-E). Overall, similar pair clusters with either matched or convergent pattern identified by the population-level analysis showed an overlapping but distinct pattern with those identified by the hierarchical sister cluster analysis (Figure 3B). This discrepancy was likely due to the following facts: 1. In the population-level statistical analysis, we arbitrarily set the lowest threshold as a criterion to identify similar pair clusters. Thus, the levels of this threshold could influence the production of similar pair clusters. 2. In population-level statistical analysis, each cluster could use for multiple times, whereas in hierarchical sister cluster analysis, once a cluster was selected as a pair with another cluster, it could not be re-used again. To overcome this discrepancy, we intersected the results from hierarchical sister cluster analysis and population-level statistical analysis, and identified 8 pairs of effector gene-based glutamatergic/GABAergic similar pair clusters, 1 with matched pattern and 7 with convergent pattern (Figure 3B)”.

3) Line 226: The TF Regulatory Landscape in Whole-brain Neuronal Phenotypes

This section is a bit too long. The section should be separated, for example, for neurotransmitter-type and neuromodulator-type.

Thanks for the suggestion.

We have separated this section into three parts: 1. The TF Regulatory Landscape of Whole-brain Glutamatergic/GABAergic Neuron Clusters (Line 226-227); 2. The TF Regulatory Landscape of Neuromodulatory Neuron Clusters (Line 304); 3. Different Neuron Clusters Exhibit Similar TF Profiles (Line 332).

4) Line 395: results supported "divergent" and "convergent" patterns, because divergent pattern indicated the link of same TFs to different neuron phenotypes, while convergent pattern indicated the link of different TFs to similar neuron phenotypes.

This is too much speculation. The relationship between the morphology and effector gene profiles are unclear. I think it is fair to say that morphology does not directly correlate with TF profiles.

Thanks for the suggestion.

We have updated our statement in the revision as “This observation could be inferred from “convergent pattern” (same TFs were expressed in different effector-based subtypes) and “divergent pattern” (different TFs were expressed in similar effector-based subtypes), respectively. However, we could not exclude unknown indirect regulations of morphological subtypes by TFs” (Line 427-431).

Reviewer #2:

I can say the authors addressed my concerns on statistical analyses after this round of revision, shrugging off unnecessary parts and exposing the most interesting part of this study. In addition, they included a new analysis in Figure 5 which indicates potential causality between the differential expression of RBPs and differential expression of effector genes.

There are still concerns about the validity of classification (lower 10% of gene distance compared to shuffled pairs, how do you justify the threshold?). However, this justification will not be so straightforward because it will be prone to the intrinsic statistics of TF expression, effector gene expression, and the sensitivity of the sequencing technique. It is obvious this is the best authors can do. I advise authors to acknowledge such limitations in the discussion.

Thanks for the suggestion. We have included the discussion on this limitation in the revision in both Result and Materials and methods.

In Results (line 270-279): “In the population-level analysis, we arbitrarily set the lowest threshold as a criterion to identify similar pair clusters. Thus, the levels of this threshold could influence the production of similar pair clusters.”

In Materials and methods (line 828-829): “This analysis is subject to the intrinsic statistics of TF and effector gene expression, as well as the sensitivity of scRNAseq method”.

In addition, the reason why we applied the lowest 10% threshold is this intrinsic difference in TF-based (0.1-0.4) and effector-gene-based (0-0.1) distances between clusters (Figure 3—figure supplementary 2D-E). One possible explanation is due to the difference in the numbers of TF genes (less, n = 283) and effector genes (more, n = 1099). However, considering that subsampling of genes has little influence on identifying matched, convergent, and divergent types (Figure 3-supplemental 2H), it ruled out the contribution of the difference in the number of TFs and effector genes to the difference in distance measures. Thus, this distance difference more likely resulted from the hierarchical regulation of effector gene expression by TFs.

Reviewer #3:

Unfortunately, the authors misunderstood my request for adding Tree Distance analysis. An important claim of the paper was that glutamatergic/GABAergic clusters show different TF and terminal profile patterns ("convergent/divergent"), whereas neuromodulator type clusters predominantly expressed the same TF profiles ("matched"). However, supplementary analysis was only shown for the glutamatergic/GABAergic types but not the modulator types, and the main figure that supports this conclusion (Figure 3E) still uses sister clusters from hierarchical clustering for classification of neuromodulator types, which I found to be an unfair comparison.

Thanks for the great suggestion. We have performed the population-level analysis on neuromodulator-type neurons to strengthen the conclusion in this revision (line 312-318; Figure 3—figure supplementary 3E-G). Meanwhile, we also updated similar neuromodulator pairs using the hierarchical sister cluster analysis by applying more rigorous criterions (line 306-310; Figure 3—figure supplementary 3C-D).

We first updated the analysis based on hierarchical sister clusters by applying more rigorous criterions: 1. They are terminus sister clusters; 2. They were identified as matching nodes based on R package “TreeDist”. The analysis showed that 5 out 8 sister neuromodulator clusters with matched pattern (Figure 3—figure supplementary 3C-D). In the previous analysis, 7 out of 8 sister neuromodulator clusters were identified as matched pattern, but 2 pairs of similar clusters (4/14; 6/21) were not immediate sister clusters at the hierarchical termini (Figure 3—figure supplementary 3D′).

Meanwhile, we performed the population-level statistical analysis using all genes or subsampling of 80% genes to identify similar neuromodulator pairs. We identified 9 similar neuromodulator clusters, 8 with matched pattern and 1 with convergent pattern (Figure 3—figure supplementary 3 E-F).

Thus, we intersected hierarchical sister cluster analysis and population-level statistical analysis, and identified 6 pairs of similar neuromodulator clusters, 5 with matched pattern and 1 with convergent pattern (line 315-318; Figure 3—figure supplementary 3G-I).

Also, the gene subsampling test of the robustness of matched/divergent/convergent pairs was only performed on similar pairs defined by population-statistics. It was not clear to me why the authors still adhered to the "sister cluster" definition (or an INTERSECTION of this strategy with the population-statistics), for all subsequent conclusions.

Thanks for the question.

We have addressed this concern in the revised manuscript (Line 267-279)

“Overall, similar neuron clusters with either matched or convergent pattern identified by the population-level analysis showed an overlapping but distinct pattern with those identified by the hierarchical sister cluster analysis (Figure 3B). This discrepancy was likely due to the following facts: 1. In the population-level analysis, we arbitrarily set the lowest threshold as a criterion to identify similar pair clusters. Thus, the levels of this threshold could influence the production of similar pair clusters. 2. In population-level statistical analysis, each cluster could use for multiple times, whereas in hierarchical sister cluster analysis, once a cluster was selected as a pair with another cluster, it could not be re-used again. To overcome this discrepancy, we intersected the results from hierarchical sister cluster analysis and population-level statistical analysis, and identified 8 pairs of similar glutamatergic/GABAergic clusters, 1 with matched pattern and 7 with convergent pattern (Figure 3B)”.

We think it is better to keep all these analyses in this manuscript to strengthen our conclusions. We also greatly appreciate all reviewers’ suggestions to make the conclusion more solid.

I suggested tree distance as a global measurement of whether the glutamatergic/GABAergic types and modulator types truly have different regulatory logic at the population level, rather only at the terminal sisters.

Thanks for the suggestion.

In the revision, we discussed different regulatory logic at the population level between neuromodulator-type and neurotransmitter-type neurons in Line 323-332:

“we have performed this global tree measurement using the R package “TreeDist”, and found that the distance between TF-based and effector-based hierarchical tree of glutamatergic/GABAergic neurons (Tree distance = 0.71) was higher than that of neuromodulator neurons (Tree distance = 0.38), suggesting distinct TF regulatory logic of effector-based phenotypes between neurotransmitter-type and modulator-type at the global level. The smaller distance for neuromodulator neurons was consistent with our conclusion that neuromodulator type clusters predominantly expressed the same TF profiles ("matched")”.

https://doi.org/10.7554/eLife.68224.sa2

Article and author information

Author details

  1. Hui Zhang

    1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
    2. University of Chinese Academy of Sciences, Beijing, China
    3. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Visualization, Writing – original draft
    Contributed equally with
    Haifang Wang and Xiaoyu Shen
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5300-5310
  2. Haifang Wang

    1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
    2. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
    Contribution
    Formal analysis, Methodology
    Contributed equally with
    Hui Zhang and Xiaoyu Shen
    Competing interests
    No competing interests declared
  3. Xiaoyu Shen

    1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
    2. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
    Contribution
    Data curation, Formal analysis, Investigation, Methodology, Validation
    Contributed equally with
    Hui Zhang and Haifang Wang
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8868-7035
  4. Xinling Jia

    1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
    2. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
    Contribution
    Data curation
    Competing interests
    No competing interests declared
  5. Shuguang Yu

    1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
    2. University of Chinese Academy of Sciences, Beijing, China
    3. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
    Contribution
    Data curation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6640-5420
  6. Xiaoying Qiu

    1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
    2. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
    Contribution
    Data curation
    Competing interests
    No competing interests declared
  7. Yufan Wang

    1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
    2. University of Chinese Academy of Sciences, Beijing, China
    3. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
    Contribution
    Data curation
    Competing interests
    No competing interests declared
  8. Jiulin Du

    1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
    2. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
    3. School of Future Technology, University of Chinese Academy of Sciences, Beijing, China
    Contribution
    Funding acquisition, Project administration, Supervision, Writing – review and editing
    For correspondence
    forestdu@ion.ac.cn
    Competing interests
    No competing interests declared
  9. Jun Yan

    1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
    2. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
    3. School of Future Technology, University of Chinese Academy of Sciences, Beijing, China
    Contribution
    Funding acquisition, Project administration, Supervision, Writing – review and editing
    For correspondence
    junyan@ion.ac.cn
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0405-0502
  10. Jie He

    1. Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai, China
    2. Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
    Contribution
    Conceptualization, Funding acquisition, Project administration, Supervision, Writing – original draft, Writing – review and editing
    For correspondence
    jiehe@ion.ac.cn
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2539-2616

Funding

Shanghai basic research field Project (18JC1410100)

  • Jiulin Du
  • Jun Yan
  • Jie He

Shanghai Municipal Science and Technology Major Project (2018SHZDZX05)

  • Jiulin Du
  • Jun Yan
  • Jie He

The National Key Research and Development Program of China (2020YFA0112700)

  • Jie He

Strategic Priority Research Program of Chinese Academy of Science (XDB32000000)

  • Jiulin Du
  • Jun Yan
  • Jie He

National Natural Science Foundation of China (3187060071)

  • Jie He

State Key Laboratory of Neuroscience

  • Jiulin Du
  • Jun Yan
  • Jie He

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Dr Muming Poo for the discussion and manuscript editing. We thank facilities of the Institute of Neuroscience: Dr Min Zhang (MZ) and Zhenning Zhou (ZZ) from Molecular and Cellular Biology Core Facility for the assistance of scRNA-seq; FACS facility Haiyan Wu (HW); the optical imaging facility Qian Hu (QH), Yumei Zhang (YZ), and Yonghong Wang (YW); the bioinformatics core facility and the facility of mapping brain-wide mesoscale connectome. This research was funded by grants from the Shanghai basic research field Project (Grant No.18JC1410100), Shanghai Municipal Science and Technology Major Project (Grant No. 2018SHZDZX05), The National Key Research and Development Program of China (Grant No. 2020YFA0112700), Strategic Priority Research Program of Chinese Academy of Science(Grant No. XDB32000000), National Natural Science Foundation of China (grant No. 3187060071), State Key Laboratory of Neuroscience.

Ethics

Animal procedures performed in this study were approved by the Animal Use Committee of Institute of Neuroscience, Chinese Academy of Sciences (NA-045-2019).

Senior Editor

  1. Didier YR Stainier, Max Planck Institute for Heart and Lung Research, Germany

Reviewing Editor

  1. Koichi Kawakami, National Institute of Genetics, Japan

Reviewer

  1. Koichi Kawakami, National Institute of Genetics, Japan

Publication history

  1. Received: March 10, 2021
  2. Accepted: December 5, 2021
  3. Accepted Manuscript published: December 13, 2021 (version 1)
  4. Version of Record published: January 19, 2022 (version 2)

Copyright

© 2021, Zhang et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,323
    Page views
  • 279
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Hui Zhang
  2. Haifang Wang
  3. Xiaoyu Shen
  4. Xinling Jia
  5. Shuguang Yu
  6. Xiaoying Qiu
  7. Yufan Wang
  8. Jiulin Du
  9. Jun Yan
  10. Jie He
(2021)
The landscape of regulatory genes in brain-wide neuronal phenotypes of a vertebrate brain
eLife 10:e68224.
https://doi.org/10.7554/eLife.68224

Further reading

    1. Neuroscience
    2. Stem Cells and Regenerative Medicine
    Xin-Yao Sun et al.
    Research Article Updated

    Brain organoids have been used to recapitulate the processes of brain development and related diseases. However, the lack of vasculatures, which regulate neurogenesis and brain disorders, limits the utility of brain organoids. In this study, we induced vessel and brain organoids, respectively, and then fused two types of organoids together to obtain vascularized brain organoids. The fused brain organoids were engrafted with robust vascular network-like structures and exhibited increased number of neural progenitors, in line with the possibility that vessels regulate neural development. Fusion organoids also contained functional blood–brain barrier-like structures, as well as microglial cells, a specific population of immune cells in the brain. The incorporated microglia responded actively to immune stimuli to the fused brain organoids and showed ability of engulfing synapses. Thus, the fusion organoids established in this study allow modeling interactions between the neuronal and non-neuronal components in vitro, particularly the vasculature and microglia niche.

    1. Neuroscience
    2. Stem Cells and Regenerative Medicine
    Bilal Cakir, In-Hyun Park
    Insight

    Fusing brain organoids with blood vessel organoids leads to the incorporation of non-neural endothelial cells and microglia into the brain organoids.