Single-cell transcriptome profiles of Drosophila fruitless-expressing neurons from both sexes

  1. Colleen M Palmateer
  2. Catherina Artikis
  3. Savannah G Brovero
  4. Benjamin Friedman
  5. Alexis Gresham
  6. Michelle N Arbeitman  Is a corresponding author
  1. Department of Biomedical Sciences, Florida State University, College of Medicine, United States
  2. Program of Neuroscience, Florida State University, United States

Abstract

Drosophila melanogaster reproductive behaviors are orchestrated by fruitless neurons. We performed single-cell RNA-sequencing on pupal neurons that produce sex-specifically spliced fru transcripts, the fru P1-expressing neurons. Uniform Manifold Approximation and Projection (UMAP) with clustering generates an atlas containing 113 clusters. While the male and female neurons overlap in UMAP space, more than half the clusters have sex differences in neuron number, and nearly all clusters display sex-differential expression. Based on an examination of enriched marker genes, we annotate clusters as circadian clock neurons, mushroom body Kenyon cell neurons, neurotransmitter- and/or neuropeptide-producing, and those that express doublesex. Marker gene analyses also show that genes that encode members of the immunoglobulin superfamily of cell adhesion molecules, transcription factors, neuropeptides, neuropeptide receptors, and Wnts have unique patterns of enriched expression across the clusters. In vivo spatial gene expression links to the clusters are examined. A functional analysis of fru P1 circadian neurons shows they have dimorphic roles in activity and period length. Given that most clusters are comprised of male and female neurons indicates that the sexes have fru P1 neurons with common gene expression programs. Sex-specific expression is overlaid on this program, to build the potential for vastly different sex-specific behaviors.

Editor's evaluation

This study presents a valuable single-cell sequencing dataset of fruitless-expressing neurons in the male and female Drosophila nervous system. The quality data and convincing analyses allowed the authors to conclude that most neuronal types are present in both Drosophila sexes, suggesting that sex-specific versions of the transcription factor Fruitless can modify neural function in a sex-specific way without completely altering core neural identity. This work will be of interest to developmental biologists and neuroscientists with a focus on sex-specific differences.

https://doi.org/10.7554/eLife.78511.sa0

Introduction

A current goal of neuroscience research is to understand molecular differences at the single cell/neuron level to better understand the diverse cellular components of the nervous system, with the ultimate goal of understanding how diverse cells work together to generate behavior (Ngai, 2022). Drosophila is an excellent model for this approach, given there are defined and experimentally tractable sets of neurons that generate the potential for complex behaviors. Indeed, a Drosophila single-cell RNA-seq (scRNA-seq) atlas has been generated to understand neurons underlying adult circadian biology, with the characterization of the adult core circadian clock neurons (Ma et al., 2021). In addition, there are atlases that have been generated to understand the cellular components of the brain and ventral nerve cord (VNC) during development and adult stages, using single-cell, genome-wide approaches (Allen et al., 2020; Croset et al., 2018; Davie et al., 2018; Konstantinides et al., 2018; Kurmangaliyev et al., 2020; Li et al., 2017; Li et al., 2022; McLaughlin et al., 2021; Özel et al., 2021; Simon and Konstantinides, 2021; Xie et al., 2021). Here, we present a scRNA-seq study to understand how the potential for sexually dimorphic adult reproductive behaviors are specified in the developing nervous system in males and females. We gain insight into how neurons that arise from a sex-shared developmental trajectory can underlie vastly different behaviors in males and females (Ren et al., 2016).

In Drosophila, the neuronal substrates that direct reproductive behaviors are specified by the sex-specific transcription factors (TFs) encoded by fruitless (fru; fru P1 transcript isoforms) and doublesex (dsx) (Figure 1A), produced as an outcome of alternative pre-mRNA splicing by the sex determination hierarchy (reviewed in Andrew et al., 2019; Cline and Meyer, 1996). Sex-specific splicing at the 5′ end of fru transcripts produced under control of the most distal promoter (P1 promoter; fru P1 transcripts) results in the production of male-specific (FruM) isoforms. FruM isoforms have an additional 101 amino acid region on the amino terminus that is not present in common Fru isoforms. In females, fru P1 transcripts encode for a short peptide that is predicted to be nonfunctional. fru transcript isoforms are also alternatively spliced at the 3′ end, resulting in products with different DNA-binding domains (Gramates et al., 2022; Ito et al., 1996; Ryner et al., 1996). The identification of master regulatory TFs has provided an unprecedented molecular inroad into a behavioral question, allowing for high-resolution genomic and genetic interrogation, microscopic visualization, and the physiological manipulation of neurons directing behavior. dsx and fru P1-expressing neurons (fru P1 neurons hereafter) are present in both sexes and each set arises from sex-shared developmental lineages (Ito et al., 1996; Lee et al., 2002; Manoli et al., 2005; Ren et al., 2016; Robinett et al., 2010; Ryner et al., 1996; Sanders and Arbeitman, 2008; Stockinger et al., 2005). However, these neurons direct dramatically different innate behaviors in the sexes due to differences in morphology, connectivity, physiology, and number. Males display an intricate courtship display that includes chasing the female, tapping her with his leg, and singing a song by wing vibration. The female will either accept or reject the male’s courtship advances. Once the female has mated, she shows a broad range of post-mating changes including changes in her receptivity to subsequent courtship displays (reviewed in Anholt et al., 2020; Auer and Benton, 2016; Dauwalder, 2011; Greenspan and Ferveur, 2000; Manoli et al., 2006; Peng et al., 2021; Villella and Hall, 2008; Yamamoto et al., 2014).

Figure 1 with 8 supplements see all
scRNA-sequencing of fru P1 neurons from males and females at 48-hr after puparium formation (APF).

(A) The Drosophila sex determination hierarchy is an alternative pre-mRNA splicing cascade that generates sex-specific transcription factors that regulate somatic sexual differentiation. The pre-mRNA splicing regulators are encoded by sxl, tra, and tra-2. Early production of Sxl in females results in continued production of functional Sxl and Tra in females. Tra and Tra-2 regulate the sex-specific splicing of doublesex (dsx) and fruitless transcripts from the P1 promoter (fru P1). In females, female-specific Dsx (DsxF) is produced. In males, male-specific (DsxM) and male-specific Fru (FruM) are produced due to default pre-mRNA splicing. Sxl also regulates dosage compensation (DC) (reviewed in Andrew et al., 2019; Cline and Meyer, 1996). (B) Confocal maximum projections of 48-hr APF male (top) and female (bottom) brains and ventral nerve cords (VNCs) expressing membrane-bound GFP (mcd8::GFP) in fru P1-expressing neurons (green). Male tissues are immunostained for male-specific FruM (magenta). Magnification of the boxed regions in male brain and VNC are below the male tissues (×40). Scale bars = 50 μm. (C) Uniform Manifold Approximation and Projection (UMAP) plot of 7988 fru P1 cells, from male central nervous system (CNS) at 48-hr APF, grouped into 77 clusters (male data analysis). (D) UMAP plot of 17,530 fru P1 cells, from female CNS at 48-hr APF, grouped into 88 clusters (female data analysis). (E) UMAP plot of 25,518 total fru P1 neurons, from both sexes (full data set), grouped into 113 clusters. For all UMAP plots, the annotations shown were determined using marker gene expression (Source data 3). Clusters with numbers have additional annotations listed in Source data 3.

Here, we focus on fru P1 neurons that are found in both males and females. Earlier studies of fru P1 neurons showed that there are not large sex differences in their number or cell body positions (Manoli et al., 2005; Stockinger et al., 2005). fru P1 was initially thought to be important for male courtship based on analyses of different fru P1 mutant allele combinations (Anand et al., 2001; Gailey and Hall, 1988; Ryner et al., 1996; Villella et al., 1997). Additional studies showed that fru P1 neurons are both necessary and largely sufficient for male courtship behaviors, based on experiments where fru P1 neurons were either activated or silenced and behavior was examined, and by studies where FruM was produced in females in fru P1 neurons (Clyne and Miesenböck, 2008; Demir and Dickson, 2005; Manoli et al., 2005; Stockinger et al., 2005). Female receptivity was later shown to be directed by fru P1 neurons using neuronal silencing approaches (Kvitsiani and Dickson, 2006). fru P1 neurons make up ~2–5% of all neurons, in both sexes, with expression found in the periphery, brain, and VNC (Lee et al., 2000; Manoli et al., 2005; Stockinger et al., 2005). fru P1 neurons in peripheral structures are important for detecting con-specific mates. In the central nervous system (CNS), fru P1 is in regions important for higher-order processing of experience/sensation and regions involved in motor outputs for reproductive behaviors (reviewed in Auer and Benton, 2016).

The spatial positions of fru P1 neurons in the brain and VNC have been assigned to named spatial clusters, with the idea that neurons in close proximity may share functions (Billeter and Goodwin, 2004; Lee et al., 2000; Manoli et al., 2005; Stockinger et al., 2005). However, it is not clear if all neurons within a spatial cluster are functionally and/or molecularly similar, which we address here. Furthermore, while the position and number of neurons are similar between the sexes, there is dimorphism in cell number within several spatial clusters, male-specific clusters, sex differences in neuronal projections, and differences in their physiology (reviewed in Auer and Benton, 2016; Billeter and Goodwin, 2004; Cachero et al., 2010; Clowney et al., 2015; Datta et al., 2008; Kallman et al., 2015; Kimura, 2008; Kimura et al., 2005; Ruta et al., 2010; Yu et al., 2010). Sex differences in fru P1 neurons begin to be established during metamorphosis, coinciding with when FruM is at its peak expression in males, during the mid-pupal phase (Arthur et al., 1998; Belote and Baker, 1987; Chen et al., 2021; Kimura et al., 2005; Lee et al., 2000; Stockinger et al., 2005). Therefore, a major remaining question is how does fru establish these sex differences and ultimately the potential for sex-specific behavior during development. This scRNA-seq study addresses what are the different molecular classes of fru P1 neurons that arise during development in each sex.

Work from our laboratory and others have examined gene expression in fru P1 neurons, FruM regulated expression, FruM binding, chromatin modifications in fru P1 neurons, and open chromatin regions in fru P1 neurons (Brovkina et al., 2021; Dalton et al., 2013; Goldman and Arbeitman, 2007; Neville et al., 2014; Newell et al., 2016; Palmateer et al., 2021; Vernes, 2014). Among these studies, only two have evaluated the mid-pupal stage (Neville et al., 2014; Palmateer et al., 2021). However, these studies used bulk sequencing approaches, thereby lacking the resolution to examine differences in gene expression across individual neurons. To determine the gene expression signatures for individual fru P1 cells, we performed single-cell RNA-seq during a critical developmental stage for establishing the potential for sex-specific behavior. This produced three cell atlases containing: male fru P1 neurons, female fru P1 neurons, and a full data set of male and female fru P1 neurons co-analyzed. We focus our analyses on the full data set of 25,518 neurons from both males and females that formed 113 molecular clusters after dimensionality reduction using Uniform Manifold Approximation and Projection (UMAP) and clustering. We examine sex differences in gene expression and neuron number within these clusters. We also manually annotated several clusters using the enriched gene expression signatures. We annotate clusters based on gene expression consistent with producing fast-acting neurotransmitters, neuropeptides, receptors, TFs, and several previously characterized fru P1 neuronal populations. We compare our cluster annotations to those in recent Drosophila neuronal scRNA-seq studies, by examining marker gene overlap. Additionally, we examine the in vivo spatial patterns of overlapping expression with fru P1 and several marker genes to provide anatomical links to our analyses. Our identification of a population of neurons that express genes that regulate circadian behavior led to our discovery of a set of fru P1 neurons that regulate activity and period length in a sex-specific manner. Altogether, our data sets show the molecular and functional heterogeneity of fru P1 neurons, revealing the diversity of neurons required to build reproductive behaviors in males and females. While the data reveal sex differences in gene expression within molecularly defined clusters, the observation that nearly all clusters are comprised of male and female neurons indicates that male and female fru P1 neurons also share common gene expression repertoires, with sex-specific information overlaid on these core patterns.

Results and discussion

scRNA-seq of fru P1-expressing cells in the pupal CNS reveals cell type diversity

To gain insight into the molecular and functional differences across fru P1-expressing neurons (fru P1 neurons) in males and females, we performed scRNA-seq, using the 10× Genomics platform. This approach allows us to examine the transcriptomes of individual fru P1 neurons. To enrich for fru P1 neurons, fluorescence-activated cell sorting (FACS) was performed on dissociated neurons from pupae CNS tissue, at 48 hr after puparium formation (APF), that expressed membrane-bound GFP in fru P1 neurons (fru P1>mCD8::GFP) (Manoli et al., 2005; Figure 1B). This is the pupal stage where FruM is at peak expression in males, based on immunostaining, and a critical period for when fru P1 establishes the potential for behavior (Arthur et al., 1998; Belote and Baker, 1987; Chen et al., 2021; Lee et al., 2000). After 10× library construction, Illumina sequencing, read mapping, and processing with the 10× Genomics CellRanger pipeline, we obtained gene-barcode matrices for further analyses.

These matrices were filtered to retain high-quality cells: cells were removed that had >5% mitochondrial transcripts (dying and/or stressed cells),<200 genes detected (empty droplets), those expressing more than 4000 genes (nfeatures), and/or possessing more than 20,000 unique molecular identifiers (UMIs; a metric for transcript counts) (potential multiplets) (Figure 1—figure supplement 1). After filtering, there were 7988 male cells and 17,530 female cells in the pooled replicates (Figure 1—figure supplement 1). The pooled replicates from males have 1653 genes and 4712 UMIs per cell (median values). The pooled replicates from females have 1729 genes and 5186 UMIs per cell (median values) (Source data 1). The combined filtered data from both sexes contain 25,518 fru P1-expressing cells (hereafter called the full data set). In the full data set, the cells have a median of 1706 genes and 5046 UMIs detected per cell, which is on par with or better coverage than other single-cell atlas studies that examine the Drosophila CNS (Allen et al., 2020; Avalos et al., 2019; Croset et al., 2018; Davie et al., 2018; Li et al., 2021; Source data 1). As expected, given that fru P1 is expressed in neurons, 99% of the cells in the full data set are classified as neurons (25,509 cells). This is based on detecting expression of at least one of three genes with known neuronal expression (embryonic lethal abnormal vision, elav; neuronal Synaptobrevin, nSyb; long non-coding RNA, noe; Source data 2; DiAntonio et al., 1993; Kim et al., 1998; Robinow et al., 1988). Given this, the cells in the scRNA-seq data set will be called neurons, hereafter. There is minimal cell-cycle impacts on gene expression in neurons in this study, given that during the pupal stage we examined, the majority of neurons are post-mitotic, with only four neuroblasts continuing to divide into the late pupal stages (Ito and Hotta, 1992). Furthermore, we observe that a large percentage of neurons express the post-mitotic markers genes elav (77%) and nSyb (93%) (Source data 2). We also searched the data set for expression of string (stg) that encodes a tyrosine protein phosphatase required for cell-cycle progression, with elevated expression in mitotic neuroblasts. stg expression was only found in 0.3% of neurons in our data set (Source data 2), which also indicates that nearly all the neurons examined are post-mitotic.

We next scaled and normalized the expression data from the full data set, and also the male and female data sets separately, using a Seurat workflow (Stuart et al., 2019). To generate UMAP plots for visualization and analysis, we selected the number of significant principle components from the expression data, based on Jackstraw analyses (Stuart et al., 2019) (p < 0.05 for PCs; Figure 1—figure supplement 2). The male and female replicates show large overlap in UMAP space, indicating that there are minimal batch effects between replicates within sex (Figure 1—figure supplement 1G–I). To identify neurons that exhibit similarities in their gene expression profiles, we performed graph-based clustering analysis on each data set. To examine gene expression differences between the clusters in each data set, we identified marker genes, using the Seurat ‘FindAllMarkers’ function and the Wilcox rank sum test (min.pct = 0.25, logfc.threshold = 0.25; which resulted in all called marker genes having p < 0.05). Marker genes are those that have significantly higher expression in one cluster compared to all other clusters within each analysis (Source data 3).

Here, we present an in-depth examination of the UMAP and clustering from the full data set and provide links to the sex-specific data sets, in some sections. The male data set has 77 clusters (Figure 1C, Figure 1—figure supplement 3A), the female data set has 88 clusters (Figure 1D, Figure 1—figure supplement 3C), and the full data set has 113 clusters (Figure 1E, Figure 1—figure supplement 4). We observed a high correlation in the number of neurons in each cluster between the replicates from each sex (male replicates r = 0.93; female replicates r = 0.88) (Source data 2), indicating small technical variation in the number of neurons per cluster. Given that fru P1 is expressed in ~2000 neurons (Lee et al., 2000), it is possible we are examining the full repertoire of fru P1 neurons (~3.9× coverage for male cells and ~8.8× coverage for female cells), assuming there are not biases in the ability to capture different fru P1 neurons in the experimental workflow. In addition to the full data set analysis on the merged male and female data, we also performed an analysis on integrated male and female data, using the Seurat integration workflow. These results were highly concordant in terms of overlap (Figure 1—figure supplement 5C, D, Source data 2), providing evidence that the overall conclusions are independent of this step in the data processing (see methods for additional comparative details). We also include the integrated analysis for comparison (Figure 1—figure supplement 5, Source data 2 and 3).

Next, we evaluate the molecular differences across the clusters by examining marker genes. There is a range in the number of marker genes for each cluster: 48–501 in the male data set, 74–555 in the female data set, and 61–610 in the full data set (Source data 3). In total, we find 3582 marker genes in the male data set, 3203 in the female data set, and 3662 in the full data set (Source data 3). Gene expression heatmaps show the top five marker genes, based on log fold-change, for each cluster (Figure 1—figure supplement 4C–E). The robust expression differences of the marker genes across clusters further confirm that the clusters are different at the molecular level and that the resolution chosen for the clusters results in biologically meaningful separations. To understand the functions of neurons in the different clusters, we provide detailed annotation of the clusters, with a primary focus on the full data set (113 clusters, Figure 1E; Source data 3). To gain insight into which genes to focus on for annotation, we determined which gene functional groups are enriched in the marker genes lists, by performing gene ontology (GO) analyses. An examination of the top 10 overrepresented terms for three GO categories: ‘molecular function’, ‘biological process’, and ‘cellular component’ (Figure 1—figure supplement 7A–C, Source data 4), shows that they are shared across the three data set analyses (Figure 1—figure supplement 7A–C), and are characteristic of developing neurons. In addition, we also performed protein domain and pathway enrichment analyses for these marker gene lists and find that the top enriched terms are also shared across the three data sets (Source data 4). We annotate each cluster by indicating the distribution of the marker genes from some of the enriched categories (GO, protein domain, and pathways) and from a curated selection of genes that are biologically relevant (Source data 3). Additional annotation of the clusters is based on published expression data, GO enrichments of marker genes from each cluster, and overlap with other single-cell studies in the CNS (Source data 3; Allen et al., 2020; Avalos et al., 2019; Croset et al., 2018; Davie et al., 2018; Ma et al., 2021). We provide a name for 46 clusters and information at the level of marker gene expression for the remaining clusters.

Examination of fru P1 scRNA-seq clusters with sex differences in neuron number

An examination of the full data set UMAP plot shows separation of male and female neurons within each cluster, with some clusters having more neurons from one sex (Figure 2A, Figure 1—figure supplement 1I). This separation of the male and female cells is not due to the highly expressed male-specific long non-coding RNA on the X 1 and/or 2 (roX1 and roX2), based on a full Seurat analysis performed when these genes are removed (Figure 1—figure supplement 6I, J). We note that some of the separation may be due to developmental differences between males and females, given that females have a shorter pupal phase than males (Bakker and Nelissen, 1963). The results indicate that male and female neurons have shared gene expression, with some expression differences that drive the separation in UMAP space. This is consistent with our previous studies that showed male and female fru P1 neurons have shared gene expression profiles, using a cell-type-specific RNA-sequencing approach called Translating Ribosome Affinity Profiling (TRAP) (Newell et al., 2016; Palmateer et al., 2021; Thomas et al., 2012). When we examine the overlap in marker genes from the three scRNA-seq data sets with gene lists from our previous TRAP study that examined gene expression enriched in fru P1 neurons at 48-hr APF, we find significant overlap between the gene lists (Source data 3; Palmateer et al., 2021), providing further validation of the scRNA-seq approach. The idea that fru P1 neurons from males and females would share gene expression repertoires also fits well with the observation that homologously positioned fru P1 neurons in males and females arise from a shared developmental trajectory (Ren et al., 2016), with cell bodies that reside in similar anatomical positions (Manoli et al., 2005; Stockinger et al., 2005). Additionally, studies using genetic intersectional strategies to examine subpopulations of fru P1 neurons find homologously positioned neurons in both sexes, across a large range of molecular expression tools (e.g., see Brovero et al., 2021; Cachero et al., 2010; Palmateer et al., 2021; Yu et al., 2010).

Figure 2 with 2 supplements see all
Sex differences in single-cell clustering.

(A–B) Uniform Manifold Approximation and Projection (UMAP) plots of the 25,518 fru P1 neurons from males and females (full data set) with all cells labeled by sex (A) and by sex bias in cell number (B) (Source data 5). (B) Sex-biased classification is based on normalized number of female cells (legend on right, Source data 5). Cluster numbers and black arrows indicate female-specific clusters (red) and the male-specific clusters (blue). Female- (light pink) and male biased (light blue) have two-fold more neurons compared to the other sex. Female- (purple) and male-biased, strong (dark blue) have four-fold more neurons compared to the other sex. (C) Dot plot of top 5 marker genes in each sex-specific cluster based on log fold-change in expression (Source data 3). (D) Brain and ventral nerve cord (VNC) confocal maximum projections from 0- to 24-hr adults for CCHa2fru P1-expressing neurons, with intersecting expression shown in green for males (left) and females (right). Scale bars = 50 μm. (E, F) Subclustering analyses of neurons from male-specific and male-biased clusters that express dsx defines five subclusters (clusters are indicated by color and number). (F) The five subclusters are shown with sex indicated by color, with male cells in blue and female cells in pink. Female cells are present only in subclusters 0 and 3. (G) Dot plot of top 5 marker genes in each dsx subcluster based on log fold-change in expression (Source data 3). (H) Dot plot of Hox genes with known VNC expression in dsx subcluster 0, for both sexes. For all dot plot (C, G, H) panels the dot size indicates the percentage of neurons expressing each gene per cluster (Percent Expressed). Average normalized expression level is shown by color intensity (Average Expression).

We next evaluated the clusters that have differences in neuron number between the sexes in our full data set. Given there are 2.19× more female neurons in the full data set we did several tests to determine the impact of this on clustering. After normalizing the number of neurons in each female cluster by dividing by 2.19×, we find that 52% of the clusters show a sex bias in their neuron numbers. 24% show a >twofold difference (sex bias), 28% show >four-fold difference (strong sex bias), and 3.5% of the clusters are sex specific (Figure 2B, Source data 5). All references to sex-biased clusters in the full data set analysis are on normalized data values (Source data 5). The workflow to obtain fru P1 neurons may have biases in the neurons captured, though our data comparing replicates indicate the purification protocol is highly reproducible given the correlation of cell numbers per cluster between the two replicates (Source data 2).

We next determined if this difference is maintained when equal numbers of neurons between the sexes are present in the UMAP, by removing random subsets of female neurons to equal the number of male neurons, which was performed three independent times. We find that six clusters change their status after random removal across some of the independent cell removal analyses (Figure 2—figure supplement 1A, Source data 5). Further, we repeated the full Seurat workflow by first removing female cells at random (downsampling) and generating new UMAP plots and clustering analysis for three independent downsampled analyses, with equal numbers of male and female cells. To examine the similarity of gene expression between the clusters, we use Pearson correlations in clustifyr (Fu et al., 2020). We match 80–85% of the clusters in the downsampled analyses to those found in the full data set, and of those there is high concordance of sex-specific and sex-biased clusters (Figure 2—figure supplement 1B, Source data 5). However, fewer female-specific clusters are present in the downsampled analyses and three additional male-specific clusters are resolved (Figure 2—figure supplement 1B, Source data 5). Though differences are revealed in the analyses when female neurons are randomly removed to equal the number of male cells, most sex-biased clusters are preserved. Taken together, we proceed with the analysis of the clusters in the full data set, as the additional female cells provide information.

Sex-specific clusters

In the full data set there are four sex-specific clusters (three female-specific and one male-specific), with two maintained after the downsampling analysis (female-specific cluster 107 and male-specific cluster 68) (Figure 2B, arrows). One of the top marker genes for the female-specific cluster 107 is CCHamide-2 (CCHa2), which encodes a neuropeptide hormone that stimulates feeding (Ren et al., 2015). Cluster 107 neurons may be from the VNC, given that cluster 107 marker genes include Hox segment identity genes Antennapedia (Antp) and Ultrabithorax (Ubx). Antp and Ubx define VNC T1–T3 and T2–T3 thoracic segments, respectively (Baek et al., 2013; Jarvis et al., 2012). To determine if there are female-specific CCHa2 neurons that also express fru P1 (CCHa2fru P1) in the VNC, we used a genetic intersectional approach to visualize CCHa2fru P1-expressing neurons. The approach relies on the two-component Gal4/UAS system that drives expression of a Myc-tagged membrane-bound reporter protein (Nern et al., 2015). Expression of the UAS-reporter requires Flippase (FLP)-mediated removal of a stop cassette. The expression of FLP is directed by fru P1 regulatory elements (Yu et al., 2010). Therefore, reporter gene expression is limited to fru P1 neurons that also express Gal4. Throughout this study, we examine expression at 48-hr APF and 0- to 24-hr adults, given that the intersectional approach might be slow to report on expression given the requirement of a genomic, FLP-mediated, recombination event and production of several proteins. Additionally, fru P1 expression is most thoroughly characterized in adults (Billeter and Goodwin, 2004; Cachero et al., 2010; Kimura et al., 2005; Lee et al., 2000; Manoli et al., 2005; Stockinger et al., 2005; Yu et al., 2010). CCHa2 is also a marker gene for additional clusters, so we expect the expression pattern to include neurons not in cluster 107. There is no detectable reporter gene expression at 48-hr APF. In 1-day-old adults, we find female-specific CCHa2fru P1-expressing neurons in both the brain and VNC, and CCHa2fru P1-expressing neurons in the median bundle in both sexes (Figure 2D). There are no male CCHa2fru P1-expressing neurons in the VNC. This observation indicates that there are female-specific CCHa2fru P1 neurons in the VNC, as suggested by the cluster analysis.

The top marker gene for the male-specific cluster 68 is the sex hierarchy gene dsx. This is consistent with the observation that there are male-specific populations of dsxfru P1 neurons in the brain (pC1, pC2, SN, and SLN) and VNC (TN1 and abdominal ganglion) (Billeter et al., 2006; Ishii et al., 2020; Lee et al., 2002; Rideout et al., 2007; Sanders and Arbeitman, 2008). Given that cluster 68 does not express Hox segment identity genes that are expressed in the VNC, or other genes with known VNC expression, these neurons are likely from the brain (Figure 2—figure supplement 2B). There are two additional clusters in the full data set that have dsx as a marker gene: one strongly male-biased (4× more male cells after normalization) and one male-biased cluster (2× more male cells after normalization) (Figure 2—figure supplement 2C, Source data 3 and 4). To analyze these clusters more deeply, we subclustered these neurons, selecting for those with detectable dsx expression and identified marker genes for each subcluster (Figure 2B, Figure 2—figure supplement 2C). This reanalysis resolved five subclusters, retaining the original male-biased cluster as one cluster (subcluster 0), and generating two clusters from the original male-specific cluster (subclusters 2 and 4) and two clusters from the original strongly male-biased cluster (subclusters 1 and 3; Figure 2E, F, Figure 2—figure supplement 2D).

We propose that the male-biased subcluster 0 contains neurons from the VNC, given these neurons also express Antp, Ubx, abdominal A (abd-A), and Abdominal B (Abd-B) (Figure 2E, F, Figure 2—figure supplement 2E–H, Source data 3), and the most significant Berkeley Drosophila Genome Project (BDGP) enrichment term for the marker genes is ‘ventral nerve cord’, based on analysis in Flymine portal (Lyne et al., 2007). In the VNC, both males and females have dsxfru P1 neurons in the abdominal ganglion, and there are male-specific dsxfru P1 neurons in the metathoracic ganglion called TN1 and TN2 neurons (Billeter et al., 2006; Lee et al., 2002; Rideout et al., 2007; Sanders and Arbeitman, 2008). When we examine the expression of Hox genes in subcluster 0, many of the male and female neurons show expression of abd-A and Abd-B (Figure 2H). Given this, we propose these neurons are the dsxfru P1 neurons in the abdominal ganglion. There are also neurons in subcluster 0 only in males, with expression of Antp and Ubx (Figure 2H) that we propose are the male-specific metathoracic TN1 or TN2 neurons. Consistent with this, we previously showed TNI and TN2 neurons underwent cell death in females by 48-hr APF (Sanders and Arbeitman, 2008).

We propose that the strongly male-biased and male-specific dsx and fru P1-expressing neuron clusters contain neurons from the brain, based on expression of Hox genes (Figure 2—figure supplement 2E–H). Previous studies have shown dsxfru P1 neurons in the brain in PC1 and PC2 regions, as well as small sets of neurons called SN and SLN (Lee et al., 2002; Rideout et al., 2007; Sanders and Arbeitman, 2008). The strongly male-biased cluster split into two subclusters, one that has two female neurons (subcluster 3), and the other is now male specific (subcluster 1). The male-specific cluster (Figure 2—figure supplement 2C) split into subclusters 2 and 4 that are positioned closely together in the UMAP space (Figure 2—figure supplement 2D). A recent study using genetic intersectional approaches found 1–2 pC1/pC2 female dsxfru P1 neurons (Ishii et al., 2020), whereas we did not detect any (Sanders and Arbeitman, 2008), leaving open that possibility that neurons in subcluster 3 are from either pC1 or pC2 brain regions, given the detection of female cells. Also, it has been proposed that male pC2 dsx-expressing neurons are comprised of distinct populations, based on fasciculation patterns from either medial and lateral positions (pC2 renamed pC2m and pC2l) (Robinett et al., 2010). Thus, subclusters 1, 2, 3, and/or 4 may contain neurons that are from the pC1 or pC2, SN, SLN dsx-expressing regions in brain, given there are no known molecular markers that would distinguish these populations and there are predicted to be several distinct populations based on morphological and spatial features. An examination of marker genes unique to each subclusters suggests there may be functional differences (pathway enrichment, Benjamini–Hochberg p < 0.1; Source data 6).

Hox gene expression identifies fru P1 neurons from the VNC

To annotate clusters from the VNC, we examined expression of Hox genes Antp, Ubx, abd-A, and Abd-B that are known to be expressed in an anterior to posterior pattern in the VNC (Baek et al., 2013). These genes show restricted expression in a subset of clusters (Figure 3A, Figure 3—figure supplement 1A–H). Additionally, we find several have more than one of these Hox genes as a marker gene (Source data 3). For example, male data set cluster 1 has all four genes as markers, which indicates that neurons from several thoracic and abdominal VNC segments may be present in some clusters. This also indicates that neurons from different segments may have shared gene expression profiles. Therefore, any cluster with one or multiple Hox marker genes was annotated as a ‘VNC’ cluster (Source data 3), resulting in seven VNC clusters in both the male and female data set, and 11 in the full data set (Figure 1D, F, G, Figure 3B, Figure 3—figure supplement 1I–J, Source data 3). In addition, we performed a correlation analysis to examine co-expression of Antp, Ubx, abd-A, and Abd-B within single neurons. All the data sets have significant positive correlations for expression of these genes, with the highest correlations found between abd-A and Abd-B, consistent with these genes having the strongest overlap in their spatial expression (Baek et al., 2013; Figure 3C, Figure 3—figure supplement 1K, L). This finding differs from scRNA-seq on the adult VNC, which showed significant anti-correlations in expression between all pairings, with the exception of abd-A and Abd-B (Allen et al., 2020). This is consistent with the observation that during development there is a refinement of Hox gene expression that limits co-expression in VNC neurons in adults (Allen et al., 2020; Baek et al., 2013).

Figure 3 with 1 supplement see all
Ventral nerve cord (VNC) cluster annotations based on Hox gene expression.

(A) Gene expression feature plots showing neurons that express Hox genes with known expression in the VNC (Antp, Ubx, abd-A, and Abd-B), in the full data set Uniform Manifold Approximation and Projection (UMAP). The gene expression feature plots show gene expression levels in purple in the UMAP, with color intensity proportional to the log-normalized expression levels. (B) UMAP plot showing annotated VNC clusters in the full data set analysis. The VNC clusters are numbered and highlighted in green (Source data 3). (C) Heatmap of correlation of Hox gene expression in single neurons in the full data set. Pearson’s r values denoted on heatmap with color, according to legend (right).

Identification of fru P1 mushroom body Kenyon cell neuron populations

The mushroom body is a structure in the brain that is critical for Drosophila learning and memory (reviewed in Davis, 1993; Heisenberg, 2003). The mushroom body is comprised of intrinsic neurons called the Kenyon cells (KCs), as well as extrinsic input and output neurons (Aso et al., 2014; Li et al., 2020). fru P1 is expressed in the mushroom body γ and αβ KCs, whereas there are no reports of expression in the α′β′ KCs. Furthermore, previous work has demonstrated that when fru P1 function is reduced by RNAi in γ or αβ KCs there are deficits in courtship learning (Manoli et al., 2005). To gain insight into fru P1 MB KCs, we annotated clusters that are KC subtypes, based on marker gene analysis, using criteria similar to a previous scRNA-seq study of the Drosophila midbrain (Croset et al., 2018; Figure 4A, Source data 3). The marker gene criteria are schematically summarized (Figure 4—figure supplement 1A). Clusters that contain KCs can be identified based on having eyeless (ey) and/or Dopamine 1-like receptor 2 (Dop1R2, also known as damb) as marker genes (Han et al., 1996; Kurusu et al., 2000; Source data 3). To identify clusters that contain αβ KCs subtypes, we additionally required that short neuropeptide precursor F (sNPF), and/or Fasciclin 2 (Fas2) are marker genes (Cheng et al., 2001; Crittenden et al., 1998; Johard et al., 2008). This identified three in the full data set analyses (Source data 3). The clusters that contain γ KCs were identified as those that have trio as a marker gene, in addition to the marker genes described for identifying αβ KCs. trio encodes a Rho guanine nucleotide exchange factor that has high expression restricted to the γ KCs, at 48-hr APF (Awasaki et al., 2000). This allowed us to confidently annotate two γ KC clusters in each data set analysis (Source data 3). We also annotate potential γ KCs and αβ KC clusters based on having marker gene combinations consistent with containing KCs (indicated by *; Source data 3). For example, we find one potential γ KC cluster in the female data set analysis that has only Dop1R2 and trio marker genes. We also find several potential αβ KC clusters where sNPF and Fas2 are marker genes, and there is expression of ey and/or Dop1R2, but these are not marker genes (see methods, Figure 4—figure supplement 1). We could not confidently identify clusters containing fru P1-expressing α′β′ KCs, if they exist, as there are no known marker gene patterns that would distinguish this population at 48-hr APF.

Figure 4 with 1 supplement see all
Annotation of mushroom body Kenyon cells (KCs).

(A) Uniform Manifold Approximation and Projection (UMAP) plot highlighting KC cluster annotations in the full data set analysis (Source data 3). N (gray) indicates clusters not meeting marker gene criteria used to annotate KC clusters (see methods and Figure 4—figure supplement 1A). (B) Subclustering analysis of neurons from mushroom body KC clusters (colored clusters from panel A), creating 13 subclusters. (C) The 13 subclusters are shown with sex indicated by color, with male cells in blue and female cells in pink. (D) Dot plot of top 5 marker genes in each mushroom body KC subcluster based on log fold-change in expression (Source data 3). Dot size indicates the percentage of neurons expressing each gene per cluster (Percent Expressed). Average normalized expression level is shown by color intensity (Average Expression).

As several studies have implicated the αβ and γ lobes as being important for courtship memory in males, we next wanted to further examine the six high confidence KC clusters identified in the full data set analysis by subclustering and assessing marker genes. This resulted in 13 subclusters (Figure 4B), where the original three αβ KC clusters separate into five subclusters and the two γ KC clusters separate into seven subclusters (Figure 4—figure supplement 1J). All subclusters identified contain cells from each replicate (Figure 4—figure supplement 1K). The GO enrichment analyses for these subclusters are consistent with learning and memory functions (Source data 4 and 6). We also find marker genes in subclusters that have previously been shown to be expressed in the mushroom body in a sex-differential manner (Obp99a, Obp44a in females) (Crocker et al., 2016), other genes known to be highly expressed in KC neurons (Dop1R1) (Croset et al., 2018), and genes indicative of serotonin production (Vmat and SerT) (Figure 4C, D, Source data 6). Given the large number of KC neurons and subtypes previously defined using Gal4 driver tools (Aso et al., 2014), these results point to molecular differences that need be further validated in vivo to gain insight into the different morphological categories. Altogether, this analysis suggests the existence of fru P1 KC subtypes that might have a role in learning during reproductive behaviors.

Identification of fru P1 neurons that produce fast-acting neurotransmitters

We next identify the fru P1 neuronal clusters that produce the fast-acting neurotransmitters (FANs) acetylcholine, glutamate, and GABA. To annotate clusters, we used marker genes that encode products in an FAN biosynthetic pathway or encode transporters for the neurotransmitters, as previously done (Allen et al., 2020; Avalos et al., 2019; Croset et al., 2018; Davie et al., 2018). Vesicular acetylcholine transporter (VAChT) and Choline acetyltransferase (ChAT) were used to identify cholinergic neurons. Vesicular glutamate transporter (VGlut) was used to identify glutamatergic neurons. Glutamic acid decarboxylase 1 (Gad1) and Vesicular GABA Transporter (VGAT) were used to identify GABAergic neurons (Figure 5A, Figure 5—figure supplement 1A, H). The majority of clusters that have neurons that produce FANs are annotated as cholinergic. For example, in the full data set analysis there are 32 cholinergic clusters, 18 GABAergic, and 16 glutamatergic (Figure 5A, Source data 3). Only one full data set cluster has marker gene expression for more than one FAN (cholinergic and glutamatergic), and the male and female data set each have one cluster that has marker gene expression for more than one FAN (GABAergic and glutamatergic) (Figure 5A, Figure 5—figure supplement 1A, H, Source data 3).

Figure 5 with 1 supplement see all
Annotation of neurons that produce fast-acting neurotransmitters (FANs).

(A) Uniform Manifold Approximation and Projection (UMAP) plot showing annotated clusters with neurons that produce FANs, in the full data set analysis (Source data 3). In the color legend on right, N (gray) indicates clusters with no FAN marker gene expression. (B) Size proportional Euler diagram showing the percentage of cells with overlapping expression of genes indicative of cholinergic (VAChT), GABAergic (Gad1), and glutamatergic (VGlut) neurons, in the full data set. (C–G) Gene expression feature plots showing neurons that express genes indicative of FAN production: acetylcholine (ChAT and VAChT), GABA (Gad1 and VGAT), and glutamate (VGlut), in the full data set UMAP. (H–S) Brain and ventral nerve cord (VNC) confocal maximum projections for FAN-producing ∩ fru P1-expressing neurons, with intersecting expression shown in green for males and females, as indicated. ChATfru P1 neurons, VGATfru P1 neurons, and VGlutfru P1 neurons in 48-hr after puparium formation (APF) and 0- to 24-hr adult are shown. Arrows point to regions with sexually dimorphic projections or cell bodies. Scale bars = 50μm.

To determine if individual neurons express genes that would be indicative of a neuron producing multiple FANs, we examined expression of VAChT (cholinergic), Gad1 (GABAergic), and VGlut (glutamatergic) at the neuron level, in each data set (Figure 5B, Figure 5—figure supplement 1B, I). Similar to other single-cell studies in the Drosophila CNS, this analysis revealed that fru P1 neuronal populations largely produce only one FAN and therefore are mostly exclusive (Figure 5B, Figure 5—figure supplement 1B, I; Allen et al., 2020; Avalos et al., 2019; Croset et al., 2018). In contrast to these other scRNA-seq studies, no neurons in our data set express genes indicative of production of three FANs, and ≤8% of neurons co-express two genes indicative of production of two FANs (Allen et al., 2020; Avalos et al., 2019; Croset et al., 2018). The expression of these FAN biosynthetic pathway or transporter genes at the neuron level are also shown (Figure 5C–G, Figure 5—figure supplement 1C–G, J–N), indicating that 62% are cholinergic (VAChT and/or ChAT), 19% are GABAergic (Gad1 and/or VGAT), and 14% are glutamatergic (VGlut) (Source data 2).

To visualize the spatial patterns of fru P1 neurons that produce different FANs we use a genetic intersectional strategy that relies on Gal4 transgenes that report on ChAT (cholinergic), VGAT (GABAergic), and VGlut (glutamatergic) (Deng et al., 2019; Nern et al., 2015; Yu et al., 2010). We can detect the three different fru P1-expressing FAN classes at 48-hr APF in both sexes, based on reporter gene expression that labels the neuronal membrane of each FAN ∩ fru P1-expressing neuronal population. However, we do not find broad expression of ChAT at 48-hr APF, which is not consistent with our single-cell data that shows extensive expression of ChAT (Figure 5C, Figure 5—figure supplement 1C, J). Examination of ChATfru P1-expressing neuronal populations in 0- to 24-hr adults does reveal broader expression of ChAT compared to VGAT and VGlut (Figure 5N–S).

We find sexual dimorphisms in the morphology of fru P1 neurons in the FAN ∩ fru P1-expressing neuronal populations, some that have been previously described (Figure 5H–S; Cachero et al., 2010; Kimura et al., 2005; Yu et al., 2010). For example, in CHATfru P1 neurons, we observe male-specific expression in the tritocerebral loop (Yu et al., 2010). In VGATfru P1 neurons, we observe the sexually dimorphic mAL neurons, which have previously been shown to be GABAergic and suppress male courtship behavior in adults (Figure 5J, P; Kallman et al., 2015; Kimura et al., 2005; Koganezawa et al., 2010). In addition, we find that the VGlutfru P1 expression patterns exhibit sex differences in cell body positioning (Figure 5L, M, R, S). In the male brain, the majority of VGlutfru P1 cell bodies are exclusively positioned near the Pars intercerebralis (PI), whereas in females, the cell bodies are additionally positioned throughout the midbrain (Figure 5L, M, R, S, white arrows).

There are 17 VGlutfru P1 clusters, without a corresponding number of spatially distinct VGlutfru P1 in the CNS in males and females (Figure 5A, L, M, R, S, Source data 3). This suggests that neurons that reside in similar spatial position may have different molecular identities, as has previously been found in other single-cell studies of the Drosophila CNS (Ma et al., 2021). To interrogate gene expression differences, we further examined the top marker genes for these clusters. We found that prospero (pros), a gene which encodes a homeobox TF that promotes neural differentiation (Doe et al., 1991), is a marker gene in three clusters (Source data 3). When we examined pros expression across all neurons in each glutamatergic cluster we find five additional clusters where pros is not a marker gene but is expressed in a high percentage of cells (Figure 5—figure supplement 1O). This analysis also showed that eight clusters exhibit little to no pros expression (Figure 5—figure supplement 1O).

Next, we determined if neurons that are spatially close together are different at the molecular level, by examining Pros protein. In males, VGlutfru P1 brain neurons are close together spatially, with the majority of the cell bodies positioned together near the Pars intercerebralis region of the brain. These neurons are different molecularly, as only some have nuclear Pros staining, based on immunofluorescence analyses (Figure 5—figure supplement 1P–P', white arrows, Figure 5—figure supplement 1Q–U). Earlier in development, pros has a role in neuroblast differentiation (Doe et al., 1991). However, by 48-hr APF only mushroom body KC neuroblasts are thought to be dividing (Ito and Hotta, 1992). The VGlutfru P1 neurons are not KC neurons, leaving open the role of pros in the fru P1 neurons evaluated here. Overall, the cluster analyses reveals the molecular heterogeneity in closely positioned neurons in the brain, especially in the well-defined spatial clusters that the field has defined (Lee et al., 2000; Manoli et al., 2005; Stockinger et al., 2005).

Aminergic populations of fru P1 neurons

Subsets of fru P1 neurons produce the biogenic amines dopamine, serotonin, tyramine, and octopamine (Billeter and Goodwin, 2004; Certel et al., 2010; Certel et al., 2007; Jois et al., 2018; Lee and Hall, 2001; Lee et al., 2001; Yu et al., 2010; Zhang et al., 2016). These aminergic fru P1 neurons have roles regulating male mating drive (dopamine, Zhang et al., 2016), mate discrimination (octopamine/tyramine, Certel et al., 2010; Certel et al., 2007), and copulation duration (serotonin, Jois et al., 2018; Lee and Hall, 2001). To identify clusters with aminergic-releasing properties, we use marker gene expression of vesicular monoamine transporter (Vmat), that encodes a transporter for vesicular packaging of biogenic amines (Greer et al., 2005). Vmat is a marker gene in three clusters in the full data set, two clusters in the female data set, and is not detected as a marker gene in the male data set (Figure 6A, Source data 3), though it is expressed in restricted clusters of neurons in the male data set analysis (Figure 6—figure supplement 1A). This restricted expression pattern of Vmat is consistent with other scRNA-seq analyses (Allen et al., 2020; Croset et al., 2018; Davie et al., 2018).

Figure 6 with 1 supplement see all
Annotation of neurons that produce aminergic neurotransmitters.

(A) Uniform Manifold Approximation and Projection (UMAP) plot showing clusters where Vmat is a marker gene. (B) UMAP plot showing clusters of neurons that produce aminergic neurotransmitters are identified based on marker gene expression: Tyramine and Octopamine (Tdc2), Dopamine (Dat and/or ple), Serotonin (SerT), in the full data set (Source data 3). The N (gray) indicates clusters with no aminergic marker genes expression. (C–D) Dot plots of top 5 genes for the three dopaminergic clusters (C) and three serotonergic clusters (D), based on log fold-change in gene expression. Dot size indicates the percentage of neurons expressing each gene per cluster (Percent Expressed). Average normalized expression level is shown by color intensity (Average Expression). (E) Violin plots showing expression of abd-A, Abd-B, and dsx in male and female serotonergic cells predicted to be from the abdominal ganglion ventral nerve cord (VNC) neurons. These cells were identified by SerT, abd-A, and/or Abd-B expression, resulting in 41 neurons (17 males and 24 females). (F–H) Proportion of fru P1 dopaminergic neurons (DAT and/or ple-expressing) (F) serotonergic neurons (SerT-expressing) (G), and octopaminergic neurons (Tdc2-expressing) (H) that co-express genes indicative of fast-acting neurotransmitter (FAN) expression in males and females. The proportion that are aminergic and cholinergic (ChAT and VAChT) are shown with green bars, GABAergic (Gad1 and VGAT) are shown with blue bars, and glutamatergic (VGlut) are shown by orange bars. Legend to right of each plot.

Next, we examined the spatial expression patterns of Vmatfru P1 neurons (Figure 6—figure supplement 1D–G). At 48-hr APF in both sexes, we find limited expression in the brain and no expression in the VNC (Figure 6—figure supplement 1D–E). In 0- to 24-hr adults, we also observe limited expression in the brains of both sexes, with projections around the mushroom body γ lobes (Figure 6—figure supplement 1F, G). Further, in 0- to 24-hr adults, there is expression in the VNC of both sexes, consistent with previous reports of Vmat expression in the 5-day adult VNC (Figure 6—figure supplement 1F, G; Allen et al., 2020). The restricted Vmatfru P1 expression is consistent with the limited expression in the clusters.

We additionally annotate the aminergic clusters based on marker gene expression for genes that encode biosynthetic enzymes or transporters of specific aminergic transmitters, described below (Figure 6B, Figure 6—figure supplement 1H, I, Source data 3). Dopaminergic neurons were classified by marker gene expression of Dopamine transporter (DAT) and/or pale (ple). Serotonergic neurons were defined by maker gene expression of Serotonin transporter (SerT). Tyraminergic and octopaminergic were defined by marker gene expression of Tyrosine decarboxylase 2 (Tdc2). Expression of these genes was also visualized at the single neuron level (Figure 6—figure supplement 1A–C), revealing that these genes are also expressed beyond where they are considered cluster marker genes.

Dopaminergic populations of fru P1 neurons

The three Vmat clusters from the full data set analysis have DAT and ple among their top enriched marker genes. Dat and ple are expressed >96% of neurons in these clusters, so these three clusters are annotated as dopaminergic neurons (clusters 58, 74, and 89, Figure 6B, Figure 6—figure supplement 1C, Source data 3). All three dopaminergic clusters annotated here have marker genes previously shown to be in adult midbrain and VNC dopaminergic neurons: PDGP- and VEGF-related growth factor (Pvf3), kekkon 1 (kek1), and homothorax (hth) (Allen et al., 2020; Croset et al., 2018). The three clusters do not have Hox marker genes that are expressed in the VNC, however a small population of neurons in clusters 74 and 89 have Abd-B expression (Figure 6—figure supplement 1J). The top marker genes are largely overlapping between the three clusters, with Vmat, DAT, and ple among the top 5 (Figure 6C). Clusters 58 and 74 also share the IgSF gene DIP-delta as a top non-unique marker gene (Figure 6C, Source data 3). We showed that DIP-deltafru P1 neurons in the adult brain have restricted spatial expression pattern, with broader expression in the VNC (Brovero et al., 2021). This indicates that some DIP-deltafru P1 neurons are likely dopaminergic and may be those previously visualized in the brain (Brovero et al., 2021). Further, when we examine Vmatfru P1 expression patterns in the brain of both sexes (Figure 6—figure supplement 1D–G), their expression pattern is reminiscent of DIP-deltafru P1 neurons, with projections around the mushroom body γ lobes (Brovero et al., 2021).

It is known that male dopaminergic fru P1 neurons have a role in mating drive and copulation duration (Jois et al., 2018; Zhang et al., 2016), so we next evaluated male dopaminergic clusters identified in the male data set analysis (male clusters 56 and 67, Figure 6—figure supplement 1H, Source data 3). A small population of neurons in both clusters express Abd-B and Ubx, suggesting that they are from the VNC (Figure 6—figure supplement 1K), potentially capturing previously identified neurons involved in male copulation duration (Jois et al., 2018). Unlike the dopaminergic clusters identified in the full data set analysis, these two male clusters show little overlap in their marker genes with only five shared marker genes (Source data 3). To compare these clusters, we visualized expression of their top marker genes, revealing that cluster 56 has DIP-beta (Figure 6—figure supplement 1L, Source data 3). In addition, we find cluster 67 has VGlut as a top marker gene, suggesting that populations of male glutamatergic fru P1 neurons are also dopaminergic (Figure 6—figure supplement 1L).

Serotonergic populations of fru P1 neurons

Next, to identify clusters with serotonergic neurons, we examined the clusters that have SerT as a marker gene in the full data set. We identify three serotonergic clusters (clusters 73, 86, and 92). However, all three clusters also contain a substantial number of neurons without SerT expression, with all clusters having less than 70% of their neurons expressing SerT (cluster 73: 26%, cluster 86: 26%, and cluster 92: 70%) (Figure 6—figure supplement 1A–C, Source data 3). These clusters have additional marker genes that have been previously identified in serotonergic neurons in the adult midbrain and VNC (Allen et al., 2020; Croset et al., 2018), which provides further evidence that they should be considered serotonergic clusters. All three serotonergic clusters have IGF-II mRNA-binding protein (Imp), and two have Jim Lovell (lov), as marker genes (Croset et al., 2018). We also find marker genes encoding TFs ventral veins lacking (vvl) and Lim3 for clusters 86 and 92, as well as juvenile hormone inducible 21 (JhI-21) as a marker gene in all three clusters. These three genes have previously been shown to be marker genes in adult VNC serotonergic neurons (Allen et al., 2020). Further, our data suggest that neurons from two serotonergic clusters have autoregulatory control, as one cluster has marker gene expression of 5HT receptors 5-hydroxytryptamine (serotonin) receptor 1A and B (5-HT1A and 5-HT1B, cluster 73) and another has 5-hydroxytryptamine (serotonin) receptor 2B (5-HT2B, cluster 92) (Source data 3), as previously shown in adults (Allen et al., 2020).

To examine expression across these serotonergic clusters, we evaluate the top five marker genes for each serotonergic cluster. We find that all three have SerT, Imp, Dopa decarboxylase (Ddc), and CG2269 as marker genes that are expressed in a high percentage of neurons (Figure 6D, Source data 3). The serotonergic clusters 86 and 92 share enriched expression of Neuropeptide-like precursor 1 (Nplp1) and VGlut, suggesting that these serotonergic neurons are also glutamatergic and peptidergic (Figure 6D, Source data 3). This analysis also reveals specific and highly expressed markers for these neuron populations: cluster 86 expresses CG15082 and cluster 92 expresses the neuropeptide-encoding gene Allatostatin C (AstC) (Figure 6D, Source data 3), indicating molecular differences.

Serotonergic fru P1 neurons in the abdominal ganglion of the VNC are sexually dimorphic, with ~8–10 male-specific neurons in this region that have a role in sperm transfer (Billeter and Goodwin, 2004; Lee and Hall, 2001; Lee et al., 2001). To identify these neurons in our data set we searched for SerT-expressing neurons that also expresses abd-A and/or Abd-B, from the full data set analysis, resulting in 41 neurons (17 male and 24 female). To investigate sex-specific molecular differences, we examined sex-differentially expressed genes in these 41 neurons (Source data 3). dsx is a top marker gene in male neurons (Source data 3). 11 of the 17 male serotonergic abdominal ganglion neurons identified here express dsx, whereas the 24 female cells exhibit no dsx expression (Figure 6E). dsx expression is required for the formation of male-specific fru P1 serotonergic neurons in the abdominal ganglion (Billeter et al., 2006), some of which are involved in sperm transfer (Tayler et al., 2012), suggesting that we may have identified these male-specific neurons in our data set.

Tyraminergic and octopaminergic populations of fru P1 neurons

Tyraminergic and/or octopaminergic fru P1 neurons have been shown to have a role in mate discrimination. Artificially activating octopaminergic fru P1 neurons or using RNA-mediated interference against fru P1 transcripts in octopaminergic neurons increases male–male courtship (Certel et al., 2010; Certel et al., 2007). The gene-encoding tyrosine decarboxylase 2 (Tdc2), an enzyme involved in synthesizing the amino acid tyrosine to tyramine, is expressed in both tyraminergic and octopaminergic neurons (Roeder, 2005). We identify tyraminergic/octopaminergic neurons using the Tdc2 marker gene. This results in two tyraminergic/octopaminergic clusters in the full data set and one in the female data set (Source data 3). Tdc2 is not detected as a marker gene in the male data set but is expressed in a small population of male cells (Figure 6—figure supplement 1A). Interestingly, in the full data set, one of the tyraminergic/octopaminergic clusters, cluster 51, is also identified as a mushroom body KC cluster (Source data 3). This suggests that some previously identified Tdc2-expressing mushroom body neurons may also include fru P1-expressing neurons (Aso et al., 2014). However, to our knowledge, no reports of fru P1 and Tdc2 expression have been shown in the mushroom body.

We did not detect Tdc2fru P1 neurons using the genetic intersectional strategy, with a recently published Tdc2-Gal4 driver (Deng et al., 2019). There are previous reports of overlap based on FruM antibody staining and a different Tdc2-Gal4 driver (Certel et al., 2010; Certel et al., 2007), suggesting that there may be limitations to our visualization approach. One additional possibility is that there are Gal4 expression differences between these Tdc2-Gal4 driver lines. Further, this may be a FruM-expressing population where fruFLP is not expressed, given there are FruM-expressing neurons that are fruFLP negative (Yu et al., 2010).

fru P1 neurons that produce both aminergic and FANs

We find that one male dopaminergic cluster and two serotonergic clusters have VGlut marker gene expression (Figure 6D, Figure 6—figure supplement 1G, Source data 3). This suggests that there are neurons that may be both aminergic and FAN producing. This prompted us to evaluate at the neuron level co-expression of genes indicative of biogenic amine and FAN production or transport (Figure 6F–H). Here, aminergic neurons were classified based on DAT and/or ple (dopaminergic), SerT (serotonergic), or Tdc2 (tyraminergic/octopaminergic) expression and we asked about their overlapping expression with genes we previously used to identify FAN-producing neurons. For the three classes of aminergic neurons the majority are also cholinergic, based on their co-expression of either ChAT or VAChT (Figure 6F–H, green bars). The male tyraminergic/octopaminergic neurons have a higher proportion of glutamatergic (VGlut) co-expressing neurons compared to GABAergic (VGAT or Gad1), whereas the opposite is seen in females (Figure 6H). Overall, we find a range of fru P1 aminergic neurons show co-expression with genes indicative of producing/transporting FANs (10–54% of aminergic neurons). This suggests that populations of fru P1 neurons likely release at least two neurotransmitters.

Identification of fru P1 neurons that express circadian clock genes

Research of circadian/sleep behaviors have identified sexual dimorphisms (reviewed in Andretic and Shaw, 2005; Ho and Sehgal, 2005; King and Sehgal, 2020; Shafer and Keene, 2021), with fru P1 neurons and mating behavior implicated in mediating sex differences (Fujii and Amrein, 2010; Fujii et al., 2007; Hanafusa et al., 2013; Sakai and Ishida, 2001; Tauber et al., 2003). GO enrichment analyses on marker genes identified two clusters in the full data set, clusters 108 and 109, enriched with genes that have circadian functions (Figures 1C and 7A). The marker genes include period (per), timeless (tim), Clock (Clk), vrille (vri), and Pdp1-epsilon (Pdp1) (Source data 3). We also annotated other clusters with circadian maker genes (Source data 3), but 108 and 109 are the only clusters with more than one marker gene with known circadian functions in the full data set. The male data set has one cluster with per, tim, Clk, vri, and Pdp1 marker genes (Source data 3), whereas there are no female clusters with multiple marker genes with known circadian functions. In the full data set, cluster 109 is male biased in cell number (>twofold more after normalization), though both clusters 108 and 109 contain cells from males and females (Source data 5). An examination of the expression of per, tim, Clk, vri, and Pdp1 in all neurons did not reveal any additional clusters that are likely to be involved in circadian functions that might have been missed when only examining marker genes in clusters (Figure 7—figure supplement 1A–C).

Figure 7 with 3 supplements see all
Annotation of neurons that express circadian clock genes.

(A) Uniform Manifold Approximation and Projection (UMAP) plot showing clusters annotated as circadian clock neurons (blue), in full data set (Source data 3). (B) Subclustering analysis of neurons from annotated circadian clusters (colored clusters from panel A), defining three subclusters. (C) The three subclusters are shown with sex indicated by color. The male cells are in blue and female cells are in pink. (D) Dot plot of top 5 marker genes in each circadian subcluster based on log fold-change gene expression (Source data 3). Dot size indicates the percentage of neurons expressing each gene per cluster (Percent Expressed). Average normalized expression level is shown by color intensity (Average Expression). (E–E″) Confocal maximum projections of brains immunostained for Per (magenta) and fru P1>mcd8::GFP neurons, from 48-hr after puparium formation (APF) male pupae, at ×40 magnification. (E″) contains zoomed inset of fru P1>mcd8::GFP and Per (magenta) neurons in upper right. The right brain hemisphere is shown. (F) Brain and ventral nerve cord (VNC) confocal maximum projections for Clk856fru P1 expression in 0- to 24-hr adult males (left) and females (right). Scale bars = 50 μm.

To further examine the neurons in clusters 108 and 109, we performed a subclustering analysis, resulting in three subclusters that each have male and female neurons (Figure 7B, C). Expression of per, tim, Clk, vri, and Pdp1 is different in the three subclusters (Figure 7—figure supplement 1D), as is expression of the top five maker genes (based on log-fold-change in expression, Figure 7D), suggesting the three subclusters are comprised of different circadian regulatory neurons, or the neurons could have distinct gene expression profiles due to slight differences in time of day when the neurons were processed for 10× Genomics library preparation. Based on expression data from a recent scRNA-seq study of clock neurons (Ma et al., 2021), we postulate that subcluster 0 is most similar to DN1p neurons given this cluster has gl, VGlut, Rh7, and Dh31 as marker genes (Figure 7B, Source data 6). We postulate that subcluster 1 has similar expression to DN1p neurons, due to having Rh7 as a marker gene (Kistenpfennig et al., 2017; Ma et al., 2021). Consistent with these observations, DN1s have been shown to express FruM in adults and DN1ps show a sex difference in cell number in adults (Fujii and Amrein, 2010; Hanafusa et al., 2013). Marker gene expression in subcluster 2 indicates that neurons in this cluster are similar to s-LNvs or l-LNvs clock neurons, because cry is a marker gene (Yoshii et al., 2008). Previous studies of developing clock neurons, based on per and tim expression, indicate that two DN1 neurons, four small ventral LNs (s-LNvs), and two large ventral LNs (l-LNvs) are present at 48-hr APF (Kaneko et al., 1997). However, we are not able to visualize DN1s or l-LNv neurons at 48-hr APF in either sex with Per antibody staining. We do identify four s-LNvs in both sexes with Per antibody staining at 48-hr APF, three of which overlap with fru P1 expression (Figure 7E–E'', Figure 7—figure supplement 1E–E''). This supports that subcluster 2 may be comprised of developing s-LNvs. However, there is no expression of pdf in subcluster 2, leaving open the possibility that these neurons may be a different subpopulation of circadian neurons that are present in pupae (Helfrich-Förster, 1995; Kaneko et al., 1997).

Functional studies of fru P1 neurons that express circadian clock genes

To visualize circadian neurons that overlap with fru P1 neurons, we use the genetic intersection approach to visualize Clk856fru P1 neurons in the CNS. The Clk856-Gal4 transgene was recently used to classify clock neuron expression in adults, using scRNA-seq (Ma et al., 2021). We did not detect Clk856fru P1 >sm.GDP.Myc neurons at 48-hr APF in either sex, perhaps due to a lag in reporter gene expression, as noted above. When we examined 0- to 24-hr adults, we find Clk856fru P1 >sm.GDP.Myc neurons that have cell bodies positioned in the DN1 and DN3 circadian regions. Additional cell bodies are detected in suboesophageal zone (SEZ) with projections in the SEZ or in the median bundle, and cell bodies are in several regions of the VNC (Figure 7F). When we quantified the number of Clk856fru P1 >sm.GDP.Myc cell bodies in the brain, we observe that males have significantly more DN3 neurons at 0–24 hr (Source data 7). The male-biased number of fru P1-expressing DN3s has not been previously described, to our knowledge (Figure 7F, Source data 7). We note that it is uncertain if DN3 neurons would be present in our 48-hr APF scRNA-seq data set, as DN3 neurons are thought to arise later in pupal development (Kaneko et al., 1997). In 4- to 7-day adults, this sex difference in the number of Clk856fru P1 DN3 neurons does not persist, but we find a larger number of DN3s in both males and females. Now, we find a significant male bias in the number of Clk856fru P1 DN1 neurons at 4–7 days. This is in agreement with previous studies that have quantified fru-expressing DN1s and found a larger number in males (Figure 7—figure supplement 1F–I, Source data 7; Hanafusa et al., 2013). In addition, in 4–7 day adults we find equal numbers of LNds in both sexes.

To examine the behavioral role of Clk856fru P1 neurons, we activated these neurons using the intersectional genetic approach, with UAS<stop<TrpA1Myc (Clk856fru P1>TrpA1 Myc) (von Philipsborn et al., 2011). The TrpA1 channel has been shown to be activated by temperature and downstream of Gq and phospholipase C (Kang et al., 2011; Kim et al., 2010; Kwon et al., 2008; Luo et al., 2017; Roessingh and Stanewsky, 2017; Zhong et al., 2012). We assayed the effect on locomotor activity patterns using the Drosophila activity monitor (DAM, Trikinetics) system, in 12-hr light:12-hr dark (LD) conditions. The TrpA1 channel is Myc tagged, so we further visualized Clk856fru P1 neurons in 4–7 day adults. When we count the number of Clk856fru P1 neurons in both sexes, we find significantly more male DN1s, LNds, and both types of SEZ neurons, as compared to females (Figure 7—figure supplement 1, Source data 7). Generally, when we find these significant sex differences in neuron numbers in Clk856fru P1>TrpA1 Myc or Clk856fru P1>sm.GDP.Myc, the effect size is small (Source data 7).

To examine locomotor activity, we raised flies at 19°C, a temperature where the TrpA1 channel is not activated by temperature, and recorded activity in the DAM system at 25°C, a temperature where the TrpA1 channel is activated by temperature (von Philipsborn et al., 2011). When we compare activity between the sexes across genotypes, we find that the wild-type control, Canton S (CS), males are significantly more active than females (Figure 7—figure supplement 2A). This is not observed in the experimental comparisons of Clk856fru P1>TrpA1 Myc males and females, where the trend is that females are more active (Figure 7—figure supplement 2A). To evaluate activity across the 24-hr day, we examined actograms for all genotypes (Figure 7—figure supplement 2B, C). Clk856fru P1>TrpA1 Myc males exhibit little daytime activity (Figure 7—figure supplement 2B). Clk856fru P1>TrpA1 Myc females show robust activity in the early afternoon compared to all other genotypes (Figure 7—figure supplement 2C). This suggests that the Clk856fru P1 neurons have sleep-reducing roles in females, and sleep-promoting roles in males. These neurons may normally direct the dimorphism in daytime sleep, which has been referred to as the male daytime siesta (Andretic and Shaw, 2005; Ho and Sehgal, 2005). A previous study has also shown that DN1 neurons have a role in promoting the daytime siesta (Guo et al., 2016), and this population of neurons are activated in our Clk856fru P1 intersection.

Next, we determined if Clk856fru P1 neurons had a role in regulating circadian period. To test this, we conducted the DAM assay in 12-hr dark:12-hr dark (DD) conditions after entraining the flies in LD. Consistent with the LD data, actograms containing 2 days of LD data show reduced daytime activity in Clk856fru P1>TrpA1 Myc males and increased activity in the early afternoon in Clk856fru P1>TrpA1 Myc females (Figure 7—figure supplement 3A, B). Remarkably, these actograms also show that Clk856fru P1>TrpA1 Myc males have an observable activity shift over the 10 DD days, indicative of a shortening circadian period (Figure 7—figure supplement 3A, B). To further evaluate the circadian period in DD for all genotypes, we examined periodograms, period peaks, and period strength (Figure 7—figure supplement 3C–E). Strikingly, this analysis reveals a shortened period in Clk856fru P1>TrpA1 Myc males (mean period = 22.8 hr), and a longer period in Clk856fru P1>TrpA1 Myc females (mean period = 24.6 hr), compared to control genotypes (Figure 7—figure supplement 3C, D). When we examine period strength, we find that Clk856fru P1>TrpA1 Myc also exhibit the strongest period strength compared to all other genotypes. Altogether, we find that activating Clk856fru P1>TrpA1 Myc neurons has sexually dimorphic effects on circadian period, with males having a shortened period and females having a lengthened period. These circadian sex differences with Clk856fru P1>TrpA1 activation could be due to quantitative sex differences in Clk856fru P1 neuron number, sex differences in their connectivity, or sex differences in their intrinsic physiology.

Neuropeptide and neuropeptide receptor expression in fru P1 neurons

Out of the 49 annotated neuropeptide-encoding genes in the Drosophila genome (Larkin et al., 2021), we find that 47 genes are expressed in fru P1 neurons in the full data set analysis (Figure 8A), and 18 genes are identified as marker genes (Source data 3). In addition, 64% of the clusters in the full data set (72 clusters) have at least one neuropeptide-encoding gene as a marker gene (Source data 3). Here, we focus on those with known roles in reproductive behaviors in females and males.

Figure 8 with 1 supplement see all
Annotation of neurons that express neuropeptides.

(A) Heatmap of log-normalized gene expression for neuropeptide-encoding genes. Gene expression values were mean scaled and log normalized. (B) Uniform Manifold Approximation and Projection (UMAP) plot showing annotated clusters with neurons that produce neuropeptides with known roles in fru P1 neurons for directing reproductive behavior, in the full data set (Source data 3). (C, D) Confocal maximum projections of brains and ventral nerve cord (VNC) showing NPFfru P1 expression in 0- to 24-hr adult males and females, using an NPF-T2A-Gal4 driver (Deng et al., 2019). White arrows on male brain indicate male-specific NPFfru P1 neurons that have been previously identified (Liu et al., 2019). Scale bars = 50 μm.

We first characterize clusters that express genes that encode neuropeptides with known roles in female-specific reproductive behaviors. For example, Insulin-like peptide 7 (Ilp7)-expressing neurons are involved in oviduct function, and Diuretic hormone 44 (Dh44)-expressing neurons have a role in sperm ejection (Castellanos et al., 2013; Garner et al., 2018; Lee et al., 2015). We find Ilp7 is a marker gene in two clusters, in all three data sets (Figure 8B, Figure 8—figure supplement 1A, B). Ilp7 has been shown to be expressed in fru P1 neurons in both sexes, with female-specific neurons identified in the abdominal ganglion of the VNC, due to cell death in males (Garner et al., 2018). However, neither cluster with Ilp7 as a marker gene in the full data set is female-specific nor shows a sex bias in cell number (Source data 5), so it is possible that we did not identify the female-specific Ilp7 neurons. Dh44 has been shown to have a role in female sperm storage post-mating, with six cholinergic Dh44-expressing neurons in the Pars intercerebralis of the female brain shown to have a role in sperm ejection behavior (Lee et al., 2015). We find Dh44 is a marker gene in two clusters in both the male and female data sets and six clusters in the full data set (Source data 3). In the full data set, three of these clusters with Dh44 as a marker gene also have marker genes that indicated they are cholinergic (ChAT and/or VAChT are marker genes, Source data 3). This previous work showed co-expression of fru P1 with the gene that encodes the Dh44 receptor (Dh44-R1) but did not examine if Dh44-expressing neurons were fru P1 expressing (Lee et al., 2015). Using a T2A-Gal4 knock-in in the Dh44 gene and the genetic intersectional approach (Deng et al., 2019), we visualized the spatial expression of Dh44fru P1 neurons, finding expression at 48-hr APF and in 0- to 24-hr adults in both sexes (Figure 8—figure supplement 1C–F). In 0- to 24-hr adults, this Dh44fru P1 expression pattern includes neurons in the Pars intercerebralis, which may contain the neurons implicated to have a role in female sperm ejection. fru P1 neurons that produce neuropeptides also have known roles in male courtship, copulation, and aggression behaviors (Asahina et al., 2014; Liu et al., 2019; Tayler et al., 2012; Wohl et al., 2020; Wu et al., 2019). For example, a subset of male-specific Tackykinin (Tk) and fru P1 co-expressing neurons in the brain promote male aggression, though a larger number of Tk-expressing neurons have been identified in both sexes (Asahina et al., 2014; Nässel et al., 2019; Wohl et al., 2020). Here, we find that Tk is a marker gene in 10 clusters in the full data set, and five in both the male and female data set analyses (Figure 8B, Figure 8—figure supplement 1A, B, Source data 3). This suggests that beyond the male-specific Tk-Gal4 and fru P1 co-expressing neurons that have been well characterized for their role in aggression (Asahina et al., 2014; Wohl et al., 2020), there are additional Tk-expressing neurons that co-express fru P1 in both sexes.

Male-specific Neuropeptide F (NPF) and fru P1 co-expressing neurons have a role in regulating courtship drive, by inhibiting courtship in sexually satiated males (Liu et al., 2019). We annotate two clusters where NPF is a marker gene in both male and full data set analyses (Figure 8B, Figure 8—figure supplement 1A, Source data 3). NPF is not a marker gene in the female data set (Source data 3). To spatially visualize NPFfru P1 neurons, we used the genetic intersection approach with an NPF T2A-Gal4 knock-in (Deng et al., 2019). There were no visible NPFfru P1 neurons at 48-hr APF in either sex. In 0- to 24-hr adult brains, we identify dramatically different sexually dimorphic neuronal projections, with some cell bodies in similar positions (Figure 8C, D). Based on location of the neurons, we potentially find the previously described male-specific NPF neurons in the brain using the intersectional approach (Figure 8C, D, white arrows). However, the NPFfru P1 expression pattern differs from previous reports that used different NPF-Gal4 or LexA insertions (Liu et al., 2019). Additionally, when we examined expression of all genes that encode neuropeptides, we find additional clusters that express the neuropeptides Corazonin (Crz) and Drosulfakinin (Dsk), which have been shown to be expressed in subpopulations of fru P1 neurons and have impacts on male behaviors, but are not considered marker genes in our data sets (Figure 8A) (sperm transfer, Tayler et al., 2012; courtship inhibition, Wu et al., 2019).

Neuropeptide receptors are also widely expressed in fru P1 neurons, with 104 clusters having at least one gene encoding a neuropeptide receptor as a marker gene in the full data set (Source data 3). Previous studies have found that fru P1 neurons that express neuropeptide receptors have roles in male courtship behaviors. These include Corazonin receptor (CrzR) (sperm transfer, Tayler et al., 2012), Cholecystokinin-like receptor at 17D3 (CCKLR-17D3) (courtship inhibition, Wu et al., 2019), and SIFamide receptor (SIFaR) (increased male–male courtship, Sellami and Veenstra, 2015). In addition, the neuropeptide receptor Leucine-rich repeat-containing G protein-coupled receptor 3 (Lgr3) has been shown to expressed in fru P1 neurons and have a role in female behavior (female receptivity and fecundity, Meissner et al., 2016). All these neuropeptide receptors are found as marker genes in our data set (Source data 3). We visualized CrzRfru P1 and Lgr3fru P1 neurons using T2A-Gal4 knock-in driver lines (Deng et al., 2019; Figure 8—figure supplement 1G–N). CrzRfru P1 neurons in the abdominal ganglia have a known role in male copulation behavior (Tayler et al., 2012). We find that these neurons are in similar positions as previously shown in males and additionally visualize sexually dimorphic neurons in the brain as well as a small set of neurons in the female abdominal ganglia (Tayler et al., 2012; Figure 8—figure supplement 1G–J). The 48-hr APF Lgr3fru P1 neurons show expected median bundle expression, with more GFP-expression in females compared to males, consistent with Lgr3 expression being regulated by FruM (Figure 8—figure supplement 1K–L; Meissner et al., 2016). However, in 0- to 24-hr adult males, we observe male-specific Lgr3fru P1 neuronal projections in the lateral protocerebral complex and in the VNC that were not previously reported (Figure 8—figure supplement 1M–N), which may be due to differences in visualization strategies. Given that 109 of the 113 clusters in the full data set express at least one neuropeptide or neuropeptide receptor as a marker gene, indicates that many fru P1-expressing neurons use neuropeptides for signaling (Source data 3).

nAChR receptor subunit genes expressed in fru P1 neurons

We find nicotinic acetylcholine receptor (nAChR) subunits are marker genes for the majority of clusters in the full data set (91 clusters out of 113) (Source data 3). Nicotinic acetylcholine receptor are pentameric receptors that assemble with combinations of 10 subunits [seven α nAChR subunits (1–7) and three β nAChR subunits (1–3)] (Rosenthal and Yuan, 2021). Of these 10 nAChR subunit genes, we find that nine are marker genes and all 10 are expressed in the full data set (Source data 3). Furthermore, they are broadly expressed with at least 97% of fru P1 neurons expressing at least one nAChR receptor. Out of the 10 subunits, nAChRα5 is the most widely expressed in all three data sets and nAChRβ3 has the narrowest expression (Source data 8). The different subunit combinations confer different physiological binding properties (Perry et al., 2021). To evaluate subunit co-expression, we examined the pairwise expression correlation between the 10 subunits for the three data set analyses and show significant correlations (Figure 9A–C). For all data sets, we find that the overall correlation trends are similar, with some additional significant correlations observed for the female data set (Figure 9A–C). We find the highest positive expression correlations are between: nAChRα5 and nAChRβ1, nAChRα6 and nAChRα7, and nAChRα1 and nAChRα2 for the full data set analysis (Figure 9C). In contrast to the analysis of subunit correlations performed in the adult mid-brain data set, we find anti-correlations between subunit expression data (Figure 9A–C; Croset et al., 2018). These negative correlations occur between nAChRα5 and nAChRα6 and nAChRα5 and nAChRα7, with the strongest negative correlations in the female data set analysis (Figure 9A–C). Given that nAChRα5 and nAChRα6 have the strongest expression correlation in the adult midbrain data set and we observe this pair as having the strongest expression anti-correlation points to the importance of examining different neuronal populations using single-cell approaches.

nAChR subunit gene expression in fru P1 neurons.

Heatmap showing correlation of nAChR subunit gene expression in neurons in the male data set (A), female data set (B), and full data set (C). Pearson’s r values denoted on heatmap with color, according to legends. (D–S) Confocal maximum projections of brain and ventral nerve cord (VNC) for nAChRαfru P1-expressing neurons, with intersecting expression shown in green for males and females, as indicated. Male and female animals from 48-hr after puparium formation (APF) and 0- to 24-hr adults are shown, as indicated. Scale bars = 50 μm.

We next visualized the spatial expression patterns for some nAChRα that have available Gal4 lines. We examined nAChRα1, nAChRα2, nAChRα3, and nAChRα6fru P1 neurons and find broad expression at 48-hr APF and in 0- to 24-hr adults of both sexes, which is consistent with our scRNA-seq expression data (Figure 9D–S). All nAChRfru P1 spatial populations examined have similar expression patterns within each sex, which is consistent with their correlated co-expression (Figure 9D–S). The nAChRfru P1 spatial populations are in all segments of the VNC in both sexes (Figure 9D–S). We also find that spatial expression of nAChRα1, nAChRα2, and nAChRα6fru P1 neurons in 0- to 24-hr adult male brains contain previously characterized sexually dimorphic projections in the tritocerebral loop (Figure 9D–S). Notably, we find that nAChRα3fru P1 expression is absent from the mushroom body in both sexes, whereas we observe mushroom body expression for all other nAChRs ∩ fru P1 examined (Figure 9D–S). We also find that nAChRα3 is not a marker gene in mushroom body clusters identified in our data sets (Source data 3), consistent with previous reports indicating its absence from the mushroom body (Croset et al., 2018; Shih et al., 2019). Our expression findings here, and the identification of several nAChRs in genomic studies examining regulation by FruM, expression in fru P1 neurons, and expression changes post-mating, suggest that nAChRs are important for Drosophila reproductive behaviors (Brovkina et al., 2021; Dalton et al., 2013; Neville et al., 2014; Newell et al., 2016; Newell et al., 2020; Palmateer et al., 2021; Vernes, 2014).

Wnt expression in fru P1 neurons

Wnts have been shown to have a range of function in neurons, including axon guidance, synapse formation, neuronal plasticity, morphogenesis, and signaling in several species (He et al., 2018; Kolodkin and Tessier-Lavigne, 2011; Salinas and Zou, 2008; Sanes and Yamagata, 2009). Here, we further examine Wnt expression, given the GO enrichment analysis of marker genes in the full data set identified several wingless pathway terms including: ‘WG ligand binding’, ‘wingless pathway’, ‘Beta-catenin independent Wnt signaling’, and ‘signaling by Wnt’ (Source data 4). Furthermore, our previous genome-wide association study showed four Wnt pathway signaling genes had allele variants associated with differences in female remating behavior (Newell et al., 2020), adding further impetus to evaluate the Wnt pathway. We found three Wnt genes were marker genes, Wnt4 (14 clusters), Wnt5 (2 clusters), and Wnt10 (4 clusters) (Source data 3).

We visualized the spatial expression patterns for Wnt4fru P1 and Wnt10fru P1 neurons and observe several sexual dimorphisms in their neuronal projection patterns at 48-hr APF and in 0- to 24-hr adults of both sexes (Figure 10A–H). We additionally visualized Wnt5fru P1 in 0- to 24-hr adults (Figure 10I, J). In 0- to 24-hr adult males, Wnt4fru P1 neuron projections look similar to previously identified sexually dimorphic vPR1/vPr-a ascending neurons (Figure 10C, white arrows), which have been suggested to have a role in sensory integration and motor output to control wing song (Cachero et al., 2010; Yu et al., 2010). Wnt10fru P1 neurons are observed only in the brain, and not the VNC, and show sexually dimorphic projections (Figure 10E–H, white arrows). Male Wnt10fru P1 neurons project across hemispheres at both 48-hr APF and 0- to 24-hr stages (Figure 10E, G, white arrows), whereas in female brains no projections between the brain hemispheres were observed (Figure 10F, H, white arrows). Sexually dimorphic midline crossing of gustatory sensory neurons was previously observed in the VNC and regulated by fru and dsx (Mellert et al., 2010; Possidente and Murphey, 1989). The observation that Wnt10fru P1 neurons also have sexually dimorphic midline crossing in the brain suggest this may be a mechanism to generate sex differences in behavior. In 0- to 24-hr adults, Wnt5fru P1 shows broader expression in the brain than Wnt4fru P1 and Wnt10fru P1 neurons, and also exhibits expression in all VNC segments (Figure 10I, J).

Wntfru P1 expression patterns.

(A–J) Brain and ventral nerve cord (VNC) confocal maximum projections showing intersecting expression patterns (in green), for males and females, as indicated. The patterns of Wnt4fru P1, Wnt10fru P1, and Wnt5fru P1 neurons are shown. Scale bars = 50 μm. The Wnt5 stock was not available to perform expression analyses at 48-hr after puparium formation (APF) due to the stock being unhealthy.

Sex differences in gene expression between male and female fru P1 neurons

We next examined differential expression between the sexes within individual cluster in the full data set analysis. For each gene, the expression in male neurons within a cluster was compared with expression in female neurons within the same cluster. This differential expression analysis reveals 6162 genes are significantly differentially expressed between males and females, at the cluster level (3504 higher in males, 2658 higher in females). When we reduce these gene lists to remove redundant genes (the same genes are differentially expressed in more than one cluster) and compare between males and females, we find 711 genes uniquely male biased in expression (always show higher expression in males within at least one cluster), 586 are uniquely female biased in expression (always show higher expression in females within at least one cluster) (Source data 9). There are 147 genes that are either male- or female biased, depending on the cluster (Source data 9). This analysis shows that the same genes are re-used in different neuronal clusters to impart sexual dimorphism, given the redundancy in the gene lists found across the clusters. When we compare these sex-biased genes to genes identified as sex biased in our previous cell-type-specific TRAP study at 48-hr APF, we find significant overlap between gene lists (Source data 3; Palmateer et al., 2021). One limitation to this analysis is if the clusters are comprised of more than one neuronal type, as this may impact the ability to detect sex-differential expression due to averaging the expression. We provide GO enrichment analysis of these lists (Source data 4); several of the enriched GO categories motivated the follow-up investigations presented below.

Immunoglobulin-like domain superfamily (IgSF) expression in fru P1 neurons

A Flymine analysis of the marker gene lists to find ‘protein domain’ enrichments identify ‘Immunoglobulin-like domain superfamily’ with the most significant enrichment in each of the three data sets (Source data 4; Lyne et al., 2007). In the full data set analysis, we find that all clusters have at least one marker gene that is a member of the Immunoglobulin-like domain superfamily, with most expressing a large and unique repertoire (Source data 3). Using a heatmap we visualized expression of IgSF-encoding genes, which shows the average expression of a gene in a cluster, regardless of marker gene status (Figure 11). Consistent with what we observe when we examine marker genes (Source data 3), each cluster exhibits unique expression combinations of IgSF-encoding genes (Figure 11).

Figure 11 with 1 supplement see all
IgSF gene expression in fru P1 neurons.

(A) Heatmap of log-normalized gene expression for IgSF-encoding genes. Gene expression values were mean scaled and log normalized.

Previously we examined expression of dpr and DIP genes in the male CNS from 48-hr APF using 10× single-cell genomic approaches, with fru P1 neurons identified based on expression of fru P1 and/or a GFP reporter (Brovero et al., 2021). Similar to our previous study on males, we find that in both sexes the majority of dprs show broad expression, whereas DIPs exhibit more restricted expression (Figure 11—figure supplement 1). With the addition of female data, we next determine if there are dpr and DIP expression differences between the sexes, for each cluster of the full data set analysis (Figure 11—figure supplement 1). This analysis revealed that there are sex differences in dpr and DIP expression for several clusters in our data set (Figure 11—figure supplement 1, Source data 9). Of the dprs and DIPs that show a significant sex bias in expression within clusters, some show expression that is always higher in female neurons (uniquely female biased; dprs 1, 5, 8, 10, 11, 12, 14, 15, 16, 17, and 20, DIP-alpha and DIP-beta), whereas fewer show expression that is always higher in males (uniquely male biased; dprs 7, 18, and 21, DIPs gamma, theta, and zeta, Figure 11—figure supplement 1, Source data 9). In contrast, dprs 2, 3, 6, 9, and 13 and DIP-delta and DIP-eta show cluster dependent differences in their significant sex-biased expression (Figure 11—figure supplement 1, Source data 9). Future in vivo expression analyses are needed to confirm these quantitative differences. Differences in dpr/DIP co-expression within individual fru P1 neurons may offer a mechanism to generate different cell adhesion properties to mediate synaptic connections, as we previously proposed (Brovero et al., 2021).

Transcription factor expression in fru P1 neurons

During the stage of metamorphosis that we examined, cell fate decisions are being directed by transcription factor (TF) expression, with different combinations of TFs found in different neuronal populations (Doe, 2017; Shirasaki and Pfaff, 2002). Consistent with this, the top ‘molecular function’ GO enrichment terms for marker genes across all three data set analyses is ‘DNA-binding transcription factor activity’ (Figure 1—figure supplement 6, Source data 4). Out of the 628 Drosophila genes encoding TFs (Flybase), 240 are marker genes in the male data set analysis, 228 in the female data set analysis, and 253 in the full data set analysis (Source data 3). We next determined the enriched TF subcategories in Flymine, evaluating subcategories that contained more than 25 genes (Lyne et al., 2007). This identified three subcategories of TF marker genes: homeobox-like domain superfamily (74 genes), zinc finger C2H2 (74 genes), and helix–loop–helix (HLH) DNA-binding domain superfamily (25 genes). We next examined if certain subcategories have restricted or broad expression, to gain insight into which TFs might impart cluster-specific identities.

Most of the homeobox-like domain superfamily TF-encoding genes, including abd-A, Abd-B, and slouch (slou), show cluster specificity, with high expression in a large percent of neurons in the cluster, in only a subset of clusters (Figure 12—figure supplement 1A).There are some homeobox TF-encoding genes, such as pipsqueak (psq), extradenticle (exd), and CG16779, that are expressed more broadly, in nearly all clusters (Figure 12—figure supplement 1A), On the other hand, zinc finger C2H2 TF-encoding genes, which includes fru, has most genes displaying broad expression (Figure 12—figure supplement 2). Though, some zinc finger C2H2 TFs exhibit cluster-specific expression, such as buttonhead (btd), which is predominately expressed in cluster 101 (Figure 12—figure supplement 2). Genes that encode HLH DNA-binding domain superfamily TFs also have several showing broad expression across clusters (Figure 12—figure supplement 3), and others with more restricted expression. These analyses suggest that the homeobox-like domain superfamily may have a large role in directing differences in cell types among the fru P1 neurons, given the overall number of genes with large differences in expression across the clusters (Figure 12—figure supplements 13).

We next determined if the 253 marker genes that encode TFs display significant sex-differential expression. There are 69 genes that encode TFs with cluster-specific sex-differential expression. Seventeen genes are uniquely male biased, 38 are uniquely female biased, and 8 are either male- or female-biased depending on the cluster (Source data 9). We used dot plots to visualize the expression of these TFs with cluster-specific differential expression between the sexes (Figure 12—figure supplement 4A), though future in vivo validation is needed to confirm these quantitative differences. This shows that 49 clusters exhibit significant sex-differential expression of at least one marker gene TF, with several clusters showing differential expression of multiple TFs (Figure 12—figure supplement 4A, black circles). We also visualized expression of the top three sex-differentially expressed TFs in each sex, based on log fold-change between the sexes (Figure 12A). The top three differentially expressed TFs with male-biased expression are: cut (ct), dachshund (dac), and dsx. Those with female-biased expression are: abnormal chemosensory jump 6 (acj6), broad (br), and CG43689 (Source data 9). We show all clusters with significant sex bias in expression for these TFs using UMAP featureplots and dot plots for visualization (Figure 12). This differential TF expression between the sexes is likely a mechanism to direct sex differences in fru P1 neurons.

Figure 12 with 4 supplements see all
Sex differences in transcription factor (TF) expression in fru P1 neurons.

(A) Gene expression feature plots showing gene expression in neurons for TFs that have significant sex-biased gene expression within single-cell clusters between female (left; top three) and male (right; top three). Top sex-biased TFs determined by log FC in Source data 9. The top three male-biased TFs shown are: ct, dac, and dsx. The top three female-biased TFs shown are: acj6, br, and CG43689. Bold cluster numbers in the dot plot indicate the cluster where the gene had the sex-differential expression detection in the top three for each sex. A dot plot representation for all TF genes that have cluster-specific sex-biased marker gene expression is shown in Figure 12—figure supplement 4.

Conclusions

We performed scRNA-seq on pupal fru P1 neurons from males and females to understand how sex differences are built into the nervous system to direct adult reproductive behaviors. The full data set analysis revealed the heterogeneity of fru P1 neurons at 48-hr APF with 113 different molecular clusters identified. This exceeds the number of previously designated spatial cluster and morphological classifications. This suggests that there are more functional differences in neurons than predicted by FruM/fru P1 spatial patterns (Billeter and Goodwin, 2004; Cachero et al., 2010; Lee et al., 2000; Manoli et al., 2005; Stockinger et al., 2005; Yu et al., 2010). The analyses also revealed that nearly all clusters have neurons from both males and females. This suggests that male and female neurons that direct vastly different reproductive behaviors, have a core gene expression program that are overlaid with sex differences in gene expression. This is consistent with our previous TRAP-seq data sets, where we postulated that shared molecular properties underlie common behavioral needs. We proposed that sex differences in gene expression are overlaid on a sex-shared baseline gene expression program to direct male and female behavioral differences (Newell et al., 2016; Palmateer et al., 2021). We also identify sex-specific clusters in our data set. We find a male-specific dsx-expressing cluster, which is consistent with the known male-specific fru P1 and dsx co-expressing neurons (Billeter et al., 2006; Ishii et al., 2020; Rideout et al., 2007; Sanders and Arbeitman, 2008), and we validate a newly identified female-specific CCHa2-expressing cluster (Figure 2, Figure 2—figure supplement 2). Our results also point to sex differences in neuron number across the clusters, which is consistent with previous in vivo studies (for characterization of all fru P1 neurons see Cachero et al., 2010; Manoli et al., 2005; Stockinger et al., 2005; Yu et al., 2010).

Using the enriched marker genes, we annotate 46 clusters, and provide information for all the clusters based on the distribution of the marker genes, with comparisons to other single-cell Drosophila studies and in-depth examinations of different GO groupings (Source data 3). For example, we use Hox gene expression to annotate clusters from the VNC (Figure 3, Source data 3). We used known gene expression profiles to categorize fru P1 neurons as γ and αβ KCs of the mushroom body, including some newly identified molecular subtypes (Figure 4, Source data 3). The marker gene analyses led to the identification of two clusters that have circadian clock neurons, which we further subcluster to reveal additional populations by using known gene expression signatures from a previous single-cell RNA-seq studies (Figure 7; Ma et al., 2021). We perform functional analysis of these fru P1 circadian neurons and show that they have dimorphic roles in period length (Figure 7—figure supplement 3). The analyses of marker genes also show that neuropeptides (64% of clusters) and neuropeptide receptors (92% of clusters) genes have enriched expression in the majority of fru P1 neurons, (Figure 8, Source data 3), and indicate the importance of these signaling molecules in reproductive behaviors. Our previous GWAS analyses of natural variation of female remating found Wnt pathway genes contributing to differences (Newell et al., 2020). The Wnt pathway marker gene identification led to the identification of Wntfru P1 neurons, with sexual dimorphism in their projections (Wnt 4, Wnt 5, and Wnt 10; Figure 10). The top protein domain enrichment across the marker genes was the ‘Immunoglobulin-like domain superfamily’, consistent with our identification of the IgSF genes being expressed in fru P1 neurons in our previous studies (see Brovero et al., 2021; Goldman and Arbeitman, 2007). An examination of the marker gene distribution and the overall expression levels shows that all clusters display a unique repertoire of IgSF genes (Source data 3; Figure 11). The diversity of IgSFs across the clusters may be a way to generate neurons with different adhesion properties to ultimately generate different neuronal connections. Another functional category enriched in the marker genes was ‘DNA-binding transcription factor activity’, with homeobox-like domain TF genes showing the largest expression differences across the clusters. We also annotate the clusters with respect to the neurotransmitters they produce, based on marker gene expression and overall gene expression levels (fast-acting and biogenic amines). One key finding is that 80% of the clusters have nAChR subunit genes as marker genes (Figure 9), and most fru P1 neurons express at least one nAchR subunit gene, with nAChRalpha5 expressed in 97% of neurons.

For many of the marker gene analyses, we provide in vivo spatial links that support the conclusions. Additional in vivo validation is certain to produce interesting new findings contributing new insights into the molecular-genetic and circuit basis of behaviors and will provide valuable information about how expression in scRNA-seq studies correlates with in vivo mRNA, protein, and reporter gene patterns. Our study used careful clustering and filtering criteria consistent with published Drosophila single-cell studies (Allen et al., 2020; Croset et al., 2018; Davie et al., 2018; Konstantinides et al., 2018; Kurmangaliyev et al., 2020; Li et al., 2017; Li et al., 2022; McLaughlin et al., 2021; Özel et al., 2021; Simon and Konstantinides, 2021; Xie et al., 2021). We expect new computational algorithms will likely improve single-cell data analyses. For example, after we started our analyses a new doublet detection algorithm (DoubletFinder) was released (McGinnis et al., 2019), which we show found additional potential doublets in our data set throughout the clusters (Source data 2). The data set will provide opportunities to further our biological understanding of behaviors and can also be reanalyzed to provide additional molecular-genetic insights as new algorithms are generated.

Materials and methods

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Genetic reagent (D. melanogaster)fru P1-Gal4Provided by Baker lab PMID:15959468fruP1-GAL4
Genetic reagent (D. melanogaster)Canton SUlrike HeberleinWild type+/+;+/+;+/+ (Canton S)
Genetic reagent (D. melanogaster)fru P1FLPDonor: Barry Dickson, Howard Hughes Medical Institute, Janelia Research CampusRRID:BDSC66870w[*]; TI{FLP}fru[FLP]/T+D6:D27+D6:D18M3, Sb[1]
Genetic reagent (D. melanogaster)10XUAS-IVS-mCD8::GFPDonor: Gerald M. Rubin & Barret Pfeiffer, Howard Hughes Medical Institute, Janelia Research CampusRRID:BDSC_32186w[*]; P{y[+t7.7] w[+mC]=10XUAS-IVS-mCD8::GFP}attP40
Genetic reagent (D. melanogaster)10xUAS >stop > myr::smGdP-cMycPfeiffer, B., Rubin, G. (2014.4.16). Recombinase and tester constructs and insertions.RRID:BDSC_62125w[1118]; P{y[+t7.7] w[+mC]=10xUAS(FRT.stop)myr::smGdP-cMyc}attP40
Genetic reagent (D. melanogaster)UAS >stop > TrpA1MycBarry Dickson Stocks. Flybase ID: FBrf0234603RRID:BDSC_66871P{UAS(FRT.stop)TrpA1myc}VIE-260B
Genetic reagent (D. melanogaster)Lgr3-Gal4Donor: Bruce Baker, Howard Hughes Medical Institute, Janelia Research Campus; Donor's Source: James Truman, Howard Hughes Medical Institute, Janelia Research CampusRRID:BDSC_66683w[*]; P{y[+t7.7] w[+mC]=Lgr3-GAL4::VP16}attP40/CyO
Genetic reagent (D. melanogaster)NPF-Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84671w[*]; TI{2 A-GAL4}NPF[2 A-GAL4]/TM6B, Tb[1]
Genetic reagent (D. melanogaster)Wnt10-Gal4Donor: Hugo J. Bellen, Baylor College of Medicine; Donor's Source: Gene Disruption ProjectRRID:BDSC_86484y[1] w[*]; TI{GFP[3xP3.cLa]=CRIMIC.TG4.2}Wnt10[CR01661-TG4.2]/SM6a
Genetic reagent (D. melanogaster)Wnt4-Gal4Donor: Gene Disruption Project; Donor's Source: Hugo J. Bellen, Baylor College of MedicineRRID:BDSC_67449y[1] w[*]; Mi{Trojan-GAL4.2}Wnt4[MI03717-TG4.2]
Genetic reagent (D. melanogaster)Wnt5-Gal4Donor: Huey Hing, State University of New York, BrockportRRID:BDSC_59034w[*] TI{w[+mC]=GAL4}Wnt5[Gal4]
Genetic reagent (D. melanogaster)Vmat-Gal4Donor: Gene Disruption Project; Donor's Source: Hugo J. Bellen, Baylor College of MedicineRRID:BDSC_66806y[1] w[*]; Mi{Trojan-GAL4.0}Vmat[MI07680-TG4.0]
Genetic reagent (D. melanogaster)ChAT-Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84618TI{2 A-GAL4}ChAT[2 A-GAL4]/TM3, Sb[1]
Genetic reagent (D. melanogaster)CrzR-Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84621w[*]; TI{2 A-GAL4}CrzR[2A-B.GAL4]/TM3, Sb[1]
Genetic reagent (D. melanogaster)CCHa2-Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84602w[*]; TI{2 A-GAL4}CCHa2[2 A-GAL4]/TM3, Sb[1]
Genetic reagent (D. melanogaster)Dh44-Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84627w[*]; TI{2 A-GAL4}Dh44[2 A-GAL4]/TM3, Sb[1]
Genetic reagent (D. melanogaster)nAChR⍺1 Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84662TI{2 A-GAL4}nAChRalpha1[2A-AC.GAL4]/TM6B, Tb[1]
Genetic reagent (D. melanogaster)nAChR⍺2 Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84663w[*]; TI{2 A-GAL4}nAChRalpha2[2 A-GAL4]/TM3, Sb[1]
Genetic reagent (D. melanogaster)nAChR⍺3 Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84664w[*] TI{2 A-GAL4}nAChRalpha3[2 A-GAL4]
Genetic reagent (D. melanogaster)nAChR⍺6 Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84665TI{2 A-GAL4}nAChRalpha6[2 A-GAL4]/CyO
Genetic reagent (D. melanogaster)VGAT-Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84696TI{2 A-GAL4}VGAT[2 A-GAL4]/CyO
Genetic reagent (D. melanogaster)VGlut-Gal4Donor: Paul Garrity, Brandeis University & Bowen Deng, National Institute of Biological Sciences; Donor's Source: Yi Rao, National Institute of Biological SciencesRRID:BDSC_84697TI{2 A-GAL4}VGlut[2 A-GAL4]/CyO
Genetic reagent (D. melanogaster)Clk856-Gal4Obtained from Michael Rosbashw; CLK856-Gal4/Cyo; MKRS/TM6B
AntibodyRat polyclonalSanders and Arbeitman, 2008RRID:AB_2569440anti-FruM; 1:200 in TNT
AntibodyRabbit polyclonalObtained from Michael RosbashRRID:AB_2315101anti-Per; 1:500 in TNT
AntibodyRabbit polyclonalAbcamRRID:AB_307014anti-Myc; 1:6050 in TNT
AntibodyMouse monoclonalDevelopmental Studies Hybridoma BankRRID:AB_2314866anti-NC82; 1:20 in TNT
AntibodyMouse monoclonalDevelopmental Studies Hybridoma BankRRID:AB_528440anti-Prospero(MR1A); 1:100 in TNT
AntibodyRabbit polyclonalInvitrogenRRID:AB_221477anti-GFP 488; 1:600 in TNT
AntibodyGoat polyclonalInvitrogenRRID:AB_141373anti-rat 488; 1:1000 in TNT
AntibodyGoat polyclonalInvitrogenRRID:AB_10563566anti-rabbit 568; 1:500 in TNT
AntibodyGoat polyclonalInvitrogenRRID:AB_2535719anti-mouse 633; 1:500 in TNT
Software, algorithmZenCarl ZeissRRID:SCR_013672
Software, algorithmIllustratorAdobeRRID:SCR_010279
Software, algorithmSeuratStuart et al., 2019RRID:SCR_016341
Software, algorithmCell Ranger10 x GenomicsRRID:SCR_017344
Software, algorithmGraphPad PrismGraphPadRRID:SCR_002798
Software, algorithmShinyR-DAMCichewicz and Hirsh, 2018

Fly strains and husbandry

Request a detailed protocol

Flies for scRNA-seq experiments were the genotype: w[*]; P[y[+t7.7] w[+mC]=10XUAS-IVS-mCD8::GF]attP40/UAS-Gal4;fru P1-Gal4/+. This strain expressed membrane-bound GFP in fru P1 neurons in males and females. The flies were sex sorted at the white pre-pupal stage using the male gonad to distinguish between the sexes. We note that the UAS-Gal4 present in this genotype was determined to be non-functional (Feng and Cook, 2017). The pupae were aged to 48 hr APF on 3% agar plates with Bromophenol blue for color contrast. All strains, unless otherwise indicated, are aged in a humidified incubator at 25°C on a 12-hr light and 12-hr dark cycle. Additional fly strains used in this study are listed in the Resources Table. The laboratory Drosophila media composition is: 33 l H2O, 237 g agar, 825 g dried deactivated yeast, 1560 g cornmeal, 3300 g dextrose, 52.5 g Tegosept in 270 ml 95% ethanol and 60 ml propionic acid.

Dissociation of pupal CNS for single-cell mRNA sequencing

Request a detailed protocol

The CNS (brain and VNC) from 20 freshly dissected male or female pupae were used per replicate (n = 2 replicates per sex). The dissected tissue was dissociated as previously described (Brovero et al., 2021). All dissections occurred within an hour after the lights turned on (ZT0). GFP positive cells (fru P1-expressing neurons) were enriched using FACS with a FACSAria SORP with BD FACSDiva software.

10× genomics library preparation and sequencing

Request a detailed protocol

scRNA-seq libraries were generated using Single Cell 3′ Library & Gel Bead Kit v3, Chip B Kit, and the GemCode 10× Chromium instrument (10× Genomics, CA), according to the manufacturer’s protocol (Zheng et al., 2017). In brief, FACS enriched fru P1-expressing neurons were suspended in S2 medium with 10% fetal bovine serum and the maximum volume of cell suspension (46.6 μl) was mixed with single-cell Master Mix and loaded into an individual chip channel for each sex and replicate. For all samples, this process occurred between 4 and 5 hr after lights on (ZT4–ZT5). The library preparation steps were conducted as previously described, with modifications to polymerase chain reaction (PCR) thermal cycling steps for v3 chemistry (Brovero et al., 2021). For cDNA amplification PCR settings were: 98°C for 3 min, 12 cycles of (98°C for 15 s, 63°C for 20 s), 72°C for 1 min, held at 4°C. For sample index PCR amplification the settings were: 98°C for 45 s, 16 cycles of (98°C for 20 s, 54°C for 30 s, 72°C for 20 s), 72°C for 1 min, and hold at 4°C. All single-cell libraries were sequenced on an Illumina NovaSeq 6000 at Florida State University Translational Core Laboratory.

scRNA-seq data processing and clustering in Seurat

Request a detailed protocol

CellRanger software (v.3.0.2) ‘cellranger count’ command was used to align sequencing reads to a customized Drosophila melanogaster (BDGP6.92) STAR reference genome, which contained the sequence for mCD8-GFP cDNA to generate a multidimensional feature-barcode matrix for each replicate. Using the inflection point of the barcode-rank plots, the CellRanger pipeline called the number of cells captured in each replicate (Source data 1). We captured 26,149 total cells across both sexes and replicates with 871,092,958 sequencing reads, reaching an average of 65% sequencing saturation (Source data 1). The feature-barcode matrices were next analyzed in Seurat (v3.2.2) (Stuart et al., 2019). Each expression matrix was filtered to obtain high-quality cells using the following criteria: cells with >5% mitochondrial transcripts (stressed/dead/dying cells), <200 genes (empty droplets), those expressing more than 4000 features (genes) and/or possessing more than 20,000 UMIs (potential doublets or triplets) were removed in each replicate. In addition, we provide the number of cells per replicate identified by DoubletFinder (version 2.0.3) in Source data 2. The male replicates and the female replicates were merged to form sex-specific data sets and additionally all replicates for both sexes were merged into a full data set. This yielded 7988 male cells, 17,530 female cells and 25,518 cells in total. We used default normalization and scaling steps for each data set as outlined in the Seurat ‘Guided clustering tutorial’ (https://satijalab.org/seurat/v3.2/pbmc3k_tutorial.html). The default ‘NormalizeData’ function was used to normalize the data and the data were subsequently scaled using the default ‘ScaleData’ function. We performed a principal component analysis (PCA) using the top 2000 highly variable features in each data set. We calculated statistically significant principal components (PCs) for each data set based on Jackstraw analysis. We used the number of significant PCs, up to where the first non-significant PC was met, for each data set (Figure 1—figure supplement 2). This resulted in 80, 104, and 107 PCs. We proceeded to use Seurat’s standard workflow to reduce dimensionality and cluster cells by default ‘FindNeighbors’, ‘FindClusters’, and ‘runUMAP’ functions. We compared a range of cluster resolutions for each data set using the clustree R package (Zappia and Oshlack, 2018) before selecting an optimal resolution for each (resolution = 3.0, 0.7, and 1.2, Figure 1—figure supplements 3 and 4). To further validate the clustering resolutions chosen, we examined expression of DIP-iota in our data set, which we have previously shown to be expressed in a small subset of adult fru P1 neurons (Brovero et al., 2021). We find that DIP-iota expression is restricted to a small subset of neurons in our UMAP plots for each data set (Figure 1—figure supplement 8), and is considered a significant marker gene for one cluster in the male and full data sets (Source data 3). We examined expression in the pupal CNS and find one neuron present in the male and two in the female VNC (Figure 1—figure supplement 8).

We also performed an analysis where we combined the male and female data sets using Seurat data integration to generate an integrated full data set (Figure 1—figure supplement 5). We used default ‘FindIntegrationAnchors’ and ‘IntegrateData’ steps as outlined in the Seurat ‘Introduction to scRNA-seq data integration’ tutorial (https://satijalab.org/seurat/articles/integration_introduction.html). Consistent with our initial merged full data set analysis, we performed PCA using the top 2000 highly variable features in each data set and calculated statistically significant PCs based on Jackstraw analysis. We selected the number of significant PCs, up to where the first non-significant PC was met, resulting in 125 significant PCs for this analysis (Figure 1—figure supplement 5A). We proceeded through the standard Seurat workflow as presented above using ‘FindNeighbors’, ‘FindClusters’, and ‘runUMAP’ functions, selecting an optimal cluster resolution that also produced a similar number of clusters to our merged full data set (resolution = 1.8). We find that the integration methodology performs marginally better in terms of replicate balance within cluster, though there are still clusters that have replicate contributions from only one technical replicate (integrated analysis: seven clusters; merged analyses: nine clusters). Given our biological question of comparing the expression of fru P1-expessing neurons between the sexes, we proceeded with the merged data in this study, as the integrated approach may limit true differences.

Marker gene identification and sex-differential expression

Request a detailed protocol

Marker genes were identified per cluster for each data set using the Seurat ‘FindAllMarkers’ function, using the Wilcox rank sum test (min.pct = 0.25, logfc.threshold = 0.25). For each gene, the expression in a given cluster was compared with expression in neurons in the remaining clusters. Differentially expressed genes between the sexes per cluster were identified using the Seurat ‘FindAllMarkers’ function, using the Wilcox rank sum test (min.pct = 0.25, logfc.threshold = 0.25). For each gene, the expression in a male neurons within a cluster was compared with expression in female neurons within the same cluster. The p values for marker gene and sex-differential expression were adjusted for multiple testing using the Bonferroni method and all results are presented in Source data 3. A marker was considered a significant marker gene if it was present in the marker gene list based on these criteria: min.pct = 0.25, logfc.threshold = 0.25, p_val <0.05.

Sex differences in cell number within clusters

Request a detailed protocol

A cluster was considered ‘sex-specific’ if only cells from one sex were present in that cluster. Given the data set contains an unbalanced number of cells between the sexes in the full data set, the female cell numbers per cluster were divided by a scaling factor of 2.19. For a cluster to be considered ‘sex-biased’, cells from one sex were required to be >twofold higher in that cluster. For a cluster to be considered ‘sex-biased, strong’ the number of cells was >fourfold higher.

Downsampling and random cell removal analyses

Request a detailed protocol

To perform random cell removal analyses, a random subset of 7988 female cells (to equal the number of male cells) was selected by the ‘sample’ R function. The Seurat ‘Dimplot’ function was used to visualize these data sets and the number of cells per sex in each cluster was quantified. To perform random downsampling analyses, a random subset of 7988 female cells were selected from the 17,530 filtered female cells using the ‘sample’ R function, as above.

The random subset of female cells was then merged with the Seurat object containing the 7988 filtered male cells into a new Seurat object using the ‘merge’ function. This was performed three times to create three independent random downsampled data sets. Each downsampled data set underwent the Seurat workflow described above. Consistent with our analysis of the full data set, we used the number of significant PCs up to where the first non-significant PC was met. This resulted in 99, 98, and 99 PCs. We chose cluster resolutions that yielded ~113 clusters, the number resolved in the analysis of the full data set, resulting in resolutions of 3.0, 3.0, and 3.1. To match these down sampled clusters with those in the full data set, we used the R package clustifyr (Fu et al., 2020). The ‘seurat_ref’ function was used to make a cluster reference from the full data set and the default ‘clustify’ wrapper function was used to perform Spearman correlations between the reference downsampled analysis clusters (Source data 5).

Immunostaining and microscopy

Request a detailed protocol

Adult and 48-hr APF brains were dissected and imaged as previously described (Palmateer et al., 2021). Primary antibodies used were as follows: rat α-FruM (1:200) (Sanders and Arbeitman, 2008), rabbit α-Myc (1:6050; abcam, ab9106), rabbit α-Per (1:500) (gift from Michael Rosbash), mouse α-Nc82 (1:20, DSHB), and mouse α-Prospero (1:100, DSHB). The secondary antibodies were as follows: goat α-rat 488 (1:1000, Invitrogen A11006), goat α-rabbit 568 (1:500; Invitrogen, A11036), rabbit α-GFP 488 (1:600; Invitrogen A21311), and goat α-mouse 633 (1:500; Invitrogen, A21052). Both primary and secondary antibodies were diluted in TNT (Tris-NaCL-Triton buffer; 0.1 M Tris–HCl, 0.3 M NaCl, 0.5% Triton X-100). All confocal microscopy was performed on a Zeiss LSM 700 system, with Zeiss Plan-Apochromat ×20/0.8 and ×0/1.4 objectives. The z-stack slice interval for all images was 1.0 μm. A 1 Airy Unit (AU) pinhole size was calculated in Zeiss Zen software (Black edition, 2012) for each laser line: 488 nm: 38 μm; 555 nm: 34 μm; and 639 nm: 39 μm. All images were acquired at 1024 × 1024 pixel resolution with bidirectional scanning.

Clk856fru P1>sm.GDP.Myc and Clk856fru P1>TrpA1 Myc cell bodies were scored blinded, in each brain hemisphere for 0- to 24-hr adults and 4- to 7-day adults of both sexes (Source data 7). Data from both left and right hemispheres for each region within sex were pooled for analysis for each genotype and time point. Statistical testing within each quantified region was performed between the sexes within each genotype and time point using a Mann–Whitney U-test in GraphPad Prism (9.3.0).

Subclustering male-specific/-biased dsx-expressing neuron clusters

Request a detailed protocol

All male specific or male biased that were enriched for dsx expression (Source data 3), clusters 21, 47, and 68 in the full data analysis were subset and re-analyzed. Only dsx expressing neurons in these clusters were maintained in this analysis (dsx expression >0 UMIs). Twenty-seven significant PCs were used to perform new dimensionality reduction based on Jackstraw analysis. Cluster resolution was determined based on visual inspection of distinct UMAP populations being identified as individual clusters (resolution = 0.5). Differential expression was calculated between the clusters using the same FindAllMarkers criteria as above.

Subclustering circadian neuron clusters

Request a detailed protocol

All clusters classified as ‘circadian’ in our full data set UMAP analysis were subset and PCA was performed only on these cells. The two significant PCs were used to perform new dimensionality reduction based on Jackstraw analysis. Cluster resolution was determined based on visual inspection of distinct UMAP populations being identified as individual clusters (resolution = 2.5). Differential expression was calculated between the clusters using the same FindAllMarkers criteria as above.

Annotating mushroom body KC populations and subclustering

Request a detailed protocol

Cell clusters were annotated based on the presence genes that have been shown to be expressed in the mushroom body KCs and those specific to subpopulations. All of these genes were manually annotated for their presence (Source data 3), however, a cluster was only confidently annotated as a mushroom body KC subtype if at least two genes were enriched in expression in that cluster. For example, if a cluster had a marker gene with known general KC expression present (ey and/or Dop1R2) and additionally had Fas2, sNPF, and/or trio as a marker gene, the cluster was annotated as a KC subtype. These subtypes were determined based on the presence of trio, which shows high expression in only the γ KCs at 48-hr APF (Awasaki et al., 2000). αβ KCs were annotated based on marker genes for ey and/or Dop1R2 with sNPF and/or Fas2 in the absence of trio. Further, we also found clusters with Fas2 and sNPF as marker genes with expression of ey or Dop1R2 in the cluster, but not present as a marker gene. We also find one cluster per data set that has marker gene expression of trio in addition to one other KC-expressed gene, either Fas2 or Dop1R2. Clusters with incomplete enriched marker gene expression but suggestive of being KC subtypes are annotated as lower confidence ab_KC clusters (indicated by *).

All high-confidence KC clusters in our full data set UMAP analysis were subset and PCA was performed only on these cells. The 21 first significant PCs were used to perform new dimensionality reduction based on Jackstraw analysis. Cluster resolution was determined based on visual inspection of distinct UMAP populations being identified as individual clusters (resolution = 0.5). Differential expression was calculated between the clusters using the same FindAllMarkers criteria as above.

Locomotor activity assays and analysis

Request a detailed protocol

Wild-type CS, transgene controls without a Gal4 driver (w; UAS <stop < TrpA1Myc; fruFLP/+), and experimental flies with expression restricted to Clk856fru P1 neurons (w; UAS<stop<TrpA1Myc/Clk856-Gal4; fruFLP/+) were collected 0- to 6-hr post-eclosion and loaded into glass tubes that contained our standard laboratory food. Tubes containing the flies were loaded into DAM (Trikinetics, Waltham, MA). The light:dark (LD) assay was conducted at 25°C on a 12-hr:12-hr LD cycle and beam cross activity was recorded in one min. bins for 15 days. Incubator lights turned on at 8 am and off at 8 pm. Flies were aged to 5 days during the assay, therefore the first 5 days of data were removed from the analysis. This resulted in 10 days of data analyzed for all genotypes in the LD assay. In the dark:dark (DD) assay, flies were first entrained and aged at 25°C on a 12-hr:12-hr LD cycle for 7 days, allowing for 2 days of LD data for analysis. Next, we switched to a 12-hr:12-hr DD cycle for 10 days (constant darkness for 10 days). Beam cross activity for the DD assay was recorded in 1-min bins. Data were analyzed using ShinyR-DAM for both assays (Cichewicz and Hirsh, 2018). Dead flies were considered those with less than 50 beam cross events per day and were removed from the analyses, resulting in an n = 26–30 for all genotypes in LD and n = 23–30 for all genotypes in DD. ShinyR-DAM analysis of sleep was only performed on LD assay data. ShinyR-DAM measures sleep events using a 5-min sliding window, where 5 min of inactivity is considered a sleep event. Statistical tests were performed on ShinyR-DAM output data in R Studio and GraphPad Prism (9.3.0). One-way analysis of variance with Tukey-HSD post hoc testing was performed for comparisons across all genotypes. Comparisons between daytime and nighttime sleep were statistically compared using a Student’s t-test for daytime vs. nighttime sleep. Circadian period analysis was performed in ShinyR-DAM using the DD assay data. The default ShinyR-DAM parameters were used as follows: Chi-Sq testing range of 18–30 hr, a Chi-Sq period testing resolution of 0.2, and a rhythmicity threshold for filtering arhythmic individuals (Qp.act/Qp.sig) of 1. All ShinyR-DAM output data presented are provided in Source data 7.

Data availability

The Gene Expression Omnibus accession number for all scRNA-seq the data is https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE160370. Code used for the analyses presented can be found at: https://github.com/cpalmateer/fruP1SC_48hrAPF (copy archived at swh:1:rev:e82363ab1c3ef08dc4c067b155b28607bde3a861).

The following data sets were generated
    1. Palmateer C
    2. Arbeitman M
    (2023) NCBI Gene Expression Omnibus
    ID GSE160370. A single-cell transcriptomic atlas of fruitless-expressing neurons in the Drosophila pupal central nervous system.

References

    1. Gailey DA
    2. Hall JC
    (1988)
    Behavioral and cytogenetic analysis of the fruitless phenotype
    Behavior Genetics 18:716–717.
    1. Kim B
    2. Shortridge RD
    3. Seong C
    4. Oh Y
    5. Baek K
    6. Yoon J
    (1998)
    Molecular characterization of a novel Drosophila gene which is expressed in the central nervous system
    Molecules and Cells 8:750–757.
    1. Kimura KI
    (2008)
    Sexually dimorphic neural circuitry: formation and function in courtship behavior of Drosophila melanogaster
    Genes & Genetic Systems 83:525.
    1. Li H
    2. Janssens J
    3. De Waegeneer M
    4. Kolluru SS
    5. Davie K
    6. Gardeux V
    7. Saelens W
    8. David FPA
    9. Brbić M
    10. Spanier K
    11. Leskovec J
    12. McLaughlin CN
    13. Xie Q
    14. Jones RC
    15. Brueckner K
    16. Shim J
    17. Tattikota SG
    18. Schnorrer F
    19. Rust K
    20. Nystul TG
    21. Carvalho-Santos Z
    22. Ribeiro C
    23. Pal S
    24. Mahadevaraju S
    25. Przytycka TM
    26. Allen AM
    27. Goodwin SF
    28. Berry CW
    29. Fuller MT
    30. White-Cooper H
    31. Matunis EL
    32. DiNardo S
    33. Galenza A
    34. O’Brien LE
    35. Dow JAT
    36. Jasper H
    37. Oliver B
    38. Perrimon N
    39. Deplancke B
    40. Quake SR
    41. Luo L
    42. Aerts S
    43. Agarwal D
    44. Ahmed-Braimah Y
    45. Arbeitman M
    46. Ariss MM
    47. Augsburger J
    48. Ayush K
    49. Baker CC
    50. Banisch T
    51. Birker K
    52. Bodmer R
    53. Bolival B
    54. Brantley SE
    55. Brill JA
    56. Brown NC
    57. Buehner NA
    58. Cai XT
    59. Cardoso-Figueiredo R
    60. Casares F
    61. Chang A
    62. Clandinin TR
    63. Crasta S
    64. Desplan C
    65. Detweiler AM
    66. Dhakan DB
    67. Donà E
    68. Engert S
    69. Floc’hlay S
    70. George N
    71. González-Segarra AJ
    72. Groves AK
    73. Gumbin S
    74. Guo Y
    75. Harris DE
    76. Heifetz Y
    77. Holtz SL
    78. Horns F
    79. Hudry B
    80. Hung RJ
    81. Jan YN
    82. Jaszczak JS
    83. Jefferis G
    84. Karkanias J
    85. Karr TL
    86. Katheder NS
    87. Kezos J
    88. Kim AA
    89. Kim SK
    90. Kockel L
    91. Konstantinides N
    92. Kornberg TB
    93. Krause HM
    94. Labott AT
    95. Laturney M
    96. Lehmann R
    97. Leinwand S
    98. Li J
    99. Li JSS
    100. Li K
    101. Li K
    102. Li L
    103. Li T
    104. Litovchenko M
    105. Liu HH
    106. Liu Y
    107. Lu TC
    108. Manning J
    109. Mase A
    110. Matera-Vatnick M
    111. Matias NR
    112. McDonough-Goldstein CE
    113. McGeever A
    114. McLachlan AD
    115. Moreno-Roman P
    116. Neff N
    117. Neville M
    118. Ngo S
    119. Nielsen T
    120. O’Brien CE
    121. Osumi-Sutherland D
    122. Özel MN
    123. Papatheodorou I
    124. Petkovic M
    125. Pilgrim C
    126. Pisco AO
    127. Reisenman C
    128. Sanders EN
    129. Dos Santos G
    130. Scott K
    131. Sherlekar A
    132. Shiu P
    133. Sims D
    134. Sit RV
    135. Slaidina M
    136. Smith HE
    137. Sterne G
    138. Su YH
    139. Sutton D
    140. Tamayo M
    141. Tan M
    142. Tastekin I
    143. Treiber C
    144. Vacek D
    145. Vogler G
    146. Waddell S
    147. Wang W
    148. Wilson RI
    149. Wolfner MF
    150. Wong YCE
    151. Xie A
    152. Xu J
    153. Yamamoto S
    154. Yan J
    155. Yao Z
    156. Yoda K
    157. Zhu R
    158. Zinzen RP
    159. FCA Consortium§
    (2022) Fly cell atlas: a single-nucleus transcriptomic atlas of the adult fruit fly
    Science 375:eabk2432.
    https://doi.org/10.1126/science.abk2432
    1. Yamamoto D
    2. Sato K
    3. Koganezawa M
    (2014) Neuroethology of male courtship in Drosophila: from the gene to behavior
    Journal of Comparative Physiology. A, Neuroethology, Sensory, Neural, and Behavioral Physiology 200:251–264.
    https://doi.org/10.1007/s00359-014-0891-5

Decision letter

  1. Nikos Konstantinides
    Reviewing Editor; Institut Jacques Monod, France
  2. Claude Desplan
    Senior Editor; New York University, United States

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Single-cell transcriptome profiles of Drosophila fruitless-expressing neurons from both sexes" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Claude Desplan as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) The reviewers have raised concerns regarding the "sex-specific" versus "shared but sexually dimorphic" cell types. They recommend a careful re-analysis of the data to address this exact question. In particular, they recommend generating a single atlas with both male and female cells under different integration or merging conditions. This would include:

a) a careful parameter selection, such as resolution, that will allow for evaluation of subclusters. They also recommend a comparison between integration and merging of male and female datasets

b) doublet elimination using appropriate algorithms (e.g. DoubletFinder or other)

c) elimination of cell cycle influence

After generating such an atlas, an experimental validation of differences in cell number or sex-specific markers in a few well-known clusters would be necessary to address the question of sex-specificity of clusters.

2) The reviewers have noticed that, in multiple cases, there are discrepancies between the paper's conclusions and established knowledge or previous datasets. There has to be a clear discussion of the reasons or experimental resolution.

3) The paper is long and detailed and the reviewers feel that there has to be a Conclusion Section that summarizes the most interesting observations and makes the paper accessible to a broader audience.

4) In many cases, the reviewers noticed images that were of low quality or resolution. This also has to be addressed.

Reviewer #2 (Recommendations for the authors):

I have some concerns about the way neuron number was normalized per cluster per sex. I do not have a better way to do this analysis but would instead like to see a specific example cell type/cluster pinned down via counting the number of real neurons. This could perhaps be done using a known cell type and antibody stains for a marker protein or a GAL4-reporter combination, to determine whether the difference in neural number between sexes for a particular cluster is real. This approach was used to evaluate female-specific cluster 107, but not to validate any differences in neuron number in a cluster that is present in both sexes. The conclusion that so many clusters contain different numbers of neurons between the sexes is an important finding and additional "ground truthing" would help support the authors conclusions.

Many of the figure panels that contain antibody stains are not of sufficiently high quality to really see what is going on, even in the separate higher res version of the Figure. This is true in several places. For example, for Figure 7E, please consider cropping or including a second set of panels with insets showing the region with expression. Many brains, exe. all of Figure 10, are not high enough resolution to see what the tiny arrows are pointing at.

I was not able to evaluate a section of Figure 5: there are no panels after 5S. The text refers to experiments in 5T and 5U that are not shown.

"However, we do not find broad expression of ChAT at 48hr 478 APF, which is not consistent with our single cell data that shows extensive expression of ChAT." I was hoping for some discussion of this discrepancy, or an experimental resolution. mRNA expression does not always match protein (or reporter protein) expression. This can be especially true for terminal genes like neurotransmitters or Rhodopsins, where there are cases of high mRNA expression long before protein expression. I would love to see HCR in situs for ChAT vs. an antibody stain or protein reporter. This examination could potentially be restricted to a specific cell type for clarity. Comparing the two would be an interesting addition, especially given such a distinct difference that is otherwise unexplained in the manuscript. At the very least, this result should be discussed.

I'm not entirely sure I understand the link between the section on circadian rhythm and the rest of the paper. The authors were able to identify and label DN1p in the scSeq data, but unless I misunderstand, do not focus on DN1p in the experiments on sexually dimorphic roles of activation of a subset of circadian rhythm neurons.

Reviewer #3 (Recommendations for the authors):

In addition, we have some comments and questions about the data analysis that could impact some of these "big picture" conclusions.

A. One of the big findings the authors report from their single cell analysis is the apparent detection of male-specific and female-specific neuron populations, a finding that would be of wide interest to those interested in sex differences. However, we feel that support from a further analysis would help make this result more convincing. The authors currently infer sex-specificity in some of their clusters based on the absence of cells from one sex after merging the datasets. This raises an obvious question – to what extent does this inference depend on clustering parameters, and how do you know what the optimal clustering strategy is? For example, in the combined-sex analysis, clusters 12/5/16 are considered separate clusters (Figure 1), and cluster 12 is designated as strongly male-biased and clusters 5 and 16 as strongly female-biased (Figure 2B). On the other hand, clusters 3 and tachykinin-1 are each considered a single cluster (Figure 1) and are considered sexually monomorphic (Figure 2B), but a look at Figure 2A shows that each of these clusters has a male half and a female half. Is there an objective reason for deciding when we are looking at a single monomorphic cluster with sex-biased gene expression, and when we are looking at a closely related pair of sex-limited clusters? The authors describe their statistical approach to determine the number of clusters; but that is going to be affected by sequencing depth, variation in staging, batch effects, cell cycle differences, etc. What is the most biologically significant clustering strategy, and how do you determine that? How do you decide, as a neurobiologist, whether you are under- or over-clustering your data, and when you hit that Goldilocks spot where your clusters are most likely to correspond to biologically meaningful cell types? These issues are worth discussing. (On a technical note, please use different colors in Figure 2B – the color for strong male bias looks almost the same as the color from strong female biased, which confused us until we figured that out).

There is a technical approach that could help with this. Merging combines the raw count matrices from different datasets, but an alternative is to integrate them (implemented through Seurat), which uses common anchors between cells across datasets to promote the identification of shared cell types. We think it would be worth seeing if this sex-specificity holds after integration. If it does, then that's two independent pipelines that identify sex-specific populations. We think that integrating, rather than merging, might help tighten up some of the sex difference analyses too. Looking at Figure 2A, it's clear that many of the clusters don't merge particularly well. The authors use this as evidence of variation in the extent of sexual dimorphism between populations (as labelled in Figure 2B). But we worry that this creates fuzzy boundaries between male and female populations of what may be the same neuron type. And this fuzziness could affect where the cluster boundaries are called. Comparing the bottom part of Figure 1F and 2A, we can see a cluster 6 that is called as a single cluster despite clearly separating into male and female populations in Figure 2A. This contrasts with Octopamine_1, which seems to be largely/entirely female-biased, while Octopamine_2 is male-biased. There are other examples like that. We wonder whether integration (rather than merging) will do a better job of identifying homologous cell types. The authors could then ask what is differentially expressed between male and female cell-types within a cluster.

B. A general conclusion the authors reach is that male and female fru+ neurons 'share common gene expression repertoires with sex-specific information overlaid on these core patterns' (lines 145-149; see also 1057). This conclusion rests on their observation that 'nearly all clusters are comprised of male and female neurons'. On the one hand, this is what UMAP does – it clusters cells by expression similarities. How do we know that a cluster containing both male and female cells on a UMAP actually corresponds to the same population in male and female brains? On the other hand, taken alongside the small number of 'sex-specific' clusters they resolve, it seems that the authors are arguing that the vast majority of the neurons are shared between the sexes, but that these shared types generally show sexual dimorphism in their expression. This is an interesting result, but it rests exclusively on the distribution of cells in UMAP space and the cluster identification boundaries. For the reasons outlined above, these features of the UMAP/clustering are heavily dependent on the input parameters. Without some sort of orthogonal validation, it's hard to know how robust this result is.

C. A potential problem with the sex difference analyses, which the authors themselves recognize (line 249), is how to exclude the possibility that much of the difference is driven by sex differences in development time and/or cell cycle phase. The data discussed on lines 970-974 seems to confirm that the differences in developmental timing could be a significant contributor to male- and female-specific gene expression. The authors show some of their stainings at two different time points, which helps for those specific genes, but it's a potential issue that limits the interpretation of the transcriptomic data itself and the authors should probably highlight it. On the analysis side, there's not much that can be done about development time, but have the authors tried to account for cell cycle?

D. Organizing the section on transcription factors (starting at line 1023) by TF superfamily may not be the most informative approach, since the type of DNA binding domain is not directly related to developmental function. It may be better to organize this section by the degree of specificity: start with which TFs, regardless of their family, are the most specific to candidate cell types – and which TFs are specific to particular clusters that were singled out for detailed analysis in previous sections (e.g. KC cells, clock neurons, which TFs show correlation with particular neurotransmitters, etc). Better biological insights are likely to be obtained this way. Figure 12 and its supplements are near-impossible to take in. It would be better to replace the main-text figure with examples of highly specific TFs mapped on UMAPs, perhaps in relation to cluster annotation developed previously in this paper and move the big grid to the supplement. Showing a separate panel with 5-10 most sex-biased TFs for each sex would be more informative than annotating them on the large cluster*TF grid.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Single-cell transcriptome profiles of Drosophila fruitless-expressing neurons from both sexes" for further consideration by eLife. Your revised article has been evaluated by Claude Desplan (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

When comparing the merged and integrated dataset, the authors argue that they are highly concordant. Nevertheless, based on Figure 1 and Figure Supplement 5C-D, it is pretty obvious that integration worked better since there are a number of library-specific groups of datapoints in the merged UMAP. It would make it easier to visualise the integration of the libraries, even if the points are shuffled. In R, this is implemented in DimPlot of Seurat, by the option shuffle=TRUE.

We appreciate that the authors do not want to repeat the analysis, but they should definitely compare how the two approaches affect their analysis and not only rely on the visual (which, furthermore, is not convincing).

This can indeed lead to serious misinterpretations of the data when clusters that are library-specific are simply the product of batch effect. For example, the authors identify more mushroom body subtypes (9 or 13 depending on the clustering) than what has been reported in the literature (7). Are these clusters reliable? Or do they result from batch effect? Comparing Figure 1, Figure Supplement 5C-D, and Figure 4A, it seems very plausible that cluster γ_KCs_3 is library-specific. This problem might also occur in other neuronal structures, where the cell type composition is not as well known as the mushroom bodies.

https://doi.org/10.7554/eLife.78511.sa1

Author response

Essential revisions:

1) The reviewers have raised concerns regarding the "sex-specific" versus "shared but sexually dimorphic" cell types. They recommend a careful re-analysis of the data to address this exact question. In particular, they recommend generating a single atlas with both male and female cells under different integration or merging conditions. This would include:

a) a careful parameter selection, such as resolution, that will allow for evaluation of subclusters. They also recommend a comparison between integration and merging of male and female datasets

We performed an additional analysis where we integrated the data sets using the Seurat FindIntergrationAnchors function and the IntegrateData functions, as suggested. The authors did not do an integrated analysis initially given we did not see large batch effects across the replicates, and all the samples were generated using the same 10x Genomics platform and sequencing methodologies. The Integrated results are highly consistent with our original merged data analyses. We have added a new figure (Figure 1 —figure supplement 5), to provide a side-by-side visual of these UMAP plots for the reader. The number of cells contributed from each sex in this integration analysis has been added to Source Data 2 and reveals highly similar trends to that of our presented analysis. The integrated analysis also reveals overlap between the neurons from males and females across clusters, few sex-specific clusters, and clusters with sex-biases in cell number. Further, when we examine the sex-specific clusters derived from the integrated analysis, we find that the male-specific cluster has dsx as a top marker gene, serving as a ground truth in both analyses, given there are known small populations of male-specific neurons that express both dsx and fru (PMID: 16950103, PMID: 32314964, PMID: 12745633, PMID: 17716899, PMID: 18599032). Given the methodologies produced similar results, and the merged analyses was appropriate given there were no large batch effects or changes in methodologies in making the libraries, the authors continue to move forward with the original merged data presented and the substantial analyses on that data set. The integrated analyses are available in the supplemental materials and uploaded to GEO. A summary of this analysis and a figure (Figure 1—figure supplement 5) have been added to the manuscript, lines 221-225. The methodology for this analysis has also been added to the Materials and methods section, lines 1176-1187.

To address the concern about parameter selection, we clarify our approaches and provide in vivo evidence that the resolution chosen was appropriate. In the analyses presented, we used the JackStraw statistical approach to decide on the number of principle components (p <0.05). We next compared several resolutions and examined the resulting trees using Clustree, to visually inspect the additional clusters produced at each resolution. We decided to use the male-specific dsx cluster as a ground truth, as we know these neurons exist biologically and are a small population in vivo. Subclustering of the full data set did not reveal additional male-specific dsx-expressing clusters, giving us further confidence in the resolution chosen. We acknowledge that higher resolution clustering will result in more clusters, though we did not find substantial additional cluster separations with more resolution: the larger clusters were no longer sub-dividing by increasing the resolution. The data set presented here has more clusters than the published full brain data set, further suggesting we used resolution aligned with what is expected for a smaller neuronal data set. Finally, it does not appear that we used too high a resolution, given our biological validation of two clusters comprised of few neurons: the female-specific CCHa2 cluster, and the sex-shared DIP-iota-expressing cluster a new figure, Figure 1 —figure supplement 8 (DIP-iota described in lines 1169-1175). The in vivo results showed there were few neurons that are CCHa2fru P1 or DIP-iotafru P1, consistent with scRNA-seq clustering data. Further, we also present in vivo validation of large clusters by examining genes that are broadly expressed in large clusters, such as the nAchR-encoding genes (Figure 9, lines 894-912).

b) doublet elimination using appropriate algorithms (e.g. DoubletFinder or other).

We applied DoubletFinder to further evaluate our filtering criteria. DoubletFinder does identify additional potential doublets that were not filtered using our initial criteria. We have added the DoubletFinder results to Source Data 2 and add this point to our discussion (lines 1099-1102). The potential additional doublets are not concentrated in any one cluster or replicate, indicating that doublets are not a large confound across the cluster analyses presented and are not the only cells in any cluster. As with any genomic analyses, researchers need to establish their criteria using the tools and genome releases available at the time and move forward with their extensive annotation and validations. Additionally, our initial filtering criteria to remove potential doublets were rigorous. We eliminated outliers by maintaining cells in our analysis that met the following criteria: >5% mitochondrial transcripts (dead/dying/stressed cells), <200 genes (empty droplets), those expressing more than 4,000 features (genes) and/or possessing more than 20,000 UMIs (potential doublets or triplets) (Lines 172-177 and 1149-1157). These parameters are similar to those used in Drosophila cell atlases (PMID: 32314735, PMID: 29671739, PMID: 31746739, PMID: 36289550, PMID: 31915294, PMID: 32339165, PMID: 32396065).

c) elimination of cell cycle influence.

Animals used in this study were 48 hours after puparium formation (APF). At this time point, all neuroblasts divisions have ceased, except for four mushroom body neuroblasts (PMID: 1728583), so cell division is unlikely to be having a global impact on this data set. To further address this concern, we examined expression of stg, a cell-cycle gene with elevated expression in mitotic neuroblasts. stg expression was only found in 0.3% of neurons in our data set (lines 193-197 and Source data 2). Furthermore, we observe that a large percentage of neurons express the post-mitotic genes elav (77%) and nsyb (93%). These results suggest that there is minimal cell-cycle influence on gene expression in our data set.

After generating such an atlas, an experimental validation of differences in cell number or sex-specific markers in a few well-known clusters would be necessary to address the question of sex-specificity of clusters.

In this study we find two high confidence sex-specific clusters. These clusters were identified after both downsampling and data normalization approaches. One is the known male-specific dsx-expressing cluster (PMID: 16950103, PMID: 32314964, PMID: 12745633, PMID: 17716899, PMID: 18599032) and the other is the newly identified female-specific CCHa2-expressing cluster validated in this study. Therefore, there is in vivo experimental evidence for these small sex-specific clusters.

One limitation of scRNA-seq studies is that the workflows may be performed with neurons/cells from several animals and the workflows may results in loss of neurons/cells. Therefore, the number of neurons/cells present in any cluster may differ from what is present in the animal. However, even given this caveat we observed a high correlation in the number of neurons in each cluster between the replicates from each sex (male replicates r=0.93; female replicates r=0.88)(Source data 2). This indicates that there is limited technical variation in terms of the cells captured in each replicate, providing confidence in the comparisons made based on cell numbers.

As further validation that cluster size reflects in vivo biology, we show that clusters comprised of few neurons have correspondingly low numbers in the animal (CCHa2, DIP-iota, and circadian neurons, Figure 2 and Figure 1 —figure supplement 8, Figure 7 and Figure 7 —figure supplement 1) and clusters comprised of many neurons have a high number in animals (several nAchR subunit genes; Figure 9).

2) The reviewers have noticed that, in multiple cases, there are discrepancies between the paper's conclusions and established knowledge or previous datasets. There has to be a clear discussion of the reasons or experimental resolution.

We address all the reviewer comments. We have also added commentary about limitations of our study to address additional concerns.

3) The paper is long and detailed and the reviewers feel that there has to be a Conclusion Section that summarizes the most interesting observations and makes the paper accessible to a broader audience.

We have added a more comprehensive conclusions section with a summary of our results.

4) In many cases, the reviewers noticed images that were of low quality or resolution. This also has to be addressed.

We appreciate this being pointed out. Higher resolution figures have been provided or will be provided upon publication. We are planning on submitting for the reviewers high quality images using the Zipped related manuscript file to avoid issue with individual file sizes, as needed.

Reviewer #2 (Recommendations for the authors):

I have some concerns about the way neuron number was normalized per cluster per sex. I do not have a better way to do this analysis but would instead like to see a specific example cell type/cluster pinned down via counting the number of real neurons. This could perhaps be done using a known cell type and antibody stains for a marker protein or a GAL4-reporter combination, to determine whether the difference in neural number between sexes for a particular cluster is real. This approach was used to evaluate female-specific cluster 107, but not to validate any differences in neuron number in a cluster that is present in both sexes. The conclusion that so many clusters contain different numbers of neurons between the sexes is an important finding and additional "ground truthing" would help support the authors conclusions.

We have added biological validation that shows the number of neurons per cluster reflects biology. We now show the small sex-shared DIP-iota-expressing cluster (Cluster 57) reflects the in vivo observations of few neurons in the animal, with sex-differences in number (Figure 1 —figure supplement 8, lines 1169-1175). We show that one neuron is restricted to the TN1 segment of the VNC in males and two in females at 48hr APF. In addition, there are several neurons in the male abdominal ganglion portion of the VNC that are absent in females, which is reflected in the male-bias in cell number in the cluster. Cluster 57 in the full data set contains 91 males cells and 46 female cells (Source Data 5). Given that 20 animals were used per replicate, this also validates that we are capturing very small neuronal populations in our data. As noted above, the two sex-specific clusters (female-specific CCHa2 and male-specific dsx) also show that the number of neurons found in a cluster reflects in vivo biology. Additionally, we note above that large clusters with genes expressed in large percentage of the neurons have broad expression in vivo (nAchR subunits). We also perform independent validation for the circadian neurons in both sexes using antibody staining and Gal4 tools (Figure 7 and Figure 7 – supplement 1).

Furthermore, the results presented are consistent with studies that have shown that there are differences in fru P1 neuron number between the sexes (PMID: 15959468, PMID: 15935765, PMID: 20832311, PMID: 20832315). Several studies have also shown sex differences in neuron number in fru P1 neuron sub-populations defined by immunostaining with antibodies and Gal4 driver intersectional strategies (PMID: 20832311, PMID: 20832315, PMID: 18786359, PMID: 16281036).

Additionally, we provide strong evidence that the number of cells per cluster is unlikely due to batch effects, which would reflect technical issues, given the high correlation of numbers of cells per cluster across the replicates (male replicates r=0.93; female replicates r=0.88) (Source data 2).

Many of the figure panels that contain antibody stains are not of sufficiently high quality to really see what is going on, even in the separate higher res version of the Figure. This is true in several places. For example, for Figure 7E, please consider cropping or including a second set of panels with insets showing the region with expression. Many brains, exe. all of Figure 10, are not high enough resolution to see what the tiny arrows are pointing at.

We appreciate this feedback and have made sure to provide higher resolution figures to address this issue. We have asked to provide higher quality images for review through the submission system and will be certain that these are used for publication.

I was not able to evaluate a section of Figure 5: there are no panels after 5S. The text refers to experiments in 5T and 5U that are not shown.

We apologize for our error in these figure callouts, this has now been resolved.

"However, we do not find broad expression of ChAT at 48hr 478 APF, which is not consistent with our single cell data that shows extensive expression of ChAT." I was hoping for some discussion of this discrepancy, or an experimental resolution. mRNA expression does not always match protein (or reporter protein) expression. This can be especially true for terminal genes like neurotransmitters or Rhodopsins, where there are cases of high mRNA expression long before protein expression. I would love to see HCR in situs for ChAT vs. an antibody stain or protein reporter. This examination could potentially be restricted to a specific cell type for clarity. Comparing the two would be an interesting addition, especially given such a distinct difference that is otherwise unexplained in the manuscript. At the very least, this result should be discussed.

We acknowledge that this is a limitation of our study and now provide discussion of this result. HCR analyses will take several months to years of work with multiple HCR probes. This is a large undertaking, for a manuscript that is already lengthy. Follow-up with such an approach would be very interesting future work that we now discuss in our conclusions section.

I'm not entirely sure I understand the link between the section on circadian rhythm and the rest of the paper. The authors were able to identify and label DN1p in the scSeq data, but unless I misunderstand, do not focus on DN1p in the experiments on sexually dimorphic roles of activation of a subset of circadian rhythm neurons.

Here, we have implemented existing Gal4 tools to provide a more in-depth biological validation of the scRNA-seq dataset. We are capturing these neurons in our cell atlas at 48hr APF and sought to assess their function in behaving adult flies, when locomotor behaviors can be examined in a circadian context. Given this comment, we have modified and shortened this section to highlight the major findings. Based on the immunostaining staining results, the in vivo behavioral studies performed with the TrpA1 transgene expression modify activity of DN1, DN3, and LNds. We hope we understood the question being asked.

Reviewer #3 (Recommendations for the authors):

In addition, we have some comments and questions about the data analysis that could impact some of these "big picture" conclusions.

A. One of the big findings the authors report from their single cell analysis is the apparent detection of male-specific and female-specific neuron populations, a finding that would be of wide interest to those interested in sex differences. However, we feel that support from a further analysis would help make this result more convincing. The authors currently infer sex-specificity in some of their clusters based on the absence of cells from one sex after merging the datasets. This raises an obvious question – to what extent does this inference depend on clustering parameters, and how do you know what the optimal clustering strategy is? For example, in the combined-sex analysis, clusters 12/5/16 are considered separate clusters (Figure 1), and cluster 12 is designated as strongly male-biased and clusters 5 and 16 as strongly female-biased (Figure 2B). On the other hand, clusters 3 and tachykinin-1 are each considered a single cluster (Figure 1) and are considered sexually monomorphic (Figure 2B), but a look at Figure 2A shows that each of these clusters has a male half and a female half. Is there an objective reason for deciding when we are looking at a single monomorphic cluster with sex-biased gene expression, and when we are looking at a closely related pair of sex-limited clusters? The authors describe their statistical approach to determine the number of clusters; but that is going to be affected by sequencing depth, variation in staging, batch effects, cell cycle differences, etc. What is the most biologically significant clustering strategy, and how do you determine that? How do you decide, as a neurobiologist, whether you are under- or over-clustering your data, and when you hit that Goldilocks spot where your clusters are most likely to correspond to biologically meaningful cell types? These issues are worth discussing. (On a technical note, please use different colors in Figure 2B – the color for strong male bias looks almost the same as the color from strong female biased, which confused us until we figured that out).

There is a technical approach that could help with this. Merging combines the raw count matrices from different datasets, but an alternative is to integrate them (implemented through Seurat), which uses common anchors between cells across datasets to promote the identification of shared cell types. We think it would be worth seeing if this sex-specificity holds after integration. If it does, then that's two independent pipelines that identify sex-specific populations. We think that integrating, rather than merging, might help tighten up some of the sex difference analyses too. Looking at Figure 2A, it's clear that many of the clusters don't merge particularly well. The authors use this as evidence of variation in the extent of sexual dimorphism between populations (as labelled in Figure 2B). But we worry that this creates fuzzy boundaries between male and female populations of what may be the same neuron type. And this fuzziness could affect where the cluster boundaries are called. Comparing the bottom part of Figure 1F and 2A, we can see a cluster 6 that is called as a single cluster despite clearly separating into male and female populations in Figure 2A. This contrasts with Octopamine_1, which seems to be largely/entirely female-biased, while Octopamine_2 is male-biased. There are other examples like that. We wonder whether integration (rather than merging) will do a better job of identifying homologous cell types. The authors could then ask what is differentially expressed between male and female cell-types within a cluster.

We address our clustering and analysis parameters in detail above. Based on the reviewers’ suggestions we used the integration functions in Seurat. We agree that there are challenges in discerning between what could be considered single monomorphic clusters with sex-biased gene expression and closely related pairs of sex-specific clusters. Given this, we felt it best to assign a cluster resolution that revealed known sex-specific populations of fru P1-expressing neurons and stopped at a resolution where we no longer saw large clusters splitting into smaller clusters, using Clustree (see above). We acknowledge the additional potential limitations reviewer 3 raises in the conclusions section of the manuscript.

The colors in Figure 2B and in Figure 2 supplement 1 have been changed.

B. A general conclusion the authors reach is that male and female fru+ neurons 'share common gene expression repertoires with sex-specific information overlaid on these core patterns' (lines 145-149; see also 1057). This conclusion rests on their observation that 'nearly all clusters are comprised of male and female neurons'. On the one hand, this is what UMAP does – it clusters cells by expression similarities. How do we know that a cluster containing both male and female cells on a UMAP actually corresponds to the same population in male and female brains? On the other hand, taken alongside the small number of 'sex-specific' clusters they resolve, it seems that the authors are arguing that the vast majority of the neurons are shared between the sexes, but that these shared types generally show sexual dimorphism in their expression. This is an interesting result, but it rests exclusively on the distribution of cells in UMAP space and the cluster identification boundaries. For the reasons outlined above, these features of the UMAP/clustering are heavily dependent on the input parameters. Without some sort of orthogonal validation, it's hard to know how robust this result is.

The authors point to previous studies in the field to provide orthogonal validation. First, results from our laboratory using the cell-type-specific, RNA sequencing approach called Translating Ribosome affinity purification (TRAP) showed that there are a large number of genes that are expressed in fru P1 neurons from both males and females (core genes), and a much smaller set of male- and female-biased genes. This was true across two independent studies, examining three developmental time points (PMID: 27247289, and PMID: 33901168). That there is a much larger core set of genes expressed in fru P1 neurons, than those with sex-biased expression supports the idea of shared core expression networks, overlaid with sex-specific information. The TRAP results are completely independent of cluster-based tools and were performed using independent statistical methodologies.

Additional evidence that neurons are “shared” between the sexes comes from a cell lineage study that mapped fru neurons to their neuroblast lineage origins, with high resolution. This allowed detailed comparisons and showed they have shared origins (PMID: 27618265)

Furthermore, several studies from our lab and others in the field have shown that fru P1 neurons identified using genetic intersectional approaches identify populations in both males and females with highly similar spatial patterns (PMID: 20832315, PMID: 33901168, PMID: 20832311, PMID: 33616528). The intersectional approaches rely on expression of individual genes that have overlapping expression with fru P1, so this also provides support to the idea that cells in the same spatial position have shared gene expression. The finding of homologously positioned neurons in males and females using intersectional approaches is true across a large number of independent genes.

Text has been added to the manuscript on Lines 261-276

C. A potential problem with the sex difference analyses, which the authors themselves recognize (line 249), is how to exclude the possibility that much of the difference is driven by sex differences in development time and/or cell cycle phase. The data discussed on lines 970-974 seems to confirm that the differences in developmental timing could be a significant contributor to male- and female-specific gene expression. The authors show some of their stainings at two different time points, which helps for those specific genes, but it's a potential issue that limits the interpretation of the transcriptomic data itself and the authors should probably highlight it. On the analysis side, there's not much that can be done about development time, but have the authors tried to account for cell cycle?

As the reviewer indicates, the authors note the potential impacts of sex-differences in developmental timing at this stage. We have provided evidence that the cell cycle phase is not a major source of variation in this data set (see above).

D. Organizing the section on transcription factors (starting at line 1023) by TF superfamily may not be the most informative approach, since the type of DNA binding domain is not directly related to developmental function. It may be better to organize this section by the degree of specificity: start with which TFs, regardless of their family, are the most specific to candidate cell types – and which TFs are specific to particular clusters that were singled out for detailed analysis in previous sections (e.g. KC cells, clock neurons, which TFs show correlation with particular neurotransmitters, etc). Better biological insights are likely to be obtained this way. Figure 12 and its supplements are near-impossible to take in. It would be better to replace the main-text figure with examples of highly specific TFs mapped on UMAPs, perhaps in relation to cluster annotation developed previously in this paper and move the big grid to the supplement. Showing a separate panel with 5-10 most sex-biased TFs for each sex would be more informative than annotating them on the large cluster*TF grid.

The reviewer makes a good point and modify the manuscript figure 12 and provide a more focused analysis of 6 genes (3 per sex). We provide the TF superfamily cluster*TF grid results in the supplemental materials, as a reader may find them informative for a particular question.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

When comparing the merged and integrated dataset, the authors argue that they are highly concordant. Nevertheless, based on Figure 1 and Figure Supplement 5C-D, it is pretty obvious that integration worked better since there are a number of library-specific groups of datapoints in the merged UMAP. It would make it easier to visualise the integration of the libraries, even if the points are shuffled. In R, this is implemented in DimPlot of Seurat, by the option shuffle=TRUE.

We appreciate that the authors do not want to repeat the analysis, but they should definitely compare how the two approaches affect their analysis and not only rely on the visual (which, furthermore, is not convincing).

This can indeed lead to serious misinterpretations of the data when clusters that are library-specific are simply the product of batch effect. For example, the authors identify more mushroom body subtypes (9 or 13 depending on the clustering) than what has been reported in the literature (7). Are these clusters reliable? Or do they result from batch effect? Comparing Figure 1, Figure Supplement 5C-D, and Figure 4A, it seems very plausible that cluster γ_KCs_3 is library-specific. This problem might also occur in other neuronal structures, where the cell type composition is not as well known as the mushroom bodies.

The authors recognize that the reviewer wants to ensure that the computational analyses are robust. We agree that batch correction methodologies are critical for analyzing data sets that exhibit large degrees of variation between libraries or use different scRNA-seq chemistry platforms. While batch corrections are important in these circumstances, the use of the integrated analysis approach needs to be weighed against the outcome that the approach limits detection of true biological differences due to the algorithmic method of converging the data sets. Integrated analyses may have the unintended consequence of limiting the detection of true biological differences between the sexes in fru P1-expressing neurons, the primary goal of this study.

At the start of our analyses, we examined if the data have batch effects by visual inspection and by comparing the number of cells contributed per replicate for each cluster (Source data 2). We find high concordance in terms of number of cells contributed to each cluster between the replicates, as detailed in our previous response. In the full data set, using the merged analysis approach, we do identify 9 clusters that exhibit replicate-specific data from either sex when we performed data merging (Source data 2). However, we still argue that our technical replicates exhibit only a small degree of batch effects, based on the criteria above and using the merged approach is superior for this study. As suggested, the authors have additionally performed integrated analyses and find that this analysis has 7 clusters that exhibit replicate-specific data from either sex (Source data 2). Therefore, with respect to potentially improving replicate-specific population of a cluster, the integrated analyses does not perform that differently. We have added these points to the manuscript (lines: 222-228; 1181-1198).

We proceeded with the merged analyses for the data presented in the paper to ensure that we are able to best address our biological question, even with the potential small batch effects. This is reasonable as the data from the replicates overlaps for the majority of the clusters, and the integrated analysis also has replicate-specific data. Furthermore, integrated analyses may obscure detecting true biological differences, limiting our study results. We provide additional data for our reader to interrogate the integrated analysis further, by including the marker gene lists for that analysis in this manuscript (Source data 3), though we argue the merged data set analyses are better overall. Given the recommendation to improve the visualization, the authors have changed the UMAP figures that show the replicates using the option “shuffle = TRUE” (Figure 1 —figure supplement 1I, Figure 1 —figure supplement 5C-D). We agree that this is a better method to visualize the data for each replicate.

We further address the issue that using the merged analyses may skew the mushroom body results. The mushroom body analysis is mentioned by the reviewers, so the authors visualized those clusters by replicate, as well as cross-checked with Source data 2. We conclude that γ_KCs_3 is not library-specific. It is largely female-biased but neurons are contributed by all replicates in the merged data set (Source data 2). In addition, this recommendation prompted us to reexamine the subclustering results for the mushroom body analyses. Again, we find that no clusters are library specific and have added this visualization for transparency (Figure 4 —figure supplement 1K). Based on this visual inspection, we do find that male-specific subcluster 4 does exhibit a slight batch effect between the libraries based on distance of cells within that cluster, but there are cells from each replicate in the cluster.

https://doi.org/10.7554/eLife.78511.sa2

Article and author information

Author details

  1. Colleen M Palmateer

    Department of Biomedical Sciences, Florida State University, College of Medicine, Tallahassee, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Validation, Investigation, Visualization, Writing - original draft, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7254-0829
  2. Catherina Artikis

    Department of Biomedical Sciences, Florida State University, College of Medicine, Tallahassee, United States
    Contribution
    Validation, Visualization
    Competing interests
    No competing interests declared
  3. Savannah G Brovero

    Department of Biomedical Sciences, Florida State University, College of Medicine, Tallahassee, United States
    Contribution
    Methodology
    Competing interests
    No competing interests declared
  4. Benjamin Friedman

    Department of Biomedical Sciences, Florida State University, College of Medicine, Tallahassee, United States
    Contribution
    Validation, Visualization
    Competing interests
    No competing interests declared
  5. Alexis Gresham

    Department of Biomedical Sciences, Florida State University, College of Medicine, Tallahassee, United States
    Contribution
    Validation, Visualization
    Competing interests
    No competing interests declared
  6. Michelle N Arbeitman

    1. Department of Biomedical Sciences, Florida State University, College of Medicine, Tallahassee, United States
    2. Program of Neuroscience, Florida State University, Tallahassee, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    michelle.arbeitman@med.fsu.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2437-4352

Funding

National Institute of General Medical Sciences (R01GM073039)

  • Colleen M Palmateer
  • Catherina Artikis
  • Savannah G Brovero
  • Benjamin Friedman
  • Alexis Gresham
  • Michelle N Arbeitman

National Institute of General Medical Sciences (R01GM116998)

  • Colleen M Palmateer
  • Catherina Artikis
  • Savannah G Brovero
  • Benjamin Friedman
  • Alexis Gresham
  • Michelle N Arbeitman

National Institute of General Medical Sciences (1R35GM148205)

  • Michelle N Arbeitman

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank the Florida State University College of Medicine Translation Core for running the sequencing core. Several stocks were obtained from the Bloomington Drosophila Stock Center (NIH P40OD018537). Several antibodies used in this study were obtained from the Developmental Studies Hybridoma Bank, created by the NICHD of the NIH and maintained at The University of Iowa, Department of Biology, Iowa City, IA 52242. We thank colleagues that generously provided reagents, including Drosophila strains and antibody resources.

Senior Editor

  1. Claude Desplan, New York University, United States

Reviewing Editor

  1. Nikos Konstantinides, Institut Jacques Monod, France

Version history

  1. Received: March 9, 2022
  2. Preprint posted: March 11, 2022 (view preprint)
  3. Accepted: January 8, 2023
  4. Version of Record published: February 1, 2023 (version 1)

Copyright

© 2023, Palmateer et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,679
    Page views
  • 217
    Downloads
  • 3
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Colleen M Palmateer
  2. Catherina Artikis
  3. Savannah G Brovero
  4. Benjamin Friedman
  5. Alexis Gresham
  6. Michelle N Arbeitman
(2023)
Single-cell transcriptome profiles of Drosophila fruitless-expressing neurons from both sexes
eLife 12:e78511.
https://doi.org/10.7554/eLife.78511

Share this article

https://doi.org/10.7554/eLife.78511

Further reading

    1. Evolutionary Biology
    2. Genetics and Genomics
    Thomas A Sasani, Aaron R Quinlan, Kelley Harris
    Research Article

    Maintaining germline genome integrity is essential and enormously complex. Although many proteins are involved in DNA replication, proofreading, and repair, mutator alleles have largely eluded detection in mammals. DNA replication and repair proteins often recognize sequence motifs or excise lesions at specific nucleotides. Thus, we might expect that the spectrum of de novo mutations – the frequencies of C>T, A>G, etc. – will differ between genomes that harbor either a mutator or wild-type allele. Previously, we used quantitative trait locus mapping to discover candidate mutator alleles in the DNA repair gene Mutyh that increased the C>A germline mutation rate in a family of inbred mice known as the BXDs (Sasani et al., 2022, Ashbrook et al., 2021). In this study we developed a new method to detect alleles associated with mutation spectrum variation and applied it to mutation data from the BXDs. We discovered an additional C>A mutator locus on chromosome 6 that overlaps Ogg1, a DNA glycosylase involved in the same base-excision repair network as Mutyh (David et al., 2007). Its effect depends on the presence of a mutator allele near Mutyh, and BXDs with mutator alleles at both loci have greater numbers of C>A mutations than those with mutator alleles at either locus alone. Our new methods for analyzing mutation spectra reveal evidence of epistasis between germline mutator alleles and may be applicable to mutation data from humans and other model organisms.

    1. Evolutionary Biology
    2. Genetics and Genomics
    Ban Wang, Alexander L Starr, Hunter B Fraser
    Research Article

    Although gene expression divergence has long been postulated to be the primary driver of human evolution, identifying the genes and genetic variants underlying uniquely human traits has proven to be quite challenging. Theory suggests that cell-type-specific cis-regulatory variants may fuel evolutionary adaptation due to the specificity of their effects. These variants can precisely tune the expression of a single gene in a single cell-type, avoiding the potentially deleterious consequences of trans-acting changes and non-cell type-specific changes that can impact many genes and cell types, respectively. It has recently become possible to quantify human-specific cis-acting regulatory divergence by measuring allele-specific expression in human-chimpanzee hybrid cells—the product of fusing induced pluripotent stem (iPS) cells of each species in vitro. However, these cis-regulatory changes have only been explored in a limited number of cell types. Here, we quantify human-chimpanzee cis-regulatory divergence in gene expression and chromatin accessibility across six cell types, enabling the identification of highly cell-type-specific cis-regulatory changes. We find that cell-type-specific genes and regulatory elements evolve faster than those shared across cell types, suggesting an important role for genes with cell-type-specific expression in human evolution. Furthermore, we identify several instances of lineage-specific natural selection that may have played key roles in specific cell types, such as coordinated changes in the cis-regulation of dozens of genes involved in neuronal firing in motor neurons. Finally, using novel metrics and a machine learning model, we identify genetic variants that likely alter chromatin accessibility and transcription factor binding, leading to neuron-specific changes in the expression of the neurodevelopmentally important genes FABP7 and GAD1. Overall, our results demonstrate that integrative analysis of cis-regulatory divergence in chromatin accessibility and gene expression across cell types is a promising approach to identify the specific genes and genetic variants that make us human.