Abstract
In mammals, spermatogonial cells (SPGs) are undifferentiated male germ cells in testis that are quiescent until birth and then self-renew and differentiate to produce spermatogenic cells and functional sperm from early postnatal life throughout adulthood. The transcriptome of SPGs is highly dynamic and timely regulated during postnatal development. We examined if such dynamics involves changes in chromatin organization by profiling the transcriptome and chromatin accessibility of SPGs from early postnatal stages to adulthood in mice using deep RNA-seq, ATAC-seq and computational deconvolution analyses. By integrating transcriptomic and epigenomic features, we show that SPGs undergo massive chromatin remodeling during postnatal development that partially correlates with distinct gene expression profiles and transcription factors (TF) motif enrichment. We identify genomic regions with significantly different chromatin accessibility in adult SPGs that are marked by histone modifications associated with enhancers and promoters. Some of the regions with increased accessibility correspond to transposable element subtypes enriched in multiple TFs motifs and close to differentially expressed genes. Our results underscore the dynamics of chromatin organization in developing germ cells and complement existing datasets on SPGs by providing maps of the regulatory genome at high resolution from the same cell populations at early postnatal, late postnatal and adult stages collected from single individuals.
Introduction
Spermatogonial cells (SPGs) are the initiators and supporting cellular foundation of spermatogenesis in testis in many species, including mammals. In the mammalian testis, the founding germ cells are primordial germ cells (PGCs), which give rise sequentially to different populations of SPGs: primary transitional (T1)-prospermatogonia (ProSG), secondary transitional (T2)-ProSG and then spermatogonial stem cells (SSCs) (McCarrey, 2013; Rabbani et al., 2022; Tan et al., 2020). The ProSG population is exhausted by postnatal day (PND) 5 (Drumond et al., 2011) and by PND6-8, distinct SPGs subtypes can be distinguished on the basis of specific marker proteins and regenerative capacity (Cheng et al., 2020; Ernst et al., 2019; Green et al., 2018; Hermann et al., 2018; Tan et al., 2020).
SSCs represent an undifferentiated population of SPGs that retain regenerative capacity and divide to either self-renew or generate progenitors that initiate spermatogenic differentiation, giving rise to differentiating SPGs (diff-SPGs). Diff-SPGs form chains of daughter cells that become primary and secondary spermatocytes around PND10 to 12. Spermatocytes then undergo meiosis and give rise to haploid spermatids that develop into spermatozoa. Spermatozoa are then released into the lumen of seminiferous tubules and continue to mature in the epididymis until becoming capable of fertilization by PND42-48 in mice (De Rooij, 2017; Kubota and Brinster, 2018; Oatley and Griswold, 2017).
Recent work showed that SPGs in early postnatal life have distinct transcriptional signatures (Green et al., 2018; Hammoud et al., 2014; Hermann et al., 2018; Law et al., 2019). During the first week of postnatal development, SPGs display unique features necessary for the rapid establishment and expansion of the cell population along the basement membrane. This includes high expression of genes involved in cell cycle regulation, stem cell proliferation, transcription and RNA processing (Grive et al., 2019). In comparison, genes expressed in adult SPGs are involved in the maintenance of a balance between proliferation and differentiation and help constitute a steady cell population that ensures sperm formation across life. Previous transcriptome analyses have revealed that adult SPGs prioritize pathways related to paracrine signaling and niche communication, as well as mitochondrial functions and oxidative phosphorylation (Grive et al., 2019; Hermann et al., 2018). In addition, changes to chromatin accessibility, histone tail modifications and DNA methylation have also been reported in SPGs during postnatal development (Maezawa et al., 2017) (Hammoud et al., 2014; Hammoud et al., 2015). Today however, the relationship between transcriptome dynamics and chromatin landscape in SPGs during the transition from early postnatal to adult stage has not been fully characterized.
We characterized the transcriptome and chromatin accessibility in SPGs during the transition from early postnatal to adult stages in mice. The results reveal extensive changes in transcription and chromatin accessibility in particular, increased accessibility at enhancer regions in adult compared to postnatal SPGs. Regions with changes in chromatin accessibility are enriched in binding motifs of several TFs that are differentially expressed between postnatal and adult stages, suggesting a possible role in changes in chromatin accessibility. Analyses of chromatin accessibility at transposable elements (TEs) identified previously uncharacterized changes at long terminal repeats (LTR) and LINE L1 subtypes between developing and adult SPGs. Together, these findings suggest a functional link between transcriptional dynamics and chromatin accessibility in SPGs during development and underscore the plasticity of genome organization in male germ cells.
Results
FACS enriches SPGs collected from postnatal and adult mouse testis
We collected testes from 8- and 15-day old pups (PND8 and 15) and from adult males and prepared cell suspensions from each animal by enzymatic digestion. The preparations were enriched for SPGs by fluorescence-activated cell sorting (FACS) using specific surface markers (Kubota et al., 2004) (Fig. 1A, B). The purity of sorted cells was evaluated by immunocytochemistry using PLZF (ZBTB16), a well-established marker for SPGs (Costoya et al., 2004). FACS enriched cell populations from 3-6% PLZF+ (before FACS) to 85-95% PLZF+ (Fig. 1B), suggesting high SPGs enrichment.
To characterize the molecular identity of enriched SPGs populations, we profiled their transcriptome at PND8, PND15 and adulthood by total RNA sequencing (RNA-seq) (n=6 for each group) and examined the expression of known SSCs and somatic markers (Leydig and Sertoli cells) (Fig. 1C and Fig. S1A, B). Classical SPGs markers such as c-kit (Schrans-Stassen et al., 1999), Id4 (Helsel et al., 2017; Sun et al., 2015), Lin28a (Chakraborty et al., 2014; Wang et al., 2020), Zbtb16 (Costoya et al., 2004; Song et al., 2020) and Gfra1 (He et al., 2007) were robustly expressed in all SPGs samples at each developmental stage (Fig. 1C and Fig. S1A, B), indicating high purity of sorted cells. In contrast, low to negligible expression of somatic cells markers such as Vim (Bernardino et al., 2018) and Tspan17 (Gewiss et al., 2021) for Sertoli cells, and Fabp3 and Hsd3b1 for Leydig cells (Sararols et al., 2021) (Fig. 1C) was detected. To further validate the samples purity, we manually curated a list of spermatogonial and somatic cell markers derived from recent single-cell RNA-seq datasets (Cao et al., 2021; Sararols et al., 2021) and determined their expression level in our datasets. Again, we detected robust and consistent expression of all reported spermatogonial markers and low expression of somatic cells markers in all samples at each developmental stage (Fig. S1A).
To evaluate the cellular composition of samples, we applied computational deconvolution to our bulk RNA-seq datasets, employing publicly available single-cell RNA-seq datasets (Cobos et al., 2023). Trained on high-quality RNA-seq datasets from pure or single-cell populations, deconvolution algorithms create expression matrices reflecting the cellular diversity in reference datasets. These cell type-specific expression matrices are subsequently used to determine the cellular composition of bulk RNA-seq samples. First, we assessed the cellular composition of all our RNA-seq libraries, using datasets generated by Hermann et al. (2018), which characterized the single-cell transcriptomes of various populations of SPGs and testicular somatic cells in early postnatal (PND6) and adult stages. This enabled us to analyze the composition of isolated SPGs but also to assess potential somatic cell contamination using a unified dataset source. Upon re-analyzing single-cell RNA-seq datasets, we identified distinct cell-type clusters, marked by specific cellular markers as reported in the original and subsequent studies (Fig. S2A,B). Then, CIBERSORTx generated gene-expression signature matrices and estimated the cell-type proportions within our 18 bulk RNA-seq libraries (Newman et al., 2019). Evaluation of our postnatal libraries (PND8 and 15) against a PND6 signature matrix revealed a predominant derivation from SPGs, with average estimated proportions of SSCs being 0.99 and 0.85 for PND8 and PND15 samples, respectively (Fig. S2F). Notably, the analysis of PND15 libraries also suggested the presence of additional SPGs types, including progenitors and differentiating SPGs (Fig. S2F), albeit at lower frequency. Similarly, evaluation of our adult RNA-seq libraries using an adult signature matrix, showed an average SSCs proportion of 0.82, indicating a primary derivation from SSCs (Fig. S2G). Consistent with the findings from PND15 libraries, our deconvolution analysis also suggests the presence of additional SPGs types, including progenitors and differentiating SPGs in adult samples (Fig. S2G). However, unlike in postnatal libraries, the deconvolution analysis of adult libraries indicated the presence of other cell types (labelled “Other”), not corresponding to the major somatic cell types identified by Hermann et al. (2018). The estimated average proportion of these cells was less than 0.05 in two adult libraries and 0.10 in the others. This variance in cellular composition underlines the effectiveness of the deconvolution method to dissect the complex cellular composition of bulk RNA-seq samples. To further strengthen our observations, we re-analyzed two additional testicular single-cell RNA-seq datasets derived from an early postnatal (PND7) (Tan et al, 2020) and adult (Green et al., 2018) stage (Fig. S2C-E). Evaluation of our libraries confirmed a derivation from germ cells, in particular SSCs (Fig. S2G, F). Therefore, the results of the deconvolution analyses and our immunofluorescence data showing 85-95% PLZF+ cells in our cellular preparations confirm the efficiency of our FACS-based enrichment method and underscore that our bulk RNA-seq libraries are mainly composed of SPGs. The deconvolution analyses also suggest a predominant cellular composition of SSCs and to a lesser degree of differentiating SPGs, while also detecting a small proportion of somatic cells (<0.10) in our adult RNA-seq libraries, which is expected given the use of an immunolabeling-based enrichment method.
Dynamic transcriptomic states characterize postnatal and adult SPGs
We examined the transcriptome dynamics of SPGs from postnatal to adult stages by identifying differentially expressed genes (DEGs) in the RNA-seq datasets. We used a stringent cut-off (adjusted-P≤0.05, abs Log2 fold change (FC)≥1) on 17,000 genes expressed during at least one developmental stage (See Methods). A total of 663 DEGs were identified between PND8 and PND15 (146 down-regulated and 517 up-regulated) and 2483 DEGs between PND15 and adult stage (914 down-regulated and 1579 up-regulated) (Fig. 1D-E and Table S1). Consistent with previous reports (Hammoud et al., 2015), we observed a dynamic regulation of germ cell factors, transcription factors (TF) involved in core pluripotency pathways and signaling molecules important for self-renewal (Fig. S1B). For instance, Pou5f1 (Oct4), a TF necessary for pluripotency (Nichols et al., 1998), is significantly down-regulated in adult SPGs while the TFs Klf4 and Sox2, also needed for pluripotency (An et al., 2019), are expressed similarly in postnatal and adult stages although at different level (i.e Klf4 is expressed more than Sox2) (Fig. S1B).
We conducted Gene Ontology (GO) enrichment analyses to identify the biological processes which DEGs are involved in. GO analyses showed that DEGs between PND8 and PN15 are involved in cell differentiation, cilium movement, sperm motility and transcription and include for instance, TFs such as Junb, Hoxb6, Cebpa and Pparg (Fig. 1F and Table S2). Further, genes coding for histone proteins such as histone H2B Hist2h3b and epigenetic modifiers like the methyltransferase Pygo1 and histone acetyltransferase Ncoa1 are also differentially expressed between PND8 and 15. Interestingly, genes down- or up-regulated between postnatal stages are involved in different processes. While down-regulated genes are implicated in cell differentiation, cell migration and regulation of proliferation for instance, Cdkn1a that is down-regulated at PND15 regulates genetic diversity during spermatogenesis (Kanatsu-Shinohara et al., 2022), up-regulated genes are rather involved in germ cell development, cell signalling, insulin secretion and transcription (Fig. S1C and Table S2). Further, DEGs between PND15 and adulthood play roles in reproduction, spermatogenesis, cell differentiation, DNA replication, glycolysis and extracellular matrix (ECM)-receptor interaction pathways (Fig. 1G, Fig. S1D and Table S2). For example, the DNA methyltransferase Dnmt1, necessary for epi/genome regulation (Edwards et al., 2017) and Col4a2, a subunit of type IV collagen and component of the basal membrane (Reissig et al., 2019), are specifically down-regulated in adult SPGs (Fig. S1C). Interestingly, expression of collagen has been associated to a high proliferative potential and the ability to form germ cell colonies in SPGs (He et al., 2005), suggesting that regulation of collagen genes in adult SPGs may decrease germ-stem cell potential. Genes up-regulated in adult SPGs are involved in cilium movement, germ cell development and glycolysis (Table S2).
We then examined the differences and similarities of transcriptional profiles across the three developmental stages. To identify unique or common changes in gene expression during the transition from early postnatal to adult stages, we compared the list of DEGs at PND8 versus PND15 and at PND15 versus adulthood. Remarkably, the vast majority of DEGs show significant stage-specific changes in transcription (Fig. S2A). For instance, 75% (495/663) of DEGs between PND8 and PND15 are not statistically changed in the transition to adult SPGs (Fig. S2B and Table S1 and S2). Similarly, 93% (2325/2493) of DEGs in adult SPGs are not statistically changed when compared to PND8 versus PND15 SPGs (Fig. 1H and Table S1 and S2). GO enrichment analyses showed that adult-specific down-regulated genes are involved in cell differentiation, cell migration, regulation of signal transduction and Wnt signalling, regulation of DNA replication and transcription as well as collagen biosynthesis and laminin interactions while adult-specific up-regulated genes are involved in sexual reproduction, male gamete generation, germ cell development, cilium movement and glycolysis (Fig. 1H and Table S1 and S2).
Interestingly, a small fraction of all DEGs (4.9%) are detected as significantly changed in the same direction and magnitude (adjusted-P≤0.05, absLog2FC≥1) across the three developmental stages (Fig. S2C and Table S1 and S2). For instance, 121 genes are consistently up-regulated from PND8 to adult stage, while just 26 are consistently down-regulated from PND8 to adult stage (Fig. S2C). Such DEGs are involved in different biological pathways like chromatin organization (Table S2). In particular, the histone gene clusters displayed significant down-regulation across postnatal development and in adulthood (Table S2).
Landscape of chromatin accessibility in SPGs during postnatal development
Cellular differentiation is generally accompanied by changes in chromatin accessibility at regulatory elements (Atlasi et al., 2017). We examined how chromatin accessibility in SPGs is modified during development using omni-ATAC-seq (Corces et al., 2017). We focused on SPGs at PND15 and adulthood, stages that showed the highest changes in gene expression (Fig. 1). Accessible regions were identified by peak-calling on merged nucleosome-free fragments (NFF) as proxy for genomic regions with potential regulatory activity. We identified 158,977 peaks with clear ATAC-seq signal compared with surrounding genomic regions (Fig. S3A and Table S3). Most accessible regions were located at distal intergenic regions (38%), introns (28%) and putative promoter regions (23%) (Fig. S3B, C) and encompassed sequences enriched for histone PTMs associated with active chromatin such as H3K4 mono-, di- and tri-methylation (Fig. S2D). Notably, signal enrichment was higher for H3K4 methylation than for other histone PTMs (Fig. S3D). These results are consistent with previous observations that ATAC-seq peaks are identified at regulatory elements such as enhancers and promoters and are preferentially located at genomic regions with nucleosomes carrying H3K4me (Henikoff et al., 2020).
We then examined differences in chromatin accessibility between PND15 and adult SPGs by differential accessibility analysis. 3212 differentially accessible regions (DARs) were identified (adjusted-P≤0.05, absLog2FC≥1) with a total of 760 regions with decreased chromatin accessibility (DARs-down) and 2452 regions with increased accessibility (DARs-up) in adult SPGs (Figure 2A, B and Table S3). DARs were predominantly localized in distal intergenic regions (54% DARs-down, 37% DARs-up) and introns (33% DARs-down, 36% DARs-up) with a minority located in putative promoter regions (8% DARs-down, 12% DARs-up), consistent with the genomic distribution of all detected ATAC-seq peaks (Fig. 2C). GO analyses of the closest genes assigned to each DAR showed that these genes are involved in male gonad development, cell adhesion, sex differentiation and regulation of cell communication among others (Fig. 2D, E).
Epigenomic annotation of DARs using high-quality publicly available ChIP-seq datasets for postnatal SPGs (Cheng et al., 2020) revealed that DARs are predominantly located in regions enriched for histone PTMs associated with enhancers and promoters. 51% of regions with increased accessibility are located at genomic loci that carry histone PTMs since early postnatal stages, and about half (48%) overlap with regions significantly enriched in H3K4me1, a histone PTM associated to enhancers. In contrast, 80% of regions with decreased accessibility are located in regions with H3K4me1 and 1/3 (33%) also overlap with the repressive histone PTM, H3K27me3 (Fig. 2F). Interestingly, while 16% of regions with increased accessibility are located at potential active regulatory elements that carry H3K27ac, none of the regions with decreased accessibility overlap with H3K27ac (Fig. 2F). Therefore, our data indicate that the transition from postnatal to adult SPGs is accompanied by discrete, yet robust, changes in chromatin accessibility at potential regulatory elements, suggesting their involvement in the control of gene transcription.
DARs are associated with binding sites for distinct families of TFs
We examined the relationship between differences in chromatin accessibility and TF binding by TF motif analysis using DARs as input. We observed that regions with decreased accessibility in adult SPGs are enriched for binding motifs of members of specific TF families such as KLF (KLF2, KLF5, KLF11, KLF12, KLF15, KLF16), SP (SP1-5, SP9), FOXO (FOXO1, FOXO3), ETS (ETV1-6) among others (Fig. 3A and Table S4). Expression analyses showed that many of these TFs are differentially expressed in postnatal or adult SPGs (adjusted-P<0.05, 10 down-regulated and 11 up-regulated) (Fig. 3B, C). For example, KLF11,12 and 15, TFs that regulate stem cell maintenance and development (Bialkowska et al., 2017), are upregulated in adult SPGs compared to PND15 (Fig. 3C). These TFs and their motif match top enriched motifs detected in DARs in adult SPGs (Fig. 3A). The SP family of TFs also show differential expression and enrichment of binding motifs in regions with decreased accessibility. SP5, which has its lowest expression in adult SPGs (Fig. 3B, C), promotes self-renewal of mESCs by directly regulating Nanog (Tang et al., 2017) .
Members of the FOXO family of TFs also have an enrichment of motifs in regions with decreased accessibility. While FOXO3 has increased expression from PND15 onwards, FOXO1 is decreased in adult compared to postnatal SPGs (Fig. 3B, C). Interestingly, FOXO1 and FOXO3 regulate SSC function and maintenance in mouse (Goertz et al., 2011). Finally, ETS-type of TFs also show differential expression during SPGs development and their motif is enriched in regions with decreased accessibility. ELK4 is up-regulated in adult SPGs and can act as a transcriptional repressor via recruitment of the HDAC Sirt7 and deacetylation of H3K18ac (Fig. 3B-D) (Barber et al., 2012). EKL4 can also act as transcriptional activator of immediate early genes such as c-Fos (Dalton and Treisman, 1992). Interestingly, c-Fos itself codes for a TF important for the regulation of proliferation (He et al., 2008) and shows a trend towards up-regulation in adult SPGs (Table S1). In contrast, GABPA is downregulated in adult SPGs compared to PND15 and has been involved in the regulation of proliferation in mESCs (Ueda et al., 2017).
Regions with decreased chromatin accessibility also show an enrichment for the binding motif of the TF DMRT1. Dmrt1 is progressively repressed during SPGs postnatal development and is the lowest in adult SPGs (Fig 3B, C and E). DMRT1 can act either as a repressor or activator and controls testis development and male germ cell proliferation (Zhang et al., 2016). DMRT1 can inhibit meiosis and promote mitosis in SPGs by repressing Stra8 (Matson et al., 2010). Consistently, we observed transcriptional repression of Stra8 in adult SPGs (Table S1).
Next, we identified motifs overrepresented in regions with increased chromatin accessibility in adult SPGs. The identified motifs correspond to binding motifs of members of specific TF families such as POU (POU2F2, POU1F1, POU5F1), RFX (RXF1-7), DMRT (DMRT1, DMRT3, DMRTA1-2, DMRTC2), SOX (SOX2, SOX3, SOX5, SOX13), NFY (NFYA, NFYB, NFYC) and AP-1 (JUN, JUNB, JUND, FOS, ATF3, BATF) (Fig. 3F and Table S4). A subset of them is also differentially expressed (adjusted-P<0.05, 13 down-regulated and 5 up-regulated). The most overrepresented motif is similar to the binding motif of the POU family of TFs, which are critical regulators of stem cells (Nichols et al., 1998). Pou1f1 and Pou5f1 are transcriptionally repressed in adult SPGs while Pou2f2 is maximally expressed in PND15 SPGs and down-regulated in adult cells (Fig. 3G, H). We also identified motifs for members of the RFX family of TFs, which are master regulators of ciliogenesis (Choksi et al., 2014) implicated in regulation of neural stem cells (Kawase et al., 2014). RFX2 is robustly expressed in adult SPGs (Fig. 3G, H and I) and has been reported to induce the expression of ciliary genes in association with the TF FOXJ, which has a trend towards up-regulation in adult SPGs (Fig. S4) (Quigley and Kintner, 2017). Interestingly, ciliary genes are among the top genes specifically up-regulated in adult SPGs (Fig. 1H), suggesting a regulatory relationship between RFX2 and ciliary genes expression in adult SPG cells.
Other binding motifs enriched at regions with increased accessibility correspond to members of the NF-Y complex, NF-YA, NF-YB and NF-YC. In mESCs, NF-Y TFs facilitate a permissive chromatin conformation and play an important role in the expression of core ESC pluripotency genes (Oldfield et al., 2014). NF-YA/B motif enrichment has also been found in regions of open chromatin in human SPGs (Guo et al., 2017). Interestingly, the expression of NF-YA and NF-YC is progressively down-regulated starting at PND15 (Fig. 3G, H). We also detected an enrichment for the binding motifs of members of the AP-1 family of TFs, which are involved in many processes, from regulation of cell proliferation to differentiation and acute responses to environmental clues (Bejjani et al., 2019; Vierbuchen et al., 2017). Interestingly, except for Fosl2, other TFs do not show any significant change in transcription in postnatal or adult SPGs and are either constitutively expressed or repressed. For instance, c-FOS and JUN are robustly expressed in postnatal and adult SPGs (Fig. 3J) consistent with their role in promoting the proliferative potential of SSCs (He et al., 2008; Wang et al., 2018). Together, these data reveal that a shift in TFs repertoire accompanies chromatin reorganization occurring in SPGs during postnatal to adult development.
A subset of DARs is associated with differential gene expression
Thus far, our results show that in SPGs, the transition from postnatal to adult stage is accompanied by changes in gene expression and chromatin accessibility. In some instances, differences in gene expression can be correlated with differences in chromatin accessibility of regulatory elements. Since DARs in postnatal and adult SPGs overlap with putative enhancer and promoter elements, we examined if these changes in chromatin accessibility correlate with changes in gene expression. First, we tested if promoter regions of DEGs (+/-2.5kb from TSS) have differential chromatin accessibility by identifying all possible ATAC-seq peaks located in promoters and calculating an average accessibility signal. No significant difference was detected for down-regulated genes in adult SPGs, despite a trend for decreased accessibility around the TSS (Fig. 4A). However, down-regulated genes had a significant increase in chromatin accessibility in their body (Fig. 4B). Up-regulated genes had a significant increase in chromatin accessibility both at the promoter region and gene body (Fig. 4A, B). Then, we conducted motif enrichment analyses and identified families of TFs with overrepresented binding motifs at the promoter of DEGs. Binding motifs for members of KLF and SP family of TFs were overrepresented in the promoter of both down- and up-regulated DEGs, while motifs for members of TF families SOX and NFY were detected only in the promoter of down-regulated genes. Promoters of up-regulated genes were specifically enriched in motifs for RFX and AP-1 (Fig. 4C). These observations suggest that changes in chromatin accessibility, in particular at the promoter of up-regulated DEGs may be associated with differential binding of RFX and AP-1 TFs.
Next, we examined the overlap between DARs and the promoter of DEGs. Unexpectedly, only 3.2% of all promoters of DEGs overlap with at least one DAR (Fig. 4D). The majority of overlapping DARs (71%) is associated with promoters of up-regulated genes in adult SPGs (Fig. 4D). For example, Gpx4, a peroxidase involved in the control of stemness (Hu et al., 2021), is transcriptionally active in adult SPGs and has an open chromatin at its promoter region (Fig. 4E). Since the majority of DARs are located at distal intergenic regions and have histone PTMs associated with active regulatory elements, we extended our search of their target genes by assigning a large region (+/-100kb) around each DEG. We observed that a third of DEGs were associated with at least one DAR (Fig. 4F), with the majority of DARs (69%) overlapping regulatory elements of up-regulated genes. For instance, Braf is transcriptionally up-regulated in adult SPGs, which correlates with increased chromatin accessibility at distal regulatory elements (Fig. 4G). Our results overall show that only a fraction of DEGs is associated with DARs, suggesting that changes in gene expression are not always mirrored at the level of chromatin accessibility, consistent with recent observations (Kiani et al., 2022).
Chromatin accessibility at transposable elements (TEs) undergoes remodeling during the transition from early postnatal to adulthood in SPGs
TEs are tightly regulated in the germline by coordinated epigenetic mechanisms involving DNA methylation, chromatin silencing and PIWI proteins-piRNA pathway (Deniz et al., 2019). SPGs were recently shown to have a unique landscape of chromatin accessibility at long-terminal repeats (LTRs) retrotransposons, compared with other stages of germ cells in testis (Sakashita et al., 2020). We examined chromatin accessibility at TEs in PND15 and adult SPGs by quantifying ATAC-seq reads overlapping TEs defined by UCSC RepeatMasker then analyzing differential accessibility at subtypes level (see Methods section). These analyses showed that SPGs transitioning from PND15 to adult stages have significantly different chromatin accessibility at 135 TEs subtypes (adjusted-P≤0.05, absLog2FC≥1) (Fig. 5A, B and Table S5). Most TEs subtypes have decreased chromatin accessibility (93/135, 68,9%) (Fig. 5A) and 42 have increased accessibility (Fig. 5B) in adult SPGs. Differentially accessible TEs loci are predominantly located in intergenic (68%) and intronic (25%) regions and 6% re close to a gene (+/-1kb from TSS) (Fig. 5C).
LTR retrotransposons were the most abundant TEs with altered chromatin accessibility, specifically ERVK and ERV1 subtypes (Fig. 5A, B). Exemplary ERVK subtypes with less accessible chromatin included RLTR17, RLTR9A3, RLTR12B and RMER17B (Table S5). Enrichment of RLTR17 and RLTR9 repeats was previously reported in mESCs, specifically at TFs important for pluripotency maintenance such as Oct4 and Nanog (Fort et al., 2014). Interestingly, we identified the promoter region of the long non-coding RNA (lncRNA) Lncenc1, an important regulator of pluripotency in mESCs (Fort et al., 2014; Sun et al., 2018), to harbor several LTR loci with decreased accessibility in adult SPGs. One of these loci, RLTR17, overlaps with the TSS of Lncenc1, and its decreased accessibility correlates with markedly lower Lncenc1 expression in adult SPGs (Fig. 5D). Lncenc1 (also known as Platr18) is part of the pluripotency-associated transcript (Platr) family of lncRNAs identified as potential regulators of pluripotency-associated genes Oct4, Nanog and Zfp42 in mESCs (Bergmann et al., 2015). We identified several other Platr genes such as Platr27 and Platr14, whose TSS overlaps with LTRs with reduced accessibility, RLTR17 and RLTR16B_MM, respectively (Fig. 5D and Table S5). These two pluripotency-associated transcripts tended to be downregulated in adult SPGs but were unchanged at PND8 and PND15 (Fig. 5D and Table S1). The other LTR subtypes with decreased accessibility in adult SPGs belong to ERV1, ERVL and MaLR families (Fig. 5A). A few non-LTR TEs also had decreased chromatin accessibility, particularly 7 DNA element subtypes, 2 satellite subtypes and 1 LINE subtype, respectively (Table S5).
TE subtypes with increased accessibility mostly included members of ERVK, ERVL and ERV1 families (24/42, 57.1%) (Table S5). Interestingly, many LINE L1 subtypes had increased chromatin accessibility in adult SPGs (Table S5). When parsing the data for more accessible loci within L1 subtypes, we found several L1 loci situated less than +/-5kb from the TSS of olfactory (Olfr) genes, particularly Olfr gene clusters on chromosome 2, 7 and 11 (Table S5).
TEs are known to provide tissue-specific substrates for TF binding (Sundaram and Wysocka, 2020). We examined the regulatory potential of differentially accessible LTR subtypes by determining the enrichment of TF motifs in these regions. We grouped LTR subtypes by family (EVK, ERV1, ERVL and ERVL-MaLR families) and observed that among families with less accessible chromatin, ERVKs had the highest number of enriched TF motifs in adult SPGs (Table S5). Top hits included TFs with known regulatory functions in cell proliferation and differentiation such as FOXL1 and FOXQ1, stem cell maintenance factors ELF1, EBF1 and THAP11 and TFs important for spermatogenesis PBX3, ZNF143 and NFYA/B (Table S5). ERVLs displayed motif enrichment for few TFs, among which ETV2, recently reported to be the SSC factor ZBTB7A and the testis-specific CTCF paralog CTCFL (Table S5) (Green et al., 2018).
More accessible LINE L1 subtypes were highly enriched in TF motifs, particularly in multiple members of ETS, E2F and FOX families (Table S5). The most significant motifs belonged to SSCs maintenance and stem cell potential regulators FOXO1 and ZEB1, as well as TFs recently associated with active enhancers of the stem cell-enriched population of SPGs such as ZBTB17 and KLF5 (Table S5) (Cheng et al., 2020). More accessible ERV1s were also enriched in several TF binding sites, including spermatogenesis-related TFs (PBX3, PRDM1, NFYA/B), hypoxia inducible HIF1A and cytokine regulators STAT5A/B, suggesting different metabolic demand between postnatal and adult stage (Table S5).
Discussion
This study examines the transcriptomic landscape and chromatin accessibility of mouse SPGs from early postnatal development to adulthood. Using FACS-enriched SPGs populations at PND8, PND15 and adulthood, we show that the transcriptome of SPGs is dynamically regulated during development and that many factors including pluripotency-related TFs and signaling molecules are differentially expressed. This correlates with changes in chromatin accessibility although not all DEGs have altered accessibility. TEs such as LTR retrotransposons have modified chromatin accessibility, which in some cases affects pluripotency-associated lncRNAs. TF motif enrichment in differentially accessible TEs suggests various regulatory roles for SPGs. Our findings complement existing datasets on spermatogonial cells by providing parallel transcriptomic and chromatin accessibility maps at high resolution from the same cell populations at early postnatal, late postnatal and adult stages collected from single individuals (for adults).
Different transcriptional states characterize the postnatal development of SPGs
SPGs are known to undergo genome-wide transcriptional changes during development (Grive et al., 2019; Hammoud et al., 2015; Hermann et al., 2018). The present data complement previous findings by showing that postnatal maturation of SPGs is accompanied by an overall decrease in the expression of pluripotency markers and differential expression of signaling molecules and regulators of cell cycle and spermatogenesis. The high quality and depth of our RNA-seq datasets allowed us to uncover specific transitional states of SPGs postnatal maturation that were not known before. For instance, we observed that while the transition from early to mid-postnatal stage (PND8 to PND15) is accompanied by changes in gene expression, the transition from mid-postnatal to adult stage is more extensive, both in number of regulated genes and magnitude of changes. We also observed a clear trend towards up-regulation of gene expression as postnatal maturation proceeds, that may reflect gradual transcriptional increase for some genes and possibly sharp changes at certain developmental stages for others.
Consistently, many transcriptional regulators and chromatin modifiers are differentially regulated during SPGs development. Different families of TFs including factors with a specific role in SPGs like RFX and DMRT, and factors with more widespread functions across cell types such as members of the AP-1 family are modulated, suggesting the contribution of multiple classes of TFs to SPGs development. Notably, several transcriptional repressors and epigenetic silencers are differentially down-regulated which may explain the marked trend towards transcriptional up-regulation in our datasets (Fig. 1D). Interestingly, we also observed a global down-regulation of histone genes starting at PND15 (Table S1), which may modify the composition and structure of chromatin and be associated with chromatin openness (Prado et al., 2017). Overall, our transcriptional data highlight the yet unappreciated dynamic regulation of TFs and epigenetic/chromatin modifiers in SPGs during postnatal development.
Chromatin accessibility in SPGs is modified at enhancers during postnatal development
Our results suggest that the transition of SPGs from early postnatal to adult stage is accompanied by changes in chromatin accessibility. In particular, accessibility is increased in regions enriched in H3K4me1, a histone PTM typically associated with primed enhancers, but also to a lesser extent, in regions enriched in H3K27ac, a mark associated with active regulatory elements (Gasperini et al., 2020; Shlyueva et al., 2014). This suggests that the transition of SPGs from postnatal to adult stage involves changes in the regulatory genome, in particular at primed enhancers. Enhancer priming has recently been proposed as an important mechanism for SPGs differentiation (McCarrey, 2023). Differences in chromatin accessibility may be partially explained by differential binding of TFs to regulatory elements (Klemm et al., 2019). This may involve at least three non-mutually exclusive events: 1) a change in the abundance of TFs, 2) a change in the epigenetic status of TF binding sites due to different amount and/or activity of epigenetic modifiers, and 3) a change in nucleosome positioning around TF binding sites due to different amount and/or activity of chromatin remodelers. For instance, CpG methylation and nucleosome positioning can influence TF binding at regulatory elements and affect chromatin accessibility (Barisic et al., 2019; Kaluscha et al., 2022). The fact that Dnmt1 is downregulated in adult SPGs concomitantly with differential expression of TFs and chromatin modifiers may explain changes in chromatin accessibility at potential enhancers.
Modest correspondence between transcription and chromatin accessibility in SPGs
Despite robust changes in gene expression in adult SPGs, only a subset of DARs could be assigned to DEGs. This suggests that transcriptional regulation and chromatin accessibility at regulatory elements are not always linked and can be dissociated, as previously reported (Chereji et al., 2019; Kiani et al., 2022). This implies that chromatin accessibility may be modified without any effects on transcription and conversely, transcription can be activated or repressed without requiring any change in chromatin accessibility. In turn, it is possible that a repressor can be exchanged with an activator at a regulatory element without detectable consequences. It is also possible that changes in chromatin accessibility have no immediate effects on transcription but reflect an intermediate state that primes a gene or genomic locus for later activation or repression such as observed with developmentally- or experience-dependent primed enhancers (Marco et al., 2020; Rada-Iglesias et al., 2011). Consistently, the majority of DARs with increased accessibility overlap with regions enriched in H3K4me1, a histone modification characteristic of primed enhancers thought to control future transcriptional responses in different cell types (Rada-Iglesias et al., 2011). These regions may be important for responses to external cues in adult SPGs. The apparent lack of correspondence between transcription and accessibility may also be in part technical and due to the stringency of filters we used to assign genomic loci to DARs. We cannot exclude the possibility that modest changes in accessibility (<log2FC1) correspond to changes in TF occupancy. Chromatin accessibility has been suggested to not necessarily be the primary determinant of chromatin-mediated gene regulation (Chereji et al., 2019). Such possibility could be addressed by examining genome-wide maps of TFs, cofactors and RNA-Pol II occupancy that would provide more accurate information about the regulatory genome and its relationship with observed gene expression programs.
Chromatin accessibility at TEs during SPGs maturation
Our results reveal differences in chromatin accessibility and TFs motif landscape at different TEs subtypes between PND15 and adult SPGs. ERVK and ERV1 subtypes were the most abundant categories of TEs that become less accessible in adult SPGs, whilst LINE L1 subtypes gained accessibility. Although the majority of these TEs reside in intergenic and intronic regions, we could detect specific loci belonging to differentially accessible ERVK and LINE L1 subtypes localized nearby TSS of distinct gene families. Notably, ERVKs are known to play an important role in the regulation of mRNA and lncRNAs transcription during spermatogenesis (Davis et al., 2017; Sakashita et al., 2020). The landscape of chromatin accessibility at LTRs loci in SPGs is known to be unique compared to other mitotic germ cells and meiotic gametes in testis (Sakashita et al., 2020).
RLTR17, one of the LTR subtypes with decreased chromatin accessibility in adult SPGs overlaps with the TSS of several downregulated lncRNAs from the Platr family. Platr genes, including Lncenc1 and Platr14 identified in our study, are LTR-associated lncRNAs important for gene expression programs in ESCs (Bergmann et al., 2015). Interestingly, RLTR17 has also previously been linked to pluripotency maintenance. In mESCs, it is highly expressed and enriched in open chromatin regions and has been shown to provide binding sites for the pluripotency factors Oct4 and Nanog (Fort et al., 2014). Therefore, we suggest that RLTR17 chromatin organization may play a role in regulating pluripotency programs between early postnatal and adult SPGs.
In contrast to decreased accessibility at LTRs loci, LINE L1 subtype had increased accessibility in adult SPGs. Some of these L1 loci were located close to olfactory receptor genes with upregulated mRNA expression. Recent findings in mouse and human ESCs suggest a non-random genomic localization of L1 elements, specifically at genes encoding proteins with specialized functions (Lu et al., 2020). Among them, the Olfr gene family was the most enriched in L1 elements (Lu et al., 2020). Although their role in SPGs is currently not established, Olfr proteins have been implicated in swimming behaviour of sperm cells (Fukuda and Touhara, 2005). Given their dynamic regulation in SPGs, we speculate that Olfr genes play additional roles in spermatogenesis, other than in sperm physiology. Together with the high number of enriched TF motifs identified at differentially accessible ERVKs and LINE L1 elements, these data highlight a regulatory role for chromatin organization of TEs in SPGs during the transition from development to adult stage (Sundaram et al., 2014; Sundaram and Wysocka, 2020).
Limitations
One limitation of our study is the partial purity of SPGs populations obtained by FACS, due to the use of a purification method that does not fully remove other testicular cells from the samples. This therefore does not exclude the possible influence of contaminating cells on the data and their interpretation. Such influence may explain differences between our datasets and datasets in the literature. Nevertheless, by comparing chromatin landscape in developing and adult SPGs, the study reveals an age-dependent dynamic reorganization of chromatin accessibility in these cells. When integrated with our transcriptomic datasets and published histone PTMs profiles, the data provide novel insight into transcriptome-chromatin dynamics of mouse SPGs from early postnatal life to adulthood with great depth and high resolution. This represents an important resource for future studies on mouse germ cells.
Methods
Mouse husbandry
Male C57Bl/6J mice were purchased from Janvier Laboratories (France) and bred in-house to C57Bl/6J primiparous females to generate males for the experiments. All animals were kept on a reversed 12-h light/12-h dark cycle in a temperature- and humidity-controlled facility with food (M/R Haltung Extrudat, Provimi Kliba SA, Switzerland) and water provided ad libitum. Cages were changed once weekly. Animals from 2 independent cohorts generated separately were used for the experiments.
Germ cells isolation
Germ cells were isolated at postnatal day (PND) 8 or 15 for RNA-seq and ATAC-seq, and at postnatal week 20 (adult) for ATAC-seq. Testicular single-cell suspensions were prepared as previously described with slight modifications (Kubota et al., 2004a, 2004b). For PND8 and PND15 cells, testes from 2 animals were pooled for each sample while for adults, testes from individual males were used. Pup testes were collected in sterile HBSS on ice. Tunica albuginea was gently removed from each testis, making sure to keep seminiferous tubules as intact as possible. Tubules were enzymatically digested in 0.25% trypsin-EDTA (ThermoFisher Scientific) and 7mg/ml DNase I (Sigma-Aldrich) solution for 5 min at 37oC. The suspension was vigorously pipetted up and down 10 times and incubated again for 3 min at 37oC. The digestion was stopped by adding 10% fetal bovine serum (ThermoFisher Scientific) and cells were passed through a 20μm-pore-size cell strainer (Miltenyi Biotec) and pelleted by centrifugation at 600g for 7 min at 4oC. Cells were resuspended in PBS-S (PBS with 1% PBS, 10 mM HEPES, 1 mM pyruvate, 1mg/ml glucose, 50 units/ml penicillin and 50 μg/ml streptomycin) and used for sorting. Adult testes were collected and the tunica was removed. Seminiferous tubules were digested in 1 mg/mL collagenase type IV (Sigma Aldrich) and then in 0.25 % Trypsin-EDTA (Gibco), both times for 8-12 minutes and in the presence of DNase I solution (Sigma Aldrich). FBS (Cytiva HyClone) was added to a concentration of 10 % and the cell suspension was filtered through a 40μm-pore-size cell strainer (Corning) and washed with DPBS-S (1 % FBS, 10 mM HEPES (Gibco), 1mM sodium pyruvate (Gibco), 1mg/mL Glucose (Gibco) and 1X Penicillin-Streptomycin (Gibco) in DPBS (Gibco). 5 mL of the single cell suspension was then overlayed on 2 mL 30 % Percoll (Sigma) and centrifuged with disabled break at 600g for 8 minutes at 4 °C. The supernatant was removed and cell suspension was washed twice in DPBS-S and used for sorting.
SPGs enrichment by FACS
For pup testis, dissociated cells were stained with BV421-conjugated anti-β2M, biotin-conjugated anti-THY1 (53-2.1) and PE-conjugated anti-αv-integrin (RMV-7) antibodies. THY1 was detected by staining with Alexa Fluor 488-Sav. For adult testes, cells were stained with anti-α6-integrin (CD49f; GoH3), BV421-conjugated anti-β2 microglobulin (β2M; S19.8) and R-phycoerythrin (PE)-conjugated anti-THY1 (CD90.2; 30H-12) antibodies. α6-Integrin was detected by Alexa Fluor 488-SAv after staining with biotin-conjugated rat anti-mouse IgG1/2a (G28-5) antibody. Prior to FACS, 1 μg/ml propidium iodide (Sigma) was added to cell suspensions to discriminate dead cells. All antibody incubations were performed in PBS-S for at least 30 min at 4oC followed by washing in PBS-S. Antibodies were obtained from BD Biosciences (San Jose, United States) unless otherwise stated. Cell sorting was performed at 4oC on a FACS Aria III 5L using an 85μm nozzle at the Cytometry Facility of University of Zurich. For RNA-seq at PND8 and PND15, cells were collected in 1.5 ml Eppendorf tubes in 500 μL PBS-S, immediately pelleted by centrifugation and snap frozen in liquid N2. Cell pellets were stored at -80oC until RNA extraction. For RNA-seq of adults, 1000 spermatogonial cells (MHC I negative, alpha-6-integrin and Thy1 positive) per male were sorted into PBS. For Omni-ATAC at PND15, 25’000 cells were collected in a separate tube, pelleted by centrifugation and immediately processed using a library preparation protocol (Corces et al., 2017). For Omni-ATAC in adults, 5000 cells from each animal were collected in a separate tube and processed using the same protocol.
Immunocytochemistry
The protocol used for assessing SPGs enrichment after sorting was kindly provided by Jon Oatley at Washington State University, Pullman, USA (Yang et al., 2013). Briefly, 30,000-50,000 cells were adhered to poly-L-Lysine coated coverslips (Corning Life Sciences) in 24-well plates for 1 h. Cells were fixed in freshly prepared 4% PFA for 10 min at room temperature then washed in PBS with 0.1% Triton X-100 (PBS-T). Non-specific antibody binding was blocked by incubation with 10% normal goat serum for 1 h at room temperature. Cells were incubated overnight at 4oC with mouse anti-PLZF (0.2 μg/ml, Active Motif) primary antibody. Alexa488 goat anti-mouse IgG (1 µg/mL, ThermoFisher Scientific) was used for secondary labelling at 4oC for 1 h. Coverslips were washed 3 times, mounted onto glass slides with VectaShield mounting medium containing DAPI (Vector Laboratories) and examined by fluorescence microscopy. SPGs enrichment was determined by counting PLZF+ cells in 10 random fields of view from each coverslip and dividing by the total number of cells present in the same fields of view (DAPI-stained nuclei). The number of PLZF+ and PLZF-cells from the 10 different fields of view was averaged.
RNA extraction and RNA-seq library preparation
For RNA-seq at PND8 and PND15, total RNA was extracted from sorted cells using AllPrep RNA/DNA Micro kit (Qiagen). RNA quality was assessed using a Bioanalyzer 2100 (Agilent Technologies). Samples were quantified using Qubit RNA HS Assay (ThermoFisher Scientific). RNA sequencing libraries were prepared using SMARTer Stranded Total RNA-Seq Kit v3 (Takara Bio USA, Inc) following the recommended protocol with minor adjustments. For PND8 and PND15 SPGs libraries, RNA was fragmented for 4 minutes at 94 °C. After reverse transcription, samples were barcoded using SMARTer Unique Dual Index Kit (Takara Bio USA, Inc). Five PCR cycles were run and PCR products were purified using AMPure XP reagent (Beckman Coulter Life Sciences; bead:sample ratio 0.8X). Ribosomal cDNA was depleted following the protocol and final library amplification was performed with 12 PCR cycles. Libraries were purified twice using AMPure XP reagent (bead:sample ratio 1X) and eluted in Tris buffer. Adult SPGs libraries were prepared using the option to start directly from cells (1000 cells as input). Cells were incubated for 6 minutes at 85 °C and processed as indicated for PND8 and PND15 SPGs. Samples were pooled at equal molarity, as determined in a 100-800 bp window on the Bioanalyzer.
Omni-ATAC library preparation and sequencing
Chromatin accessibility was profiled from PND15 and adult SPGs. Libraries were prepared according to Omni-ATAC protocol, starting from 25,000 PND15 and 5000 adult sorted SPGs (Corces et al., 2017). Briefly, sorted cells were lysed in cold lysis buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% NP40, 0.1% Tween-20, and 0.01% digitonin) and nuclei were pelleted and transposed using Nextera Tn5 (Illumina) for 30 min at 37oC in a thermomixer with shaking at 1000 rpm. Transposed fragments were purified using MinElute Reaction Cleanup Kit (Qiagen). Following purification, libraries were generated by PCR amplification using NEBNext High-Fidelity 2X PCR Master Mix (New England Biolabs) and purified using Agencourt AMPure XP magnetic beads (Beckman Coulter) to remove primer dimers (78bp) and fragments of 1000-10,000 bp. Library quality was assessed on an Agilent High Sensitivity DNA chip using the Bioanalyzer 2100 (Agilent Technologies). 6 samples were sequenced from PND15 and 5 from adult SPGs.
RNA-seq analysis
Quality control and alignment: 100bp single end sequencing was performed at the Functional Genomics Center Zurich (FGCZ) using a Novaseq SP flowcell on the Novaseq 6000 platform. Quality assessment of FASTQ files was done using FastQC (Andrews et al., 2012) (version 0.11.8). TrimGalore (Krueger, 2015) (version 0.6.2) was used to trim adapters and low-quality ends from reads with Phred score less than 30 (-q 30) and to discard trimmed reads shorter than 30bp (--length 30). Trimmed reads were pseudo-aligned using Salmon (Patro et al., 2017) (version 0.9.1) with automatic detection of the library type (-l A), correcting for sequence-specific bias (--seqBias) and correcting for fragment GC bias correction (--gcBias) on a transcript index prepared for the Mouse genome (GRCm38) from GENCODE (version M18) (Harrow et al., 2012), with additional piRNA precursors and TEs (concatenated by family) from Repeat Masker as in (Gapp et al., 2018).
Downstream analysis
Data analyses and plotting were conducted with R (R Core Team, 2020) (version 3.6.2) using packages from The Comprehensive R Archive Network (CRAN) (https://cran.r-project.org) and Bioconductor (Huber et al., 2015). Pre-filtering of genes was done using the filterByExpr function from edgeR (Robinson et al., 2009) (version 3.28.1) with a design matrix requiring at least 15 counts (min.counts=15). Normalization factors were obtained using TMM normalization (Robinson and Oshlack, 2010) from edgeR package and differential gene expression analyses were done using limma-voom (Law et al., 2014) pipeline from limma (Ritchie et al., 2015) (version 3.42.2). GO enrichment analyses were performed using g:Profiler (https://biit.cs.ut.ee/gprofiler/gost) querying against Mus musculus database.
Deconvolution analyses
Hermann et al., 2018: Raw data were obtained from PND6 and adult sorted ID4-EGFP+ SPGs. Using the cluster IDs provided in the supplemental datasets, CIBERSORTx analysis (Newman et al., 2019) was performed to estimate the proportion of labeled cell types in each RNA-seq library using the default settings (G.min=300, G.max=500, q=0.1). For deconvolution analyses, we used the aggregated P6 and adult datasets as references.
Tan et al., 2020: Raw data from E18, P2 and P7 were obtained. We applied the reported filtering and normalization parameters, followed by UMAP analysis (Becht et al., 2018) using the top 10 principal components (PCs). For clustering, the Seurat package was employed (Butler et al., 2018, dims=1:10 in FindNeighbors, resolution=0.025 in FindClusters) and resultant cell counts in each cluster corresponded to those reported in Tan et al., 2020. Clusters identified as germ, Sertoli, stroma, Leydig and peritubular myoid (PTM) cells were validated using specific marker genes as per Tan et al., 2020; Green et al., 2018; and Hermann et al., 2018. Cells not classified into these clusters were categorized as ’Other Cells.’ Then, CIBERSORTx was utilized to estimate the proportion of these labeled cell types in each RNA-seq library using the default settings (G.min=300, G.max=500, q=0.1). For the deconvolution analyses, two replicate libraries from PND7 were used as references, and their averaged results were reported. Additionally, we performed a re-analysis focusing on re-clustered germ cells. Within this subset, SSCs and differentiating SPGs were identified, using marker genes referenced in Tan et al., 2020; Green et al., 2018; and Hermann et al., 2018.
Green et al., 2018: Raw data from adult were obtained. We applied the reported filtering and normalization parameters, followed by UMAP analysis (Becht et al., 2018) using the top 10 principal components (PCs). For clustering, the Seurat package (Butler et al., 2018, dims=1:20 in FindNeighbors, resolution=0.011 in FindClusters). Clusters identified as germ, Sertoli, stroma, Leydig, and peritubular myoid (PTM) cells were validated using specific marker genes as per Tan et al., 2020; Green et al., 2018; and Hermann et al., 2018. Cells not classified into these clusters were categorized as ’Other Cells.’ Then CIBERSORTx was utilized to estimate the proportion of these labeled cell types in each RNA-seq library using the default settings (G.min=300, G.max=500, q=0.1). For deconvolution analyses, we used 8 seminiferous tubule (ST datasets), 3 spermatogonia enriched (SPGs datasets) and 8 Sertoli enriched (SER datasets) as our references separately, and the averages of each group were reported.
ATAC-seq analysis
Quality control, alignment, and peak calling: Paired-end (PE) sequencing was performed on PND15 and adult SPGs on Illumina HiSeq2500 platform at FGCZ. FASTQ files were assessed for quality using FastQC (Andrews et al., 2012) (version 0.11.8). QC was performed using TrimGalore (Krueger, 2015) (version 0.6.2) in PE mode (--paired), trimming adapters, low-quality ends (-q 30) and discarding reads shorter than 30bp after trimming (--length 30). Alignment on GRCm38 genome was performed using Bowtie2 (Langmead and Salzberg, 2012) (version 2.3.5) with the following parameters: allowing fragments up to 2kb to align (-X 2000), entire read alignment (--end-to-end), suppressing unpaired alignments for paired reads (--no-mixed), suppressing discordant alignments for paired reads (--no-discordant) and minimum acceptable alignment score with respect to read length (--score-min L,- 0.4,-0.4). Using alignmentSieve (version 3.3.1) from deepTools (Ramirez et al., 2016) (version 3.4.3), aligned data (BAM files) were adjusted for read start sites to represent the center of the transposon cutting event (--ATACshift), and filtered for reads with a high mapping quality (--minMappingQuality 30). Reads mapping to the mitochondrial chromosome and ENCODE blacklisted regions were filtered out. To call nucleosome-free regions, all aligned files were merged within groups (PND15 and adult), sorted and indexed using SAMtools (Li et al., 2009) (version 0.1.19) and nucleosome-free fragments (NFFs) were obtained by selecting alignments with a template length between 40 and 140bp in length. Peak calling on NFFs was performed using MACS2 (Zhang et al., 2008) (version 2.2.7.1) with mouse genome size (-g 2744254612) and PE BAM file format (-f BAMPE).
Differential accessibility analysis
These analyses were conducted in R (version 3.6.2) using packages from CRAN (https://cran.r-project.org) and Bioconductor (Huber et al., 2015). Peaks were annotated based on overlap with GENCODE (version M18) (Harrow et al., 2012) transcript and/or distance to the nearest TSS (https://github.com/mansuylab/SC_postnatal_adult/bin/annoPeaks.R). The number of extended reads overlapping in peak regions was calculated using csaw package (Lun and Smyth, 2015) (version 1.20.0). Peak regions that did not have at least 15 reads in at least 40% of the samples were filtered out. Normalization factors were obtained on filtered peak regions using TMM normalization method (Robinson and Oshlack, 2010) and differential analysis on peaks (PND15 versus adult) was performed using Genewise Negative Binomial Generalized Linear Models with Quasi-likelihood (glmQLFit) Tests from edgeR package (Robinson et al., 2009) (version 3.28.1). Peak regions with an absLog2FC≥1 and adjusted-P≤0.05 were categorized as DARs.
Downstream analysis
Heatmaps of normalized ATAC-seq signal were created using deepTools with default parameters. Genomic annotation of ATAC-seq peak and DARs and identification of closest gene DARs were performed using ChIPpeakAnno (Zhu et al., 2010). GO enrichment analysis of genes associated to DARs was performed using g:Profiler (Raudvere et al., 2019) querying against the Mus musculus database. For the epigenomic annotation of DARs, publicly available signal files and peak files for all histone marks were used (Cheng et al., 2020). Heatmaps of ChIP-seq signal for histone PTMs around ATAC-seq peaks and DARs were generated with deepTools. Overlap between ChIP-seq peaks and DARs was generated using Intervene (Khan et al., 2017). TF motif analysis was performed using MEME-ChIP (Ma et al., 2014) and motif identification was done querying identified motifs against JASPAR database (Castro-Mondragon et al., 2021). Quantification and identification of differences in chromatin accessibility between promoter regions of DEGs was performed with deepStats (Richard 2019). To quantify differences in chromatin accessibility between DEGs, bamscale cov (Pongor et al., 2020) was employed using ATAC-seq BAM files as input and genomic coordinates of up-regulated and down-regulated genes. Kruskal-Wallis test was used to test for significant differences in chromatin accessibility between each category of DEGs at postnatal and adult stage.
Differential accessibility analysis at TEs
TE gene transfer format (GTF) file was obtained from http://labshare.cshl.edu/shares/mhammelllab/www-data/TEtranscripts/TE_GTF/mm10_rmsk_TE.gtf.gz on 03.02.2020. This file provides hierarchical information about TEs: Class (level 1, eg. LTR), family (level 2, eg. LTR ->L1), subtype (level 3, eg. LTR-> L1->L1_Rod), and locus (level 4, eg. LTR->L1 -> L1_Rod -> L1_Rod_dup1). TE loci were annotated based on overlap with GENCODE (version M18) as described above for ATAC-seq peaks. Filtered BAM files (without reads mapping to blacklisted or mitochondrial regions) were used to analyze TEs. Mapped reads were assigned to TEs using featureCounts from R package Rsubread (Liao et al., 2019) (version 2.0.1) and were summarized to Subtypes (level 3), allowing for multi-overlap with fractional counts, while ignoring duplicates. The number of extended reads overlapping at TE loci were obtained using csaw package (Lun and Smyth, 2015) (version 1.20.0). Subtypes without at least 15 reads, and loci without at least 5 reads in at least 40% of samples were filtered out. Normalization and differential accessibility analysis were performed as described above. Subtypes which had an absolute Log2FC≥0.5 and adjusted-P≤0.05 were categorized as differentially accessible subtypes and loci with an absLog2FC≥1 and adjusted-P≤0.05 were categorized as differentially accessible loci. For further downstream data analysis, only differentially accessible loci or differentially accessible subtypes were considered. TF motif enrichment analysis was performed using the marge package (Amezquita, 2018) (version 0.0.4.9999), which is a wrapper around the Homer tool (Heinz et al., 2010) (version 4.11.1).
Acknowledgements
We thank Francesca Manuella and Martin Roszkowski for taking care of animals breeding, Yvonne Zipfel for animal care, Silvia Schelbert and Alberto Corcoba for taking care of animal licenses and lab organization and Niharika Gaur for technical help. We thank Catherine Aquino and Emilio Yángüez from Functional Genomics Center Zurich for support and advice with libraries preparation and sequencing. We are very grateful to Jon Oatley, Melissa Oatley, Tessa Lord and Nathan Law for conceptual advice, hands-on training, and for providing detailed protocols for testis dissection and preparation, and immunocytochemistry of SPG cells. We thank Zuguang Gu for support with heatmaps. We thank Service and Support for Science IT (www.s3it.uzh.ch) for computational infrastructure.
Competing interest
The authors declare that they have no competing interests.
Funding
The Mansuy lab is funded by the University Zürich, the ETH Zürich, the Swiss National Science Foundation grant number 31003A_175742/1, the National Centre of Competence in Research (NCCR) RNA&Disease funded by the Swiss National Science Foundation (grant number 182880/Phase 2 and 205601/Phase 3), ETH grants (ETH-10 15-2 and ETH-17 13-2), European Union Horizon 2020 Research Innovation Program Grant number 848158, European Union projects FAMILY and HappyMums funded by the Swiss State Secretariat for Education, Research and Innovation (SERI), FreeNovation grant from Novartis Forschungsstiftung, and the Escher Family Fund. RGA-M received an ETH Postdoctoral Fellow grant number 20-1 FEL-28. DKT received a Swiss Government Excellence Scholarship.
Data and material availability
Sequencing data is available via the EBI BioStudies database within the ArrayExpress collection (RNA-seq: E-MTAB-12721; ATAC-seq: E-MTAB-12722). The code employed for data analysis is available from https://github.com/mansuylab/SC_postnatal_adult/. Additional information is available upon request.
References
- marge: An API for Analysis of Motifs Using HOMER in RbioRxiv https://doi.org/10.1101/249268
- Sox2 and Klf4 as the Functional Core in Pluripotency Induction without Exogenous Oct4Cell Rep 29:1986–2000
- Andrews S, Krueger F, Segonds-Pichon A, Biggins L, Krueger C, Wingett S. (2012). FastQC. A quality control tool for high throughput sequence data. Babraham BioinformaticsBabraham Bioinformatics
- Multi-omics profiling of mouse gastrulation at single-cell resolutionNature 576:487–491
- Remembering through the genome: the role of chromatin states in brain functions and diseasesTranslational Psychiatry 13:1–11
- The interplay of epigenetic marks during stem cell differentiation and developmentNat Rev Genet 18:643–658
- SIRT7 links H3K18 deacetylation to maintenance of oncogenic transformationNature 487:114–118
- Mammalian ISWI and SWI/SNF selectively mediate binding of distinct transcription factorsNature 569:136–140
- The AP-1 transcriptional complex: Local switch or remote command?Biochimica et Biophysica Acta (BBA) - Reviews on Cancer 1872:11–23
- Regulation of the ESC transcriptome by nuclear long noncoding RNAsGenome Res 25:1336–1346
- Evaluation of the purity of sertoli cell primary culturesMethods in Molecular Biology 1748:9–15
- Krüppel-like factors in mammalian stem cells and developmentDevelopment 144
- Enhancer-associated H3K4 methylation safeguards in vitro germline competenceNature Communications 2021 12:1 12:1–19
- Single-Cell RNA Sequencing Defines the Regulation of Spermatogenesis by Sertoli-Cell Androgen SignalingFront Cell Dev Biol 9
- JASPAR 2022: the 9th release of the open-access database of transcription factor binding profilesNucleic Acids Res 50:D165–D173
- LIN28A marks the spermatogonial progenitor population and regulates its cyclic expansionStem Cells 32
- Unique Epigenetic Programming Distinguishes Regenerative Spermatogonial Stem Cells in the Developing Mouse TestisiScience 23
- Accessibility of promoter DNA is not the primary determinant of chromatin-mediated gene regulationGenome Res 29:1985–1995
- Switching on cilia: transcriptional networks regulating ciliogenesisDevelopment 141:1427–1441
- Convergent evolution of RFX transcription factors and ciliary genes predated the origin of metazoansBMC Evol Biol 10
- Effective methods for bulk RNA-seq deconvolution using scnRNA-seq transcriptomesGenome Biol 24https://doi.org/10.1186/s13059-023-03016-6
- An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissuesNat Methods 14:959–962
- Essential role of Plzf in maintenance of spermatogonial stem cellsNat Genet 36:653–659
- Characterization of SAP-1, a protein recruited by serum response factor to the c-fos serum response elementCell 68:597–612
- Spermatogonial Stem Cell Self-Renewal Requires OCT4, a Factor Downregulated During Retinoic Acid-Induced DifferentiationStem Cells 26:2928–2937
- Transposon-driven transcription is a conserved feature of vertebrate spermatogenesis and transcript evolutionEMBO Rep 18:1231–1247
- The nature and dynamics of spermatogonial stem cellsDevelopment (Cambridge) https://doi.org/10.1242/dev.146571
- Regulation of transposable elements by DNA modificationsNat Rev Genet https://doi.org/10.1038/s41576-019-0106-6
- Spermatogonial morphology and kinetics during testis development in mice: a high-resolution light microscopy approachReproduction 142:145–155https://doi.org/10.1530/rep-10-0431
- DNA methylation and DNA methyltransferasesEpigenetics Chromatin 10:1–10
- Staged developmental mapping and X chromosome transcriptional dynamics during mouse spermatogenesisNat Commun 10https://doi.org/10.1038/s41467-019-09182-1
- Deep transcriptome profiling of mammalian stem cells supports a regulatory role for retrotransposons in pluripotency maintenanceNat Genet 46:558–566
- Developmental expression patterns of testicular olfactory receptor genes during mouse spermatogenesisGenes to Cells 11:71–81
- Alterations in sperm long RNA contribute to the epigenetic inheritance of the effects of postnatal traumaMolecular Psychiatry 25:2162–2174
- Towards a comprehensive catalogue of validated and target-linked human enhancersNature Reviews Genetics 2020 21:5 21:292–310
- Two distinct Sertoli cell states are regulated via germ cell crosstalkBiol Reprod 105
- Foxo1 is required in mouse spermatogonial stem cells for their maintenance and the initiation of spermatogenesisJ Clin Invest 121:3456–3466
- A Comprehensive Roadmap of Murine Spermatogenesis Defined by Single-Cell RNA-SeqDev Cell 46:651–667
- Dynamic transcriptome profiles within spermatogonial and spermatocyte populations during postnatal testis maturation revealed by single-cell sequencingPLoS Genet 15
- Chromatin and Single-Cell RNA-Seq Profiling Reveal Dynamic Signaling and Metabolic Transitions during Human Spermatogonial Stem Cell DevelopmentCell Stem Cell 21:533–546
- Dnmt1 has de novo activity targeted to transposable elementsNature Structural & Molecular Biology 2021 28:7 28:594–603
- Chromatin and Transcription Transitions of Mammalian Adult Germline Stem Cells and SpermatogenesisCell Stem Cell 15:239–253
- Transcription and imprinting dynamics in developing postnatal male germline stem cellsGenes Dev 29:2312–24
- GENCODE: The reference human genome annotation for the ENCODE projectGenome Research 22:1760–1774
- Expression of Col1a1, Col1a2 and procollagen I in germ cells of immature and adult mouse testisReproduction 130:333–341
- Gfra1 Silencing in Mouse Spermatogonial Stem Cells Results in Their Differentiation Via the Inactivation of RET Tyrosine Kinase 1Biol Reprod 77:723–733
- Gdnf upregulates c-Fos transcription via the Ras/Erk1/2 pathway to promote mouse spermatogonial stem cell proliferationStem Cells 26:266–278
- Histone modifications at human enhancers reflect global cell-type-specific gene expressionNature 459https://doi.org/10.1038/nature07829
- ID4 levels dictate the stem cell state in mouse spermatogoniaDevelopment 144:624–634
- Efficient chromatin accessibility mapping in situ by nucleosome-tethered tagmentationElife 9:1–19
- The Mammalian Spermatogenesis Single-Cell Transcriptome, from Spermatogonial Stem Cells to SpermatidsCell Rep 25:1650–1667
- GPX4 and vitamin E cooperatively protect hematopoietic stem and progenitor cells from lipid peroxidation and ferroptosisCell Death & Disease 2021 12:7 12:1–9
- Orchestrating high-throughput genomic analysis with BioconductorNature Methods 12:115–121
- Evidence that direct inhibition of transcription factor binding is the prevailing mode of gene and repeat repression by DNA methylationNature Genetics 2022 54:12 54:1895–1906
- Regulation of male germline transmission patterns by the Trp53-Cdkn1a pathwayStem Cell Reports 17:1924–1941
- Regulatory Factor X Transcription Factors Control Musashi1 Transcription in Mouse Neural Stem/Progenitor CellsStem Cells Dev 23
- Intervene: a tool for intersection and visualization of multiple gene or genomic region setsBMC Bioinformatics 18
- Changes in chromatin accessibility are not concordant with transcriptional changes for single-factor perturbationsMol Syst Biol 18
- Chromatin accessibility and the regulatory epigenomeNature Reviews Genetics 2018 20:4 20:207–220
- Krueger F. (2015). Trim Galore. A wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files, Babraham Bioinformatics www.bioinformatics.babraham.ac.uk/projects/trim_galore/.Babraham Bioinformatics
- Culture Conditions and Single Growth Factors Affect Fate Determination of Mouse Spermatogonial Stem CellsBiol Reprod 71:722–731
- Spermatogonial stem cellsBiol Reprod https://doi.org/10.1093/biolre/ioy077
- Fast gapped-read alignment with Bowtie 2Nature Methods 9:357–359
- voom: precision weights unlock linear model analysis tools for RNA-seq read countsGenome Biology 15
- Developmental kinetics and transcriptome dynamics of stem cell specification in the spermatogenic lineageNat Commun 10
- The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing readsNucleic Acids Research 47:e47–e47
- Genomic Repeats Categorize Genes with Distinct Functions for Orchestrated RegulationCell Rep 30:3296–3311
- csaw: a Bioconductor package for differential binding analysis of ChIP-seq data using sliding windowsNucleic Acids Research 44
- Motif-based analysis of large nucleotide data sets using MEME-ChIPNat Protoc 9:1428–1450
- Dynamic reorganization of open chromatin underlies diverse transcriptomes during spermatogenesisNucleic Acids Res 46https://doi.org/10.1093/nar/gkx1052
- Mapping the epigenomic and transcriptomic interplay during memory formation and recall in the hippocampal engram ensembleNature Neuroscience 2020 23:12 23:1606–1617
- The mammalian Doublesex homolog DMRT1 is a transcriptional gatekeeper that controls the mitosis versus meiosis decision in male germ cellsDev Cell 19
- Toward a More Precise and Informative Nomenclature Describing Fetal and Neonatal Male Germ Cells in RodentsBiol Reprod 89:1–9https://doi.org/10.1095/biolreprod.113.110502
- Epigenetic priming as a mechanism of predetermination of spermatogonial stem cell fateAndrology 11:918–926
- Determining cell type abundance and expression from bulk tissues with digital cytometryNat Biotechnol 37:773–782https://doi.org/10.1038/s41587-019-0114-2
- Formation of pluripotent stem cells in the mammalian embryo depends on the POU transcription factor Oct4Cell 95:379–391
- The Biology of Mammalian SpermatogoniaSpringer New York https://doi.org/10.1007/978-1-4939-7505-1
- Histone-Fold Domain Protein NF-Y Promotes Chromatin Accessibility for Cell Type-Specific Master Transcription FactorsMol Cell 55:708–722
- BAMscale: quantification of next-generation sequencing peaks and generation of scaled coverage tracksEpigenetics & Chromatin 13
- Histone availability as a strategy to control gene expressionRNA Biology 14:281–286
- Rfx2 Stabilizes Foxj1 Binding at Chromatin Loops to Enable Multiciliated Cell Gene ExpressionPLoS Genet 13
- Decoding the Spermatogenesis Program: New Insights from Transcriptomic AnalysesAnnu Rev Genet 56:339–368https://doi.org/10.1146/annurev-genet-080320-040045
- A unique chromatin signature uncovers early developmental enhancers in humansNature 470
- deepTools2: a next generation web server for deep-sequencing data analysisNucleic Acids Research 44:W160–W165
- g:Profiler: a web server for functional enrichment analysis and conversions of gene listsNucleic Acids Research 47:W191–W198
- The Col4a2em1(IMPC)Wtsi mouse line: Lessons from the Deciphering the Mechanisms of Developmental Disorders programBiol Open 8
- gtrichard/deepStats: deepStats 0.3.1 (Version 0.3.1)Zenodo https://doi.org/10.5281/zenodo.3336593
- limma powers differential expression analyses for RNA-sequencing and microarray studiesNucleic Acids Research 43:e47–e47
- edgeR: a Bioconductor package for differential expression analysis of digital gene expression dataBioinformatics 26:139–140
- A scaling normalization method for differential expression analysis of RNA-seq dataGenome Biology 11
- Endogenous retroviruses drive species-specific germline transcriptomes in mammalsNat Struct Mol Biol 27:967–977
- Specific Transcriptomic Signatures and Dual Regulation of Steroidogenesis Between Fetal and Adult Mouse Leydig CellsFront Cell Dev Biol 9
- Differential expression of c-kit in mouse undifferentiated and differentiating type A spermatogoniaEndocrinology 140:5894–5900
- AP-1 as a regulator of cell life and deathNat Cell Biol 4:E131–E136
- Transcriptional enhancers: from properties to genome-wide predictionsNature Reviews Genetics 2014 15:4 15:272–286
- PLZF suppresses differentiation of mouse spermatogonial progenitor cells via binding of differentiation associated genesJ Cell Physiol 235:3033–3042
- Id4 Marks Spermatogonial Stem Cells in the Mouse TestisScientific Reports 5
- The Long Noncoding RNA Lncenc1 Maintains Naive States of Mouse ESPGs by Promoting the Glycolysis PathwayStem Cell Reports 11:741–755
- Widespread contribution of transposable elements to the innovation of gene regulatory networksGenome Res 24:1963–1976
- Transposable elements as a potent source of diverse cis-regulatory sequences in mammalian genomesPhilosophical Transactions of the Royal Society B: Biological Sciences https://doi.org/10.1098/rstb.2019.0347
- Single-cell RNAseq analysis of testicular germ and somatic cell development during the perinatal periodDevelopment 147https://doi.org/10.1242/dev.183251
- Sp5 induces the expression of Nanog to maintain mouse embryonic stem cell self-renewalPLoS One 12
- Long Terminal Repeats: From Parasitic Elements to Building Blocks of the Transcriptional Regulatory RepertoireMol Cell https://doi.org/10.1016/j.molcel.2016.03.029
- GA-Binding Protein Alpha Is Involved in the Survival of Mouse Embryonic Stem CellsStem Cells 35:2229–2238
- AP-1 Transcription Factors and the BAF Complex Mediate Signal-Dependent Enhancer SelectionMol Cell 68:1067–1082
- Single-Cell RNA Sequencing Analysis Reveals Sequential Cell Fate Transition during Human SpermatogenesisCell Stem Cell 23:599–614
- LIN28A binds to meiotic gene transcripts and modulates their translation in male germ cellsJ Cell Sci 133
- Long non-coding RNAs potentially function synergistically in the cellular reprogramming of SCNT embryosBMC Genomics 19
- The POU domain transcription factor POU3F1 is an important intrinsic regulator of GDNF-induced survival and self-renewal of mouse spermatogonial stem cellsBiol Reprod 82:1103–1111
- Oct4 differentially regulates chromatin opening and enhancer transcription in pluripotent stem cellsElife 11https://doi.org/10.7554/ELIFE.71533
- Overlapping functions of krüppel-like factor family members: Targeting multiple transcription factors to maintain the naïve pluripotency of mouse embryonic stem cellsDevelopment (Cambridge) 145
- DMRT1 Is Required for Mouse Spermatogonial Stem Cell Maintenance and ReplenishmentPLoS Genet 12
- Model-based Analysis of ChIP-Seq (MACS)Genome Biology 9
- ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip dataBMC Bioinformatics 11
Article and author information
Author information
Version history
- Sent for peer review:
- Preprint posted:
- Reviewed Preprint version 1:
- Reviewed Preprint version 2:
Copyright
© 2023, Lazar-Contes et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 745
- downloads
- 34
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.