Despite intense research on mice, the transcriptional regulation of neocortical neurogenesis remains limited in humans and non-human primates. Cortical development in rhesus macaque is known to recapitulate multiple facets of cortical development in humans, including the complex composition of neural stem cells and the thicker supragranular layer. To characterize temporal shifts in transcriptomic programming responsible for differentiation from stem cells to neurons, we sampled parietal lobes of rhesus macaque at E40, E50, E70, E80, and E90, spanning the full period of prenatal neurogenesis. Single-cell RNA sequencing produced a transcriptomic atlas of the developing rhesus macaque neocortex parietal lobe. Identification of distinct cell types and neural stem cells emerging in different developmental stages revealed a terminally bifurcating trajectory from stem cells to neurons. Notably, deep-layer neurons appear in the early stages of neurogenesis, while upper-layer neurons appear later. While these different lineages show overlap in their differentiation program, cell fates are determined post-mitotically. Pseudotime trajectories from vRGs to oRGs revealed differences in dynamic gene expression profiles and identified divergence in their activation of BMP, FGF, and WNT signaling pathways. These results provide a comprehensive picture of the temporal patterns of gene expression leading to different fates of radial glial progenitors during neocortex layer formation.
This study presents a useful resource for the gene expression profiles of different cell types in the parietal lobe of the cerebral cortex of prenatal macaques. The evidence supporting the claims of the authors is solid, and revision has clarified some of the cell isolation and cell classification issues flagged by reviewers. This dataset will be of interest to developmental neurobiologists and could potentially be used for future comparative studies on early brain development.
The neocortex is the center for higher brain functions, such as perception and decision-making. Therefore, the dissection of its developmental processes can be informative of the mechanisms responsible for these functions. Several studies have advanced our understanding of the neocortical development principles in different species, especially in mice. Generally, the dorsal neocortex can be anatomically divided into six layers of cells occupied by distinct neuronal cell types. The deep-layer neurons project to the thalamus (layer VI neurons) and subcortical areas (layer V neurons), while neurons occupying more superficial layers (upper-layer neurons) preferentially form intracortical projections1. The generation of distinct excitatory neuron cell types follows a temporal pattern in which early-born neurons migrate to deep layers (i.e., layers V and VI), while the later-born neurons migrate and surpass early-born neurons to occupy the upper layers (layers II-IV) 2. In Drosophila, several transcription factors are sequentially explicitly expressed in neural stem cells to control the specification of daughter neuron fates, while very few such transcription factors have been identified in mammals thus far. Using single-cell RNA sequencing (scRNA-seq), Telley and colleagues found that daughter neurons exhibit the same transcriptional profiles of their respective progenitor radial glia, although these apparently heritable expression patterns fade as neurons mature3. However, the temporal expression profiles of neural stem cells and the contribution of these specific temporal expression patterns in determining neuronal fate have yet to be wholly clarified in humans and non-human primates. Over the years, non-human primates (NHP) have been widely used in neuroscience research as mesoscale models of the human brain. Therefore, exploring the similarities and differences between NHP and human cortical neurogenesis could provide valuable insight into unique features during human neocortex development.
In mammals, radial glial cells are found in the ventricular zone (VZ), where they undergo proliferation and differentiation. The neocortex of primates exhibits an extra neurogenesis zone known as the outer subventricular zone (OSVZ), which is not present in rodents. As a result of evolution, the diversity of higher mammal cortical radial glia populations increases. Although ventricular radial glia (vRG) is also found in humans and non-human primates, the vast majority of radial glia in these higher species occupy the outer subventricular zone (OSVZ) and are therefore termed outer radial glia (oRG). Outer radial glial (oRG) cells retain basal processes but lack apical junctions 4 and divide in a process known as mitotic somal translocation, which differs from vRG 5.
VRG and oRG are both accompanied by the expression of stem cell markers such as PAX6 and exhibit extensive self-renewal and proliferative capacities 6. However, despite functional similarities, they have distinct molecular phenotypes. Previous scRNA-seq analyses have identified several molecular markers, including HOPX for oRGs, CRYAB, and FBXO32 for vRGs7. Furthermore, oRGs are derived from vRGs, and vRGs exhibit obvious differences in numerous cell-extrinsic mechanisms, including activation of the FGF-MAPK cascade, SHH, PTEN/AKT, and PDGF pathways, and oxygen (O2) levels. These pathways and factors involve three broad cellular processes: vRG maintenance, spindle orientation, and cell adhesion/extracellular matrix production8. Some transcription factors have been shown to participate in vRG generation, such as INSM and TRNP1. Moreover, the cell-intrinsic patterns of transcriptional regulation responsible for generating oRGs have not been characterized.
ScRNA-seq is a powerful tool for investigating developmental trajectories, defining cellular heterogeneity, and identifying novel cell subgroups9. Several groups have sampled prenatal mouse neocortex tissue for scRNA-seq 10,11, as well as discrete, discontinuous prenatal developmental stages in human and non-human primates, 7,12 13,14. The diversity and features of primate cortical progenitors have been explored 4,6,7,15.
The temporally divergent regulatory mechanisms that govern cortical neuronal diversification at the early postmitotic stage have also been focused on 16. Studies spanning the full embryonic neurogenic stage in the neocortex of humans and other primates are still lacking. Rhesus macaque and humans share multiple aspects of neurogenesis, and more importantly, the rhesus monkey and human brains share more similar gene expression patterns than the brains of mice and humans17–19. To establish a comprehensive, global picture of the neurogenic processes in the rhesus macaque neocortex, which can be informative of neocortex evolution in humans, we sampled neocortical tissue at five developmental stages (E40, E50, E70, E80, and E90) in rhesus macaque embryos, spanning the full neurogenesis period. Through strict quality control, cell type annotation, and lineage trajectory inference, we identified two broad transcriptomic programs responsible for the differentiation of deep-layer and upper-layer neurons. We also defined the temporal expression patterns of neural stem cells, including oRGs, vRGs, and IPs, and identified novel transcription factors involved in oRG generation. These findings can substantially enhance our understanding of neocortical development and evolution in primates.
scRNA-Seq Analysis of Cell Types in the Developing Macaque Neocortex
In order to establish a comprehensive view of the cellular composition of the rhesus macaque brain at different stages in prenatal period development, we conducted single-cell RNA sequencing (scRNA-seq) on the dissected parietal lobes of eight total rhesus macaque embryos, spanning five developmental stages of prenatal neurogenesis, including E40 (stage of peak neurogenesis) and E50 (Layer 6 formation); as well as at E70 (Layer 5 formation), E80 (Layer 4 formation), and E90 (Layer2-3 formation) (Figure 1A, Figure S1A)2. We obtained a transcriptome from 53,295 cells after filtering out low-quality cells and removing potential doublets. Each embedding was visualized using uniform manifold approximation and projection (UMAP) of dimension reduction using Seurat, which identified 28 distinct clusters. All cell clusters present in the samples were then annotated (Figure 1B), and the respective cell types were identified (Figure 1C) based on their expression of molecular markers (Figure 1D and Figure S2, B to C). According to the expression of the marker genes, we assigned clusters to cell type identities of neurocytes (including radial glia (RG), outer radial glia (oRG), intermediate progenitor cells (IPCs), ventral precursor cells (VP), excitatory neurons (EN), inhibitory neurons (IN), oligodendrocyte progenitor cells (OPC), oligodendrocytes, astrocytes, ventral LGE-derived interneuron precursors and Cajal-Retzius cells, or non-neuronal cell types (including microglia, endothelial, meninge/VALC(vascular cell)/pericyte, and blood cells). Based on the expression of the marker gene, cluster 23 was identified as thalamic cells, which are small numbers of non-cortical cells captured in the sample collection at earlier time points. Each cell cluster was composed of multiple embryo samples, and the samples from similar stages generally harbored similar distributions of cell types.
In general, ventricular radial glia (vRG) (cluster 10) showed characteristic VIM and PAX6 expression; outer radial glia (oRG) (cluster 12 and cluster 14) highly expressed HOPX, previously verified marker7; two clusters (cluster 8 and cluster 22) of intermediate progenitor cells (IPCs) that strongly expressed EOMES, and one cluster of IPCs (cluster 22) were all topologically close to radial glia, while the other IPC cluster (cluster 8) was closer to neurons, indicating the presence of two stages of IPCs20; excitatory neurons were identified by the expression of well-established markers, such as NEUROD2 and NEUROD6; and the astrocyte and oligodendrocyte lineages were identified by AQP4 and SOX10 expression, respectively21. In addition, we identified DLX1 and GAD-positive cells, which suggested the presence of inhibitory neurons (Figure 1C and Figure S2).
Collectively, these results suggested that cortical neural progenitors undergo neurogenesis processes during the early stages of macaque prenatal cortical development, while gliogenic differentiation, including oligodendrocytes and astrocytes, occurs in later stages (Figure 1A and 1B).
Distinct Excitatory Neuronal Types Sequentially Emerge in Developing Cortex
To better understand the temporal dynamics of excitatory neuron (EN) development and differentiation, we compiled a subset of all ENs and re-clustered them into ten subclusters (EN1-10) (Figure 2A). We then calculated the relative expression levels of cellular markers for each group and annotated the EN subclusters based on published descriptions of marker function (Figure S3B). We found that all EN subclusters could be well-distinguished by differential expression of either deep-layer neuron markers (BCL11B, FEZF2, and SOX5)22 or upper-layer neuron markers (CUX1 and SATB2)23,24(Figure 2, B to C). In previous seminal studies, 3H-thymidine (3H-dT) tracing in macaque rhesus unraveled the sequential generation of cortical neurons, with deep-layer neurons appearing prior to upper-layer neurons2. In agreement with this early work, we found that early-born neurons (E40 and E50) predominantly outnumbered later-born neurons in the deep-layer neuron subclusters (EN5 and 10), while upper-layer neuron subclusters contained the inverse proportions (Figure 2, C and D).
We then characterized temporal changes in the composition of each EN subcluster. While the EN 5 and EN 10 (deep-layer neurons) subclusters emerged at E40 and E50 and disappeared in later stages, EN subclusters 1, 2, 3, and 4 gradually increased in population size from E50 to E80, EN subclusters 8 and 9 gradually increased in population size from E80 to E90 (Figure 2D). Notably, EN 6 was exclusively found in the cortex at E70 (Figure S3A). EN7 is identified as CUX1-positive, PBX3-positive, and ZFHX3-positive excitatory neuron subcluster.
Specification of Different Progenitor Fates Controlled by Regulator Genes
To focus on the differentiation of neural progenitors, we generated subsets of cell clusters 8, 10, 12, 14, 15, 16, and 22, then annotated them as ventricular radial glia (RG_C10), outer radial glia (oRG_C12 and oRG_C14), intermediate progenitor cells (IPC_C22 and IPC_C8), ventral precursor cells (VP_C15), or oligodendrocyte precursor cells (OPC_C16) (Figure 3A). Subclustering analysis revealed that these neural progenitors differentiated into diverse cell types with distinct lineages across the macaque prenatal neocortex (Figure 3B). We next investigated trajectories within the neural stem cell pool prior to branching as upper-layer or deep-layer excitatory neurons (EN). The oRG and IPC precursor cell groups exhibited characteristically high specific expression of HOPX and EOMES respectively.
Although the presence of oRGs is a well-established feature of primate neurogenesis, and their molecular markers are widely used, the genetic basis and molecular processes leading to their emergence are still poorly understood. To construct gene expression profiles that illustrate the progression from RG_C10 cells to oRGs, we specifically examined changes in expression among RG_C10, oRG_C12, and oRG_C14 cells, then calculated pseudotime trajectories (Figure 3, C to D). Based on these trajectories, we then profiled the temporal shifts in the expression of each gene and selected the 150 most significant genes (Figure 3E). We found a set of genes that were previously reported as highly expressed in outer radial glia that also showed high expression in the oRG_C12 differentiated pseudotime terminal 7,25,26 (such as SFRP1, HOPX, FAM107A, TNC, PTN, and MOXD1), while high ASPM expression was typical of cells in the earlier, RG, pseudotime terminal (Figure 3F). In addition to these markers, we also found enrichment for some potential regulatory genes at the oRG_C12 terminal, such as Regulator of the cell cycle (RGCC), which controls mitotic spindle orientation 27, and TTYH1, which regulates cell adhesion 28. We also found genes that regulate ion channel expression, such as ATP1A2, ATP1B2, and SCN4B, which were not previously known to participate in oRG differentiation.
Analysis of DEGs specific to the RG pseudotime terminal identified cell cycle-related genes such as MKI67, TOP2A, CDC20, and CCNA2. Within the glia-like cells that largely comprised the oRG_C12 terminal, we also detected a population of true glial genes that exhibited high expression of SLC1A3, ZFP36L1 29, GFAP, DBI, and EDNRB 30. In addition, our analyses uncovered several DEGs that have not yet been investigated for a role in radial glia function and differentiation, such as DKK3, DDAH1, SAT1, and PEA15. Finally, we screened for significant DEGs associated with RG-to-IPC and OPC-to-astrocyte/oligodendrocyte differentiation trajectories (Figures S5 and S6).
The Generation of Deep-layer and Upper-layer Neurons Follows Distinct Terminal Trajectories
Based on our finding above that the deep-layer neurons appear earlier than upper-layer neurons, we next sought to characterize dynamic shifts in the expression of transcriptional regulators that potentially contribute to determining neuronal fates. To this end, we performed pseudotime trajectory analysis of EN lineages (including deep-layer and upper-layer neurons) (Figure 4A), then superimposed the cluster labels from RG_C10, oRG_C12, and oRG_C14 stem cells in UMAP plots (Figure 3A) and early (E40 and E50) and late (E70, 80 and 90) emergence stages (Figure 2A) over the extracted transcriptomic data. We then calculated the RG-specific differentially expressed genes (rgDEGs) and the DEGs of early and late neuron-specific (nDEGs). The set of genes that overlapped between rgDEGs and nDEGs (termed mapping genes) can thus be used to map neuronal subtypes to different neural stem cell populations (Figure. S4). We hypothesized that mapping genes, such as FEZF2 and DOK5, may function in radial glial cells to specify neuronal progeny (Figure 4B). Our results provide single-cell transcriptome data to support that apical progenitors and daughter excitatory neurons share molecular temporal identities during neocortex prenatal corticogenesis 31–33.
Since substantial evidence indicates that neuronal fate is determined post-mitotically in the mammalian neocortex, analysis of differentially expressed genes throughout the differentiation process could reveal the genetic mechanisms responsible for deciding stem cell fate as different neuronal progeny. Using Dynverse R packages, we constructed pseudotime trajectories for the cluster data, which identified a bifurcating trajectory from neural stem cells (including RG_C10, oRG_C12, and oRG_C14) that leads to either deep-layer neurons in one branch or to upper-layer neurons in the alternative branch (Figure 4C). However, it should be noted that the two distinct fates share a common path from neural stem cells to immature neurons (Figure 4C, from dot1 to dot4), supporting the likelihood that neuronal fate is primarily determined post-mitotically.
Based on this trajectory, we categorized the DEGs that exhibit dynamic, temporal shifts in expression over pseudotime, resulting in broad clusters. The first of these clusters was enriched with neural stem cells, characterized by high expression of PAX6, VIM, and TOP2A, which likely participate in regulating stemness and proliferation. The three remaining DEG clusters were enriched in either deep-layer neurons, upper-layer neurons, or both (Figure 4D). More specifically, we found that most DEGs, such as STMN2, TUBB3, NEUROD2, and NEUROD6, were shared between both branches, illustrating the differentiation processes between deep and upper-layer ENs. We also identified deep layer-specific DEGs (including FEZF2, TBR1, and LMO3) and upper layer-specific genes (including MEF2C and DOK5). We then organized the cell types into a branching tree based on their differential expression of these marker genes, including stem-cell genes PAX6, HOPX, and EOMES, in addition to the above deep-layer-and upper-layer-specific genes (Figure 4E). The separation into distinct, branch-specific sets of DEGs suggested a role for these cell type-specific transcription factors and other genes in terminal neuron differentiation.
Conserved and Divergent Features of Human, Macaque and Mouse Neocortical Progenitor Cells
Next, we performed liger integration 34 to integrate the macaque single-cell dataset with published mouse 11 and human datasets, which were created by combing two published human scRNA-seq data of 7 to 21 postconceptional week (PCW) neocortex 35,36, to conduct a cross-species comparison. Species-integrated UMAP showed that all major cell types were well integrated between the three species (Figure 5A).
It is clearly observed that similar cellular processes with comparable temporal progression sequentially generate deep-layer and upper-layer neurons in both primates and rodents (Figure S7, A to C). However, in primates, the number and relative development period of upper neurons are more extended than that of rodents (Figure S7D). Besides, another important evolutionarily divergent feature in cortical development between primates and rodents is the abundance of oRG cells in primates’ outer subventricular zone (oSVZ) (human and macaque). Our human-macaque-mouse cross-species analysis showed that strong expression of the oRG marker genes, such as HOPX, MOXD1, FAM107A and CLU (3) in both human and macaque datasets (Figure 5B), while barely detectable expression in mouse dataset, which is consistent with previous reports 37.
Furthermore, We picked HOPX-positive and EOMES-positive cells (ligercluster 5, 20, 13, 15) from the human, macaque, and mouse datasets and performed developmental trajectories analysis with slingshot (Figure 5C). Comparing vRG to IPC trajectory between human, macaque, and mouse, we found this biological process of vRG-to-IPC is remarkably conserved across species. However, the vRG to oRG trajectory is divergent between species because the oRG population was not identified in the mouse dataset. The latter process is almost invisible in mice but similar in humans and macaques.
Previous studies have shown that neural progenitors exhibit distinct temporal expression patterns during neurogenesis in mice. However, similar temporal profiling has yet to be thoroughly investigated in macaque neural stem cells. Based on its topological position and associated markers that suggest it functions as a developmental root, we examined the temporal expression profile of RG_C10 cells (Figure 3A). We first identified marker genes to distinguish RG_C10 cells in different prenatal stages (i.e., E40-E90) and used these markers to generate dynamic expression profiles. We then clustered these dynamically expressed macaque genes into five types (Figure 6A), consisting of Type 1 and 2 genes that decreased in expression during neurogenesis, Type 3, 4 and 5 genes that gradually ramped up in expression throughout neurogenesis. We then performed a similar profiling of dynamic gene expression using data obtained from mouse apical progenitors10 and categorized the DEGs into the same five categories. A total of 72 homologous transcription factors were shared between human, macaque and mouse transcriptomes. Analysis of their temporal dynamics revealed that they exhibited similar temporal patterns between species such as NEUROD1, NEUROG2, POU3F2, ETV1, ETV5 and FOS (Figure 6B and Figure S8), which indicated that the majority of transcription factors had conserved temporal dynamics between species, and thus evolutionarily conserved patterns of regulation in neural stem cell differentiation.
To identify the master regulators (MRs) related to cortical neurogenesis among macaque and mouse, we used the SCENIC workflow to analyze the gene regulatory networks of each transcription factor (Figure 6, C to H). Among all the regulatory pairs associated with each prenatal stage, we selected ten TFs with the highest regulon specificity scores (Figure 6 C, E and G) and their top 5 target genes, as visualized by Cytoscape. Analysis of the regulatory activity of human, macaque and mouse11 prenatal neocortical neurogenesis indicated that despite commonalities in the roles of classical developmental TFs such as GATA1, SOX2, HMGN3, TCF7L1, ZFX, EMX2, SOX10, NEUROG1, NEUROD1 and POU3F1. The top 10 TFs of the human, macaque, and mouse vRG each time point and their top 5 target genes identified by pySCENIC as an input to construct the transcriptional regulation network (Figure 6 D, F and H). Some conserved regulatory TFs present in more than one species are identified, such as HMGN3, EMX2, SOX2, and HMGA2 at an early stage when deep-lager generation and SOX10, ZNF672, and ZNF672 at a late stage when upper-lay generation.
The cerebral cortex region of the brain is responsible for extraordinary cognitive capacities, such as abstract thinking and language. The neocortex of non-human primate rhesus macaque resembles that of humans in many aspects. Thus, a comprehensive investigation of macaque neurogenesis can improve our understanding of neocortical development and evolution. For this purpose, we performed scRNA-seq in prenatal neocortical tissues of macaque. Our data support previous omics studies of prenatal macaque brain development 38 39 17 and recently published human neocortex development studies 40. We constructed a single-cell resolution transcriptomic atlas of the developing macaque neocortex, which we then used to identify dynamically expressed genes that likely contribute to the maturation of distinct neuron types, temporal expression patterns of neural stem cells, and the generation of oRGs.
The existence of oRGs has long been demonstrated in humans and primates through live imaging and immunostaining 41. ORGs share similarities in their patterns of gene expression with vRGs (e.g., SOX2, PAX6, and NESTIN) but also specifically express several genes (such as MOXD1, HOPX, and FAM107A) 7,42. Recently, the molecular mechanisms associated with the generation and amplification of oRGs were shown to include signaling pathways such as the FGF-MAPK cascade, SHH, PTEN/AKT, and PDGF pathways, as well as proteins such as INSM, GPSM2, ASPM, TRNP1, ARHGAP11B, PAX6, and HIF1α. A number of these proteins were validated by genetic manipulation of their activities or expression in mouse, ferret, and marmoset 43–46. The dynamically expressed genes and pseudotime trajectories from vRGs to oRGs presented in this work are in agreement with these previous reports. In addition to known regulators, we detected differential expression of ion channel regulators in oRGs, such as ATP1A2, ATP1B2, and SCN4B, suggesting that hyperpolarization and depolarization processes may participate in promoting or maintaining oRG populations. Indeed, hyperpolarization in mouse apical progenitors has been shown to promote the generation of IPCs and indirect neurogenesis 47, although it remains unclear if similar mechanisms contribute to the development of the primate neocortex.
The sequential generation of cortical neurons has long been observed in both rodents and primates, and many transcription factors are known to drive post-mitotic specification of neuronal progeny, mainly in rodents. Here, we found genes conserved across species, such as FEZF2 and TBR1, involved in specifying deep-layer neurons in the macaque neocortex. The slingshot pseudotime analysis, which reflected the expression pattern of genes during appropriate lineage trajectories, suggests that some genes are lineage-restricted or even involved in lineage fate determination. For example, FEZF2, PPP1R1B (also known as DARPP32), SOX5, LMO3, and CELF4 may be involved in the fate specialization of deep neurons. In contrast, TMOD1, PID1, LINGO1, CRYM, and MEF2C for upper-layer neurons. Notably, FEZF2 48 and SOX5 49 have been confirmed associated with deep layer neuron specification in mice, while others, such as TMOD1, PID1, and LINGO1, have not been reported before, and their potential functions in the fate specification of cortical projection neurons need to be further explored. To date, only a very limited set of genes have been shown to function in specifying neuronal progeny in progenitor cells, which suggests that regulation in progenitors is more complex than in neurons, where a single transcription factor can trigger a transcriptional cascade to promote the maturation of neurons. In the current study, we found two types of genes (Dynamic type 1 and 5), which have gradually altered expression patterns along with the neurogenic stage, indicating a transition in cell state in both vRGs and oRGs. In addition to these gradual transition genes, we also identified sets of genes (Dynamic type 2-4) with sharp changes in expression during neurogenesis, which likely function in specifying different neuronal subtypes. These findings led us to postulate that a combination of biological processes and pathways change over time to coordinate gradual transitions from neural stem cells to daughter neurons.
This study also faces limitations. During the macaque prenatal neurogenesis, dorsal radial glia cells generate excitatory neurons by direct and indirect neurogenesis. In general, the number of progenitor cells gradually decreases throughout development, with progenitor cells generating deep neurons in the early stage of development and upper excitatory neurons in the later stages. We observed the disappearance of two subclusters of ENs (EN 5 and EN 10) from later-stage samples (Figure S3). This disappearance could also be explained by the development of axons and dendrites in early-born neurons, which have higher morphological complexity and greater vulnerability. Due to the fixed size of the BD Rhapsody microwells, this sing-cell capture method might be less efficient in capturing mature excitatory neurons but has a good capture effect on newborn neurons at each sampling time point. Nonetheless, this transcriptomic atlas uncovers the molecular signatures of the primary cell types and temporal shifts in gene expression of differentiating progenitor cells during neocortex layer formation, thus providing a global perspective into neocortical neurogenesis in the macaque.
Materials and Methods Animal
Animals and frozen tissue samples from prenatal rhesus macaque (Macaca mulatta) were provided by the National Medical Primate Center, Institute of Medical Biology, Chinese Academy of Medical Sciences. Timed pregnancy-derived biological replicate specimens were profiled at each prenatal developmental stage (E40, E50, E70, E80 and E90)50. These time points were selected to coincide with peak periods of neurogenesis for the different layers of the neocortex based on previous studies 50,51. All animal procedures followed international standards and were approved in advance by the Ethics Committee on Laboratory Animals at IMBCAMS (Institute of Medical Biology, Chinese Academy of Medical Sciences).
Fetal brain sample details
We collected eight pregnancy-derived fetal brains of rhesus macaque (Macaca mulatta) at five prenatal developmental stages (E40, E50, E70, E80, E90) and dissected the parietal lobe cortex. Because of the different development times of rhesus monkeys, prenatal cortex size and morphology are different. To ensure that the anatomical sites of each sample are roughly the same, we use the lateral groove as a reference to collect the parietal lobe for single-cell sequencing (as indicated by bright yellow in Figure S1A) and do not make a clear distinction between the different regional parts including primary somatosensory cortex and association cortices in the process of sampling.
We followed the BD Cell Preparation Guide to wash, count, and concentrate cells in preparation for use in BD Rhapsody™ System Whole transcriptome analysis.
Single-cell RNA-seq data processing
Single-cell capture was achieved by random distribution of a single-cell suspension across >200,000 microwells through a limited dilution approach. Beads with oligonucleotide barcodes were added to saturation so that a bead was paired with a cell in a microwell. The cell-lysis buffer was added so that poly-adenylated RNA molecules hybridized with the beads. Beads were collected into a single tube for reverse transcription. Upon cDNA synthesis, each cDNA molecule was tagged on the 5′ end ( the 3′ end of a mRNA transcript) with a molecular index and cell label indicating its cell of origin. Whole transcriptome libraries were prepared using the BD Resolve single-cell whole-transcriptome amplification workflow. In brief, the second strand of cDNA was synthesized, followed by ligation of the adaptor for universal amplification. Eighteen cycles of PCR were used to amplify the adaptor-ligated cDNA products. Sequencing libraries were prepared using random priming PCR of the whole-transcriptome amplification products to enrich the 3′ ends of the transcripts linked with the cell label and molecular indices. Sequencing was performed with Illumina NovaSeq 6000 according to the manufacturer’s instructions. The filtered reads were aligned to the macaque reference genome file with Bowtie2 (v2.2.9). Macaque reference genome was downloaded from Ensemble database (ftp://ftp.ensembl.org/pub/release100/fasta/macaca_mulatta/dna/Macaca_mulatta.Mmul_10.dna.toplevel.fa.gz, ftp://ftp.ensembl.org/pub/release-100/gtf/macaca_mulatta/Macaca_mulatta.Mmul_10.100.gtf.gz). We analyzed the sequencing data following BD official pipeline and obtained the cell-gene expression matrix file.
QC and Data analysis
To mitigate the effect of over-estimation of molecules from PCR and sequencing errors, we used the Unique Molecular Identifier (UMI) adjustment algorithms recursive substitution error correction (RSEC) and distribution-based error correction (DBEC), which were contained in the BD Rhapsody™ pipeline. Quality control was applied based on the detected gene number and the percentage of counts originating from mitochondrial RNA, ribosomal RNA, and hemoglobin gene per cell. Then, cells were filtered to retain only higher quality (with less than 7.5% mitochondrial gene counts, less than 5.5% ribosomal gene counts, and detected genes above 400 and less than 6000). Additionally, we identified cell doublets using Scrublet v0.142 with default parameters and removed doublets before data analysis. After quality control, a total of 59,045 cells remained for subsequent analysis.
To integrate cells into a shared space from different datasets for unsupervised clustering, we used SCTransform workflow in R packages Seurat52 V4.0.3 to normalize the single-cell RNA-seq data from different samples. We identified the variable features of each donor sequencing data using the “FindVariableFeatures” function. A consensus list of 2000 variable genes was then formed by detecting the most significant recovery rates genes across samples. Mitochondrial and ribosomal genes were not included in the variable gene list. Next, we used the SCTransform workflow in Seurat to normalize the single cell-seq data from different samples. During normalization, we also removed confounding sources of variation, including mitochondrial mapping and ribosomal mapping percentages. Standard RPCA workflow was used to perform the integration.
With a “resolution” of 0.5 upon running “FindClusters,” we distinguished major cell types of nerve cells and non-nerve cells in the umap according to known markers, including radial glia cell, outer radial glia cell, intermediate progenitor cell, excitatory neuron, inhibitory neuron, astrocyte, oligodendrocyte, Cajal-Retzius cell, microglia, endothelial cell, meningeal cells, and blood cells, which were distinguished on the first level. By following a similar pipeline, subclusters were identified. We finalized the resolution parameter on the FindClusters function once the cluster numbers did not increase when we increased the resolution. Then, we checked the differentially expressed genes (DEGs) between each of the clusters using the “FindAllMarkers” and “FindMarkers” functions with logfc.threshold=0.25.
Construction of single-cell developmental trajectory
Single-cell developmental lineage trajectories construction and discovery of trajectory transitions were performed using Slingshot(V 2.2.0) 53, from the Dyno (V 0.1.2) 54 platform, with PCA dimensionality reduction plot results. The direction of the developmental trajectory was adjusted by reference to the verified relevant studies.
Cross-species transcriptome data integration and analysis
Mouse datasets were downloaded from the Gene Expression Omnibus (GEO SuperSeries GSE153164) and at the Single Cell Portal: https://singlecell.broadinstitute.org/single_cell/study/SCP1290/molecular-logic-of-cellular-diversification-in-the-mammalian-cerebral-cortex. Human datasets were download from the Gene Expression Omnibus (GEO) under the accession number GSE104276 and GSE104276. We created a human database by combining the two published human prenatal cortical development datasets of 7 to 21 postconceptional weeks (PCW) into one containing cell numbers comparable to our macaque and published mouse dataset using the single cell transcriptome analysis pipeline of R package Seurat. We used the biomaRt package to convert the gene symbols in the mouse and macaque expression matrices into their human homologs. To perform cross-species analysis, we used the LIGER  method to integrate the macaque single-cell dataset with the mouse and human datasets. We ran the “optimizeALS” function in rliger to perform integrative non-negative matrix factorization(iNMF) on the scaled species-integrated dataset (k = 20, lambda = 5, max.iters = 30). Species-integrated UMAP showed that all major cell types were well integrated between the three species (Figure 5A). We used the “calcAlignment” function in rliger to quantify how well-aligned datasets are and got the metric of 0.891231545756746, which was very close to “1”. This suggested LIGER well-integrated cross-species datasets.
Transcription Factor Regulatory Network Analysis
We inferred regulon activities for vRG cells of human, macaque and mouse using pySCENIC  workflow (https://pyscenic.readthedocs.io/en/latest/tutorial.html) with default parameters. Since there is no transcriptional regulator database for macaques, macaques’ genome is similar to the human genome. When we analyzed the macaque dataset, we used the human transcriptional regulator database for pySCENIC analysis. GRNboost2 in scenic was used to infer gene regulatory networks based on co-expression patterns. RcisTarget55 was used to analyze the TF-motif enrichment and direct targets among different species databases. The regulon activity for each developmental time point was identified using AUCell, and the top enriched activated TFs were ranked by –log10(p_value). Lastly, we evaluated the interaction networks analysis of TFs and their target genes by Cytoscape56 (version 3.8.1).
Firstly, we picked up vRG cells in each species dataset and screened the differentially expressed genes (DEGs) between adjacent development time points using the “FindMarkers” function (with min.pct = 0.25, logfc.threshold = 0.25). After separate normalization of the DEG expression matrix from different species datasets, we use the “standardise” function of mfuzz to perform standardization. The DEGs of vRG in each species were grouped into five different clusters using the Mfuzz package in R with fuzzy c-means algorithm57.
Transcription Factor and RNA binding protein analysis
Mouse, macaque, and human transcription factor (TF) gene lists were downloaded from AnimalTFDB (http://bioinfo.life.hust.edu.cn/AnimalTFDB/). Mouse, macaque, and human RNA binding protein (RBP) gene lists were downloaded from EuRBPDB (http://EuRBPDB.syshospital.org). Heatmaps of the TFs and RBPs expression profiles were generated with normalized and standardized DEGs expression matrixes from mfuzz.
Funding: This work was supported by CAMS Innovation Fund for Medical Sciences (CIFMS, 2021-I2M-1-024, 2021-I2M-1-019) and National Science and Technology Innovation 2030 Major Program 2021ZD0200900.
The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive in National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences (GSA: CRA007718) that are publicly accessible at https://ngdc.cncb.ac.cn/gsa. Codes used to analyze results in this paper are available on GitHub at https://github.com/cheneylemon/developing-macaque-neocortex.
Competing Interest Statement
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Dataset S1. (separate file)
Marker list of 28 cell clusters.
Dataset S2. (separate file)
Gene expression importance across the vRG to oRG (C10-C14-C12) lineage trajectory related to Fig. 3E.( vRG: milestone2; oRG: milestone4)
Dataset S3. (separate file)
Gene expression importance across the RG to deep-layer neurons and upper-layer.
Dataset S4. (separate file)
Gene expression importance across the vRG to IPC(C10-C22-C8) trajectory related to fig. S5B.
Dataset S5. (separate file)
Gene expression importance across the OPC differentiation into astrocytes and oligodendrocytes lineage trajectory related to fig. S6E.
Dataset S6. (separate file)
The original data of normalized gene expression matrix in human, macaque, and mouse vRG related to fig.6A.
Dataset S7. (separate file)
Regulon specificity score for each timepoint in human, macaque, and mouse vRG related to fig. 6 C to H.
- 1.Fate and freedom in developing neocortical circuitsNat Commun 8https://doi.org/10.1038/ncomms16042
- 2.Neurons in rhesus monkey visual cortex: systematic relation between time of origin and eventual dispositionScience 183:425–427
- 3.Temporal patterning of apical progenitors and their daughter neurons in the developing neocortexScience 364https://doi.org/10.1126/science.aav2522
- 4.Neurogenic radial glia in the outer subventricular zone of human neocortexNature 464:554–561https://doi.org/10.1038/nature08845
- 5.Control of outer radial glial stem cell mitosis in the human brainCell Reports 8:656–664https://doi.org/10.1016/j.celrep.2014.06.058
- 6.Precursor diversity and complexity of lineage relationships in the outer subventricular zone of the primateNeuron 80:442–457https://doi.org/10.1016/j.neuron.2013.09.032
- 7.Molecular Identity of Human Outer Radial Glia during Cortical DevelopmentCell 163:55–67https://doi.org/10.1016/j.cell.2015.09.004
- 8.Genes and Mechanisms Involved in the Generation and Amplification of Basal Radial Glial CellsFront Cell Neurosci 13https://doi.org/10.3389/fncel.2019.00381
- 9.Current best practices in single-cell RNA-seq analysis: a tutorialMol Syst Biol 15https://doi.org/10.15252/msb.20188746
- 10.Progenitor cell diversity in the developing mouse neocortexProc Natl Acad Sci U S A 118https://doi.org/10.1073/pnas.2018866118
- 11.Molecular logic of cellular diversification in the mouse cerebral cortexNature 595:554–559https://doi.org/10.1038/s41586-021-03670-5
- 12.Chromatin and gene-regulatory dynamics of the developing human cerebral cortex at single-cell resolutionCell 184https://doi.org/10.1016/j.cell.2021.07.039
- 13.Spatiotemporal gene expression trajectories reveal developmental hierarchies of the human cortexScience 358:1318–1323https://doi.org/10.1126/science.aap8809
- 14.Single-cell atlas of early human brain development highlights heterogeneity of human neuroepithelial cells and early radial gliaNature Neuroscience 24:584–594https://doi.org/10.1038/s41593-020-00794-1
- 15.Human intermediate progenitor diversity during cortical developmentProceedings of the National Academy of Sciences of the United States of America 118https://doi.org/10.1073/pnas.2019415118
- 16.Temporally divergent regulatory mechanisms govern neuronal diversification and maturation in the mouse and marmoset neocortexNature Neuroscience 25:1049–1058https://doi.org/10.1038/s41593-022-01123-4
- 17.A comprehensive transcriptional map of primate brain developmentNature 535:367–375https://doi.org/10.1038/nature18637
- 18.Transcriptional architecture of the primate neocortexNeuron 73:1083–1099https://doi.org/10.1016/j.neuron.2012.03.002
- 19.Large-scale cellular-resolution gene profiling in human neocortex reveals species-specific molecular signaturesCell 149:483–496https://doi.org/10.1016/j.cell.2012.02.052
- 20.Intermediate progenitors and Tbr2 in cortical developmentJ Anat 235:616–625https://doi.org/10.1111/joa.12939
- 21.Towards improved animal models of neonatal white matter injury associated with cerebral palsyDisease Models & Mechanisms 3:678–688https://doi.org/10.1242/dmm.002915
- 22.Transcriptional repression by FEZF2 restricts alternative identities of cortical projection neuronsCell Rep 35https://doi.org/10.1016/j.celrep.2021.109269
- 23.Transcriptional priming as a conserved mechanism of lineage diversification in the developing mouse and human neocortexSci Adv 6https://doi.org/10.1126/sciadv.abd2068
- 24.Comprehensive transcriptome analysis of neocortical layers in humans, chimpanzees and macaquesNat Neurosci 20:886–895https://doi.org/10.1038/nn.4548
- 25.The Primate-Specific Gene TMEM14B Marks Outer Radial Glia Cells and Promotes Cortical Expansion and FoldingCell Stem Cell 21:635–649https://doi.org/10.1016/j.stem.2017.08.013
- 26.Aspm knockout ferret reveals an evolutionary mechanism governing cerebral cortical sizeNature 556:370–375https://doi.org/10.1038/s41586-018-0035-0
- 27.RGCC balances self-renewal and neuronal differentiation of neural stem cells in the developing mammalian neocortexEMBO Rep 22https://doi.org/10.15252/embr.202051781
- 28.Cellular and molecular properties of neural progenitors in the developing mammalian hypothalamusNat Commun 11https://doi.org/10.1038/s41467-020-17890-2
- 29.Single-Cell Transcriptomics Uncovers Glial Progenitor Diversity and Cell Fate Determinants during Development and GliomagenesisCell Stem Cell 24:707–723https://doi.org/10.1016/j.stem.2019.03.006
- 30.Endothelin-1 regulates astrocyte proliferation and reactive gliosis via a JNK/c-Jun signaling pathwayJ Neurosci 28:2394–2408https://doi.org/10.1523/JNEUROSCI.5652-07.2008
- 31.Temporal patterning of apical progenitors and their daughter neurons in the developing neocortexScience 364https://doi.org/10.1126/science.aav2522
- 32.Behavior and lineage progression of neural progenitors in the mammalian cortexCurr Opin Neurobiol 66:144–157https://doi.org/10.1016/j.conb.2020.10.017
- 33.Opposing Gradients of MicroRNA Expression Temporally Pattern Layer Formation in the Developing NeocortexDev Cell 49:764–785https://doi.org/10.1016/j.devcel.2019.04.017
- 34.Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell IdentityCell 177https://doi.org/10.1016/j.cell.2019.05.006
- 35.A single-cell RNA-seq survey of the developmental landscape of the human prefrontal cortexNature 555:524–528https://doi.org/10.1038/nature25980
- 36.Single-cell transcriptome analysis reveals cell lineage specification in temporal-spatial patterns in human cortical developmentSci Adv 6https://doi.org/10.1126/sciadv.aaz2978
- 37.A new subtype of progenitor cell in the mouse embryonic neocortexNature Neuroscience 14:555–561https://doi.org/10.1038/nn.2807
- 38.Spatiotemporal transcriptomic divergence across human and macaque brain developmentScience 362https://doi.org/10.1126/science.aat8077
- 39.3D Genome of macaque fetal brain reveals evolutionary innovations during primate corticogenesisCell 184https://doi.org/10.1016/j.cell.2021.01.001
- 40.An atlas of cortical arealization identifies dynamic molecular signaturesNature 598:200–204https://doi.org/10.1038/s41586-021-03910-8
- 41.Neurogenic radial glia in the outer subventricular zone of human neocortexNature 464:554–561https://doi.org/10.1038/nature08845
- 42.Transformation of the Radial Glia Scaffold Demarcates Two Stages of Human Cerebral Cortex DevelopmentNeuron 91:1219–1227https://doi.org/10.1016/j.neuron.2016.09.005
- 43.A new subtype of progenitor cell in the mouse embryonic neocortexNat Neurosci 14:555–561https://doi.org/10.1038/nn.2807
- 44.Human-specific ARHGAP11B induces hallmarks of neocortical expansion in developing ferret neocortexElife 7https://doi.org/10.7554/eLife.41241
- 45.Expression of human-specific ARHGAP11B in mice leads to neocortex expansion and increased memory flexibilityEMBO J 40https://doi.org/10.15252/embj.2020107093
- 46.Human-specific increases size and folding of primate neocortex in the fetal marmosetScience 369:546–550https://doi.org/10.1126/science.abb2401
- 47.Progenitor Hyperpolarization Regulates the Sequential Generation of Neuronal Subtypes in the Developing NeocortexCell 174:1264–1276https://doi.org/10.1016/j.cell.2018.06.036
- 48.Fezl is required for the birth and specification of corticospinal motor neuronsNeuron 47:817–831https://doi.org/10.1016/j.neuron.2005.08.030
- 49.The determination of projection neuron identity in the developing cerebral cortexCurr Opin Neurobiol 18:28–35https://doi.org/10.1016/j.conb.2008.05.006
- 50.A comprehensive transcriptional map of primate brain developmentNature 535:367–375https://doi.org/10.1038/nature18637
- 51.Transcriptomic Perspectives on Neocortical Structure, Development, Evolution, and DiseaseAnnu Rev Neurosci 40:629–652https://doi.org/10.1146/annurev-neuro-070815-013858
- 52.Spatial reconstruction of single-cell gene expression dataNat Biotechnol 33:495–502https://doi.org/10.1038/nbt.3192
- 53.Slingshot: cell lineage and pseudotime inference for single-cell transcriptomicsBMC Genomics 19https://doi.org/10.1186/s12864-018-4772-0
- 54.A comparison of single-cell trajectory inference methodsNat Biotechnol 37:547–554https://doi.org/10.1038/s41587-019-0071-9
- 55.iRegulon and i-cisTarget: Reconstructing Regulatory Networks Using Motif and Track EnrichmentCurr Protoc Bioinformatics 52:2–12https://doi.org/10.1002/0471250953.bi0216s52
- 56.Cytoscape: a software environment for integrated models of biomolecular interaction networksGenome Res 13:2498–2504https://doi.org/10.1101/gr.1239303
- 57.Mfuzz: a software package for soft clustering of microarray dataBioinformation 2:5–7