Research Article

Regulatory and coding sequences of TRNP1 co-evolve with brain size and cortical folding in mammals

Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Germany
Physiological Genomics, BioMedical Center - BMC, Ludwig-Maximilians-Universität, Germany
Institute for Stem Cell Research, Helmholtz Zentrum München, Germany Research Center for Environmental Health, Germany
Department of Environmental Microbiology, Eawag, Switzerland
Department of Environmental Systems Science, ETH Zurich, Switzerland
SYNERGY, Excellence Cluster of Systems Neurology, BioMedical Center (BMC), Ludwig-Maximilians-Universität München, Germany

Mar 22, 2023

Open access
Copyright information

Abstract
Editor's evaluation
Introduction
Results
Discussion
Materials and methods
Appendix 1
Data availability
References
Article and author information
Metrics

Abstract

Brain size and cortical folding have increased and decreased recurrently during mammalian evolution. Identifying genetic elements whose sequence or functional properties co-evolve with these traits can provide unique information on evolutionary and developmental mechanisms. A good candidate for such a comparative approach is TRNP1, as it controls proliferation of neural progenitors in mice and ferrets. Here, we investigate the contribution of both regulatory and coding sequences of TRNP1 to brain size and cortical folding in over 30 mammals. We find that the rate of TRNP1 protein evolution (ω) significantly correlates with brain size, slightly less with cortical folding and much less with body size. This brain correlation is stronger than for >95% of random control proteins. This co-evolution is likely affecting TRNP1 activity, as we find that TRNP1 from species with larger brains and more cortical folding induce higher proliferation rates in neural stem cells. Furthermore, we compare the activity of putative cis-regulatory elements (CREs) of TRNP1 in a massively parallel reporter assay and identify one CRE that likely co-evolves with cortical folding in Old World monkeys and apes. Our analyses indicate that coding and regulatory changes that increased TRNP1 activity were positively selected either as a cause or a consequence of increases in brain size and cortical folding. They also provide an example how phylogenetic approaches can inform biological mechanisms, especially when combined with molecular phenotypes across several species.

Editor's evaluation

This is an important paper that combines comparative analysis and experimental assays to investigate the role of protein-coding and regulatory changes at TRNP1 in mammalian brain evolution. The evidence supporting a contribution of TRNP1 is convincing, although the strength of the link between protein-coding changes and trait evolution is stronger and more readily interpretable than the data on gene regulation. The work will be of interest to researchers interested in mammalian evolution, brain evolution, and evolutionary genetics.

https://doi.org/10.7554/eLife.83593.sa0

Introduction

Understanding the genetic basis of complex phenotypes within and across species is central for biology. Brain phenotypes – even when as simple as size or folding – are of particular interest to many fields, because they are linked to cognitive abilities, which are of particular interest to humans (Reader et al., 2011; DeCasien et al., 2022).

Brain size and cortical folding show extensive variation across mammals, including recurrent independent increases and decreases (Montgomery et al., 2016; Boddy et al., 2012; Lewitus et al., 2013; Smaers et al., 2021). For example, most rodents have a small brain and an unfolded cortex (Kelava et al., 2013), while carnivores, cetaceans, and primates generally have enlarged and folded cortices, peaking in dolphin and human. Also within primates these traits vary, showing an increase on the great ape branch, but also decreases in several New World monkey species. Using comparative, that is, phylogenetic, approaches across primates and mammals, these variations have been correlated with different life history traits, such as longevity, diet, or energetic constraints (DeCasien et al., 2017; DeCasien et al., 2022; Heldstab et al., 2022) revealing underlying ecological factors that drive selection for larger brains.

The underlying genetic and cellular factors that are associated with these evolutionary variations in brain size and folding have not been studied across such large phylogenies. However, observational and experimental studies, especially in mice, but increasingly also in other systems like the ferret, macaques and humans, have led to major insights into the genetic and cellular mechanisms of cortical development (Pinson and Huttner, 2021; Del-Valle-Anton and Borrell, 2022; Villalba et al., 2021). Briefly, proliferation of neuroepithelial stem cells (NECs) that have contacts with the apical surface and basal lamina leads to the formation of the neuroepithelium during early development. NECs then become Pax6-positive apical radial glia cells (aRGCs), that continue to self-amplify before producing basal progenitors (BPs). BPs include basal radial glia cells (bRGCs) that remain Pax6 positive, loose the apical contact, and – depending on the species – can also self-amplify before eventually producing neurons. The extent of proliferation of all these neural progenitors is also influenced by their cell cycle length where a short cell cycle leads to more cycles of symmetric divisions, a delayed onset of neurogenesis, and subsequently to more neurons and a bigger cortex. Notably, proliferation of bRGCs at a particular cortical location is thought to be crucial to generate a cortical fold at this location. Hence, genes that influence the proliferation of these neural progenitors to evolutionary changes in brain size and folding.

The major focus in this respect has been on identifying and functionally characterizing genetic changes on the human or primate lineage. For example, the human-specific gene ARHGAP11B was found to induce bRGC proliferation and folding in cortices of mice, ferrets, and marmosets (Florio et al., 2015; Kalebic et al., 2018; Heide et al., 2020). Other examples include an amino acid substitution specific to modern humans in TKTL1 (Pinson et al., 2022), human-specific NOTCH2 paralogs (Fiddes et al., 2018; Suzuki et al., 2018), the primate-specific genes TMEM14B and TBC1D3 (Liu et al., 2017; Ju et al., 2016), and an enhancer of FZD8, a receptor of the Wnt pathway (Boyd et al., 2015). While mechanistically convincing, it is unclear whether the proposed evolutionary link can be generalized as only one evolutionary lineage is investigated. Conversely, comparative approaches that correlate sequence changes with brain size changes have investigated more evolutionary lineages (Boddy et al., 2017; Montgomery et al., 2016), but these studies lack mechanistic evidence and are limited to the analysis of protein-coding regions. Here, we combine mechanistic and phylogenetic approaches to study TRNP1, a gene that is known to be important for cortical growth and folding by influencing aRGC and bRGC proliferation and differentiation in mice (Stahl et al., 2013; Pilz et al., 2013; Kerimoglu et al., 2021) and ferrets (Martínez-Martínez et al., 2016).

On a cellular level, expressing Trnp1 in neural stem cells (NSCs) isolated from mouse cortices induces phase separation, accelerates mitosis, and increases proliferation (Stahl et al., 2013; Esgleas et al., 2020). Increasing Trnp1 expression by in utero electroporation in mice and ferrets (embryonic day 13 [E13] in mice) leads to increased proliferation of aRGCs (Stahl et al., 2013; Martínez-Martínez et al., 2016). Decreasing Trnp1 expression levels in mice or ferrets (E13) reduces aRGC proliferation, increases their differentiation into BPs, and induces cortical folding (Stahl et al., 2013; Pilz et al., 2013; Martínez-Martínez et al., 2016). Notably, increasing Trnp1 expression levels by in utero electroporation at E14.5 increases bRGC proliferation (Kerimoglu et al., 2021) and also induces cortical folding.

Hence, Trnp1 levels can alter proliferation and differentiation of neural progenitors and in turn alter brain size and folding in mice and ferrets. However, whether genetic changes in TRNP1 did alter cortical size and folding during mammalian evolution is unclear. Here, we analyse the evolution of TRNP1 regulatory and coding sequences across mammals and investigate their link to the evolution of brain size and cortical folding.

Results

TRNP1 amino acid substitution rates co-evolve with rates of change in brain size and cortical folding in mammals

We experimentally and computationally collected (Camacho et al., 2009) and aligned (Löytynoja, 2021) 45 mammalian TRNP1 coding sequences, including dolphin and 18 primates (99.0% completeness, Figure 1—figure supplement 1A). For 30 of those species, we could also compile estimates for brain size and cortical folding, as well as body mass as a potentially confounding parameter (Figure 1A; Supplementary file 1c). We quantify brain size as its weight and cortical folding as the ratio of the cortical surface over the perimeter of the brain surface, the gyrification index (GI), where a GI $= 1$ indicates a completely smooth brain and a GI gt₁ indicates higher levels of cortical folding (Zilles et al., 1989). This phenotypic data together with the coding sequences are the basis for our investigation in the evolutionary relation between the rate of TRNP1 protein evolution and the evolution of brain size and gyrification.

Figure 1 with 3 supplements see all

Download asset Open asset

TRNP1 amino acid substitution rates co-evolve with brain size and cortical folding in mammals.

(A) Mammalian species for which body mass, brain size, gyrification index (GI) measurements, and TRNP1 coding sequences were available (n=30)(Figure 1—figure supplement 1). Log2-transformed units: body mass and brain size in kg; GI is a ratio (cortical surface/perimeter of the brain surface). (B) Estimated marginal and partial correlation between ω of TRNP1 and the three traits using Coevol (Lartillot and Poujol, 2011). Size indicates posterior probability (pp). (C) TRNP1 protein substitution rates (ω) significantly correlate with brain size ( $r = 0.83$ , $p p$ = 0.97).(D) The average correlation across 124 control proteins with brain size ( $\bar{r}$ =0.10). (E) TRNP1 ω correlation with GI compared to the average across control proteins. (F) TRNP1 ω correlation with body mass compared to the average across control proteins. (**C, D, E, F**) Error bars indicate standard errors. (G) Distribution of partial correlations between ω and brain size of the control proteins and TRNP1. (H) Distribution of partial correlations between ω and GI of the control proteins and TRNP1. (I) Scheme of the mouse TRNP1 protein (223 amino acids [AAs]) with intrinsically disordered regions (orange) and sites (red lines) subject to positive selection in mammals (ω > 1, $p p > 0.95$ Figure 1—figure supplement 1). Letter size of the depicted AAs represents the abundance of AAs at the positively selected sites.

The ratio of the non-synonymous (non-neutral) and the synonymous substitution rates, $ω$ , is easily accessible and hence one of the most widespread measures of selection on protein-coding sequences, despite its limitations (Yang, 2006; Nei et al., 2000). In the absence of additional evidence, only an $ω > 1$ can be interpreted as proof of positive selection. However, an $ω$ gt₁ requires many recurrent selective events and hence is underpowered to detect moderate amounts of positive selection. Therefore, it has become common practice to identify increases of $ω$ on certain branches or subtrees relative to the remainder of the tree. For our question, we are analyzing the variation of $ω$ across branches. To this end, we use the software Coevol that allows estimating the co-variance between rates of phenotypic and evolutionary sequence changes ( $ω$ ), while both types of information go into the optimization of branch length estimates of the underlying phylogenetic tree (Lartillot and Poujol, 2011). This allows to detect a correlation between the strength of selection ( $ω$ ) and a phenotypic trait. The question remains whether this correlation is directly caused by selection on that trait, or what we observe are indirect effects. This is not uncommon, because the strength of selection depends on the effective population size ( $N_{e}$ ) of a species, which is often linked to life history traits and body size (Ohta, 1987; Lynch and Walsh, 2007). For example, species with a large body size tend to have a small $N_{e}$ and thus a low efficacy of selection (Figuet et al., 2016; Lartillot and Poujol, 2011). With purifying selection being the dominant force in protein sequence evolution, we would thus expect a positive correlation between $ω$ and body size due to indirect effects of $N_{e}$ . However, in contrast to directed selection on one trait which is targeted to specific genes, a lower efficacy in purifying selection due to $N_{e}$ will have an impact on all genes.

Therefore, we compiled a set of control genes in the same 30 species for which we have TRNP1 sequences and phenotypic data. We started with all human autosomal genes that – as TRNP1 – have only one coding exon (n=1997; Human CCDS; Pujar et al., 2018) and a similar length (n=1088; 291–999 bp vs. 682 bp of TRNP1). For 133 (12.3%) of these we could find full-length high-quality one-to-one orthologous sequences for all 30 species (Figure 1—figure supplement 3A; Supplementary file 1f; Materials and methods). To ensure the quality of the resulting multiple sequence alignments, all of them were manually inspected. Based on the overall tree length we removed one outlier ( $σ_{l o g (d S)} > 3$ ) leaving us with 132 control proteins that are well comparable to TRNP1 with respect to tree length, alignment quality, and ω (Figure 1—figure supplement 3B). Eight rather conserved genes (six with ω<0.04 and two with $ω$ <0.19) did not show an acceptable parameter convergence between runs of Coevol, leaving 124 control genes well comparable to TRNP1 (Supplementary file 1f). If a species such as human or dolphin evolved a large, gyrified brain due to positive selection on TRNP1, we expect those lineages to show an increased rate of phenotype (brain size and GI) change and an increased $ω$ . If this pattern is consistent across the majority of branches, Coevol would infer a positive correlation between $ω$ and the trait. Moreover, if this correlation is stronger than that for the average control protein, we can exclude that this is solely due to variation in the efficacy of selection.

Indeed, we find that ω of TRNP1 positively correlates with brain size (r=0.83; p=0.97), GI (r=0.75; p=0.98), and also body mass (r=0.76; p=0.97) and that these correlations are stronger than those of the average control protein (Figure 1C–F, Figure 1—figure supplement 3C), showing that the interaction between TRNP1 and the phenotypes goes beyond pure efficacy of selection effects. All three traits are highly correlated with one another. It is well known that brain and body size are not independent, and the same is true for GI and brain size (Montgomery et al., 2016; Smaers et al., 2021). To disentangle which trait is most likely to be causal for the observed correlation with ω, we compare the partial correlations and find that brain size has the highest partial correlation (r=0.4), followed by GI (r=0.34), while the partial correlation with body mass (r=0.19) has a much larger drop compared to the marginals (Figure 1B, Figure 1—figure supplement 3C), making selection on brain size and/or GI the more likely causes for the variation in ω. This said, TRNP1 is unlikely to be the sole evolutionary modifier of such an important and complex phenotype as brain size and gyrification. Because our control proteins represent a random selection of genes that based on sequence properties should give us comparable power to detect a link to these phenotypes, we can use the distribution of partial correlations of ω of the controls with brain size and GI to gauge the relative importance of TRNP1 for brain evolution (Figure 1G and H; Supplementary file 1g). We find that TRNP1 protein evolution is among 4.0% and 6.4% of the most correlated proteins for brain size and GI, respectively.

Having established that the rate of protein evolution of TRNP1 is linked to brain size evolution, we now want to pinpoint the relevant sites or domains in the protein to facilitate further functional studies. Using the site model of PAML (Yang, 1997), we find 9.8% of the codons to show signs of recurrent positive selection (i.e., $ω > 1$ , site models M8 vs. M7, $χ^{2} p$ -value <0.001, df = 2). Eight codons with a selection signature could be pinpointed with high confidence (Supplementary file 1d). Seven out of those eight reside within the first intrinsically disordered region (IDR) and one in the second IDR of the protein (Figure 1I; Figure 1—figure supplement 1B). The IDRs of TRNP1 are thought to mediate homotypic and heterotypic protein-protein interactions and are relevant for TRNP1-dependent phase separation, nuclear compartment size regulation, and M‐phase length regulation (Esgleas et al., 2020). Hence, the positively selected sites indicate that these IDR-mediated TRNP1 functions were repeatedly adapted during mammalian evolution and the identified sites are candidates for further functional studies.

TRNP1 proliferative activity co-evolves with brain size and cortical folding in mammals

Next, we investigated whether the correlation between TRNP1 protein evolution and cortical phenotypes can be linked to functional properties of TRNP1 at a cellular level. A central property of TRNP1 is to promote proliferation of aRGC (Stahl et al., 2013; Esgleas et al., 2020) and also of BPs (Kerimoglu et al., 2021). This proliferative activity can be assessed in an in vitro assay in which TRNP1 is transfected into NSCs isolated from E14 mouse cortices (Stahl et al., 2013; Esgleas et al., 2020).

To compare TRNP1 orthologues in this assay, we synthesized and cloned the TRNP1 coding sequence of human, rhesus macaque, galago, mouse, and dolphin that cover the observed range of $ω$ (Figure 1C). After co-transfection with green fluorescent protein (GFP), we quantified the number of proliferating (Ki67+, GFP+) over all transfected (GFP+) NSCs for each TRNP1 orthologue in ≥7 replicates (Figure 2A and B). We confirmed that TRNP1 transfection does increase proliferation compared to a GFP-only control (p-value $< 2 \times 10^{- 16}$ ; Figure 2—figure supplement 1A) as shown in previous studies (Stahl et al., 2013; Esgleas et al., 2020). Remarkably, the proportion of proliferating cells was highest in cells transfected with dolphin TRNP1 followed by human, which was significantly higher than the two other primates, galago and macaque (Figure 2C; Figure 2—figure supplement 1B; Supplementary file 2a-c). Indeed, the proliferative activity of TRNP1 is a significant predictor for brain size (BH-adjusted p-value = 0.0018, $R^{2} = 0.89$ ) and GI (BH-adjusted p-value = 0.016, $R^{2} = 0.69$ ) of its species of origin (phylogenetic generalized least squares [PGLS], likelihood ratio test [LRT]; Figure 2C). Note that the three primates and the dolphin are phylogenetically equally distant to the mouse (Figure 2C) and hence a bias due to the murine assay system cannot explain the observed correlations with brain size and GI. Hence, these results further support that the TRNP1 protein co-evolves with brain size and cortical folding.

Figure 2 with 1 supplement see all

Download asset Open asset

TRNP1 proliferative activity correlates with brain size and cortical folding.

(A) Five different TRNP1 orthologues were transfected into neural stem cells (NSCs) isolated from cerebral cortices of 14-day-old mouse embryos and proliferation rates were assessed after 48 hr using Ki67 immunostaining as proliferation marker and green fluorescent protein (GFP) as transfection marker in 7–12 independent biological replicates. (B) Representative image of the transfected cortical NSCs immunostained for GFP and Ki67. Arrows indicate three transfected cells of which two (solid arrows) are Ki67-positive (Figure 2—figure supplement 1). (C) Induced proliferation in NSCs transfected with TRNP1 orthologues from five different species (Supplementary file 2). Proliferation rates are a significant predictor for brain size ( $χ^{2}$ =10.04, df = 1, BH-adjusted p-value = 0.0018 = 11.75 ± 2.412, $R^{2}$ = 0.89) and GI ( $χ^{2}$ =5.85, df = 1, BH-adjusted p-value = 0.016 = 16.97 ± 6.568, $R^{2}$ = 0.69) in the respective species (phylogenetic generalized least squares [PGLS], likelihood ratio test [LRT]). Error bars indicate standard errors. Included species: human (*Homo sapiens*), rhesus macaque (*Macaca mulatta*), northern greater galago (*Otolemur garnettii*), house mouse (*Mus musculus*), common bottlenose dolphin (*Tursiops truncatus*).

Activity of a cis-regulatory element of TRNP1 likely co-evolves with cortical folding in catarrhines

Experimental manipulation of Trnp1 expression levels alters proliferation and differentiation of aRGC and bRGC in mice and ferrets (Stahl et al., 2013; Martínez-Martínez et al., 2016; Kerimoglu et al., 2021). Therefore, we next investigated whether changes in TRNP1 regulation may also be associated with the evolution of cortical folding and brain size by analyzing co-variation in the activity of TRNP1 associated cis-regulatory elements (CREs), using massively parallel reporter assays (MPRAs). To this end, a library of putative regulatory sequences is cloned into a reporter vector and their activity is quantified simultaneously by the expression levels of element-specific barcodes (Inoue and Ahituv, 2015). To identify putative CREs of TRNP1, we used DNase hypersensitive sites (DHS) from human foetal brain (Bernstein et al., 2010) and found three upstream CREs, the promoter-including exon 1, an intron CRE, one CRE overlapping the second exon, and one downstream CRE (Figure 3A). We obtained the orthologous sequences of the human CREs using a reciprocal best blat (RBB) strategy across additional mammalian species either from genome databases or by sequencing, yielding a total of 351 putative CREs in a panel of 75 mammalian species (Figure 3—figure supplement 1).

Figure 3 with 2 supplements see all

Download asset Open asset

Activity of a cis-regulatory element (CRE) of *TRNP1* correlates with cortical folding in catarrhines.

(A) Experimental setup of the massively parallel reporter assay (MPRA). Regulatory activity of seven putative TRNP1 CREs from 75 species were assayed in neural progenitor cells (NPCs) derived from human and cynomolgus macaque induced pluripotent stem cells. (Figure 3—figure supplement 1). (B) Fraction of the detected CRE tiles in the plasmid library per species across regions. The detection rates are unbiased and uniformly distributed across species and clades with only one extreme outlier *Dipodomys ordii*. (C) Fraction of the detected CRE tiles in the plasmid library per region across species. (D) Log-transformed total regulatory activity per CRE in human NPCs across species with available brain size and gyrification index (GI) measurements (n=45). (E) Total activity per CRE across species. Exon 1 (E1), intron (I), and the downstream (D) regions are more active and longer than other regions. (**B, C, E**) Each box represents the median and first and third quartiles with the whiskers indicating the furthest value no further than 1.5 * IQR from the box. Individual points indicate outliers. Figure 3—figure supplement 2 (F) Regulatory activity of the intron CRE is weakly associated with gyrification across mammals (phylogenetic generalized least squares [PGLS], likelihood ratio test [LRT] p-value=0.097, R²=0.07, n=37) and strongest across great apes and Old World monkeys, that is, catarrhines (PGLS, LRT p-value=0.003, R²=0.58, n=10).

Due to limitations in the length of oligonucleotide synthesis, we split each orthologous putative CRE into highly overlapping, 94 bp fragments. The resulting 4950 sequence tiles were synthesized together with a barcode unique for each tile. From those, we constructed a complex and unbiased lentiviral plasmid library containing at least 4251 (86%) CRE sequence tiles (Figure 3B and C). Next, we stably transduced this library into neural progenitor cells (NPCs) derived from two humans and one cynomolgus macaque (Geuder et al., 2021). We calculated the activity per CRE sequence tile as the read-normalized reporter gene expression over the read-normalized input plasmid DNA (Figure 3A, Materials and methods). Finally, we use the per-tile activities (Figure 3—figure supplement 2A) to reconstruct the activities of the putative CREs. To this end, we summed all tile sequence activities for a given CRE while correcting for the built-in sequence overlap (Figure 3D; Materials and methods). CRE activities correlate well within the two human NPC lines and between the human and cynomolgus macaque NPC lines, indicating that the assay is robust across replicates and species (Pearson’s $r$ 0.85–0.88; Figure 3—figure supplement 2B). The CREs covering exon 1, the intron, and the CRE downstream of TRNP1 show the highest total activity across species while the CREs upstream of TRNP1 show the lowest activity (Figure 3E).

Next, we tested whether CRE activity is associated with either brain size or GI across the 45 of the 75 mammalian species for which these phenotypes were available (Figure 3D). None of the CREs showed a significant association with brain size or GI (PGLS, LRT uncorrected p-value > 0.05) and only the intron CRE had a tendency to be positively associated with gyrification (PGLS, uncorrected LRT p-value=0.097, Figure 3F, left; Supplementary file 3b). Our power to detect such associations might be considerably lower than for coding sequences also because regulatory elements have a high turn-over rate (Danko et al., 2018; Berthelot et al., 2018; Huber et al., 2020). Hence, we expect that some orthologous DNA sequences that are CREs in one species do not function as CREs in others and can even be lost. The latter effect might explain why the sequences orthologous to human CREs are shorter in non-primate species more distantly related to humans (Figure 3—figure supplement 1). So phylogenetic comparisons of regulatory elements might be more powerful when restricted to species closely related to the species from which the CRE annotation is derived (humans in our case). Indeed, when we restrict our analysis to the catarrhine clade that encompasses Old World monkeys, great apes, and humans, the association between intron CRE activity and GI becomes considerably stronger (PGLS, uncorrected LRT p-value=0.003, Bonferroni-corrected for seven regions p-value=0.02, Figure 3F, right; Supplementary file 3). To validate that our model results are rather specific, we generated a null distribution for the observed correlation across catharrines, permuting the activities of all other CREs of this study. In agreement with our model results, we find 8/1000 (0.8%) of the random CRE combinations to have such a significant association of p ≤ 0.003. Moreover, the intron CRE activity-GI association was consistently detected across all three cell lines including the cynomolgus macaque NPCs (Supplementary file 3). Furthermore, Reilly et al. compared enhancer activity by histone modifications in the developing cortex of humans, rhesus macaques, and mice and found a gain in activity on the human lineage in a region overlapping the intron CRE (Reilly et al., 2015). Thus, while the statistical evidence from our MPRA data alone is limited, we consider the GI association in catarrhines together with the additional evidence from Reilly et al., 2015, strong enough to warrant a more detailed analysis of the intron CRE.

Transcription factors with binding site enrichment on intron CREs regulate cell proliferation and are candidates to explain the observed activity across catarrhines

Reasoning that differences in CRE activities will likely be mediated by differences in their interactions with transcription factors (TF), we analysed the sequence evolution of putative TF binding sites (Figure 4A). First, we performed RNA-seq on the same samples that were used for the MPRA. Notably, also TRNP1 was expressed (Figure 4B), supporting the relevance of our cellular system. Moreover, TRNP1 expression was significantly higher in human NPC lines than that of cynomolgus macaque’s (BH-adjusted p-value <0.05, Figure 4—figure supplement 1A–C), consistent with higher intron CRE activity. Among the 392 expressed TFs with known binding motifs, we identified 22 with an excess of binding sites (Frith et al., 2003) within the catarrhine intron CRE sequences (Figure 4B and D). In agreement with TRNP1 itself being involved in the regulation of cell proliferation (Volpe et al., 2006; Stahl et al., 2013; Esgleas et al., 2020), these 22 TFs are enriched in biological processes regulating cell proliferation, neuron apoptotic process, and hormone levels (Gene Ontology, Fisher’s exact p-value <0.05, background: 392 expressed TFs; Figure 4C; Supplementary file 3).

Figure 4 with 2 supplements see all

Download asset Open asset

Transcription factors (TFs) with binding site enrichment on intron cis-regulatory elements (CREs) regulate cell proliferation and are candidates to explain the observed activity across catarrhines.

(A) Orthologous intron CRE sequences show different regulatory activities under the same cellular conditions, suggesting variation in cis regulation across species. (B) Variance-stabilized expression in neural progenitor cells (NPCs) of *TRNP1* and the 22 TFs with enriched binding sites (motif weight ≥ 1) on the intron CREs. Each box represents the median, first and third quartiles with the whiskers indicating the furthest value no further than 1.5 * IQR from the box. Points indicate individual expression values. Vertical line indicates average expression across all 392 TFs (5.58), grey area: standard deviation (1.61). (C) Eight top enriched biological processes (Gene Ontology, Fisher’s exact test p-value <0.05) of the 22 TFs. Background: all expressed TFs (392). (D) Variation in binding scores of the enriched TFs across catarrhines. Heatmaps indicate standardized binding scores (grey), gyrification index (GI) values (blue) and intron CRE activities (yellow) from the respective species. TF background colour indicates gene ontology assignment of the TFs to the two most significant biological processes. The bottom panel indicates the spatial position of the top binding site (motif score >3) for each TF on the human sequence. (E) Binding scores of three TFs (CTCF, ZBTB26, SOX8) are the best candidates to explain intron CRE activity, whereas only CTCF binding shows an association with the GI (phylogenetic generalized least squares [PGLS], likelihood ratio test [LRT] p-value <0.05). (F) Predicted intron CRE activity by the binding scores of the three TFs vs. the measured intron CRE activity across catarrhines.

To further prioritize these 22 TFs, we used the motif binding scores in the 10 catarrhine intron CREs to predict the observed intron CRE activity in the MPRA and to predict the GI of the respective species. We found three TFs (CTCF, ZBTB26, SOX8) to be the best candidates to explain the variation in the intron CRE activity and one TF (CTCF) to co-vary with GI (PGLS, uncorrected LRT p-value <0.05, Figure 4D–F). While the statistical support for this association is not strong, which is expected given that we were screening 22 candidate TFs in only 10 species, CTCF ChIP-seq data from the relevant cell types suggests that this particular CTCF binding site is indeed bound by CTCF in human NPCs (ChiP-seq, Encode Project Consortium, 2012, Figure 4—figure supplement 2). Moreover, HiC data show a topologically associated domain (TAD) boundary just upstream of TRNP1 in the germinal zone of the developing human brain (postconception week 8, Won et al., 2016). Hence, variations in the binding strength of CTCF across species might likely have consequences for the stability of the TAD boundary and TRNP1 expression, affecting the associated phenotypes given its crucial role for brain development (Stahl et al., 2013).

In summary, we find a suggestive correlation between the activity of the intron CRE and gyrification in catarrhines, indicating that also regulatory changes of TRNP1 might have contributed to the evolution of gyrification.

Discussion

Previous studies in mice and ferrets have elucidated mechanisms how Trnp1 is necessary for proliferation and differentiation of neural progenitors and how it could contribute to the evolution of brain size and cortical folding. We applied phylogenetic methods to explore associations between sequence and trait evolution and found that the rate of protein evolution and the proliferative activity of TRNP1 positively correlate with brain size and gyrification in mammals. Moreover, we find tentative evidence that the activity of a regulatory element in the intron of TRNP1 might be associated with gyrification in catarrhines. At the sequence level, such a correlation could also be caused by confounding factors that affect the efficacy of natural selection such as the effective population size (Ohta, 1987; Lynch and Walsh, 2007). However, body size – a reasonable proxy for effective population size (Figuet et al., 2016; Lartillot and Poujol, 2011) – correlates much less with TRNP1 protein evolution than brain size or gyrification. Even more convincingly, the correlation of TRNP1 with brain size and gyrification is much stronger than the average correlation of these traits with the evolution of other proteins, that would have had to experience the same population size changes. Furthermore, it is unclear how an increased proliferative activity of TRNP1 or an increased CRE activity could be caused by a reduced efficacy of selection or other confounding factors. Together with the known role of TRNP1 in brain development, we think that the observed correlations are best interpreted as co-evolution of TRNP1 activity with brain size and gyrification, that is, that more active TRNP1 alleles were selected because they were advantageous to increase brain size and/or gyrification.

Of note, the effect of structural changes appears stronger than the effect of regulatory changes. This is contrary to the notion that regulatory changes should be the more likely targets of selection as they are more cell-type specific (Carroll, 2008) (but see also Hoekstra and Coyne, 2007). However, current measures of regulatory activity are inherently less precise than counting amino acid changes, which will necessarily deflate the estimated association strength (Danko et al., 2018; Berthelot et al., 2018; Huber et al., 2020). Not only is gene regulation cell-type and time-dependent, but regulatory elements also evolve much faster, making a comprehensive and informative comparison across large phylogenies much more difficult. Moreover, while MPRAs function well in deciphering the regulatory activities of individual CREs, they are still limited in their in vivo interpretation. In any case, our analysis suggests that evolution likely combined both regulatory and structural evolution to modulate TRNP1 activity.

The MPRA also allowed to identify TFs that have a binding site enrichment to the intron CRE and are likely direct regulators of TRNP1. These include INSM1 (Tavano et al., 2018), which also has been shown to control NEC-to-neural-progenitor transition, as well as other relevant factors with increased activity in human neural stem and progenitor cells during early cortical development compared to later stages, such as TFAP2A, NFIC, TCF3, KLF12, and again INSM1 (Trevino et al., 2021; de la Torre-Ubieta et al., 2018). Among the enriched TFs that bind to the intron CRE, CTCF had the strongest association with gyrification. Although CTCF is best known for its insulating properties, it can also act as transcriptional activator and recruit co-factors in a lineage-specific manner (Arzate-Mejía et al., 2018). In neural progenitors, CTCF loss causes severe impairment in proliferative capacity through the increase in premature cell cycle exit, which results in drastically reduced progenitor pool and early differentiation (Watson et al., 2014). The overlapping molecular roles of TRNP1 and CTCF in neural progenitors support the possibility that TRNP1 is among the cell-fate determinants downstream of CTCF (Wu et al., 2006; Delgado-Olguín et al., 2011). Differences between species in CTCF binding strength and/or length to the intron CRE might have direct consequences for the binding of additional TFs, TRNP1 expression, and the resulting progenitor pool. However, the effects of CTCF binding in vitro and in vivo might differ and the exact mechanism, including the developmental timing and cellular context in which this might be relevant, is yet to be disentangled.

Independent from the mechanisms and independent whether caused by regulatory or structural changes, it is relevant how an increased TRNP1 activity could alter brain development. When overexpressing Trnp1 in aRGCs of developing mice (E13) and ferrets (E30), aRGC proliferation increases (Stahl et al., 2013; Pilz et al., 2013; Martínez-Martínez et al., 2016). Similarly, overexpression of Trnp1 increases proliferation in vitro in NSCs (Stahl et al., 2013; Esgleas et al., 2020) or breast cancer cells (Volpe et al., 2006). Hence, TRNP1 evolution could contribute to evolving a larger brain by increasing the pool of aRGCs. In addition, increases in brain size and especially increases in cortical folding are highly dependent on increases in proliferation of BPs, in particular bRGCs (Pinson and Huttner, 2021; Del-Valle-Anton and Borrell, 2022; Villalba et al., 2021). Remarkably, recent evidence indicates that Trnp1 could be important also for the proliferation of BPs (Kerimoglu et al., 2021): Firstly, in contrast to non-proliferating BPs from mice, proliferating BPs from human do express TRNP1 (Kerimoglu et al., 2021). Furthermore, when activating expression of Trnp1 using CRISPRa at E14.5, more proliferating BPs and induction of cortical folding is observed (Kerimoglu et al., 2021). Hence, a more active TRNP1 can increase proliferation in aRGCs and BPs and this could cause the observed co-evolution with brain size and cortical folding. TRNP1 is the first case where analyses of protein sequence, regulatory activity, and protein activity across a larger phylogeny have been combined to investigate the role of a candidate gene in brain evolution. Functional evidence from evolutionary changes on the human lineage, for example, for ARHGAP11B and NOTCH2NL, but also phylogenetic evidence from correlating sequence changes with brain size changes (Montgomery et al., 2016; Boddy et al., 2017) indicate that a substantial number of genes could adapt their function when brain size changes in mammalian lineages. Improved genome assemblies (Rhie et al., 2021) will decisively improve phylogenetic approaches (Cavassim et al., 2022; Stephan et al., 2022; Jourjine and Hoekstra, 2021; Smith et al., 2020). In combination with the increased possibilities for functional assays due to DNA synthesis (Chari and Church, 2017) and comparative cellular resources across many species (Enard, 2012; Housman and Gilad, 2020; Geuder et al., 2021), this offers exciting possibilities to study the genetic basis of complex phenotypes within and across species.

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Gene (45 mammal species)	TRNP1	See Supplementary file 1a	See Supplementary file 1a	See Supplementary file 1a
Strain, strain background (E. coli)	NEB 10-beta	New England Biolabs; Rowley, MA, United States	Cat# C3020K	Electrocompetent E. coli
Strain, strain background (E. coli)	NEB 5-alpha High Efficiency	New England Biolabs; Rowley, MA, United States	Cat# C2987I	Chemically competent E. coli
Cell line (Macaca fascicularis)	Cynomolgus Macaque NPC	This paper, based on Geuder et al., 2021	N15_39B2	Macaca fascicularis neural progenitor cells
Cell line (Mus musculus)	N2A	ATCC; Manassas, VA, United States	CCL-131
Cell line (Homo sapiens)	HEK293T	ATCC; Manassas, VA, United States	CRL-11268
Cell line (Homo sapiens, female)	Human NPC 1	This paper, based on Geuder et al., 2021	N4_29B5	Human neural progenitor cells
Cell line (Homo sapiens, male)	Human NPC 2	This paper, based on Geuder et al., 2021	N4_12 C2	Human neural progenitor cells
Biological sample (Mus musculus)	Primary murine cerebral cortex cells (NSC)	This paper, based on Esgleas et al., 2020	primary	See Methods
Sequence-based reagent	MPRA oligo Library Trnp1 CRE	Custom Array; Redmond, WA, United States	custo	See https://github.com/Hellmann-Lab/Co-evolution-TRNP1-and-GI
Transfected construct (multiple species)	MPRA Library in lentiviral particles	This paper	custom	Lentiviral particles with pMPRA-lenti and TRNP1 CRE library
Antibody	rabbit anti Ki67 (monoclonal)	Abcam; Waltham, MA, United States	Cat# ab92742, Clone EPR3610	1:100
Antibody	chicken anti-GFP (polyclonal)	Aves Labs; Davis, CA, United States	RRID: AB_2307313, Cat# GFP-1010, Polyclonal	1:500
Recombinant DNA reagent	pCAG-GFP_Gateway plasmid	Dr. Paolo Malatesta	NA	Kind gift of Dr. Paolo Malatesta
Recombinant DNA reagent	pMDLg/pRRE plasmid	Addgene; Waterton, MA, United States	Addgene 12251
Recombinant DNA reagent	pRSV-Rev plasmid	Addgene; Waterton, MA, United States	Addgene 12253
Recombinant DNA reagent	pMD2.G plasmid	Addgene; Waterton, MA, United States	Addgene 12259
Recombinant DNA reagent	pMPRAlenti1 plasmid	Addgene; Waterton, MA, United States	Addgene 61600	Kind gift of Dr. Davide Cacchiarelli
Recombinant DNA reagent	pNL3.1[Nluc/minP] plasmid, SfiI restriction site mutated	Dr. Davide Cacchiarelli	NA	Kind gift of Dr. Davide Cacchiarelli
Recombinant DNA reagent	pMPRA1 plasmid	Addgene; Waterton, MA, United States	Addgene 49349	Kind gift of Dr. Davide Cacchiarelli
Recombinant DNA reagent	pENTR1a plasmid	Stahl et al., 2013	pENTR1a
Peptide, recombinant protein	hEGF	Miltenyi Biotec; Bergisch Gladbach, Germany	Cat#130-093-825
Peptide, recombinant protein	B-27 Supplement	Thermo Fisher Scientific; Waltham, MA, United States	Cat#12587–010
Peptide, recombinant protein	N2 Supplement	Thermo Fisher Scientific; Waltham, MA, United States	Cat#17502048
Peptide, recombinant protein	L-Ascorbic acid 2-phosphate	Sigma/Merck; St. Louis, MO, United States	Cat#A8960-5G
Peptide, recombinant protein	poly-D-lysine	Sigma/Merck; St. Louis, MO, United States	Cat# A-003-E
Peptide, recombinant protein	bFGF	PeproTech, Cranbury, New Jersey, United States	Cat#100-18B
Commercial assay or kit	GenomiPhi V2 DNA-Amplification Kit	Sigma/Merck; St. Louis, MO, United States	Cat# GE25-6600-32
Commercial assay or kit	Gateway LR Clonase Enzyme mix	Thermo Fisher Scientific; Waltham, MA, United States	Cat# 11791019
Commercial assay or kit	Lipofectamine 2000	Thermo Fisher Scientific; Waltham, MA, United States	Cat# 11668019
Commercial assay or kit	Lipofectamine 3000	Thermo Fisher Scientific; Waltham, MA, United States	Cat# L3000015
Commercial assay or kit	Micellula DNA Emulsion & Purification Kit	Roboklon; Berlin, Germany	Cat# E3600-01
Commercial assay or kit	Agilent High Sensitivity DNA Kit	Agilent; Santa Clara, CA, United States	Cat# 5067–4626
Commercial assay or kit	Nextera XT DNA Library Preparation Kit	Illumina; San Diego, CA, United States	Cat# FC-131–1024
Chemical compound, drug	GlutaMax-I	Thermo Fisher Scientific; Waltham, MA, United States	Cat# 35050038
Chemical compound, drug	Blasticidin S HCl	Thermo Fisher Scientific; Waltham, MA, United States	Cat# R21001
Chemical compound, drug	DMEM-GlutaMAX	Thermo Fisher Scientific; Waltham, MA, United States	Cat# 10566016
Chemical compound, drug	Polybrene	Sigma/Merck; St. Louis, MO, United States	Cat# TR-1003-G
Chemical compound, drug	TRI reagent	Sigma/Merck; St. Louis, MO, United States	Cat# T9424-200ML
Chemical compound, drug	Geltrex	Thermo Fisher Scientific; Waltham, MA, United States	Cat# A1413302

Share this article

Cite this article

TRNP1 amino acid substitution rates co-evolve with brain size and cortical folding in mammals.

TRNP1 proliferative activity correlates with brain size and cortical folding.

Activity of a cis-regulatory element (CRE) of TRNP1 correlates with cortical folding in catarrhines.

Transcription factors (TFs) with binding site enrichment on intron cis-regulatory elements (CREs) regulate cell proliferation and are candidates to explain the observed activity across catarrhines.

Author details

Zane Kliesmete

Contribution

Contributed equally with

Competing interests

Lucas Esteban Wange

Contribution

Contributed equally with

Competing interests

Beate Vieth

Contribution

Competing interests

Miriam Esgleas

Contribution

Competing interests

Jessica Radmer

Contribution

Competing interests

Matthias Hülsmann

Contribution

Competing interests

Johanna Geuder

Contribution

Competing interests

Daniel Richter

Contribution

Competing interests

Mari Ohnuki

Contribution

Competing interests

Magdelena Götz

Contribution

Competing interests

Ines Hellmann

Contribution

Contributed equally with

For correspondence

Competing interests

Wolfgang Enard

Contribution

Contributed equally with

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organisms