1. Genomics and Evolutionary Biology
Download icon

Subterranean mammals show convergent regression in ocular genes and enhancers, along with adaptation to tunneling

  1. Raghavendran Partha
  2. Bharesh K Chauhan
  3. Zelia Ferreira
  4. Joseph D Robinson
  5. Kira Lathrop
  6. Ken K Nischal
  7. Maria Chikina Is a corresponding author
  8. Nathan L Clark Is a corresponding author
  1. University of Pittsburgh, United States
  2. Children’s Hospital of Pittsburgh, United States
  3. University of Pittsburgh School of Medicine, United States
  4. University of California, United States
Research Article
  • Cited 1
  • Views 1,823
  • Annotations
Cite as: eLife 2017;6:e25884 doi: 10.7554/eLife.25884

Abstract

The underground environment imposes unique demands on life that have led subterranean species to evolve specialized traits, many of which evolved convergently. We studied convergence in evolutionary rate in subterranean mammals in order to associate phenotypic evolution with specific genetic regions. We identified a strong excess of vision- and skin-related genes that changed at accelerated rates in the subterranean environment due to relaxed constraint and adaptive evolution. We also demonstrate that ocular-specific transcriptional enhancers were convergently accelerated, whereas enhancers active outside the eye were not. Furthermore, several uncharacterized genes and regulatory sequences demonstrated convergence and thus constitute novel candidate sequences for congenital ocular disorders. The strong evidence of convergence in these species indicates that evolution in this environment is recurrent and predictable and can be used to gain insights into phenotype–genotype relationships.

https://doi.org/10.7554/eLife.25884.001

eLife digest

Over the past 100 million years, many mammals, such as moles or mole rats, for example, have evolved to live almost entirely underground. During their transition to adapt to life underground, many species have reduced or completely lost their sense of sight, and often have only a small remnant of an eye that can sometimes be completely covered by skin and fur.

In addition, the sections of the DNA that usually control how the eyes form have changed in these animals. Since there is less need for a working eye in dark environments, DNA related to the eye is no longer protected from damaging mutations in mammals that live underground. So, by comparing the DNA of mammals that live aboveground and underground, scientists can identify the parts of DNA that help form mammals’ eyes.

Previous studies have discovered many sections of DNA responsible for producing the proteins that make up the eye. However, scientists know less about which sections of DNA control when and where these proteins are made. To address this, Partha et al. have studied the DNA of four underground mammals: the star-nosed mole, the cape golden mole, the naked mole-rat and the blind mole-rat.

By comparing the DNA of these animals with that of mammals that live above ground, Partha et al. identified sections of DNA that contained an abnormally high number of changes in the blind underground mammals. Many of these sections are involved in forming the eye, including controlling when and where proteins are made. Overall, the findings show that comparing rates of evolution in different species can help uncover sections of DNA that guide and influence how organisms develop.

Understanding how the eye is formed is not only of interest to scientists studying evolution and biology; it also has wider applications in healthcare. Many people suffer from unexplained eye abnormalities, and insight into the sections of DNA that control the eye’s development could help medical professionals diagnose these cases and design new treatments.

https://doi.org/10.7554/eLife.25884.002

Introduction

The subterranean habitat has been colonized by numerous animal species for its shelter and unique sources of food (Andersen, 1987; Nevo, 1979). Obligate fossorial species in particular have adopted the underground as a dedicated home, yet the intense demands of life underground often require unique specializations. For one, the air present in tunnels is often low in oxygen (hypoxic) and high in carbon dioxide (hypercapnic) (Nevo, 1979). This dark environment also requires the development of enhanced senses to compensate for loss of vision. These and other subterranean specializations have been reported in many independent evolutionary lineages of insects, amphibians, reptiles, and mammals (Leys et al., 2003; Lacey et al., 2000; Albert et al., 2007; Wilkinson, 2012). Within mammals alone, there are several unrelated subterranean species, including the true moles (family Talpidae), the African golden moles (Chrysochloridae), and the marsupial moles (Notoryctidae). There are also at least three unrelated lineages of subterranean rodents: the naked mole-rat (Heterocephalus glaber), blind mole-rats (Spalacidae), and the pocket gophers (Geomyidae).

Fossorial mammals have evolved a number of morphological and physiological traits that are considered adaptations to subterranean life, affecting how they sense their environment, how they move through it, or how they deal with its physiological demands. For one, species that dig with their forelimbs, like the star-nosed mole, have enlarged forelimbs and claws that allow them to tunnel through substrate, whereas species that dig with their teeth, such as the naked mole-rat, have reduced limbs and continuously growing incisors (Dubost, 1968; Ellerman, 1956). In the absence of light, non-visual sensory systems have been elaborated. These include the sensitive vibrissae (i.e., sensory hairs) of the naked mole-rat and the large and elaborate snout of the star-nosed mole, which has become a remarkable ‘tactile eye’ (Eloff, 1951; Hill et al., 2009; Catania, 1999). Fossorial mammals have also evolved an ability to withstand hypoxic conditions that rivals that of species living at high altitude (Nevo, 1979; Ar et al., 1977). For example, some species have high erythrocyte counts and high skeletal-muscle myoglobin to facilitate oxygen exchange. Some of these adaptive strategies are shared by different fossorial lineages and as such represent prime examples of convergent evolution (Nevo, 1979). Yet, some of the most striking convergent transformations in subterranean mammals are the reductions and losses of traits shared among aboveground mammals, of which the most prominent example is the reduction of the eye (Dubost, 1968; Hill et al., 2009; Cei, 1946a, 1946b).

Vision in many subterranean mammals is limited, and the degree of limitation in each species is related to the extent of its underground habitation (Nemec et al., 2008; Quilliam, 2009; Sanyal et al., 1990). For example, star-nosed moles (Condylura cristata) that share their time aboveground and underground possess diminutive eyes with thick eyelids (Catania, 1999), whereas the naked mole-rat , which spends almost all of its time underground, has tiny eyes that are rarely opened (Hetling et al., 2005). Even more extreme are the completely subcutaneous eyes of the cape golden mole (Chrysochloris asiatica) and the blind mole-rat (genus Nannospalax), which are thought to reflect their strictly subterranean lifestyle (Sanyal et al., 1990; Sweet, 1909). While some degree of visual regression is shared between subterranean mammals, not all visual structures and genetic pathways have regressed to the same degree. For instance, the eyes of true moles and mole-rats show anatomical regression and always exhibit a small eye but retain ocular architecture, suggesting that the basic eye developmental programs must be largely intact in these animals (Carmona et al., 2008, 2010; Quilliam, 1966). The convergent loss of vision and visual structures in subterranean mammals allows us to ask which genetic regions – coding or non-coding – contributed to regression in these species and which were conserved.

The genetic causes of these malformations have been probed through studies of blind cavefish and evolutionary analysis of retinal genes in subterranean mammals (Jeffery, 2009; Emerling and Springer, 2014). Pioneering work by Hendriks et al. found the evolutionary rate of the lens and retina protein αA-crystallin to be markedly accelerated in the Middle Eastern blind mole-rat (Spalax ehrenbergi), as would be expected under relaxed constraint (Hendriks et al., 1987). Furthermore, Emerling and Springer (2014) revealed that regressive genetic changes in retinal proteins are unevenly distributed across different visual pathways and eye tissues. Previous studies have placed more emphasis on retinal components of vision and connections to the visual cortex because it is these components that sense light and transmit images to the brain for vision (Emerling and Springer, 2014; Cooper et al., 1993). Less emphasis has been placed on the genes contributing to other eye tissues, such as the cornea.

The genomes of four subterranean mammals have been sequenced and studied for changes that have occurred in response to their unique environment. The naked mole-rat genome revealed genetic changes in key genes involved in thermogenesis and circadian rhythm, as well as gene loss and deactivating mutations in core visual perception genes (Kim et al., 2011). The genome of the blind mole-rat (Nannospalax gailili) also yielded diverse insights into its subterranean adaptations, such as an impactful change to the P53 protein that allows cells to escape hypoxia-induced apoptosis, as well as the upregulation of specific pathways involved in the response to hypoxia and hypercapnia (Fang et al., 2014). Additionally, parallel evolution was seen in the deactivation of visual perception genes in the blind mole-rat and naked mole-rat. The convergence of such changes provides evidence that they have occurred in response to the subterranean environment rather than as a result of unrelated species-specific conditions or neutral processes, highlighting a potential strategy to discover additional genetic regions showing a similar response (Losos, 2011; Stern, 2013; Rosenblum et al., 2014).

Previous studies have used convergent evolution to reveal genetic changes that are related to environmental shifts without a priori expectations of which regions might respond. One strategy has been to search for convergent amino acid substitutions at specific protein sites (Foote et al., 2015; Liu et al., 2010; Dobler et al., 2012). A complementary strategy is to search for convergent changes in selective pressure on larger functional regions, such as genes or regulatory sequences, because evolution at different nucleotides within a gene could nevertheless lead to convergent phenotypic effects. In practice, convergent changes in selective pressure are inferred by studying evolutionary rates, because selective constraint slows evolution, whereas lack of constraint and adaptation speed it. Computational methods employing this strategy search for functional elements whose evolutionary rates changed on those branches exhibiting the convergent environmental change (Marcovitz et al., 2016; Hiller et al., 2012; Chikina et al., 2016; Lartillot and Poujol, 2011). One demonstration of this approach by our group identified genes that convergently responded when mammalian lineages shifted from a terrestrial to a marine environment (Chikina et al., 2016). Another recent study by Prudent et al.(2016) demonstrated that regions showing convergent rate acceleration in the subterranean environment were enriched in visual perception genes and also contained circadian rhythm genes. Together, these studies show the promise of convergent rates to reveal genes underlying major changes in morphology and physiology that are related to drastic environmental shifts.

To investigate the demands placed upon subterranean species by their extreme environment, we searched for genes exhibiting convergent rate changes in four subterranean mammals. We report a large set of genes showing marked relaxation of constraint in subterranean species, which were highly enriched for visual functions. This set also contained many genes of undetermined function, which could be unrecognized causative genes in eye-related diseases. Finally, we pinpointed the eye-specific transcriptional enhancers in the Pax6 gene region using a new variant of our method and demonstrated the potential to detect new eye-specific enhancers at key developmental genes.

Results

Many genes have altered evolutionary rates specifically in subterranean mammals

We first sought to identify the genes that responded to conditions in the subterranean environment. Accordingly, we used relative evolutionary rate (RER) methods to identify protein-coding genes that evolved at a more rapid rate specifically on subterranean branches of the mammalian phylogenetic tree. Subterranean branches consisted of those leading to the star-nosed mole (Condylura cristata), the cape golden mole (Chrysochloris asiatica), the naked mole-rat (Heterocephalus glaber) and the blind mole-rat (Nannospalax galili). Each of these species represents a lineage that independently colonized the subterranean habitat, as each is more closely related to aboveground mammals than they are to each other (Figure 1A). Hence, similar phenotypic changes within these species are regarded as convergent traits. To demonstrate our RER methods, we first present the case of the eye-specific gene LIM2, which encodes Lens intrinsic membrane protein 2. First, the amount of amino acid divergence in LIM2 on each mammalian branch was quantified using sequences from 39 species and standard evolutionary models (Figure 1B) (see Materials and methods). The resulting LIM2 tree is markedly different from the genome-wide average tree in Figure 1A, and reveals distinctly high amounts of divergence in LIM2 for the four subterranean species. This rapid divergence probably resulted from loss of selective constraint in the dark subterranean environment. To quantify this rate acceleration in the LIM2 tree, we normalized all branch lengths for the expected amount of change as defined by the genome-wide average divergence for each branch. This average, after scaling (see Materials and methods), should reflect both the underlying speciation times in the mammalian phylogeny as well as changes in demographic factors affecting substitution rates. The resulting RER values for each branch are plotted in Figure 1C. An RER of zero indicates that LIM2 evolved at exactly the expected rate on that branch, while positive and negative values reflect faster and slower rates, respectively. By examining RERs it becomes clear that LIM2 changed at abnormally rapid rates in the four subterranean mammals; the rates for all four subterranean species are more rapid than all aboveground species, and this difference is supported statistically (p=0.00084, Mann-Whitney U test). Thus, extending the RER calculations to all other genes, we can distinguish the functions that responded during adaptation to the subterranean environment. Importantly, the convergence of these species allows us to confidently infer genes that responded specifically to subterranean life, because faster rates in all four species are not likely to be due to random fluctuations, as reflected by the low P-value for LIM2.

Lens intrinsic membrane protein 2 (LIM2) evolutionary rates across species.

(A) Mammalian transitions to a subterranean environment occurred in four lineages shown in red. The branch lengths on the mammalian tree reflect the average evolutionary rate across 18,980 protein-coding genes. (B) LIM2 protein-coding sequence shows accelerated rates of evolution on subterranean branches compared to those on the proteome-wide average tree. (C) Relative evolutionary rates of LIM2 showed the strongest acceleration on the subterranean branches amongst all of the genes studied. Illustrations by Michelle Leveille (Artifact Graphics).

https://doi.org/10.7554/eLife.25884.003

We performed the same RER analysis on 18,980 protein-coding genes to determine which shifted to faster or slower evolutionary rates specifically in subterranean species. We will hereafter refer to such genes as ‘mole-accelerated’ and ‘mole-decelerated’, respectively (see Materials and methods). At a false discovery rate (FDR) of 15%, we identified 55 mole-accelerated genes. We expect mole-accelerated genes to result from either selection for amino acid changes (i.e., positive Darwinian selection) or, alternatively, from a reduction in purifying selection, as suggested for the LIM2 protein. At the other extreme, we identified 1306 mole-decelerated genes at the same FDR. We expect genes to show rate deceleration if there is stronger purifying selection on that gene’s function in the subterranean environment, perhaps as the result of increased importance for fitness.

Vision-related functions are highly enriched among mole-accelerated genes

Genes with the strongest evidence of mole-acceleration were consistently associated with function in two organs, eye and skin. To illustrate, 17 of the top 30 mole-accelerated genes are expressed solely in eye tissues or are associated with eye-related disorders, whereas three accelerated genes are associated with skin, hair, and nails (Table 1). Among the genes showing very strong signals of mole-acceleration, we find proteins tha are specifically expressed in tissues of the eye such as the retina-specific proteins ROM1 and GNAT1 (Figure 2). The complete list of the 55 mole-accelerated genes similarly contains a large proportion that are related to vision and external tissues (Supplementary file 1), and they were highly enriched for functional annotations including eye morphology, photoreceptors, visual signal transduction, and eye-related mutant phenotypes (Table 2, Supplementary file 2). The strength of this enrichment is clearly illustrated by examining all genes annotated to the Gene Ontology (GO) term ‘visual perception’, because a large fraction of genes that have this annotation are ranked very highly in the list of mole-accelerated genes (Figure 3A ‘subterranean’). Furthermore, if we were to employ mole-acceleration as a sole predictor of visual function, a search would correctly identify many known visual perception genes with high accuracy, even when searching the entire genome (Figure 3B). This strong enrichment allows us to pose specific hypotheses in subsequent sections about which tissues and genetic pathways were altered during the regressive evolution of the eye.

Relative evolutionary rates of two retinal proteins across species.

Relative evolutionary rates of two retinal proteins, (A) Retinal outer segment membrane protein 1 (ROM1) and (B) Rod cell-specific G protein, subunit alpha (GNAT1), show strong acceleration in the subterranean mammals (marked in red).

https://doi.org/10.7554/eLife.25884.004
Enrichment of visual perception genes.

(A) Histogram of the rankings of 189 visual perception genes based on their mole-acceleration. We see a clear enrichment of the genes with low rank numbers, reflecting the strong signal of mole-acceleration in visual perception genes. As a control, we use four non-subterranean species, and as expected, genes involved in vision do not show convergent rate acceleration. (B) Mole-acceleration can equivalently serve as a predictor for function in visual perception. The plot shows the Precision-Recall values at varying p-value thresholds reflecting the fraction of visual perception genes significant at a particular threshold (Precision) and the fraction of visual perception genes retrieved at the same threshold (Recall). We see that mole-acceleration specifically identifies visual perception genes with high precision when compared to acceleration in two sets of four non-subterranean control species.

https://doi.org/10.7554/eLife.25884.005
Table 1
Top 30 of 55 subterranean-accelerated genes.
https://doi.org/10.7554/eLife.25884.006
GeneP-valueTissuesDescription
LIM2*0.00084LensLens intrinsic membrane protein 2
CRYBB3*0.00087LensLens-specific crystallin, beta B3
R0M1*0.00096RetinaRetinal outer segment membrane protein 1
CRYBA1*0.00098LensLens-specific crystallin, beta Al
CRYGC*0.00119LensLens-specific crystallin, gamma C
CRYBB2*0.00128LensLens-specific crystallin, beta B2
GPR89B0.00130UbiquitousG-protein-coupled receptor 89B, pH mediator in Golgi
GNAT1*0.00133RetinaRod cell-specific G-protein, subunit alpha
GPRS9A0.00134UbiquitousG-protein-coupled receptor 89A, pH mediator in Golgi
NRL*0.00138RetinaNeural retina leucine zipper responsible for expression of rhodopsin
CRYGS*0.00146LensLens-specific crystallin, gamma S
GRM6*0.00150RetinaMetabotropic glutamate receptor 6, required for normal vision
GBX20.00165EmbryoGastrulation brain homeobox 2, developmental transcription factor
LGSN*0.00171LensLengsin, lens protein with glutamine synthetase domain
CRYBB1*0.00183LensLens-specific crystallin, beta Bl
KLHDC30.00186UbiquitousKelch-domain-containing 3, high expression in brain
KRT81#0.00186Hair and nailsKeratin 81, primarily in hair cortex
WDFY10.00192UbiquitousWD repeat and FYVE-domain-containing 1, endosomal protein
KRT9#0.00195SkinKeratin 9, specific to palms of hands and soles of feet
POMP#0.00199UbiquitousProteasome maturation protein, associated with rare skin disorder
RRH*0.00201RetinaRetinal pigment epithelium-derived rhodopsin homolog
DPCD*0.00201Ciliated cellsDeleted in primary ciliary dyskinesia; maintenance of ciliated cells
RAD54L0.00217UbiquitousRAD54-like: DNA double-strand break repair
TATDN10.00235UbiquitousTatD DNase-domain-containing 1
ITLN20.00244Small intestineIntelectin 2, may play a role in defense against pathogens
STX3*0.00245UbiquitousSyntaxin 3, associated with congenital cataracts and intellectual disability
SKJV2L*0.00254UbiquitousDEAD box protein, yeast SKI2 homolog, implicated in macular degeneration
DPY19L10.00254Ubiquitousdpy-19-like 1 (Caenorhabditis elegans), probable C-mannosyltransferase
TFPT0.00266UbiquitousTCF3 (E2A) fusion partner (in childhood leukemia)
RSI*0.00275RetinaRetinoschisin 1, extracellular protein involved in organization of retina
  1. *related to vision.

    #related to skin and hair.

  2. Refer to Supplementary file 1 for a full list of the subterranean-accelerated genes.

Table 2
Representative enriched functions in mole-accelerated genes.
https://doi.org/10.7554/eLife.25884.007
Functional annotationFold enrichmentp-valueFDR q-value
Visual perception23.166.84E-161.02E-11
Sensory perception of light stimulus22.699.12E-L66.82E-12
Sensory perception8.475.83E-102.91E-06
Neurological system process5.391.75E-076.53E-O4
Detection of light stimulus29.577.04E-072.10E-03
Detection of light stimulus involved in sensory perception56.351.92E-054.77E 02
Detection of light stimulus involved in visual perception56.351.92E-054.09E-02
Detection of external stimulus14.382 49E-054.66E-02

We performed a control analysis to demonstrate that these functional enrichments are unique to subterranean species. We chose four aboveground species (Control species) for which there is no reason to expect phenotypic convergence and whose branch lengths are similar to the moles – pika, guinea pig, squirrel and cow. Whereas mole-accelerated genes were enriched in 15 GO categories at a FDR of 15%,control-accelerated genes had no enriched categories at the same FDR (Supplementary file 2). Furthermore, these control species showed no enrichment of visual perception genes specifically (Figure 3).

There were also 1306 mole-decelerated genes that evolved at significantly slower rates in subterranean species than in other mammals (Supplementary file 3). Although mole-decelerated genes are individually significant, only one GO category showed significant functional enrichment – GO Biological Process: Nucleic acid binding transcription factor activity – at an FDR of 15% (Supplementary file 4). A similar control analysis showed 626 genes as being significantly decelerated at an FDR of 15%, and these control-decelerated genes were enriched in 24 GO categories. Therefore, despite there being vastly more mole-decelerated genes than mole-accelerated genes, mole-decelerated genes as a group do not show strong functional enrichment. This result stands in stark contrast to the strong enrichment seen in the mole-accelerated genes.

Most mole-accelerated genes are under relaxed constraint

Accelerated rates could have resulted from adaptive evolution or, alternatively, from relaxation of constraint. We distinguished between these scenarios using codon-based evolutionary models to detect signatures of adaptive evolution. We tested whether the nonsynonymous to synonymous rate ratio (dN/dS) was significantly greater than 1 – the expectation for positive selection – for any portion of the gene specifically on the subterranean species branches, and also more generally across the entire mammalian phylogeny (Yang, 2007). Of the top 55 mole-accelerated genes, only one gene rejected a neutral model not allowing dN/dS ratios exceeding 1 in favor of a model allowing positive selection (dN/dS > 1) on subterranean branches (Supplementary file 5). This gene is involved in connective tissue and hair structure (KRTAP17-1).

The other accelerated genes did not show evidence of adaptive evolution and thus are probably under relaxed constraint. Almost all accelerated genes rejected a model requiring them to have identical constraints in all mammals (model M1) in favor of a model that allowed subterranean-specific relaxation of constraint (model BS1) (Supplementary file 5). Some of these genes seem to have lost all functional constraint because they show genetic lesions such as stop codons and frameshifts in some subterranean species (Supplementary file 6). This evidence of relaxed constraint is consistent with the expectation that some vision-related genes have been undergoing regressive evolution.

Skin-related genes were accelerated possibly in response to the demands of tunneling

The fossorial lifestyle of subterranean species has selected for traits related to digging and locomotion underground (Nevo, 1979). Perhaps because of this selective pressure, many of the top mole-accelerated genes encode proteins that are structural components of skin, hair and epithelial connective tissues. The reasons for their acceleration are the result of relaxation of constraint on their coding sequence. Genes encoding keratin proteins 9, 12, and 81 (KRT9, KRT12, KRT81) were studied using codon models, and the results indicated that they experienced relaxed constraint in subterranean species but not positive selection for amino acid diversification (Supplementary file 5). They contain early stop codons in multiple subterranean species, which is consistent with complete loss of constraint (Supplementary file 6).

The convergent acceleration and pseudogenization of KRT9 is particularly interesting in relation to burrowing (Figure 4). In mice, KRT9 expression is confined to footpads, and Krt9−/− null mutants develop footpad calluses due to hyperproliferation of skin (Fu et al., 2014). In humans, KRT9 is expressed solely on the palms of hands and the soles of feet, and mutations lead to a skin disorder characterized by hyperkeratosis (thickening) of the surfaces of palms and soles – epidermolytic palmoplantar keratoderma (Hennies et al., 1995). By extension, the loss of KRT9 in subterranean species may also have led to hyperproliferation of footpads, which could carry benefits for tunneling. For example, the star-nosed mole digs with its forepaws, and naked mole-rats collect and remove dirt with their feet (Jarvis and Sale, 2010; Hamilton, 1931). Such abrasive tasks could place high demands on the footpad surfaces. In addition, mole-acceleration of the POMP gene could similarly have resulted from demands on footpads. A human mutation in POMP is associated with KLICK syndrome, a skin disorder also characterized by hyperproliferation and thickening of palms and footpads (Dahlqvist et al., 2010).

Relative rates of footpad-specific keratin 9 (KRT9).

KRT9 shows strong acceleration on the subterranean branches. The image shown is the footpad of the star-nosed mole, showing characteristic hyperkeratosis. Keratin 9 mutations also lead to hyperkeratosis in mouse models and humans. Illustrations by Michelle Leveille (Artifact Graphics).

https://doi.org/10.7554/eLife.25884.008

In addition, we discovered skin- and hair-related genes showing evidence of positive selection rather than of loss of function in subterranean species. Although these genes were not significantly mole-accelerated at a FDR of 15%, they potentially reflect functional changes in response to subterranean adaptation. One such gene, COL4A4, a gene encoding a subunit of Type IV collagen, was strongly accelerated, did not contain genetic lesions, and showed evidence of positive selection in subterranean species (Supplementary file 1,5,6). Type IV collagen is the major structural component of the basal lamina in many tissues, including skin epithelium, and is composed of six subunits, three of which (COL4A4, COL4A5 and COL4A3) were notably mole-accelerated. On average, the six subunits were more accelerated than 71% of all other genes, which is a significant difference (p=0.0342, Mann-Whitney U test). Whereas Type IV Collagen seems to have responded to the subterranean environment, other major components of the basal lamina, the laminin proteins (e.g., LAMA1), were not notably accelerated.

Regressive evolution is limited to the lens, retina, and eye-specific developmental genes

In order to compare how specific eye tissues have evolved in subterranean species, we first compiled tissue-specific gene sets using expression data from 91 mouse tissues (Su et al., 2004). We identified tissue-specific genes for cornea, iris, lens and retina by selecting those genes with significant differential expression in the tissue of interest but not in other tissues. Using literature, we also compiled a set of 71 important eye developmental genes (Supplementary file 7). We first asked whether there is a relationship between the degree of tissue-specificity and the degree of mole-acceleration measured as the difference in dN/dS between subterranean and aboveground species (Figure 5A). We found a clear positive correlation between eye tissue-specificity and mole-acceleration, which is consistent with a greater relaxation of constraint on genes with few or no roles outside the eye. Next, we asked which genes with eye-tissue-specific expression showed acceleration and found that genes that are specifically expressed in the cornea (a protective tissue of the outer eye) and the iris were not accelerated in subterranean species when compared to a set of randomly chosen genes (background) (Figure 5B and C). By contrast, many lens- and retina-specific genes are accelerated. On average, lens genes are more accelerated than 84% of background genes, and retina genes are more accelerated than 82% (p=9.07×10−6 and p=6.10×10−10 for lens and retina, respectively, Mann-Whitney U test). The contrast between the front and the interior of the eye suggests that the sensory functions of the inner eye, such as phototransduction and the visual cycle, are under relaxed constraint, whereas the protective function of the cornea is not. Indeed, two of these subterranean species have eyes that are open to the environment, such that the cornea may continue to serve as a barrier to pathogens and debris.

Tissue-specific retinal and lens genes are highly accelerated in subterranean species.

(A) Ocular genes that are more tissue-specific exhibit stronger acceleration in subterranean ‘mole’ species. The y-axis represents the change in the rate of evolution on branches shifting to a subterranean environment. (B) Panels of tissue-specific genes were tested for their relative accelerations in the subterranean mammals. One hundred randomly chosen ‘background’ genes were not faster or slower on average, and provide an estimate of the variance expected for random genes. Retina- and lens-specific genes show many cases of acceleration in the subterranean environment, and their distributions are significantly elevated when compared to background (p=1.4×10−5 and 3.2 × 10−4, respectively). (C) Representation of average mole-acceleration for genes specifically expressed in four different tissues of the eye.

https://doi.org/10.7554/eLife.25884.009

Eye developmental genes as a whole were not accelerated compared to background, which may reflect the fact that most of them, such as Sonic Hedgehog (Shh), are important in the development of non-eye tissues. However, five eye-specific developmental genes were notably present at the top of the accelerated list (Vax2, Nrl, Foxe3, Crx and Aldh3A1), whereas no eye-specific genes were found lower in the list (Supplementary file 7). This is consistent with the positive relationship between eye-specificity and relaxation of constraint (Figure 5A).

Eye-specific enhancers of PAX6 show convergent acceleration in subterranean mammals

Although we observe specific instances in which eye developmental genes show accelerated rates in subterranean mammals, there is no significant global trend. This is understandable given that a majority of these developmental transcription factors have important roles in the development of non-eye-related tissues. For example, Pax6 is important in the development of pancreas and brain in addition to the eye (Kleinjan et al., 2006; Kammandel et al., 1999; Xu et al., 1999). Hence the protein-coding sequences of the transcription factors encoded by these genes experience selective pressure against deleterious mutations. However, regulatory regions controlling the expression of these developmental genes in the eye might be under relaxed constraint in subterranean mammals, given the relaxation of the need to maintain the functionality of visual pathways. We hypothesize that these eye-specific cis-regulatory elements (CREs) would thus show accelerated rates of evolution in the subterranean mammals.

We tested this hypothesis by applying our evolutionary-based method toward identifying eye-specific regulatory elements controlling the expression of the developmental transcription factor PAX6. We chose the PAX6 system because extensive effort has gone into characterizing the spatiotemporal regulation of its expression (Kleinjan et al., 2006; Kammandel et al., 1999; Xu et al., 1999; Kleinjan et al., 2001; Dimanlig et al., 2001; Griffin et al., 2002), and there exists comprehensive annotation ofCRE that control the expression of PAX6 in various tissues including the eye. On the basis of existing literature on the transcriptional regulation of Pax6 expression, we identified a 500-kb window containing Pax6 and its neighboring gene Elp4 as our genomic window of interest (Kleinjan et al., 2006). Experiments involving transgenic mice revealed various tissue-specific enhancers in a 200-kb region within this genomic window to be important for Pax6 expression. We subsequently identified 150 highly conserved non-coding elements in this genomic window and estimated their evolutionary rates on each mammalian branch. We then calculated the relative rates of the branches using the same projection operator method as was employed for the protein-coding gene trees. We then employed the Mann-Whitney U hypothesis-testing framework to identify non-coding elements that have evolved at an accelerated rate specifically on the subterranean branches (Materials and methods).

The results of our analyses show that the three regions showing the strongest signals of convergent acceleration in the subterranean mammals extensively overlap the regions previously annotated to be enhancers important for regulation in eye-specific tissues (Figure 6A, B). (i) ‘cre149’ is a 558-basepair (bp) region containing the 530-bp region annotated as the ‘alpha, intron 4 retinal’ enhancer (Kammandel et al., 1999). (ii) ‘cre21’ is a 552-bp region located within the fragment containing HS2 and HS3 of the Distal Regulatory Region, a retina-specific enhancer of PAX6 (Kleinjan et al., 2001). (iii) ‘cre86’ is a 429-bp region containing the 341-bp long ‘ectodermal enhancer’, which has been shown to be important in driving the expression of PAX6 in the developing lens (Dimanlig et al., 2001). Regions overlapping an enhancer element shown to be regulating PAX6 expression in lens, hindbrain and diencephalon (the ‘EI’ enhancer element) do not show significant rate acceleration in the moles (Kleinjan et al., 2001). This is in concordance with our expectation that only eye-specific elements show convergent acceleration, and hence the regions overlapping the EI enhancer do not show acceleration given their importance for PAX6 expression in non-eye tissues. Similarly, a 120-bp region overlapping the pancreas enhancer also does not show significant rate acceleration in the moles, as expected (Kleinjan et al., 2006; Xu et al., 1999). In addition to the eye-specific enhancer elements, we observe other regions showing comparable rate acceleration in the moles that are not yet characterized (Supplementary file 8). These regions are candidate CREs for PAX6 expression in the eye. This preliminary study of the Pax6 transcriptional regulatory module serves to confirm our hypothesis that eye-specific regulatory elements are under relaxed constraint and thus show accelerated rates of evolution in the subterranean mammals.

Mole-acceleration of eye-specific enhancers in the Pax6 gene region.

(A) Genomic region spanning Pax6 and its neighbor Elp4. The exons and introns of the two genes are represented by black blocks and lines respectively, whereas the conserved non-coding regions analyzed are represented in light blue. The conservation signal as given by the 100 vertebrates Basewise Conservation is shown in dark blue. The mole-acceleration scores for these regions are represented in red. The three most accelerated non-coding regions identified in this analysis are consistent with the eye-specific enhancers regulating Pax6 expression in the eye. (B) The mole-acceleration scores for the three eye-specific enhancers of Pax6 are the highest among 150 regions analyzed, including enhancers of other tissues and uncharacterized non-coding regions. (C) The relative rates in each species for the most accelerated region ‘cre149’.

https://doi.org/10.7554/eLife.25884.010

Mole-accelerated non-coding elements are strongly enriched near transcription factor genes driving eye development

Expanding from our analysis of Pax6, we performed a large-scale scan for convergently accelerated non-coding elements near transcription factor genes in the mammalian genome. We compiled two sets of transcription factor genes – one comprising 20 genes known to be important in eye development (the Eye set), such as Pax6, Pax2, Otx2, and another set consisting of an equal number of tissue-specific transcription factor genes that are expressed in other tissues and with no evidence of expression in eye (the Other set), which includes Hoxa9, Pax8 and Sox13 (Supplementary file 9). We identified 200 conserved non-coding elements near each gene in both sets, totaling to 8000 elements split equally between the two gene sets (see Materials and methods). We subsequently applied our method and calculated the mole-acceleration of each element. This large-scale scan revealed a total of 17 elements as convergently accelerated at an FDR of 10% (Figure 7A, Supplementary file 10). Fourteen of the 17 elements are found near to genes belonging to the Eye set, reflecting a significant enrichment of mole-accelerated elements near transcription factor genes driving eye development (Hypergeometric test, p-value=0.001). We subsequently checked the genomic locations of these mole-accelerated elements to ensure that they are not clustered at the same locus for instance. These 17 elements are found close to 14 unique genes, with 11 unique genes belonging to the Eye set, and three genes belonging to the Other set, further showcasing the strong enrichment of unique eye developmental transcription factor genes close to mole-accelerated elements (Hypergeometric test p-value = 0.0016).

Evidence of mole-acceleration in candidate eye-specific enhancers.

(A) Enrichment of mole-accelerated elements near eye developmental transcription factor genes. The bar plot shows the 17 mole-accelerated conserved non-coding elements identified. Fourteen of the 17 elements are present near transcription factor genes in the Eye set, denoted in red. (B) FANTOM5 Eye enhancers show strong mole-acceleration. The plot shows the relative proportion of FANTOM5 eye enhancers identified among all enhancers significant at the corresponding p-value threshold. We see a strong enrichment of eye enhancers identified at low mole-acceleration p-values (red points) whereas no such enrichment is observed using control-species-acceleration p-values (blue points).

https://doi.org/10.7554/eLife.25884.011

FANTOM5 eye enhancers show strong convergent acceleration in subterranean mammals

The FANTOM5 consortium has identified putative enhancer sites in the human and mouse genomes based on bidirectional enhancer transcription across tissues as well as at multiple developmental time points (Andersson et al., 2014). These putative enhancer sites include genomic regions that are transcribed in the eyeball of mouse embryos at four developmental time points. On the basis of this resource, we compiled two sets of FANTOM5 enhancer sites (see Materials and methods) – a set consisting of 900 genomic regions with non-zero expression in the eyeball across four developmental time points (‘Eye’ enhancers), and another set consisting of 6000 regions with zero expression across the same samples (‘Other’ enhancers). We subsequently calculated the convergent rate acceleration of these genomic elements in the four subterranean mammals and compared the acceleration observed for the ‘Eye’ enhancers to that of the ‘Other’ enhancers. Our analysis revealed a strong enrichment of FANTOM5 ‘Eye’ enhancers showing convergent rate acceleration in the four subterranean species when compared to the four control species (Figure 7B). We observe 62 FANTOM5 enhancers in total that showed significant mole acceleration at an FDR of 15% (Supplementary file 11). Fifteen of these correspond to the FANTOM5 Eye enhancers set, reflecting a significant enrichment of detected FANTOM5 eye enhancers using mole-acceleration (Hypergeometric test p-value=0.006).

Some aboveground species exhibit gene acceleration indicative of their altered visual capacities

To understand differences in the visual capabilities of mammals systematically, we studied the overall relative rates of evolution of visual genes across all mammals. Our gene set of interest (189 genes in total) was comprised of all genes with ‘Visual perception’ GO term annotation, excluding developmental transcription factors. For each species, we then calculated the mean relative rate across all of the genes (Figure 8). We observed the four subterranean mammals to be among the accelerated species (with mean >0), as was our expectation. However, we also observed aboveground species with overall rate accelerations comparable to those of the moles, such as the armadillo, the thirteen-lined ground squirrel, the big brown bat, David’s myotis bat and a shrew. Notably, all of these mammals show varying types of visual regression: the armadillo has poor vision characterized by a lack of cone cells in the retina (McDonough and Loughry, 2013), and shrews also have poor vision and diminutive eyes, which in some species are hidden in fur (Nowak, 1999). The nocturnal big brown bat and David’s myotis bat possess reduced eyes and rely on echolocation for navigation (Koay et al., 1998). The thirteen-lined ground squirrel displays a rare visual trait: the central region of its retina is dominated by cone photoreceptors in contrast to the retinas of most mammals (Kim et al., 2016). These scenarios could have important implications because the ground squirrel is used as a model for vision research (Li et al., 2010; Chen and Li, 2012).

Some aboveground species show accelerated rates of evolutionary change in visual perception genes.

On the basis of the relative evolutionary rates across all species for 189 genes with the GO term annotation ‘visual perception’, we calculated the species-wise mean relative rate across of the genes. Our previous observations of mole-acceleration in visual perception genes are recapitulated here – the four subterranean mammals are among the species that show an accelerated rate across these genes. Interestingly, we find other non-subterranean species showing acceleration comparable to the subterranean mammals, indicating adaptations in visual systems.

https://doi.org/10.7554/eLife.25884.012

Discussion

The independent transitions of four mammals to a subterranean environment has been accompanied by convergent phenotypic changes that have arisen as a result of adaptation to new environmental stresses in the underground ecotope (Nevo, 1979; Leys et al., 2003; Lacey et al., 2000; Albert et al., 2007; Wilkinson, 2012). Here, we report a genome-wide effort encompassing both coding and regulatory regions to identify the changes in genotype that have accompanied this phenotypic adaptation by studying changes in their evolutionary rates. Our study reveals that genes showing convergent acceleration in subterranean species are highly enriched for function in visual pathways. The decreased selective pressure on visual pathways in the dim-light subterranean environment leads to a relaxation of constraint on genetic elements involved in various eye-related phenotypes, including eye morphology, photoreception and visual transduction. In addition to accelerated change in genes in visual pathways, we observe an accelerated rate of evolution of many genes involved in skin-related phenotypes in the subterranean mammals. Whereas we see accelerated change in visual genes primarily as a result of relaxation of constraint, we see that some skin-related genes also show accelerated change due to positive selection, perhaps as a result of selection of traits contributing to a fossorial lifestyle. Aside from these two phenotypes, we do not observe a comparably strong enrichment for genes involved in the other environmental challenges associated with a subterranean lifestyle, such as hypoxia, hypercapnia and high infectivity. It is possible that the subterranean mammals may show species-specific adaptations to these stresses, whereas our analysis from a convergent evolutionary perspective reflects changes common to all the species.

Closer examination of the accelerated genes that are enriched for vision-related pathways reveals that accelerated genes tend to be lens- or retina-specific. On the other hand, genes encoding specifically for the outer ocular structure, the cornea, do not show significant acceleration, indicating the preservation of developmental programs that are important for ocular architecture. In two of the four moles with non-subcutaneous eyes, the cornea can come into direct contact with the external environment, perhaps necessitating the proper development of the structure in the highly infective subterranean niche. Lens- and retina-specific genes that are involved with the processes of photoreception and phototransduction would be under greater relaxed constraint given the dim-light environment, accruing damaging mutations at a much higher rate. Our analyses also reveal that genes that are associated with congenital eye diseases are accelerated in the four subterranean mammals. For the lens, which is largely made up of crystallins, we find many crystallin genes (Crybb3, Cryba1, Crybb1, Crygc, Crygs, etc.) in our accelerated set of genes contributing to various forms of cataracts (Graw, 2009). Similarly, we find multiple genes involved in ciliopathies to be accelerated, including ‘deleted in primary ciliary dyskinesia’ (Dpcd), Iqcb1 (a component of primary cilia), and ciliary neurotrophic factor (Cntf). Further inspection of the accelerated list of genes could potentially reveal new candidate genes that are important for congenital eye diseases.

Genes that are involved in the embryonic development of eyes do not show significant global acceleration, potentially due to their pleiotropic nature: these developmental transcription factors tend to have important regulatory roles in non-eye-related pathways that are not under relaxed constraint. We successfully tested our hypothesis that eye-specific regulatory elements of such genes are under relaxed constraint in the moles, using a novel variant of our approach that calculates convergent rate acceleration at the non-coding level. Although the strong rate acceleration in the three eye-specific enhancers of PAX6 suggests relaxation of constraint in the subterranean mammals, in the absence of functional tests we cannot be sure that the eye-specific activity is truly lost. Furthermore, we found an enrichment of such convergently accelerated non-coding regions preferentially near eye developmental transcription factor genes, identifying potential enhancer elements driving the expression of these genes specifically in the eye. As a large-scale validation approach, we show that rate acceleration in subterranean mammals strongly overlaps regions identified as eye enhancers by the FANTOM5 consortium. These proof-of-principle analyses serve to illustrate the power of convergent-evolution-based tools for the identification of eye-specific regulatory elements. Despite the apparent rapid rate of enhancer evolution across mammals, our methods and those of colleagues showcase the utility of applying evolution-based approaches to conserved non-coding regions in identifying regulatory elements underlying important developmental functions (Marcovitz et al., 2016; Villar et al., 2015). These methods present a unique opportunity to perform genome-wide scans for eye- and other tissue-specific regulatory elements, and potentially serve as complementary approaches to genome-wide assays in the identification of active enhancer elements in the genome. As more genomes are sequenced, we expect these methods to become more powerful in revealing gene regulatory changes underlying convergent phenotypes.

Overall, our results suggest that genes and non-coding regions that are involved in vision pathways are accumulating deleterious mutations by neutral processes, given the relaxation of constraint on these pathways in the subterranean environment. However, this does not preclude the possibility that the initial inactivating mutations in these pathways were adaptive in nature. The initial shutdown of eye development may have been caused by positively selected changes, followed by continued regression of structural and physiological eye genes through neutral processes. Indeed, there is evidence of such a progression of events during eye regression in blind cavefish (Jeffery, 2005). Adaptive forces for reduced eyes may have been driven by the energetic costs of maintaining functioning eyes and the risk of pathogen entry through the eye (Moran et al., 2015). We note that our rate-based analysis detects signatures of sequence divergence that are based on what is observed at the end of these processes and does not shed light on the nature of the initial inactivating changes. In addition, our methods detect convergent changes in the rates of evolution of genes and hence are not designed to detect species-specific changes that might contribute to the subterranean adaptation.

Our results showcasing acceleration in the rates of convergent evolution of visual genes strongly supports previous reports of visual regression in the subterranean habitat. Emerling and Springer (2014) studied the regression of retinal genes in three of these four subterranean species and showed that a decrease in the amount of light entering the retina is associated with higher incidence of inactivating mutations in retinal genes. They found a significantly higher number of retinal pseudogenes in the moles compared to that in closely related subaerial species, an observation concordant with our results based on rate acceleration. Genome sequencing efforts for naked mole-rat and blind mole-rat also showed a strong enrichment of pseudogenes in visual pathways that are associated with the degradation of vision in these species (Cooper et al., 1993; Kim et al., 2011). A genome-wide study by Prudent et al. (2016) detected significant genomic differences in genes involved in vision-related pathways such as eye development and perception of light in two of these four subterranean mammals, namely cape golden mole and blind mole-rat. Using our rates-based framework, we performed a rigorous investigation of convergently evolving genes in a large set of four subterranean species, and elucidated the tissue-specificity and underlying reasons for their convergent rate changes. In a first-of-its-kind demonstration at the non-coding level, we applied our methods successfully to detect eye-specific enhancers showing accelerated evolution in subterranean mammals.

Visual regression is not limited to these four mole species, and mammals display specific types of regression and other general differences in visual capabilities. Our analysis of visual gene rates across other species revealed interesting patterns and trends, wherein some aboveground species with poor or remodeled visual systems showed mean rate acceleration comparable to subterranean mammals (Figure 8). This provides an opportunity to further probe specific differences in the development and function of visual systems in terms of the specific pathways that are relaxed or under constraint across species. In addition, integrating these other species into our rate-based framework can help in fine-tuning the predictive power of the evolutionary-based approaches. Deliberate selection of foreground branches based on specific combinations among these vision-impaired mammals might greatly improve the power of the methods in detecting convergent changes, especially at the non-coding level. In this regard, the availability of rich and diverse phenotypic annotations across mammals further lays the ground for the development of evolutionary-based approaches in functional and phenotypic annotation of non-coding regions (Marcovitz et al., 2016; O’Leary et al., 2011; O'Leary et al., 2013).

Materials and methods

Adding Nannospalax galili orthologs to alignment

Given the absence of Nannospalax galili (blind mole-rat or BMR) in the 100-species alignments made available by the UCSC genome browser, we employed a custom approach to add the correct BMR orthologous sequence based on its closest relative on the mammalian species phylogeny, mouse. Using the publicly available BMR gene models (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF_000622305.1_S.galili_v1.0/), we first perform ed pairwise reciprocal nucleotide blast of all BMR gene cDNA sequences and the corresponding cDNA sequences of all genes in the mouse mm9 genome. For every mm9 gene sequence, we subsequently identify the correct BMR ortholog using the InParanoid program as follows: the program clusters pairs of sequences from the two queried genomes into groups of orthologs, and the BMR sequence forming the main ortholog pair (pairs with mutually best hit) in every group was identified as the correct ortholog (Remm et al., 2001). We then performed a profile alignment using the openly available Muscle program to add the identified BMR ortholog to the gene’s multi-species alignment (Edgar, 2004). For all analyses involving non-coding regions, we utilized a simpler approach to identify the BMR orthologous region. For each non-coding region of interest, we performed blastn with the mm9 orthologous sequence as the query against the BMR assembled genome, with the default Expect (E) value of 10 (NCBI Resource Coordinators, 2016). The resulting best scoring blastn hit in the BMR genome, if any, was added to the non-coding region’s multi-species alignment (obtained from the UCSC genome browser) using the profile alignment utility of the Muscle program (Edgar, 2004).

Calculating gene correlations with the subterranean environment

Using the 100-way 100 vertebrate species amino acid alignments from the multiz alignment available at the UCSC genome browser (Blanchette et al., 2004; Harris, 2007), those alignments with a minimum of 10 species that are also present in at least two subterranean species were selected for the study. We pruned each alignment to include only the 39 species of interest represented in the proteome-wide average tree (Figure 1A), after adding the BMR ortholog of the corresponding gene sequence to this alignment as described in the previous section. For each resulting amino acid alignment, we estimated branch lengths using the ‘aaml’ program from the phylogenetic analysis using the maximum likelihood (PAML) package (Yang, 2007). Branch lengths were estimated under an empirical model of amino acid substitution rates with rate variability between sites modeled as a gamma distribution approximated with four discrete classes (for computational efficiency) and an additional class for invariable sites (aaml model ‘Empirical + F’) (Whelan and Goldman, 2001; Yang, 1996). Branch lengths were estimated on a published mammalian species tree topology (Murphy et al., 2004), modified to include Nannospalax galili whose position in the tree was inferred based on existing literature on its ancestry (Fang et al., 2014). For the analyses involving conserved non-coding elements, we identified the elements of interest based on the human phastCons track generated from the 100-way vertebrate multiz alignment, eliminating any region of overlap with the human mRNAs track. For each such element, we obtained an alignment of orthologous regions across our species of interest, pruning from the UCSC 100-way multiz alignment. The previous section further details the procedure employed for adding the BMR orthologous region. We subsequently estimated the branch lengths using the baseml program of the PAML package under the general reversible process (REV) model for nucleotide substitution rates, with rate variability between sites modeled as a gamma distribution approximated with four discrete classes and an additional class for invariable sites (Blanchette et al., 2004; Rodríguez et al., 1990).

Subsequent to reconstructing the maximum likelihood trees using PAML, we filter out the trees of genetic elements that have zero branch lengths in at least 80% of the species present in the tree. Raw branch lengths in trees retained after this filtering step were transformed into relative rates using a projection operator method (Sato et al., 2005). These branch-specific relative rates were then used to perform a Mann-Whitney U test and correlation analysis over the binary variable of ‘subterranean’ or ‘aboveground’ (i.e., not subterranean) branches (Figure 1A). Subterranean branches are those leading to the star-nosed mole (Condylura cristata), cape golden mole (Chrysochloris asiatica), naked mole-rat (Heterocephalus glaber) and blind mole-rat (Nannospalax galili).

Datasets of ‘Eye’ vs ‘Other’ conserved non-coding elements and FANTOM5 enhancers

A. Conserved non-coding elements near transcription factor genes

We identified 20 developmental transcription factors that have important developmental roles in the formation of eye tissues (‘Eye’ set) based on a literature search. The detailed functional roles of these transcription factors and the specific eye tissues where they are relevant are provided in Supplementary file 9. The second part of the dataset comprises 20 transcription factors identified as belonging to the ‘Other’ set. These transcription factors have no known role in eye development and their tissue-specific functions were identified from a census of human transcription factors [Supplementary file 9] (Vaquerizas et al., 2009). Subsequently, we used the UCSC phastConsElements100way track to identify conserved non-coding elements near each transcription factor gene. For each gene, we identify 200 elements expanding the search window from the center of the gene along either direction. We limit the number of elements to 200 in order to avoid any biases arising out of the total number of elements studied near any particular gene. This leads to a total of 8000 elements split equally between the ‘Eye’ and the ‘Other’ set.

B. FANTOM5 enhancers

We downloaded a dataset of 44,460 putative enhancers identified by the FANTOM5 consortium, including their mm9 coordinates and expression quantification across 1190 tissue samples (Andersson et al., 2014). These include 1217 enhancers with non-zero expression level in the eyeball of mice (‘Eye’) across four developmental time-points and 13,100 enhancers with zero expression in eyeball (‘Other’). We obtained the corresponding hg19 coordinates of these enhancers using the UCSC liftOver utility (Kent et al., 2002). Based on this, we were able to map correctly 995 enhancers belonging to the ‘Eye’ set and 7695 enhancers belonging to the ‘Other’ set. From among these enhancers, we were able to confidently calculate the evolutionary rate correlation with the subterranean environment for 946 FANTOM5 ‘Eye’ enhancers and 6,331 FANTOM5 ‘Other’ enhancers, after filtering out enhancers that were either poorly conserved across our species set or whose trees were dominated by branches of length zero.

Functional enrichment analysis

We performed functional enrichment analysis using the GOrilla tool by searching for enriched GO terms in the foreground set of genes compared to the full background set of genes tested for mole convergence (Eden et al., 2009). In addition to this, functional information for subterranean-associated genes was mined from the Uniprot and RefSeq databases, and from literature cited directly (UniProt Consortium, 2007; Pruitt et al., 2007). Enrichment analysis was performed using the hypergeometric test with the background set of genes restricted to genes that were tested for mole convergence and that had at least one annotation in the corresponding annotation file. Correction for multiple testing was performed using false discovery rate q-values (Storey, 2002).

Multiple hypothesis testing correction

False Discovery Rate analysis was performed on probabilities resulting from the Mann-Whitney U test. We employed an empirical permutation-based FDR calculation, often the standard approach in genome-wide analysis. We generated 10,000 null datasets, obtaining each dataset by randomly permuting the species labels of the relative rates. This process is equivalent to calculating the Mann-Whitney U test and correlation analysis over four random branches vs the rest instead of over the binary variable of ‘subterranean’ vs ‘aboveground’ branches. The subsequent permuted datasets were used to estimate the FDR q-values for each p-value in our subterranean correlation analysis. Genes showing a correlation greater than or equal to 0.5 in the Mann-Whitney U test and significant at a FDR of 15% were considered mole-accelerated genes, and the p-value reflecting the strength of this acceleration is referred to as mole-acceleration. Similarly, genes showing a correlation less than 0.5 and significant at a FDR of 15% are considered mole-decelerated genes.

Tissue-specific gene analysis

In order to determine how specific eye tissues have evolved across subterranean species, we first identified tissue-specific gene sets using microarray expression data from 91 mouse tissues (Su et al., 2004). We isolated tissue-specific genes for cornea, iris, lens and retina (including retinal pigmented epithelium). These sets were defined as those with significant differential expression only in the tissue of interest compared to all other tissues at an alpha of 0.05 (T-test).

Phylogenetic models of selective pressure

The subterranean-accelerated genes were subjected to phylogenetic models of codon evolution to test for significant evidence of relaxation of constraint or positive selection over the subterranean mammal branches. Using PAML, we ran codeml using five different models: the branch-site neutral model (BS Neutral), the branch-site selection model (BS Alt Mod), the sites neutral model (M1), the positive selection model (M8) and its null model (M8A) (Yang, 2007). To assess the significance of relaxation of constraint on subterranean mammal branches, we performed likelihood ratio tests (LRT) between BS Neutral and its nested null model M1. LRTs between BS Alt Mod and its null BS Neutral were used to infer positive selection on subterranean mammal branches. Probabilities were assigned for each of these two LRTs using the chi-square distribution with 1 degree of freedom. Mammal-wide positive selection was inferred using the M8 vs M8A models and their respective LRT, using 1 degree of freedom chi square distribution to assess LRT significance. For calculating the correlation between mole-acceleration and degree of tissue-specificity of genes, we estimated the mole-acceleration of each gene as follows: using a branch-site selection model (BS Alt Mod) we estimated two different values of ω (dN/dS) – one for the four subterranean branches and one for the rest of the branches on the tree. Mole-acceleration was calculated as the difference in the two ω values that were estimated.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
    Ortogenesi parallela e degradazione degli organi della vista negli Spalacidi
    1. G Cei
    (1946a)
    Monit Zool Ital, Firenze 55:69–88.
  11. 11
    L’occhio di “Heterocephalus glaber” Rupp. Note anatomodes -crittive e istologiche
    1. G Cei
    (1946b)
    Monit Zool Ital, Firenze 55:89–96.
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
    The upstream ectoderm enhancer in Pax6 has an important role in lens induction
    1. PV Dimanlig
    2. SC Faber
    3. W Auerbach
    4. HP Makarenkova
    5. RA Lang
    6. LR a
    (2001)
    Development 128:4415–4424.
  17. 17
  18. 18
    Les mammiferes souterrains
    1. G Dubost
    (1968)
    Rev Ecol Biol Sol 5:99–133.
  19. 19
  20. 20
  21. 21
    The subterranean mammals of the world
    1. JR Ellerman
    (1956)
    Transactions of the Royal Society of South Africa 35:11–20.
    https://doi.org/10.1080/00359195609519005
  22. 22
    Orientation in the mole-rat cryptomys
    1. G Eloff
    (1951)
    British Journal of Psychology. General Section 42:134–145.
    https://doi.org/10.1111/j.2044-8295.1951.tb00285.x
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
    Improved pairwise alignment of genomic DNA
    1. RS Harris
    (2007)
    ProQuest.
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
    Life Underground: The Biology of Subterranean Rodents
    1. EA Lacey
    2. JL Patton
    3. GN Cameron
    (2000)
    Chicago: University of Chicago Press.
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
    The Nine-Banded Armadillo: A Natural History
    1. CM McDonough
    2. WJ Loughry
    (2013)
    University of Oklahoma Press.
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
    Walker’s Mammals of the World
    1. RM Nowak
    (1999)
    Johns Hopkins University Press.
  58. 58
  59. 59
  60. 60
    MorphoBank: phylogenomics in the cloud
    1. MA O’Leary
    2. S Kaufman
    (2011)
    Cladistics : The International Journal of the Willi Hennig Society 27:529–537.
    https://doi.org/10.1111/j.1096-0031.2011.00355.x
  61. 61
  62. 62
  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
    The eye of the blind mole rat, Spalax ehrenbergi. Rudiment with hidden function?
    1. S Sanyal
    2. HG Jansen
    3. WJ de Grip
    4. E Nevo
    5. WW de Jong
    6. GWJ de
    7. JWW de
    (1990)
    Investigative ophthalmology & visual science 31:1398–1404.
  69. 69
  70. 70
  71. 71
    A direct approach to false discovery rates
    1. JD Storey
    (2002)
    Journal of the Royal Statistical Society: Series B 64:479–498.
    https://doi.org/10.1111/1467-9868.00346
  72. 72
  73. 73
  74. 74
    The eyes of Chrysochloris hottentota and C. asiatica
    1. G Sweet
    (1909)
    Journal of Cell Science 2:328–338.
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
    Regulation of Pax6 expression is conserved between mice and flies
    1. PX Xu
    2. X Zhang
    3. S Heaney
    4. A Yoon
    5. AM Michelson
    6. RL Maas
    (1999)
    Development 126:383–395.
  81. 81
  82. 82
  83. 83

Decision letter

  1. Duncan T Odom
    Reviewing Editor; University of Cambridge, United Kingdom

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Subterranean mammals show convergent regression in ocular genes and enhancers, and adaptation in tunneling-related genes" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Diethard Tautz as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Leslie M Turner (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The authors employ a sequenced-based evolutionary rate approach to identify genomic regions under accelerated or decelerated evolution in four species of subterranean mammals, thereby associating convergent genetic changes in these species with phenotypic traits (with a focus on loss of vision). Their results show an enrichment of vision-related gene annotations among genes showing convergent accelerated evolution in subterranean mammals, most of which show signatures of relaxed constraint, rather than adaptation. By comparing accelerated genes to either genes showing eye-specific gene expression or genes involved in eye development, the authors further suggest that regression of eye function associates with relaxed constraint on genes expressed in the eye lens and retina. Lastly, they demonstrate the applicability of this approach to non-coding regions by analysing rates of evolution among subterranean mammals for non-coding regions in the Pax6 locus. Based on these results the authors propose that the regions they identify include novel candidate sequences for congenital eye disorders.

The reviewers agreed that your analysis of the genetic evolution occurring in four separate lineages of subterranean mammals has potential to be of wide interest, and so the reviewers and handling editor are requesting resubmission. However, the two serious concerns outlined below must be exhaustively addressed, or a revised manuscript will be rejected.

Essential revisions for resubmission:

1) Presentation and novelty (see below for specifics).

All reviewers and editors agreed that your current manuscript needs serious editorial revision to address pervasive logic and presentational problems that results in masking of the story's novelty. Many problematic examples are listed below, compiled from all three reviewer's comments. A specific subpoint that was extensively discussed was whether your manuscript was sufficiently novel for publication in eLife, given the recent publication of a highly similar analysis by Prudent et al. of two of the same species in MBE. The additional power of using two more species must be far better argued, or a revision will not be adequate for eLife publication.

2) Genome-wide analysis of enhancers (see below for specifics).

The final computational analysis that dissected Pax6 enhancers must be extended to a genome-wide analysis of all enhancers to identify the complete set of regions where fossorial pressures have resulted in genetic changes. This analysis should likely be an entirely new figure that compellingly displays the genome-wide data in a way that is broadly understandable and persuasive in highlighting the novelty over Prudent et al. MBE.

The title is accurate but a bit awkward: authors please suggest your own revision.

Further detail on points above:

1) Presentation and novelty (specific details).

Note that the concerns listed below arose from all four reviews, so some are not in exact occurrence order. In addition, the Reviewing Editor had great difficulty understanding what gene sets were used, when, why, and how – throughout the manuscript. The entire revision must be heavily revised to didactically lay out each analysis more slowly and clearly to non-expert readers.

In Table 1 and Supplementary file 1, Tables S1 to S6, the authors continually shift between the top 1000, 500 and 200 accelerated or decelerated genes. For clarity, it would be better to decide on a threshold of statistical significance and state in the main text how many accelerated and decelerated genes are detected using this threshold. The authors could also homogenise the set of genes used in supplementary tables in Supplementary file 1, unless there are specific advantages to using different sets.

Related to the above, how does the RER method compare to the one used by Prudent et al.?

Figure 3 could additionally use as control a set of convergent genes in a different subset of species, such as the one reported by the authors for marine adaptations (Chikina et al., 2016).

The supplemental excel tables in Supplementary file 1 could be better annotated, for example by adding legends at the top. Table S1 in Supplementary file 1 seems to be duplicated in the excel file.

Why does Table S7 in Supplementary file 1 only have 71 genes, given that the authors mention in the main text their using 98 eye developmental genes? It would be helpful to mark significantly accelerated genes in this table.

Wording in the subsection “Regressive evolution is limited to the lens, retina, and eye-specific developmental genes” is somewhat misleading and should be revised. ("we asked which eye tissues showed acceleration", whereas I think the authors are actually testing whether genes with eye-specific expression are enriched among accelerated genes).

While the acceleration that the authors observe in Pax6 eye enhancers is consistent with relaxed constraint or "regression", in the absence of functional tests of these enhancers it is possible that the function of these enhancers is intact. In contrast, many of the vision genes that exhibit convergent sequence alteration have elevated dN/dS ratios or have stop codons and frame shift mutations that provide clear evidence of relaxed constraint. I don't feel that the authors need to perform functional tests of the Pax6 eye enhancers of subterranean species, but they should mention this caveat in the Discussion section.

In the Introduction, the authors cite Fang et al. as providing evidence for convergent evolution of circadian rhythm genes in blind mole-rat and naked mole-rat genomes. However, a corrigendum for this paper reported an error in the amino acid sequence alignment of the CLOCK genes. After correcting this error, there was no evidence of CLOCK protein sequence convergence between blink mole-rat and naked mole rat. Thus, it seems that Feng et al. may not provide support for the claim that circadian rhythm genes have cover gently evolved similar sequence changes. The authors should adjust their statements in the introduction accordingly. Fang et al., 2014. Nat Commun 5:1039-1052 for Corrigendum see http://www.nature.com/articles/ncomms9051

The authors repeatedly use the term "non-genic" to refer to non-coding sequences. The use of this term is somewhat ambiguous. For instance, the Pax6 gene has known (as well as putative) enhancer elements that are located within introns of the Elp4 gene. Thus, those elements lie within the Elp4 gene and could be considered "intra-genic" relative Elp4. The term "non-genic" should be replaced with the term "non-coding" throughout the manuscript. I also suggest that the authors replace "genic" with "coding," since the term "genic" can refer to the entire gene (not just coding exons, but also promoter regions, 5'UTR, 3'UTR, etc.).

Gene and protein nomenclature: There are some errors and inconstancies in gene and protein names. For instance, "Lim2" is used to refer to the "LIM2" protein. "Pax6" is used instead of "PAX6." Also, when referring to "krt-/-" mutants, the first letter should be capitalized and the gene name should be italicized.

Figure 3A: The gene rank labeling in the histogram is truncated and needs to be extended to include the bars on the right end of the chart.

Figure 5B: The abbreviation "Dev" and "Bgd" need to be defined in the figure legend.

Figure 6A: The species that is used to diagram the gene region should be specified. If the species is mouse, then mouse gene nomenclature should be followed when labeling the diagram (first letter capitalized, name in italics).

Introduction includes long, highly detailed lists of adaptations – hard to follow and main points are lost. A figure or table summarising adaptations observed in subterranean taxa would be better.

Writing is wordy and colloquial in many places. e.g. "Even though it is possible to visualize this accelerated rate of change in the Lim2 tree, it is more rigorous to use a quantitative framework."

Results include a lot of interpretation, sometimes quite speculative. As a result, much of the Discussion is repetitive.

Overselling of enhancer results: I think the Pax6 enhancer results are a compelling example, suggesting regulatory elements may show similar convergent relaxed constraint to coding genes. However, title, Abstract, and Discussion suggest genome-wide analysis of enhancers was performed and found visual-specific enhancers accelerated ("successfully proved our hypothesis").

Adaptation in tunneling-related genes: title and in the subsection “Skin-related genes were accelerated possibly in response to the demands of tunneling”. “reasons for their acceleration are split equally between relaxation of constraint and positive selection.” No numbers are given for # positively selected and total # tested – I believe only one positive selection example is listed (COL4A4). How is 'skin-related' defined – which GO terms?

Significance of acceleration/deceleration.

What is p-value threshold considered 'significant'?

How was multiple testing accounted for?

P values for top 1000 accelerated and decelerated in Supplementary file 1, Table S1/S3 very different – 614 accelerated have P <0.05; max for 1000 decelerated is P = 0.018.

How does P value distribution compare to 2 control sets used for GO analysis? Or random sets of 4 species?

Rods vs. cone genes analysis (subsection “Each subterranean species exhibited selective gene acceleration consistent with its visual capacities”)

It is hard to evaluate these results because I can't find a legend explaining ++s and --s in Supplementary file 1, Table S8 (top) and there are several empty cells in the pseudogene section (bottom). Nevertheless, this section is overly descriptive and I don't think much can be concluded based on inconsistent patterns across 3 genes/category.

Some Methods are unclear:

Beginning of Materials and methods should clearly explain overall rationale for data included in RER. e.g., describe 100-spp alignment, criteria for inclusion in proteome average tree, minimum # of subterranean sequences required for analysis (vs. 10 overall).

Figure 3B – how was this analysis done? What counts as 'mole acceleration' and 'visual perception'?

Arbitrary/inconsistent gene list #s. Why top 30 genes in Table 1? Top 500 accelerated in GO analysis Table 2? Why 1000 top genes included in supplementary tables in Supplementary file 1? Why 200 included in PAML analysis?

Pax6 enhancer analysis – (subsection “Eye-specific enhancers of PAX6 show convergent acceleration in subterranean mammals”, second paragraph), if enhancers in 200 kb window, why 500 kb 'window of interest', is 200 kb in center? Specify locations.

Tissue-specific expression (subsection “Regressive evolution is limited to the lens, retina, and eye-specific developmental genes”). It is unclear what 'significant differential expression only in the tissue of interest' means – significantly higher expression only in one tissue compared to background?

Figure 1A – include scientific names.

Figure 1C, 2, 4: what are unlabeled blue dots at bottom? Why are species ordered alphabetically by common name (it appears, not specified) instead of by phylogeny?

Figure 5A – r2? P value?

Figure 5C – not described in legend; shades hard to distinguish.

Figure 6A – indicate 200 kb and 500 kb windows for analysis.

Subsection “Most subterranean-accelerated genes are under relaxed constraint” – how many mole-accelerated genes show lesions in 1 or more subterranean species?

Subsection “Regressive evolution is limited to the lens, retina, and eye-specific developmental genes”, “clustered at the top of the accelerated list” – what does this mean? Overall list in Supplementary file 1, Table S1?

2) Extending Pax6 enhancer analysis (specific details)

The analysis on non-coding regions (Figure 6) must be expanded. If the point the authors want to make is that their approach is applicable to non-coding regions, this approach should be applied to all conserved non-coding elements rather than a single locus. An increased scope of this analysis would add novelty to the author's findings. Can one still detect an enrichment of eye-related gene annotations proximal to enhancers with accelerated evolution in subterranean mammals? How does it compare with that of coding regions?

The authors highlight their search for convergent patterns of sequence divergence as a means to identify genetic elements that may underlie changes in phenotype. If the authors analyzed each subterranean mammal on its own (looking at sequence divergence but not convergence between subterranean species), do they see strong enrichment of vision-related GO terms in each species? Do they also see other non-vision GO terms (e.g., hypoxia)? How different are the GO enrichments for accelerated proteins in individual subterranean species vs. the list of proteins that show convergent accelerations among subterranean species?

3) Other (non-required) points

As a strategy to inform the molecular basis of phenotypic traits, the author's approach is an alternative to direct experimental mapping across mammals (e.g. of gene expression; Ma et al., 2016). The authors hypothesize that genomic regions under accelerated evolution in subterranean mammals may inform candidate loci for congenital eye disorders, but no direct evidence is provided. The manuscript would tremendously benefit from substantiating this hypothesis with experimental data, for example by evaluating a few of the accelerated loci (genes or regulatory regions) in animal models of eye development, such as mouse or zebrafish. Can the authors show a requirement for some of their novel candidate elements in eye development or eye-specific expression? (Please see (Kvon et al., 2016) for a similar approach).

https://doi.org/10.7554/eLife.25884.031

Author response

Essential revisions for resubmission:

1) Presentation and novelty (see below for specifics).

All reviewers and editors agreed that your current manuscript needs serious editorial revision to address pervasive logic and presentational problems that results in masking of the story's novelty. Many problematic examples are listed below, compiled from all three reviewer's comments. A specific subpoint that was extensively discussed was whether your manuscript was sufficiently novel for publication in eLife, given the recent publication of a highly similar analysis by Prudent et al. of two of the same species in MBE. The additional power of using two more species must be far better argued, or a revision will not be adequate for eLife publication.

We thank the reviewers for identifying areas where the manuscript could be improved upon and have addressed each point to the best of our ability. Regarding the question of novelty over Prudent et al., there are a number of reasons our manuscript presents a major advance over their paper:

1) Our method makes use of 4 species compared to the two used by Prudent et al. and hence can more precisely and confidently identify genetic elements convergently responding to the subterranean environment.

2) We perform a more detailed investigation including studying the reasons for the rate acceleration in these genes as well as where they were expressed across specific tissues of the eye. On the other hand, Prudent et al.’s presentation of mole evolution was a short description. Their presentation seems to have fulfilled its purpose of demonstrating their new methods but was not intended to reveal biological insights.

3) To highlight the contrast in the performance of our methods, we conducted a head-to-head comparison of the Prudent method to ours on a common dataset of 3 mole species. We found that ours performs better on both protein-coding and enhancer sequences. For full details see our second response to point 1 below. In the revised manuscript however, we chose to remain focused on the biological conclusions and did not include this comparison of methodologies. If the reviewers feel the comparison to Prudent et al. is crucial for publication, we are happy to add those results to the manuscript.

4) In a first-of-its-kind demonstration we show the applicability of our methods on non-coding regions, which we view as a more urgent challenge, and correctly identify experimentally validated eye-specific enhancers of Pax6. Prudent et al. did not report mole-acceleration for non-coding sequences.

5) Additionally, we now include two large-scale non-coding analyses covering 16,000 potential enhancer sites across the genome to illustrate the power of our method in identifying candidate vision-specific genetic elements. We also provide the chromosomal coordinates of the top-scoring regions in a Supplemental Table for the research community.

We believe that the combination of the aforementioned points clearly illustrates the advancement of our findings in comparison to those reported in the Prudent et al. publication in MBE.

2) Genome-wide analysis of enhancers (see below for specifics).

The final computational analysis that dissected Pax6 enhancers must be extended to a genome-wide analysis of all enhancers to identify the complete set of regions where fossorial pressures have resulted in genetic changes. This analysis should likely be an entirely new figure that compellingly displays the genome-wide data in a way that is broadly understandable and persuasive in highlighting the novelty over Prudent et al. MBE.

We have now performed two large-scale scans for convergently mole-accelerated non-coding elements, totaling to 16,000 in number. These two scans give rise to the following findings – 1) Mole-accelerated genetic elements are strongly enriched near transcription factors with known function in eye development compared to tissue-specific transcription factors outside the eye; 2) Eye enhancers identified by the FANTOM5 consortium show significant mole-acceleration compared to other Fantom5 enhancers with zero expression in the eyeball of developing mouse embryos. These new analyses strongly reflect the power of our method in identifying putative vision-specific non-coding elements, illustrating the first successful attempt at large-scale prediction of vision-specific genetic elements using a convergent-evolution-based approach. For the gene regulation research community, we provide the coordinates of these newly described mole-accelerated non-coding reqions (Supplementary file 1, Table S10). In addition to this, we perform a direct comparative analysis that showed better performance of our RER method over the Prudent et al. method as part of our response to point 1.

The title is accurate but a bit awkward: authors please suggest your own revision.

We have clarified the title.

“Subterranean mammals show convergent regression in ocular genes and enhancers, along with adaptations to tunneling”

Further detail on points above:

1) Presentation and novelty (specific details).

Note that the concerns listed below arose from all four reviews, so some are not in exact occurrence order. In addition, the Reviewing Editor had great difficulty understanding what gene sets were used, when, why, and how – throughout the manuscript. The entire revision must be heavily revised to didactically lay out each analysis more slowly and clearly to non-expert readers.

In Table 1 and Supplementary file 1, Tables S1 to S6, the authors continually shift between the top 1000, 500 and 200 accelerated or decelerated genes. For clarity, it would be better to decide on a threshold of statistical significance and state in the main text how many accelerated and decelerated genes are detected using this threshold. The authors could also homogenise the set of genes used in supplementary tables in Supplementary file 1, unless there are specific advantages to using different sets.

We address this in the revised submission by performing multiple hypothesis testing corrections. We utilize a permutation-based False Discovery Rate estimation, controlling the FDR at 15%. This results in a total of 55 genes as being identified as accelerated, and 1306 genes as decelerated. The supplementary tables in Supplementary file 1 have been modified to consistently utilize these numbers of accelerated and decelerated genes.

Related to the above, how does the RER method compare to the one used by Prudent et al.?

The two methods developed by Prudent et al. address the problem of identifying phenotypic-genomic associations in a different manner compared to our RER method. For one, our RER method relies on estimating sequence divergence by computing branch lengths on a Maximum Likelihood phylogenetic tree (number of substitutions per site). These calculations are based on standard models of amino acid or nucleotide evolution, whereas the methods in Prudent et al. rely on percent sequence identity and sequence mismatch to estimate the sequence divergence, which do not account for several confounding factors. For instance, sequence mismatch has a non-linear relationship with the actual divergence due to hidden mutation events such as back substitutions, multiple substitutions etc. This results in a consistent underestimation of actual divergence that further worsens as the divergence increases. Furthermore, sequence mismatch based measures fail to account for non-uniform substitution rates between nucleotide types and between sites [Li et al.]. Model-based measures of genetic divergence developed over the past years have been done so to control for many of these potentially confounding factors, and hence have been the standard methods employed in the field of phylogenetic inference [Kimura et al., Felsenstein et al., Cantor et al., Holmquist et al.].

Additionally, our normalization method handles missing orthologs for genes automatically by pruning away the corresponding branch(es) in all trees present, and subsequently generating relative rates, whereas it is unclear if the Prudent et al. methods handle such cases.

We performed a direct comparative analysis between our RER method and the GLS p-value method developed in Prudent et al. paper by studying the ability of the methods again to rank eye-specific regulatory regions around PAX6 on the same dataset. Our method performed better than the Prudent et al. method by putting the three known eye-specific enhancers at the top of the list of discovered regions (Author response image 1).

Author response image 1

We additionally compared the performance of the methods in identifying high-ranking genes for enrichment in “eye morphology” and “eye physiology” as annotated by Mouse Genome Informatics. Using the same coding sequence alignments, our RER method enriched for known eye genes at a significantly higher rate than the Prudent et al. method (Author response image 2).

Author response image 2

We believe that these analyses could be included in the manuscript if the reviewers deem it necessary, but we prefer the biological findings of the study to be the focus of the paper.

Figure 3 could additionally use as control a set of convergent genes in a different subset of species, such as the one reported by the authors for marine adaptations (Chikina et al., 2016).

We understand this would be one way to perform a control analysis but we believe that using four non-subterranean control species that have similar average genome-wide relative evolutionary rates to the subterranean mammals is perhaps a more conservative approach and strongly illustrates the distinction.

The supplemental excel tables in Supplementary file 1 could be better annotated, for example by adding legends at the top. Table S1 in Supplementary file 1 seems to be duplicated in the excel file.

We have added legends to the top of each supplementary table in Supplementary file 1, describing the columns present. We have additionally removed the duplication. E.g. for Table S1 “The genome-wide ranking of genes whose rates are positively associated with the subterranean branches was identified based on the Mann-Whitney U test p-values (column 2) that were significant at a FDR of 15% under a permutation-based FDR correction procedure (column 3). Other columns list brief gene descriptions (4), known biological function (5), tissue specificity if any (6), known disease links (7), the predominant evolutionary mode on subterranean branches as decided with codon models (8, see Materials and methods), and the subterranean species containing genetic lesions in the corresponding protein-coding sequences (9).”

Why does Table S7 in Supplementary file 1 only have 71 genes, given that the authors mention in the main text their using 98 eye developmental genes? It would be helpful to mark significantly accelerated genes in this table.

We started out with a set of 98 eye developmental genes and a filtering procedure removed a set of 27 genes as the branch lengths on their trees did not show sufficient variation (see Materials and methods). We have now rectified the mismatch in the numbers accordingly. “Using literature we also compiled a set of 71 important eye developmental genes”.

Wording in the subsection “Regressive evolution is limited to the lens, retina, and eye-specific developmental genes” is somewhat misleading and should be revised. ("we asked which eye tissues showed acceleration", whereas I think the authors are actually testing whether genes with eye-specific expression are enriched among accelerated genes).

We thank the reviewers for pointing out the distinction, and have suitably reworded the sentence: “we asked which genes with eye–tissue specific expressions showed acceleration and found that the cornea genes specifically expressed in cornea – a protective tissue of the outer eye – and the iris were not accelerated in subterranean species when compared to a set of randomly chosen genes”.

While the acceleration that the authors observe in Pax6 eye enhancers is consistent with relaxed constraint or "regression", in the absence of functional tests of these enhancers it is possible that the function of these enhancers is intact. In contrast, many of the vision genes that exhibit convergent sequence alteration have elevated dN/dS ratios or have stop codons and frame shift mutations that provide clear evidence of relaxed constraint. I don't feel that the authors need to perform functional tests of the Pax6 eye enhancers of subterranean species, but they should mention this caveat in the Discussion section.

We added this caveat as follows: “Although the strong rate acceleration in the three eye-specific enhancers of PAX6 suggests relaxation of constraint in the subterranean mammals, in the absence of functional tests we cannot be sure that the eye-specific activity is truly lost.”

In the Introduction, the authors cite Fang et al. as providing evidence for convergent evolution of circadian rhythm genes in blind mole-rat and naked mole-rat genomes. However, a corrigendum for this paper reported an error in the amino acid sequence alignment of the CLOCK genes. After correcting this error, there was no evidence of CLOCK protein sequence convergence between blink mole-rat and naked mole rat. Thus, it seems that Feng et al. may not provide support for the claim that circadian rhythm genes have cover gently evolved similar sequence changes. The authors should adjust their statements in the introduction accordingly. Fang et al., 2014. Nat Commun 5:1039-1052 for Corrigendum see http://www.nature.com/articles/ncomms9051

We thank the reviewers for pointing out the error and the reference to the corrigendum. We have now removed the corresponding incorrect statement from our Introduction section.

The authors repeatedly use the term "non-genic" to refer to non-coding sequences. The use of this term is somewhat ambiguous. For instance, the Pax6 gene has known (as well as putative) enhancer elements that are located within introns of the Elp4 gene. Thus, those elements lie within the Elp4 gene and could be considered "intra-genic" relative Elp4. The term "non-genic" should be replaced with the term "non-coding" throughout the manuscript. I also suggest that the authors replace "genic" with "coding," since the term "genic" can refer to the entire gene (not just coding exons, but also promoter regions, 5'UTR, 3'UTR, etc.).

We have replaced all occurrences of the term non-genic with non-coding. We agree that it makes the terminology more uniform and precise.

Gene and protein nomenclature: There are some errors and inconstancies in gene and protein names. For instance, "Lim2" is used to refer to the "LIM2" protein. "Pax6" is used instead of "PAX6." Also, when referring to "krt-/-" mutants, the first letter should be capitalized and the gene name should be italicized.

We have appropriately capitalized the protein names and the nomenclature regarding Krt-/- mutants.

Figure 3A: The gene rank labeling in the histogram is truncated and needs to be extended to include the bars on the right end of the chart.

The bars corresponding to the ranks at the right end of the plot have now been added.

Figure 5B: The abbreviation "Dev" and "Bgd" need to be defined in the figure legend.

We have added a legend describing all the abbreviations used in the plot.

Figure 6A: The species that is used to diagram the gene region should be specified. If the species is mouse, then mouse gene nomenclature should be followed when labeling the diagram (first letter capitalized, name in italics).

The genomic window plotted in Figure 6 corresponds to the hg19 assembly and the labels defining the species and the chromosomal coordinates have been added to the plot.

Introduction includes long, highly detailed lists of adaptations – hard to follow and main points are lost. A figure or table summarising adaptations observed in subterranean taxa would be better.

We have now pruned out multiple paragraphs in the Introduction section, especially with regard to the various specific vision-related adaptations. We agree that it makes the Introduction more concise and gets to the objective of the study much quicker.

Writing is wordy and colloquial in many places. e.g. "Even though it is possible to visualize this accelerated rate of change in the Lim2 tree, it is more rigorous to use a quantitative framework."

We have reworded the particular sentence, and would appreciate further comments and feedback to strengthen the style of writing.

“To quantify this rate acceleration in the LIM2 tree, we normalized all branch lengths for the expected amount of change as defined by the genome-wide average divergence for each branch”.

Results include a lot of interpretation, sometimes quite speculative. As a result, much of the Discussion is repetitive.

We have pruned the Discussion to avoid repetition.

Overselling of enhancer results: I think the Pax6 enhancer results are a compelling example, suggesting regulatory elements may show similar convergent relaxed constraint to coding genes. However, title, Abstract, and Discussion suggest genome-wide analysis of enhancers was performed and found visual-specific enhancers accelerated ("successfully proved our hypothesis").

We have now included two large-scale validation analyses to expand from the example demonstration using Pax6 enhancers. We believe our new results clearly strengthen our argument for the hypothesis that convergent relaxation at the non-coding level can be used to identify vision-specific enhancers.

“Mole-accelerated non-coding elements are strongly enriched near transcription factors driving eye development”

“FANTOM5 eye enhancers show strong convergent acceleration in subterranean mammals”

Adaptation in tunneling-related genes: title and in the subsection “Skin-related genes were accelerated possibly in response to the demands of tunneling”. “reasons for their acceleration are split equally between relaxation of constraint and positive selection.” No numbers are given for # positively selected and total # tested – I believe only one positive selection example is listed (COL4A4). How is 'skin-related' defined – which GO terms?

We have now reworded the section describing the results pertaining to skin-related genes.

Significance of acceleration/deceleration.

What is p-value threshold considered 'significant'?

How was multiple testing accounted for?

P values for top 1000 accelerated and decelerated in Supplementary file 1, Table S1/S3 very different – 614 accelerated have P <0.05; max for 1000 decelerated is P = 0.018.

How does P value distribution compare to 2 control sets used for GO analysis? Or random sets of 4 species?

We address these questions in the revised submission by performing multiple hypothesis testing corrections. We utilize a permutation-based False Discovery Rate estimation, controlling the FDR at 15%. This results in a total of 55 genes as being identified as accelerated, and 1306 genes as decelerated.

The p-value distribution in random sets of 4 species was used for calculating the FDR q-values (see Materials and methods). At the same FDR of 15%, we have 10 genes significantly accelerated in control1 and 3 genes in control2. Unsurprisingly, given the small numbers of genes identified, neither of the control-accelerated gene sets showed any significantly enriched GO terms. Given the highly similar nature of results across the two controls, we decided to report the findings based on just one of the sets of control species (control 1 – pika, guinea pig, squirrel, cow) in the revised submission.

The numbers of genes identified as having decelerated rates in the moles is 1306, at a FDR of 15%. At the same FDR, 626 genes are decelerated in the control species. Mole-decelerated genes despite being 20-fold more in number than mole-accelerated genes, do not show significant functional enrichment. We find only 1 category as significantly enriched, whereas 24 GO term categories are significantly enriched across control species. We are separately investigating the wide differences in the numbers of genes that are being detected as accelerated versus decelerated alongside other potential methodological refinements, and focus the results of this paper on the biological findings.

Acceleration/ DecelerationForeground speciesNumber of significant genes at FDR of 15%Number of significant GO terms
AccelerationMoles5515
Control100
DecelerationMoles13061
Control62624

Rods vs. cone genes analysis (subsection “Each subterranean species exhibited selective gene acceleration consistent with its visual capacities”)

It is hard to evaluate these results because I can't find a legend explaining ++s and --s in Supplementary file 1, Table S8 (top) and there are several empty cells in the pseudogene section (bottom). Nevertheless, this section is overly descriptive and I don't think much can be concluded based on inconsistent patterns across 3 genes/category.

Upon consideration of reviewers’ comments’, we have now removed the results of the section corresponding to the rods vs. cone genes analysis. We feel that with the new results corresponding to the non-coding elements, the results of the rod/cone genes analyses perhaps become less significant.

Some Methods are unclear:

Beginning of Materials and methods should clearly explain overall rationale for data included in RER. e.g., describe 100-spp alignment, criteria for inclusion in proteome average tree, minimum # of subterranean sequences required for analysis (vs. 10 overall).

We have now addressed these points in more detail in the paper.

“Using the 100-way 100 vertebrate species amino acid alignments from the multiz alignment available at the UCSC genome browser [Blanchette et al., 2004; Harris, 2007], those alignments with a minimum of 10 species were selected for study, additionally requiring the presence of at least two subterranean species. We pruned each alignment to include 39 species of interest represented in the proteome-wide average tree (Figure 1A) after adding the BMR ortholog of the corresponding gene sequence to this alignment as described in the previous section”.

Figure 3B – how was this analysis done? What counts as 'mole acceleration' and 'visual perception'?

We have now added the appropriate definitions.

“False Discovery Rate analysis was performed on probabilities resulting from the Mann-Whitney U test. We employ an empirical permutation-based FDR calculation, often the standard approach in genome-wide analysis. We generated 10,000 null datasets, where each dataset was obtained by randomly permuting the species labels of the relative rates. This process is equivalent to calculating the Mann-Whitney U test and correlation analysis over four random branches vs the rest instead of over the binary variable of “subterranean” vs. “aboveground” branches. The subsequent permuted datasets were used to estimate the FDR q-values for each p-value in our subterranean correlation analysis. Genes showing a correlation greater than equal to 0.5 in the Mann-Whitney U test and significant at a FDR of 15% are considered mole-accelerated genes, and the p-value reflecting the strength of this acceleration is referred to as mole-acceleration.”

Arbitrary/inconsistent gene list #s. Why top 30 genes in Table 1? Top 500 accelerated in GO analysis Table 2? Why 1000 top genes included in supplementary tables in Supplementary file 1? Why 200 included in PAML analysis?

We have homogenized the gene sets used across various analysis.

Pax6 enhancer analysis – (subsection “Eye-specific enhancers of PAX6 show convergent acceleration in subterranean mammals”, second paragraph), if enhancers in 200 kb window, why 500 kb 'window of interest', is 200 kb in center? Specify locations.

We performed the scan across a larger window to include a larger number of conserved non-coding elements and potentially discover new elements that may be vision-specific. Labels corresponding to the genomic locations have now been added to Figure 6.

Tissue-specific expression (subsection “Regressive evolution is limited to the lens, retina, and eye-specific developmental genes”). It is unclear what 'significant differential expression only in the tissue of interest' means – significantly higher expression only in one tissue compared to background?

The genes had significantly higher expression in that tissue compared to all the other tissues.

Figure 1A – include scientific names.

We decided against including scientific names as it makes the text in the figure too cramped.

Figure 1C, 2, 4: what are unlabeled blue dots at bottom? Why are species ordered alphabetically by common name (it appears, not specified) instead of by phylogeny?

The unlabeled blue dots correspond to ancestral branches in the tree. We now explain this in the figure legend. The dots are not ordered by phylogeny because the relative rates are obtained after correcting for the inherent phylogenetic relationship between the branch lengths. The alphabetical order was used merely for convenience.

Figure 5A – r2? P value?

Added.

Figure 5C – not described in legend; shades hard to distinguish.

Added.

Figure 6A – indicate 200 kb and 500 kb windows for analysis.

Added the legend corresponding to the chromosomal coordinates. (Refer our twenty-fourth response to point 1).

Subsection “Most subterranean-accelerated genes are under relaxed constraint” – how many mole-accelerated genes show lesions in 1 or more subterranean species?

Subsection “Regressive evolution is limited to the lens, retina, and eye-specific developmental genes”, “clustered at the top of the accelerated list” – what does this mean? Overall list in Supplementary file 1, Table S1?

We have replaced the word ‘clustered’ with ‘present’ in the sentence now. The list here refers to the list of mole-acceleration of developmental genes.

2) Extending Pax6 enhancer analysis (specific details)

The analysis on non-coding regions (Figure 6) must be expanded. If the point the authors want to make is that their approach is applicable to non-coding regions, this approach should be applied to all conserved non-coding elements rather than a single locus. An increased scope of this analysis would add novelty to the author's findings. Can one still detect an enrichment of eye-related gene annotations proximal to enhancers with accelerated evolution in subterranean mammals? How does it compare with that of coding regions?

We have now added two new sections to the paper expanding from the example analysis using Pax6 enhancers. We have analyzed a total of roughly 16,000 non-coding elements, 8,000 elements scanned for the study contrasting two sets of transcription factors and 8,000 elements corresponding to FANTOM5 putative enhancer elements. We believe these new results further strengthen our argument that convergent rate acceleration can identify candidate vision-specific genetic elements in the mammalian genome.

The authors highlight their search for convergent patterns of sequence divergence as a means to identify genetic elements that may underlie changes in phenotype. If the authors analyzed each subterranean mammal on its own (looking at sequence divergence but not convergence between subterranean species), do they see strong enrichment of vision-related GO terms in each species? Do they also see other non-vision GO terms (e.g., hypoxia)? How different are the GO enrichments for accelerated proteins in individual subterranean species vs. the list of proteins that show convergent accelerations among subterranean species?

We have considered the question of looking at relative rates in subterranean mammals individually as against convergent rate changes. Generally, we do not see any GO term significantly enriched from these individual analyses, none related to vision at any rate. Due to the very weak nature of the associations that come out of these analyses, we do not include these results in the main manuscript.

3) Other (non-required) points

As a strategy to inform the molecular basis of phenotypic traits, the author's approach is an alternative to direct experimental mapping across mammals (e.g. of gene expression; Ma et al., 2016). The authors hypothesize that genomic regions under accelerated evolution in subterranean mammals may inform candidate loci for congenital eye disorders, but no direct evidence is provided. The manuscript would tremendously benefit from substantiating this hypothesis with experimental data, for example by evaluating a few of the accelerated loci (genes or regulatory regions) in animal models of eye development, such as mouse or zebrafish. Can the authors show a requirement for some of their novel candidate elements in eye development or eye-specific expression? (Please see (Kvon et al., 2016) for a similar approach).

We thank the reviewers for suggesting a complementary strategy to our approach. We are currently in the process of conducting experiments that can provide direct evidence in validating our candidate vision-specific genetic elements. However, because we are developing an expression construct system for those experiments, they are more than a year from being completed. We hope that these experiments will be the subject of future publications.

https://doi.org/10.7554/eLife.25884.032

Article and author information

Author details

  1. Raghavendran Partha

    Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, United States
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Visualization, Writing—original draft, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon 0000-0002-7900-4375
  2. Bharesh K Chauhan

    1. UPMC Eye Center, Children’s Hospital of Pittsburgh, Pittsburgh, United States
    2. Department of Ophthalmology, University of Pittsburgh School of Medicine, Pittsburgh, United States
    Contribution
    Data curation, Formal analysis, Funding acquisition, Writing—original draft, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon 0000-0002-6429-9190
  3. Zelia Ferreira

    Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, United States
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Visualization, Writing—original draft, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon 0000-0002-7619-7466
  4. Joseph D Robinson

    Department of Molecular and Cell Biology, University of California, Berkeley, United States
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Funding acquisition
    Competing interests
    No competing interests declared
  5. Kira Lathrop

    1. UPMC Eye Center, Children’s Hospital of Pittsburgh, Pittsburgh, United States
    2. Department of Ophthalmology, University of Pittsburgh School of Medicine, Pittsburgh, United States
    Contribution
    Visualization
    Competing interests
    No competing interests declared
  6. Ken K Nischal

    1. UPMC Eye Center, Children’s Hospital of Pittsburgh, Pittsburgh, United States
    2. Department of Ophthalmology, University of Pittsburgh School of Medicine, Pittsburgh, United States
    Contribution
    Funding acquisition, Writing—review and editing
    Competing interests
    No competing interests declared
  7. Maria Chikina

    Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, United States
    Contribution
    Conceptualization, Software, Formal analysis, Funding acquisition, Writing—review and editing
    For correspondence
    mchikina@pitt.edu
    Competing interests
    No competing interests declared
  8. Nathan L Clark

    Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, United States
    Contribution
    Conceptualization, Software, Formal analysis, Funding acquisition, Visualization, Writing—original draft, Writing—review and editing
    For correspondence
    nclark@pitt.edu
    Competing interests
    No competing interests declared
    ORCID icon 0000-0003-0006-8374

Funding

National Eye Institute (Core Grant P30 EY008098)

  • Bharesh K Chauhan
  • Ken K Nischal

Research to Prevent Blindness

  • Bharesh K Chauhan
  • Ken K Nischal

Eye and Ear Foundation of Pittsburgh

  • Bharesh K Chauhan
  • Ken K Nischal

Jack Buncher Foundation

  • Bharesh K Chauhan
  • Ken K Nischal

National Institutes of Health (Core Grant P30 EY008098)

  • Bharesh K Chauhan
  • Ken K Nischal

National Science Foundation and Department of Defense (Research Experience for Undergraduates NSF-DBI 1263020)

  • Joseph D Robinson

National Institutes of Health (U54HG008540)

  • Maria Chikina
  • Nathan L Clark

National Institutes of Health (R03MH109009)

  • Maria Chikina

National Institutes of Health (R01HG009299)

  • Maria Chikina
  • Nathan L Clark

Pittsburgh Foundation (Charles E. Kaufman New Investigator Award)

  • Nathan L Clark

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank John Wible of the Carnegie Museum for discussion of visual phenotypes. This study was supported by a Charles E Kaufman New Investigator Award from The Pittsburgh Foundation to NLC, National Institutes of Health (NIH) grants (R01HG009299 and U54HG008540) to MC and NLC, NIH grant R03MH109009 to MC, a National Eye Institute/National Institutes of Health Core Grant (P30 EY008098), unrestricted grants from Research to Prevent Blindness, Inc., the Jack Buncher Foundation, and the Eye and Ear Foundation of Pittsburgh to BKC and KKN, and a Research Experience for Undergraduates grant funded by the National Science Foundation and the Department of Defense (NSF- DBI 1263020).

Reviewing Editor

  1. Duncan T Odom, Reviewing Editor, University of Cambridge, United Kingdom

Publication history

  1. Received: February 9, 2017
  2. Accepted: August 22, 2017
  3. Version of Record published: October 16, 2017 (version 1)

Copyright

© 2017, Partha et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,823
    Page views
  • 240
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading