1. Developmental Biology
  2. Evolutionary Biology
Download icon

Developmental hourglass and heterochronic shifts in fin and limb development

  1. Koh Onimaru  Is a corresponding author
  2. Kaori Tatsumi
  3. Chiharu Tanegashima
  4. Mitsutaka Kadota
  5. Osamu Nishimura
  6. Shigehiro Kuraku  Is a corresponding author
  1. Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research (BDR), Japan
  2. Laboratory for Bioinformatics Research, RIKEN BDR, Japan
  3. Molecular Oncology Laboratory, Graduate School of Medicine, Nagoya University, Japan
Research Article
  • Cited 0
  • Views 561
  • Annotations
Cite this article as: eLife 2021;10:e62865 doi: 10.7554/eLife.62865

Abstract

How genetic changes are linked to morphological novelties and developmental constraints remains elusive. Here, we investigate genetic apparatuses that distinguish fish fins from tetrapod limbs by analyzing transcriptomes and open-chromatin regions (OCRs). Specifically, we compared mouse forelimb buds with the pectoral fin buds of an elasmobranch, the brown-banded bamboo shark (Chiloscyllium punctatum). A transcriptomic comparison with an accurate orthology map revealed both a mass heterochrony and hourglass-shaped conservation of gene expression between fins and limbs. Furthermore, open-chromatin analysis suggested that access to conserved regulatory sequences is transiently increased during mid-stage limb development. During this stage, stage-specific and tissue-specific OCRs were also enriched. Together, early and late stages of fin/limb development are more permissive to mutations than middle stages, which may have contributed to major morphological changes during the fin-to-limb evolution. We hypothesize that the middle stages are constrained by regulatory complexity that results from dynamic and tissue-specific transcriptional controls.

eLife digest

Animals come in all shapes and sizes. This diversity arose through genetic mutations during evolution, but it is unclear exactly how these variations led to the formation of new shapes. There is increasing evidence to suggest that not all shapes are possible and that variability between animals is limited by a phenomenon known as “developmental constraint”. These limitations direct parts of the body towards a specific shape as they develop in the embryo. Therefore, understanding the mechanisms underlying these developmental constraints could help explain how different body shapes evolved.

The limbs of humans and other mammals evolved from the fins of fish, and this transition is often used to study the role developmental constraints play in evolution. This is an ideal model as there is already a detailed fossil record mapping this evolutionary event, and data pinpointing some of the genes involved in the development of limbs and fins. But this data is incomplete, and a full comparison between the genes activated in the fin and the limb during embryonic development had not been achieved. This is because most fish used for research have undergone recent genetic changes, making it hard to spot which genetic differences are linked to the evolution of the limb.

To overcome this barrier, Onimaru et al. compared genetic data from the developing limbs of mice to the developing fins of the brown-banded bamboo shark, which evolves much slower than other fish. This revealed that although many genes commonly played a role in the development of the fin and the limb in the embryo, the activity of these shared genes was not the same. For example, genes that switched on in the late stages of limb development, switched off in the late stages of fin development. But in the middle of development, those differences were relatively small and both species activated very similar sets of genes. Many of these genes were pleiotropic, which means they have important roles in other tissues and therefore mutate less often. This suggests that the mid-stage of limb development is under the strongest level of constraint.

Darwin’s theory of natural selection explains that mutations drive evolution. But the theory cannot predict what kinds of new body shapes new mutations will produce. Understanding how the activity levels of different genes affect development could help to fill this knowledge gap. This has potential medical applications, for example, understanding why some genetic changes cause more serious problems than others. This work suggests that mutations in genes that are active during the mid-stage of limb development may have the most serious impact.

Introduction

The genetic mechanism of morphological diversity among multicellular organisms is of central interest in evolutionary biology. In particular, our understanding of how morphological novelties are linked to the emergence of their respective genetic apparatuses is limited (Rebeiz and Tsiantis, 2017). In addition, it is still unclear to what extent internal constraints, such as pleiotropy, affect evolvability (Wagner and Zhang, 2011). The fin-to-limb transition is a classic, yet still influential, case study that contributes to our understanding of morphological evolution. In general, tetrapod limbs are composed of three modules, the stylopod, zeugopod, and autopod, which are ordered proximally to distally (Figure 1A). In contrast, fish fins are often subdivided into different anatomical modules along the anterior−posterior axis—the propterygium, mesopterygium, and metapterygium (Figure 1A). Although it is still controversial how this different skeletal arrangement compares with the archetypal tetrapod limb, the autopod (wrist and digits) seems to be the most apparent morphological novelty during the fin-to-limb transition (Clack, 2009). Despite intensive comparative studies of developmental gene regulation, genetic machinery that differs between fins and limbs remains elusive. Instead, several studies revealed that autopod-specific regulation of Hoxa13 and Hoxd10−13, which control autopod formation, is also conserved in non-tetrapod vertebrates (Davis et al., 2007; Freitas et al., 2007; Schneider et al., 2011), except that the expression domains of Hoxa13 and Hoxa11 are mutually exclusive in mouse and chick limbs while overlapping in examined fish fin buds (note that axolotl limbs also exhibit such fish-like overlap of these expression domains; Ahn and Ho, 2008; Metscher et al., 2005; Sakamoto et al., 2009; Woltering et al., 2019). Although several gene regulatory differences have been proposed to explain the anatomical difference between fins and limbs, these proposals have been exclusively focused on Hox genes (Kherdjemil et al., 2016; Nakamura et al., 2016; Sheth et al., 2012; Woltering et al., 2014). Therefore, a genome-wide systematic study is required to identify the genetic differences between fish fins and tetrapod limbs.

Figure 1 with 8 supplements see all
Transcriptome analysis and orthology assignment.

(A) The skeletal patterns of a mouse limb (top) and a bamboo shark pectoral fin (bottom). Anterior is to the top; distal is to the right. (B) Mouse forelimb buds and bamboo shark pectoral fin buds that were analyzed by RNA-seq. (C) Comparison of the accuracy of three orthology assignment methods. Vertical axis, the percentages of correctly assigned Hoxa and Hoxd paralogs (black bars) and Fgf paralogs (white bars). (D) Heat map visualization of the transcription profile of Hoxa and Hoxd genes in mouse limb buds (left) and bamboo shark fin buds (right) with scaled TPMs.

There have been several difficulties that limit genetic comparisons between tetrapods and non-tetrapod vertebrates. For example, whereas zebrafish and medaka are ideal models for molecular studies, their rapid evolutionary speed and a teleost-specific whole-genome duplication hinder comparative analyses with tetrapods at both the morphological and genetic levels (Ravi and Venkatesh, 2008). This obstacle can be circumvented by using more slowly evolving species such as spotted gar, coelacanths, and elephantfish (also known as elephant shark, a cartilaginous fish that is not a true shark) with their genome sequences that have not experienced recent lineage-specific genome duplications and thus facilitate the tracing of the evolution of gene regulation (Amemiya et al., 2013; Braasch et al., 2016). However, the major disadvantage of these slowly evolving species is the inaccessibility of developing embryos. In contrast, although the eggs of sharks and rays (other slowly evolving species; Hara et al., 2018) are often more accessible, their genomic sequence information has not been available until recently. As a solution for these problems, this study used embryos of the brownbanded bamboo shark (referred to hereafter as the bamboo shark), because a usable genome assembly was recently published for this species (Hara et al., 2018). Importantly, its non-coding sequences seem to be more comparable with those of tetrapods than with teleosts (Hara et al., 2018). In addition, this species is common in aquariums and has a detailed developmental staging table, providing an opportunity to study embryogenesis (Onimaru et al., 2018). These unique circumstances of the bamboo shark enabled a comprehensive study to identify the genetic differences between fins and limbs.

In this study, to identify genetic differences between fins and limbs, we performed RNA sequencing (RNA-seq) analyses of developing bamboo shark fins and mouse limbs. Along with this transcriptomic comparison, we also generated an accurate orthology map between the bamboo shark and mouse. In addition, we applied an assay for transposase‐accessible chromatin with high‐throughput and chromatin accessibility analysis (ATAC-seq; Buenrostro et al., 2013) across a time series of mouse limb buds, which generated a high-quality data set the showing dynamics of open-chromatin regions (OCRs; putative enhancers) during limb development. We also analyzed the evolutionary conservation of sequences in these OCRs to gain insights into the gene regulatory changes during the fin-to-limb transition.

Results

Comparative transcriptome analysis

To compare the temporal dynamics of gene expression between bamboo shark fin and mouse limb development, we obtained RNA-seq data from a time series of growing fin and limb buds with three replicates (Figure 1B; Supplementary file 1 for the details of RNA-seq). We selected limb buds from embryonic day (E)9.5 to E12.5 mice because this is the period during which the major segments of the tetrapod limb—the stylopod, zeugopod, and autopod—become apparent. In particular, the presumptive autopod domain, which is a distinct structure in the tetrapod limb, is visually recognizable from E11.5. For the bamboo shark, we selected developing fin stages from as wide a time period as possible (Figure 1B). To perform fine-scale molecular-level comparison, we annotated its coding genes using BLASTP against several vertebrates (listed in the Materials and methods) and our custom algorithm. As a result, 16443 unique genes from 63898 redundant coding transcripts were annotated as orthologous to known genes of vertebrates, among which 13,005 genes were uniquely orthologous to mouse genes (Table 1 for details of the transcriptome assembly; Figure 1—figure supplement 1—3, Supplementary files 2 and 3 for gene annotations and Supplementary data for sequence information). The number of detected orthologs is reasonable when compared with other studies (e.g. Hao et al., 2020). The quality of the ortholog assignment, which was assessed by examining Hox and Fgf genes, showed that our custom algorithm is more accurate than other methods (Figure 1C; see Materials and Methods and Supplementary file 4 for details). Using this assembly for the bamboo shark and RefSeq genes for mice, the means and standard errors of the transcripts per million (TPM) values were calculated from three replicates (see Figure 1—figure supplement 4 for other normalization methods and Supplementary files 5 and 6 for the full list of TPM values). In addition, for most of the analyses, TPMs were scaled by setting the highest TPM in each gene of each species to ‘1’ (which we refer to as the Max one method) to capture temporal dynamics rather than absolute transcript amounts. Compared to using intact TPMs and other scaling methods, Max one is relatively sensitive to interspecific differences in dynamically regulated gene expression (see Materials and methods and Figure 1—figure supplements 5 and 6 for details).

Table 1
Assembly statistics of bamboo shark transcriptome.
CharacteristicBamboo shark transcriptomeBamboo shark gene model (Hara et al., 2018)
Total number of sequences22201534038
Total sequence length (bp)19554136736633751
Average length (bp)8801076
Maximum length (bp)18451108594
N count010208
L50247655666
N50 length (bp)20751749
Protein coding6389834038
Orthology detected4163318180
Unique orthologs1413914907
Unique orthologs without gene symbols18211780
Unique orthologs only in elephantfish826552
Sequences with no orthology2089215254
Orthologs with mouse genes1232613005

With this transcriptome data set and gene annotation, we first validated our data by analyzing the expression profiles of Hoxa and Hoxd genes. In mouse limb development, Hoxa and Hoxd genes undergo two phases of global regulation (Deschamps and Duboule, 2017). During the first phase, Hoxd genes are regulated by an enhancer group located 3ʹ of the entire HoxD cluster, and the Hoxd genes are sequentially upregulated from 3ʹ to 5ʹ. The outcome of this first phase helps to establish the arm and the forearm. During the second phase, enhancers located 5ʹ of the HoxD cluster start to activate expression of Hoxd10 to Hoxd13 in the presumptive autopod region (Hoxa genes are regulated in a similar manner; Deschamps and Duboule, 2017). As expected, we detected the two phases of Hoxd gene regulation in mouse limb transcriptomes; the expression levels of Hoxd1 to Hoxd8 were highest at E9.5 (the first phase regulation), and Hoxd11 to Hoxd13 were gradually upregulated later (the second phase regulation; Figure 1D). Interestingly, the expression levels of Hoxd9 and Hoxd10 were highest at E10.5, which probably represents the transitional state between the first and second global regulation (Andrey et al., 2013). A similar profile was observed for Hoxa genes (Figure 1D). As with mouse limb buds, we found similar phasic regulation of Hoxa and Hoxd genes in the bamboo shark fin transcriptome (Figure 1D), suggesting that these transcriptomic data cover comparable developmental stages between the two species at least with respect to Hox gene regulation.

The overall similarity in the temporal dynamics of Hox gene expression between the mouse limb bud and the bamboo shark fin bud is an expected result because the second phase of Hoxd gene regulation has been found to be conserved in the fins of many fish (Ahn and Ho, 2008; Davis et al., 2007; Freitas et al., 2007; Schneider et al., 2011; Tulenko et al., 2017). However, there are several differences that are worth noting. For example, in mouse limb buds, Hoxd11 and Hoxd12 expression was highest at E11.5, followed by further upregulation of Hoxd13 at E12.5 (Figure 1D). In contrast, in bamboo shark fin buds, these three genes reached their peak expression simultaneously at [stage (st)]31 (Figure 1D). This led us to investigate further whether the quantitative collinearity of 5ʹ Hoxd genes, where the expression of Hoxd13 is much higher than that of its neighboring Hoxd genes, whose transcription levels decrease with increasing distance from Hoxd13 (Montavon et al., 2008), is conserved in the bamboo shark fin buds. First, as a confirmation of the previous observation, we also found quantitative collinearity of Hoxd genes in our transcriptome data of mouse limb buds at E12.5 (Figure 1—figure supplement 7). However, the bamboo shark fin buds exhibited no clear relationship between the genomic loci and the expression levels of Hoxd genes at either st31 or st32 (Figure 1—figure supplement 7): Hoxd12 expression was highest among its neighbors. Hoxd9 showed the second highest expression, followed by Hoxd10 and Hoxd11, which had roughly identical levels of transcripts. Hoxd13 expression was lowest among these 5ʹ members. Given that quantitative collinearity is considered to be a consequence of the characteristic global regulation of the HoxD cluster in the mouse limb bud (Montavon et al., 2008), this result suggests that the bamboo shark fin bud may have a different mechanism for Hoxd gene regulation. Interestingly, a recent study also showed that the presumptive autopod domains of chick limb buds express nearly a same amount of Hoxd13 and Hoxd12 transcripts (Yakushiji-Kaminatsui et al., 2018), suggesting that quantitative collinearity is not a universal feature of fins and limbs, rather varies among species. Taken together, although the overall temporal dynamics of Hox gene expression are conserved between the mouse limb bud and the bamboo shark fin bud, some differences in the regulation of Hox genes may exist between species.

To investigate to what extent our bulk transcriptome data captured the processes of cellular differentiation, we also analyzed genes related to chrondrogenesis and myogenesis. As a result, we found that the chondrogenic pathway was at least partially conserved between bambooshark fin buds and mouse limb buds; the expression level of Sox9 and Runx3 (key transcription factors of chondrogenesis; Fowler and Larsson, 2020) increased relatively early, and that of Acan (a cartilage-specific proteoglycan; Fowler and Larsson, 2020) was upregulated later (Figure 1—figure supplement 8). In contrast, although Nog is known to be expressed in cartilaginous condensations in mouse limb buds (Brunet et al., 1998), we did not detect a Nog ortholog in either the fin transcriptome or the genome assembly of the bamboo shark. As for myogenesis, our transcriptome data captured both conserved and divergent myogenetic regulation: Pax3 (a marker of myogenic precursor cells) was downregulated over developmental time, and the MyoD gene family (Myog, Myod1, Myf5) took turns for further differentiation (Chal and Pourquié, 2017). In contrast, whereas mouse limb buds showed upregulation of three myosin genes (Myh3, Myh7, Myh8) at E12.5, we detected the upregulation of only Myh7 in bamboo shark fin buds. Again, we did not find Myh3 and Myh8 in either the transcriptome or the genome assembly of the bamboo shark. These results suggest that our transcriptome data, even though based on bulk sampling of RNA, can reveal conserved and diverged cellular differentiation processes.

Heterochronic gene expressions

Next, to find differences in gene regulation between the two species, we performed a gene-by-gene comparison of expression dynamics with hierarchical clustering (Figure 2A). To find potential candidate genes that contribute to the different morphologies between fins and limbs, we annotated genes with mouse mutant phenotypes (see Supplementary file 7 for the full list of genes, expression data, and annotation). The result showed that 6701 genes were significantly expressed in only one of these species (‘Fin-specific’ and ‘Limb-specific’ in Figure 2A; 3284 and 3417 genes, respectively). While the fin-specific gene group consisted of many uncharacterized genes, it included ones that are known to control only fish fin development (Fischer et al., 2003; Zhang et al., 2010), such as And1 (TRINITY_DN62789_c1_g1_i3 in Supplementary data; ortholog of a coelacanth gene, XP_015216565) and Fgf24 (TRINITY_DN92536_c7_g1_i2 in Supplementary data; ortholog of a coelacanth gene, XP_006012032). In the limb-specific gene group, several interesting genes were listed that exhibit abnormal phenotype in the mouse limb (e.g. Bmp2, Ihh, and Megf8). However, the number of these species-specific genes is probably unreliable and overestimated because these groups also contain genes for which their orthology was not assigned correctly. We also detected 1884 genes that were upregulated during late stages of fin/limb development for both species, including genes that are well known to be expressed later during fin/limb development, such as the autopod-related transcription factors Hoxd13 and Hoxa13 and differentiation markers Col2a1 and Mef2c (‘Conserved, late1 and Conserved, late2’ in Figure 2A). Intriguingly, 5388 genes exhibited heterochronic expression profiles; their expression levels were highest during the late stages of mouse limb bud development but were relatively stable expression throughout fin development (‘Heterochronic1’; 3178 genes) or decreased during the late stages of fin development (‘Heterochronic2’; 2223 genes; see Supplementary file 7 for the full list of genes and annotations). For validation, we examined the spatio-temporal expression pattern of three heterochronic genes that exhibit limb abnormality in mouse mutants, Aldh1a2 from Heterochronic1 and, Hand2 and Vcan from Heterochronic2. Aldh1a2 is upregulated in the interdigital web of mouse limb buds from E11.5 (Figure 2—figure supplement 1A) and known to positively regulate interdigital cell death (Kuss et al., 2009). On the other hand, in bamboo shark fin buds, Aldh1a2 expression was initially uniform and was later restricted to the distal edge of fin buds (Figure 2—figure supplement 1A). Hand2 and Vcan transcripts were upregulated in mouse forelimb buds at E12.5 and downregulated in bamboo shark fin buds at st32 (Figure 2B,C). Thus, the temporal transcriptomic profiles were consistent with spatial expression patterns.

Figure 2 with 1 supplement see all
Detection of heterochronic gene expression between mouse limb buds and bamboo shark fin buds.

(A) Clustering analysis of gene expression dynamics. Each column represents an ortholog pair between the bamboo shark and the mouse. Each row indicates scaled gene expression at a time point indicated to the right of the heat map. Values are scaled TPMs. (B, C) Whole-mount in situ hybridization of Hand2 (B) and Vcan (C) as examples of the heterochronic genes detected in (A). Asterisks, background signals; scale bars, 200 μm. Error bars: SEM.

For a comparison, we found relatively few genes that were downregulated over time in the mouse limb bud but were upregulated in the shark fin. There was a total of 241 such genes, but only 43 of them displayed a clear heterochrony (yellow empty box in Figure 2—figure supplement 1B and Supplementary file 8 for the list of the genes). Of those, Fgf8 is particularly interesting as FGF8 plays a crucial role as a growth signal from the apical ectodermal ridge (AER) in mouse and chick limb buds (Lewandoski et al., 2000). As shown in Figure 2—figure supplement 1C, Fgf8 expression was high during the early stages of limb buds and was gradually downregulated at later stages. In contrast, in bamboo shark fin buds, Fgf8 was expressed very weakly (around 0.1 TPM) at st. 27 and st. 27.5 and was upregulated at later stages. Indeed, this late upregulation of Fgf8 was also reported in the apical fin fold (roughly equivalent to the AER) of zebrafish pectoral fin buds (Nomura et al., 2006). In the zebrafish pectoral fin bud, Fgf16 and Fgf24 are upregulated earlier than Fgf8 (Draper et al., 2003; Nomura et al., 2006). In addition, Fgf4, Fgf9, and Fgf17 are expressed in the AER and have a redundant function in the mouse limb bud (Mariani et al., 2008). Therefore, we also examined these other Fgf genes and found that moderate expression of Fgf9, Fgf16, and Fgf24 were detected in the early stages of bamboo shark fin buds (Figure 2—figure supplement 1C). Although we cannot infer the ancestral state of the expression pattern, the overlapping functions of these genes may have allowed subfunctionalization of the signaling molecules of the AER during vertebrate divergence. In sum, we detected mass heterochronic shifts in gene expression between bamboo shark fin buds and mouse forelimb buds. In particular, a mechanism to maintain upregulation of the expression of genes involved in early fin development may have been either gained in the tetrapod lineages or lost in the cartilaginous fish lineages.

Comparison of SHH signaling pathways in limb and fin buds

In tetrapod limbs, SHH controls growth and asymmetric gene expression along the anterior-posterior axis. Although previous studies have repeatedly implied a relatively delayed onset of Shh expression or a short signal duration in developing fins of several elasmobranch species (Dahn et al., 2007; Sakamoto et al., 2009; Yonei-Tamura et al., 2008), there has not been solid evidence to support such a delay due to the lack of systematic gene expression analysis and the poor staging system of these species. Because the heterochronic genes identified above include basic SHH target genes, such as Ptch1 and Gli1, we reexamined the expression dynamics of Shh and its target genes in mouse limb and bamboo shark fin buds. Because HOX genes are the upstream factors relative to Shh transcription (Zeller et al., 2009), we used them as a potential reference for developmental time. We first found that Shh transcription was present by the earliest stages examined in both bamboo shark fin and mouse limb buds, and it peaked when the transcription level of Hoxd9 and Hoxd10 was highest, suggesting that there was no apparent heterochrony in Shh transcription timing at least between these two species (red rectangles in Figure 3A and B). In contrast, SHH target genes, such as Ptch1/2, Gli1, Gremlin, and Hand2 (Vokes et al., 2008), did show a relatively extended period of expression in mouse limb buds as compared with their expression in bamboo shark fin buds. Namely, whereas the expression peak of SHH target genes was concurrent with that of Shh in the bamboo shark fin bud, these SHH target genes were highly expressed in E11.5 limb buds, which is one day later than the Shh expression peak (yellow rectangles in Figure 3A and B; see Figure 3—figure supplement 1 for intact TPM values). This timing difference is also apparent when comparing the expression peak of Hoxd11 and Hoxd12, which was concurrent with that of SHH target genes in mouse limb buds, but came after downregulation of SHH target genes in bamboo shark fin buds (green rectangles in Figure 3A and B). To confirm this observation, we performed whole-mount in situ hybridization for Ptch1 and Hoxd12 in mouse limb buds and bamboo shark fin buds. As previously reported (Lewis et al., 2001; Zákány et al., 2004), mouse limb buds showed a clear expansion of the expression domain of Ptch1 (upper panel in Figure 3C) from E10.5 to E11.5, which is accompanied by the anterior extension of the Hoxd12 expression domain (black arrowheads in Figure 3C). In contrast, Ptch1 was expressed in the posterior domain of bamboo shark fin buds at st. 29 (white arrowheads in Figure 3D), but was substantially downregulated by st. 31, whereas the Hoxd12 expression domain extended anteriorly at this stage (black arrowheads in Figure 3D). These results were roughly consistent with the RNA-seq data. We cannot completely reject the possibility that this timing difference is due to the different physical time-resolution of data sampling between these species (six time points over 20 days in the bamboo shark and four time points over 4 days in the mouse). However, given that this data set captured the similar expression dynamics of HoxA/D clusters between these species (Figure 1D; also see Figure 4C) as well as the differentiation dynamics of myocytes and chondrocytes (Figure 1—figure supplement 8), these results quite likely represent an interesting difference in the transcriptional regulation of SHH downstream genes between fins and limbs.

Figure 3 with 1 supplement see all
Shh pathway in mouse limb buds and bamboo shark fin buds.

(A, B) Scaled expression of Shh and related genes in mouse limb buds (A) and bamboo shark fin buds (B), respectively. The rectangles indicate the expression peaks of Shh, Hoxd9, and Hoxd10 (magenta), Shh target genes (yellow) and Hoxd11 and Hoxd12 (green). (C, D) Whole-mount in situ hybridization of Ptch1 and Hoxd12 in mouse limb buds (C) and bamboo shark fin buds (D); scale bars, 200 μm. White arrowheads in (D) indicate restricted expression of Ptch1 in bamboo shark fin buds. Black arrowheads in (C and D) indicate anteriorly extended expression of Hoxd12.

Figure 4 with 2 supplements see all
Hourglass-shaped conservation of the transcriptome profile between fins and limbs.

(A) Euclidean distances of the transcriptome profiles. Every combination of time points of bamboo shark fin buds and mouse limb buds is shown. The darker colors indicate a greater similarity between gene expression profiles. (B) A line plot of the Euclidean distances shown in (A). The x axis indicates the mouse limb stages, and the y axis is the Euclidean distance. (C) The same as (A) except that only Hoxd genes are included. (D, E) Scatter plots of the first and second principal components (D) and of the second and third components (e). Arrows in (E) indicate the time-order of transcriptome data. (F) Count of tissue-associated genes expressed in mouse forelimb buds. Genes with 0.65 ≤ entropy were counted.

Hourglass-shaped conservation

Several studies have reported a temporally heterogeneous diversification of embryonic transcriptomes, such that the middle stages are more conserved than early or late stages (e.g. Irie and Kuratani, 2011; Kalinka et al., 2010; Levin et al., 2012). These observations are considered to support the notion of the developmental hourglass (or egg timer), which has been proposed to explain the morphological similarity of mid-stage embryos based on developmental constraints, such as strong interactions between tissues or Hox-dependent organization of the body axis (Duboule, 1994; Raff, 1996). In addition, a previous transcriptomic analysis reported that the late stage of mammalian limb development has experienced relatively rapid evolution (Maier et al., 2017). To examine which developmental stages of fins and limbs are conserved, we calculated the distance between the fin and limb transcriptome data. As a result, four different distance methods that we examined consistently indicated that the limb bud at E10.5 and the fin buds at st27.5–30 tended to have a relatively similar expression profile (Figure 4A for a Euclidean distance measure and Figure 4—figure supplement 1 for other types of distance measures). In addition, the transcriptomic profile of all the stages of examined fin buds showed the highest similarity to that of E10.5 limb bud (Figure 4B). Therefore, the mid-stages of limb and fin buds tend to be conserved over 400 million years of evolution.

To find factors that underlie the mid-stage conservation, we analyzed Hox genes, which were proposed to be responsible for the developmental hourglass (Duboule, 1994). We found that the comparison of only Hox gene expression did not reproduce the hourglass-shaped conservation (Figure 4C), suggesting that other mechanisms constrain the middle stage of development. We further performed principal component analysis (PCA) of gene expression profiles to identify genes responsible for the hourglass-shaped conservation. The first component, PC1, distinguished transcriptome data mostly by species differences (Figure 4D). In contrast, PC2 was correlated with the temporal order of mouse limb buds (Figure 4D). PC2 was also weakly correlated with the temporal order of bamboo shark fin buds except at st27 (Figure 4D), but PC3 showed a clearer correlation (Figure 4E). These three components were mostly sufficient to reproduce the mid-stage conservation in Figure 4A (Figure 4—figure supplement 2A for the ratio of explained variables and 2B for the Euclidean distance measure). Interestingly, the plot with PC2 and PC3 roughly mirrored the hourglass-shaped conservation because the earliest and latest stages were placed more distantly than the middle stages in this representation (Figure 4E). Indeed, the major loadings of PC2 consisted of the conserved late expressed genes (C8) and the heterochronically regulated genes (C9 and C12) identified in Figure 2A (see Table 2 for the top 25 genes of PC2). Similarly, PC3 consisted of the conserved early genes (a part of C15) and the heterochronically regulated genes (C12 and C13; see Supplementary file 9 for the loadings of PC3 and others), suggesting that the presence of heterochronically regulated genes may at least partly contribute to the mid-stage conservation and the distant relationship between the early/late stages of fins and limbs. These results indicate that the mass heterochronic shift in gene expression, at least in part, contributes to the long distances between early- and late-stage expression profiles (Figure 4E).

Table 2
PCA loadings.
Loading axis: PC2
Gene symbolCluster nameLoading
TRHDEC80.31
PAX9C110.3
COL9A2C80.3
RTN4RC80.3
APC2C90.3
CNMDC80.29
HOXD13C80.29
FAM69CC80.29
WFIKKN2C80.29
HOXA13C80.29
LRRN3C120.29
HPSE2C90.29
SERPINB1AC110.29
CDKN2BC80.28
LTBP1C80.28
CDH19C80.28
PDZD2C80.28
NLGN3C90.28
MATN1C80.28
MYOD1C80.28
TSPAN11C120.28
SERINC2C90.28
FYBC80.28
KIF1AC80.28
COL9A3C80.28

Because a recent report suggests that pleiotropy of genes is related to hourglass-shaped conservation (EXPANDE Consortium et al., 2017), we counted the number of genes with stage- or tissue-specific expression. Consistent with the previous report (EXPANDE Consortium et al., 2017), we detected a relatively low number of stage-associated genes during the middle stages of mouse forelimb and bamboo shark fin development (Figure 4—figure supplement 2C). To evaluate the tissue specificity of genes, we first calculated Shannon entropy of gene expression patterns by analyzing RNA-seq data from 71 mouse tissues as released by the ENCODE project (Davis et al., 2018; Supplementary file 10 for the list of RNA-seq data). Namely, genes expressed only in a few tissues score lower with respect to entropy (thus, these genes are more specific). We counted genes with 1.0 ≥ TPM and 0.65 ≤ entropy and, again, found that the number of tissue-associated genes was relatively low at E10.5 (Figure 3F). Together, these results indicate an inverse correlation between the hourglass-shaped conservation and the number of tissue- and stage-specific genes.

Open chromatin region (OCR) conservation

Next, we systematically identified putative gene regulatory sequences involved in mouse limb development and sought a possible cause for the hourglass-shaped conservation in gene regulatory sequences. To this end, we applied ATAC-seq, which detects OCRs (putative active regulatory sequences), to time-series of forelimb buds at E9.5–E12.5 with three replicates. First, as a positive control, we found that ATAC-seq peaks that were determined by MACS2 peak caller covered 10 of 11 known limb enhancers of the HoxA cluster (Figure 5A and Figure 5—figure supplement 1), suggesting a high coverage of true regulatory sequences. Consistently, our ATAC-seq data showed relatively high scores for a quality control index, fraction of reads in peaks (FRiP), as compared with data downloaded from the ENCODE project (Davis et al., 2018; Figure 5B). Next, to examine evolutionary conservation, we performed BLASTN (Camacho et al., 2009) for the sequences in the ATAC-seq peaks against several vertebrate genomes. Reinforcing the result of the transcriptome analysis, we found that evolutionarily conserved sequences were most accessible at E10.5 (Figure 5C). To confirm this result, we also used a different alignment algorithm, LAST (Kiełbasa et al., 2011) with the bamboo shark and the alligator (Green et al., 2014) genomes. Alignment results for both analyses consistently indicated that the OCRs of E10.5 forelimb bud more frequently contained conserved sequences relative to those of other time points (Figure 5D; see Figure 5—figure supplement 2A and B for the absolute counts of conserved sequences). Therefore, activation of conserved gene regulatory sequences may be one of the proximate causes for the hourglass-shaped conservation of fin and limb transcriptome data.

Figure 5 with 2 supplements see all
Hourglass-shaped conservation of OCRs in mouse limb development.

(A) ATAC-seq signals in the enhancer regions of the HoxA cluster. e1 to e4, known limb enhancers. Green vertical lines below the signals, peak regions determined by MACS2. (B) Comparison of a quality index, FRiP, for ATAC-seq data. Blue bars are samples with a FRiP score >0.2. The number in the end of the label name indicates the replicate number. (C) Conservation analysis of sequences in ATAC-seq peaks with BLASTN. The values to the right of each graph indicate the fraction of conserved sequences in the total peak regions. The common name of each genome sequence is indicated above the graph. The not-conserved heatmap indicates the fraction of sequences that were not aligned to any genome sequences and thus serve as a negative control. (D) Temporal changes of sequence conservation frequency in ATAC-seq peaks with LAST. Error bars: SEM.

Temporal dynamics of open chromatin domains

To further characterize the ATAC-seq peaks, we next performed a clustering analysis. Using one of the three replicates for each stage, we collected the summits of peaks and the surrounding 1400 bp and carried out hierarchical clustering, which resulted in eight clusters (C1–C8; Figure 6A) that consisted of broad (C1 and C2), E11.5/E12.5-specific (C3 and C4), stable (C5 and C6), E10.5-specific (C7), and E9.5-specific (C8) peaks. The overall clustering pattern was reproducible by other combinations of replicates if its FRiP was ≥0.20 (Figure 6—figure supplement 1). Consistent with the above conservation analysis, E10.5-specific peaks frequently overlapped conserved sequences (Figure 5—figure supplement 2C and D).

Figure 6 with 5 supplements see all
Temporal dynamics of OCRs during mouse limb development.

(A) The heatmap (left) shows whole-genome clustering of ATAC-seq peaks. Each row indicates a particular genome region with a length of 1400 bp. Columns indicate developmental stages. C1−C8 are cluster numbers. The motifs (right) show the rank of enriched motifs in the sequences of each cluster. (B) Top, volcano plots of ATAC-seq signals between indicated stages (p-values, two-sided Student’s t-test). Bottom, the counts of differential signals (black dots in the top panel). + and − are genomic regions with increased or decreased signals, respectively. (C) The fraction of limb-specific OCRs for each cluster.

To characterize the regulatory features of the clusters, we performed motif analysis in each cluster using HOMER (Heinz et al., 2010). First, it was convincing that stable peaks (C5 and C6) were enriched for the CTCF binding motif both in de novo motif discovery (Figure 6A) and known motif enrichment analysis (Figure 6—figure supplement 2), which is a major regulator of three-dimensional genomic structure. This result was consistent whether random genomic regions or other peak regions were used for the background (Figure 6—figure supplement 3). In addition, E11.5/E12.5-specific peak C3 was enriched for the HOX13 motif (Figure 6A), which was consistent with the increase in the expression of 5ʹ Hox genes (Figure 1D). C4 was also enriched for motifs similar to those of C3, but the HOX13 motif was detected only in known motif enrichment analysis (compare Figure 6—figure supplements 2 and 3). The enrichment of the HOX9 motif in E10.5-specific peaks (C7) was also consistent with our RNA-seq data, in which Hoxd9 and Hoxa9 expression levels peaked at E10.5 (Figure 1D). Interestingly, in E10.5-specific peaks (C7), the LHX1-binding motif was ranked at the top of the motif enrichment list the closely related transcription factors Lhx2, Lhx9, and Lmx1b are required to mediate a signaling feedback loop between ectoderm and mesenchyme in limb development (Tzchori et al., 2009). C8 was enriched for motifs similar to those in C7 (e.g. COUP-TFII), but the top-ranked transcription factor in the de novo motif discovery analysis was VSX2, which has a very similar binding sequence to the LHX motif (Figure 6—figure supplement 4). The LHX motif was top-ranked in C8 for the known motif enrichment analysis (Figure 6—figure supplement 2). For a better understanding of the dynamics of transcription factor motifs, we counted the average number of the above detected motifs within the OCRs of each stage, which revealed a transitional increase in LHX and HOX9 motifs at E10.5 and a gradual increase in the motifs detected in C3 over the developmental stages (Figure 6—figure supplement 3). In addition, Gene Ontology (GO) analysis for the peaks in each cluster revealed that the constitutively accessible peaks (C5, C6) were closely located to genes annotated with ‘cellular components’ (Supplementary file 11). Interestingly, the dynamically regulated peaks (C3, C4, C7, C8) were associated with genes with ‘developmental process’, ‘multicellular organism development’, and ‘anatomical structure morphogenesis’ (Supplementary file 11), suggesting that these dynamic OCRs regulate developmental genes. Together, these results suggest that there are E10.5-specific transient OCRs that exhibit several characteristics including their evolutionary conservation, the presence of LHX and HOX9 motifs and a close relation with developmental genes.

To confirm the results from the above clustering analysis, we also determined the genomic regions that showed a statistically significant increase or decrease in the ATAC-seq signal within a day by using all replicates. As a result, ATAC-seq signals were most increased during the transition from E9.5 to E10.5 in the mouse limb bud. From E10.5 to E11.5, the total number of decreased and increased signals was highest, indicating that the OCR landscape was most dynamically changing at E10.5 (Figure 6B). In contrast, relatively few significant changes were observed from E11.5 to E12.5. Thus, in contrast to the transcriptome analysis, stage-specific gene regulatory sequences are likely to be most accessible at E10.5. Moreover, by comparing the peaks of each cluster identified above with ATAC-seq peaks of other cells and tissues released by the ENCODE project (Davis et al., 2018; Supplementary file 10 for the full list of cells and tissues), we discovered that the C7 cluster (E10.5-specific peaks) contained more peaks that did not overlap with those of other cells and tissues. Again, in contrast to the transcriptome analysis, the data suggest that gene regulatory sequences that are accessible only at E10.5 tend to be limb-specific (Figure 6C). Taken together, these analyses revealed a unique regulatory landscape of forelimb buds at E10.5, which is enriched for evolutionarily conserved stage-specific and tissue-specific OCRs.

Discussion

In this work, we applied transcriptomics and chromatin accessibility analysis to systematically study genetic changes that differentiate fins from limbs. Because of the slow sequence evolution and the embryo availability of the bamboo shark, we were able to compare transcriptional regulation of genes with high accuracy and found both heterochronic shifts and hourglass-shaped conservation of transcriptional regulation between fin and limb development. Here, we discuss the interpretations, limitations, and implications of these results.

Our time-series transcriptome data indicated that a remarkable number of genes that exhibit the highest expression during the late stages of mouse limb bud development are decreased during the late stages of bamboo shark fin development (Figure 2). The simplest hypothesis for this mass heterochronic shift is that the later stages of limb development gained expression of one or a few upstream transcription factor(s) or signaling molecules that collectively regulate this group of genes. Interestingly, we also observed relatively extensive expression of the downstream targets of the SHH signaling pathway in mouse limb buds, as compared with bamboo shark fin buds (Figure 3). Because SHH-independent regulation of its target genes through the GLI3-HOX complex was previously reported (Chen et al., 2004), the mismatch between the peak expression of Shh and its target genes may be caused by such SHH-independent regulatory mechanisms that are absent in bamboo shark fin development. Given that direct and genetic interactions of GLI3 and HOX have a significant impact on autopod formation, the emergence of this interaction may be a key component of the mass heterochronic shift and the acquisition of autopod-related developmental regulation in the tetrapod lineages. However, because we compared only two species, it is equally possible that the late stages of shark fin development lost this SHH-independent gene regulation. Alternatively, given that the evolutionary distance between these two species is >400 million years, it is also possible that every one of these genes independently shifted their expression to the later stages of limb development or to the early stages of shark fin development. Further taxon sampling and functional analyses will reveal the relation between the mass heterochronic shift and the emergence of the autopod.

Related to the potential changes in regulation of SHH target genes, by analyzing catshark fin buds, we previously proposed that the expression domains of genes that are positively regulated by SHH might have expanded anteriorly during the fin-to-limb transition (Onimaru et al., 2015). We speculated that this expression changes may be linked to the loss of pro- and mesopterygial elements. Recently, this hypothesis was partially supported by another group who compared the gene expression pattern of lungfish and cichlid fin development (Woltering et al., 2020), where lungfish fin buds seem to exhibit an intermediate condition between non-sarcopterygian fish fins and tetrapod limbs in terms of gene expression distribution along the anterior-posterior axis. This group particularly emphasized that the absence and presence of the dynamics of the anterior expansion of Hoxd13 expression correlate with the difference between the metapterygial morphologies of lungfish and tetrapods (also see Johanson et al., 2007 for a conflicting report). However, the significance of changes in Hoxd13 expression remains unclear because of the following two reasons: (a) Hoxd13 expression pattern seems to quite vary among species—the anterior expansion of Hoxd13 expression has been observed in the fin buds of the little skate, the small-spotted catshark, and Polyodon (Davis et al., 2007; Freitas et al., 2007; Nakamura et al., 2015), while not in those of zebrafish and cichlids (Ahn and Ho, 2008; Woltering et al., 2020) and (b) in fish fins, the expression domain of Hoxa13, whose function is mostly redundant with Hoxd13, commonly spans from anterior to posterior regions in fish fin buds like as tetrapod limbs (Davis et al., 2007; Freitas et al., 2007; Nakamura et al., 2016). Therefore, while changes in Hoxd13 expression domain are likely to contribute to some degree of anatomical diversity, their impact is questionable in the context of the fin-to-limb transition. Nevertheless, our previous study and Woltering et al. commonly suggest that the anterior expansion of gene expression domains is likely associated with the substantial anatomical changes during the fin-to-limb transition. As discussed above, we speculate that the mass heterochronic shifts that we observed in the present study may be related to the gain of SHH-independent regulation of its target genes. Therefore, whether the anterior expansion of SHH-target gene expression is related to the mass heterochronic shifts will be one of the interesting questions to address in the future.

We observed that gene expression profiles are most highly conserved between bamboo shark fin buds at st. 27.5–30 and mouse forelimb buds at E10.5 (Figure 4). Consistent with this result, our chromatin accessibility analysis reveals that OCRs at E10.5 tend to contain evolutionarily conserved sequences (Figure 5). Whereas transcriptomic conservation during the middle of embryonic development has been reported by many groups using different species (e.g. Irie and Kuratani, 2011; Kalinka et al., 2010), analysis of regulatory sequence conservation during embryonic development has been either incomplete or controversial. For example, by analyzing histone acetylation marks on several developing organs in mouse embryos, Nord et al., 2013 proposed regulatory sequences active at E11.5 are exposed by the highest evolutionary constraint. However, they used stem cell lines as the substitutes for organs at early stages. Another study showed that genes expressed at the segmentation stage of zebrafish embryos tended to be surrounded by highly conserved non-coding sequences (Piasecka et al., 2013). Although their results are in line with our present study as discussed below, they did not show that these highly conserved non-coding sequences were indeed active at the segmentation stage. In addition to these studies, there is a conflicting observation that early, instead of middle, embryonic stages tend to be regulated by conserved OCRs (Uesaka et al., 2019). Therefore, our present study is the first to convincingly show a clear correlation of conservation status between transcriptomic data and OCRs. Our results suggest that evolutionary constraints on the gene regulatory apparatus are present during the middle stage of fin and limb development. What drives the hourglass-shaped conservation is still under debate. Interestingly, we found that stage- and tissue-specific OCRs were enriched in this conserved period, during which a relatively low number of stage- and tissue-specific genes were expressed (Figure 6). These quite contrasting observations imply that the mid-stage limb development is enriched for pleiotropic genes controlled by multiple tissue-specific enhancers, including limb-specific ones, rather than by constitutive promoters that often regulate housekeeping genes. Therefore, we speculate that, at least in the case of limb development, complex regulatory sequences that execute spatiotemporally specific transcriptional controls over pleiotropic genes constrain the evolvability of this particular period of morphogenesis, probably due to the vulnerability of complex regulation to genetic mutations.

In conclusion, the present study provides insights for the evolutionary origin of gene regulation that differentiates fins from limbs. In particular, comparative transcriptional analyses prompted us to hypothesize that mass heterochronic shifts of gene expression may have occurred during the fin-to-limb evolution. In addition, both transcriptome and open chromatin data point to an evolutionary constraint during mid-stage limb development, likely owing to gene regulatory complexity. Although these hypotheses require further taxon sampling and experimental tests, this study opens up new prospects for understanding not only the genetic basis of the fin-to-limb transition but also the general nature of morphological evolution.

Materials and methods

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Gene (Mus musculus)Hand2ENSEMBLENSMUST00000040104.4N/A
Gene (Mus musculus)VcanENSEMBLENSMUST00000109546.8N/A
Gene (Mus musculus)Aldh1a2ENSEMBLENSMUST00000034723.5N/A
Gene (Mus musculus)Ptch1ENSEMBLENSMUST00000192155.5N/A
Gene (Mus musculus)Hoxd12ENSEMBLENSMUST00000001878.5N/A
Gene (Chiloscyllium punctatum)Hand2ENSEMBLChipun0004250/g4250.t1/ TRINITY_DN85524_c0_g1_i1N/A
Gene (Chiloscyllium punctatum)VcanThis paperChipun0003941/g3941.t1/ TRINITY_DN95522_c0_g1_i8N/A
Gene (Chiloscyllium punctatum)Hoxd12This paperChipun0005654/g5654.t1/TRINITY_DN85970_c0_TRINITYg1_i1N/A
Gene (Chiloscyllium punctatum)Ptch1This paperChipun0003320/g3320.t1/TRINITY_DN92499_c0_g1_i3N/A
Gene (Chiloscyllium punctatum)Aldh1a2This paperChipun0010503/g10503.t1/TRINITY_DN81423_c0_g1_i1N/A
Strain, strain background (Mus musculus)C52BL/6Laboratory for Animal Resources and Genetic Engineering RIKEN,N/AN/A
AntibodyAnti-Digoxigenin-AP, Fab fragments (Sheep)Millipore SigmaCat# 11093274910polyclonal
(1:4000)
Sequence-based reagentMus musculus Hand2 forward primerThis paperPCR primersACCAAACTCTCCAAGATCAAGACACTG
Sequence-based reagentMus musculus Hand2 reverse primerThis paperPCR primersTTGAATACTTACAATGTTTACACCTTC
Sequence-based reagentMus musculus Vcan forward primerThis paperPCR primersTGCAAAGATGGTTTCATTCAGCGACAC
Sequence-based reagentMus musculus Vcan reverse primerThis paperPCR primersACACGTGCAGAGACCTGCAAGATGCTG
Sequence-based reagentMus musculus Aldh1a2 forward primerThis paperPCR primersACCGTGTTCTCCAACGTCACTGATGAC
Sequence-based reagentMus musculus Aldh1a2 reverse primerThis paperPCR primersTCTGTCAGTAACAGTATGGAGAGCTTG
Sequence-based reagentMus musculus Hoxd12 forward primerThis paperPCR primersCTCAACTTGAACATGGCAGTGCAAGTG
Sequence-based reagentMus musculus Hoxd12 reverse primerThis paperPCR primersAGCTCTAGCTAGGCTCCTGTTTCATGC
Sequence-based reagentMus musculus Ptch1 forward primerThis paperPCR primersGGGAAGGCAGTTCATTGTTACTGTAACTG
Sequence-based reagentMus musculus Ptch1 reverse primerThis paperPCR primersTGTAATACGACTCACTATAGGTCAGAAGCTGCCACACACAGGCATGAAGC
Sequence-based reagentChiloscyllium punctatum Hand2 forward primerThis paperPCR primersACCAGCTACATTGCCTACCTCATGGAC
Sequence-based reagentChiloscyllium punctatum Hand2 reverse primerThis paperPCR primersCACTTGTTGAACGGAAGTGCACAAGTC
Sequence-based reagentChiloscyllium punctatum Vcan forward primerThis paperPCR primersAGCTTGGGAAGATGCAGAGAAGGAATG
Sequence-based reagentChiloscyllium punctatum Vcan reverse primerThis paperPCR primersAGAGCAGCTTCACAATGCAGTCTCTGG
Sequence-based reagentChiloscyllium punctatum Aldh1a2 forward primerThis paperPCR primersTTGAACTTGTACTAAGTGGTATCGCTG
Sequence-based reagentChiloscyllium punctatum Aldh1a2 reverse primerThis paperPCR primersAGGATGTGAACATTAGGCTGACCTCAC
Sequence-based reagentChiloscyllium punctatum Hoxd12 forward primerThis paperPCR primersGCCAGTATGCAACAGATCCTCTGATGG
Sequence-based reagentChiloscyllium punctatum Hoxd12 reverse primerThis paperPCR primersCTAATGACCTGTTGTACTTACATTCTC
Sequence-based reagentChiloscyllium punctatum Ptch1 forward primerThis paperPCR primersTTCAGCCAGATTGCAGATTACATCAACC
Sequence-based reagentChiloscyllium punctatum Ptch1 reverse primerThis paperPCR primersTTCTCTGTGTTTCACATTCAACGTCCTG
Commercial assay or kitNextera DNA Sample Preparation KitIlluminaCat# FC-121–1031
Commercial assay or kitTruSeq Stranded mRNA LT Sample Prep KitIlluminaCat# RS-122–2101
Software, algorithmTrinityhttps://github.com/trinityrnaseq/trinityrnaseqRRID:SCR_013048N/A
Software, algorithmBowtie2http://bowtie-bio.sourceforge.net/bowtie2/index.shtmlRRID:SCR_016368N/A
Software, algorithmBWAhttp://bio-bwa.sourceforge.net/RRID:SCR_010910N/A
Software, algorithmMACS2https://github.com/macs3-project/MACSRRID:SCR_013291N/A
Software, algorithmHOMERhttp://homer.ucsd.edu/homer/motif/RRID:SCR_010881N/A
Software, algorithmRSEMhttps://github.com/deweylab/RSEMRRID:SCR_013027N/A
Software, algorithmscikit-learnhttps://scikit-learn.org/stable/RRID:SCR_002577N/A

Animals

Animal experiments were conducted in accordance with the guidelines approved by the Institutional Animal Care and Use Committee (IACUC), RIKEN Kobe Branch, and experiments involving mice were approved by IACUC (K2017-ER032). The eggs of brownbanded bamboo shark (C. punctatum) were kindly provided by Osaka Aquarium Kaiyukan and were incubated at 25°C in artificial seawater (MARINE ART Hi, Tomita Pharmaceutical Co., Ltd.) and staged according to the published staging table (Onimaru et al., 2018). For mouse embryos, C52BL/6 timed-pregnant females were supplied by the animal facility of Kobe RIKEN, LARGE and sacrificed at different days after 9.5–12.5 days of gestation. For RNA-seq, fin buds and limb buds were dissected in cold seawater and phosphate-buffered saline (PBS), respectively, and stored at −80°C. For in situ hybridization, embryos were fixed overnight in 4% paraformaldehyde in PBS, dehydrated in a graded methanol series, and stored in 100% methanol at −20°C.

RNA-seq

Request a detailed protocol

We sampled mouse forelimb buds at E9.5, E10.5, E11.5 and E12.5 and bamboo shark pectoral fin buds at st27, st27.5, st29, st30, st31, and st32 and pooled several individual samples by stage to obtain enough RNA for each time point. We considered this pooled sample to represent one biological replicate (other replicates were generated using different individuals). Total RNAs from these samples were extracted with the RNeasy Micro and Mini plus kit (QIAGEN, Cat. No. 74034 and 74134) and PicoPure RNA Isolation Kit (ThermoFisher, Cat. No. KIT0214). Genomic DNA was removed with gDNA Eliminator columns included with this kit. For quality control, the Agilent 2100 Bioanalyzer system and Agilent RNA 6000 Nano Kit (Agilent, Cat. No. 5067–1511) were used to measure the RNA integrity number for each sample. Using 237 ng of each of the RNA samples, strand-specific single-end RNA-seq libraries were prepared with the TruSeq Stranded mRNA LT Sample Prep Kit (Illumina, Cat. No. RS-122–2101 and/or RS-122–2102). For purification, we applied 1.8× (after end repair) and 1.0× (after adapter ligation and PCR) volumes of Agencourt AMPure XP (Beckman Coulter, Cat. No. A63880). The optimal number of PCR cycles for library amplification was determined by a preliminary quantitative PCR using KAPA HiFi HotStart Real-Time Library Amplification Kit (KAPA, Cat. No. KK2702) and was estimated to be 11 cycles for mouse limb buds and 10 cycles for bamboo shark fin buds. The quality of the libraries was checked by Agilent 4200 TapeStation High Sensitivity D1000. The libraries were sequenced after on-board cluster generation for 80 cycles using 1× HiSeq Rapid SBS Kit v2 (Illumina, Cat. No. FC-402–4022) and HiSeq SR Rapid Cluster Kit v2 (Illumina, Cat. No. GD-402–4002) on a HiSeq 1500 (Illumina) operated by HiSeq Control Software v2.2.58 (Run type: SR80 bp). The output was processed with Illumina RTA v1.18.64 for base-calling and with bcl2fastq v1.8.4 for de-multiplexing. Quality control of the obtained fastq files for individual libraries was performed with FASTQC v0.11.5. RNA-seq was performed with three biological replicates for each stage.

Transcriptome assembly and orthology assignment

Request a detailed protocol

We used the NCBI RefSeq mouse proteins (GRCm38.p5; only curated proteins were used) and two bamboo shark gene lists: a genome sequence-based gene model (Hara et al., 2018) and transcripts assembled from RNA-seq in this study (see below) for orthology assignment. The amino acid sequences of the published gene model of the bamboo shark are available from https://doi.org/10.6084/m9.fig (Supplementary file 1). For the transcriptome assembly, the short reads from the bamboo shark RNA-seq data were trimmed and filtered with Trim Galore! (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) and assembled using Trinity v2.4.0 (Grabherr et al., 2011; options: --SS_lib_type RF --normalize_max_read_cov 200 min_kmer_cov 2). Protein coding sequences were predicted with a program that finds coding regions, TransDecoder v3.0.1 (Haas et al., 2013), according to the guide in TransDecoder (Supplementary file 2 and 3). Using these coding gene lists as queries, orthologous pairs were assigned as illustrated in Figure 1—figure supplement 1. The idea behind this algorithm is the ‘gar bridge’ (Braasch et al., 2016), an empirical observation that a comparison including intermediate and slowly evolving animals yields a better resolution for identifying homologous sequences than a direct comparison between two species. First, BLASTP v2.7.1 was performed between mouse and bamboo shark genes reciprocally, and also against the coding genes of the elephantfish (or elephant shark; Callorhinchus_milii-6.1.3), spotted gar (LepOcu1), coelacanth (LatCha1), chicken (GRCg6a), alligator (ASM28112v4), and human (GRCh38.p12; options: -outfmt 6 -evalue 1e-30 -window 0). Then, the BLASTP results of bamboo shark queries against the animals listed above (except for the elephantfish) were concatenated, and the best hit across species (cross-species best hit) was identified for each of the bamboo shark genes. If there was no cross-species best hit, then the best hit among the elephantfish genes was retrieved, which may include cartilaginous fish-specific genes. Subsequently, orthologous pairs between mouse and bamboo shark genes were assigned by checking if a cross-species best hit from the bamboo shark BLASTP results also had a best hit in the BLASTP result of mouse genes against the corresponding animal (species-wise best hit; Supplementary files 4, 5, 6).

For quality control, the orthology of Fgf family members was independently determined by generating molecular phylogenetic trees (Figure 1—figure supplements 2 and 3). Amino acid sequences were aligned with an alignment tool, MAFFT v7.419–1 (Katoh et al., 2002; options: --localpair --maxiterate 1000) and trimmed with trimAL v1.2 (Capella-Gutiérrez et al., 2009; options: -gt 0.9 -cons 60). Then, maximum-likelihood trees were constructed with RaxML v8.2.12 (Stamatakis, 2014; options: -x 12345 p 12345 m PROTGAMMAWAG -f a -# 100). The orthology of Hox genes was confirmed based on genome synteny. These independently confirmed orthologous pairs were compared with the results of the above orthology assignment algorithm. For a comparison, we also used the results from a reciprocal best hit algorithm, proteinOrtho v6.0.4 (Lechner et al., 2011) and the previously generated orthology groups (Hara et al., 2018; Figure 1B).

Quantification and scaling

Request a detailed protocol

The trimmed RNA-seq short reads were aligned to the transcript contigs for the bamboo shark and curated RefSeq genes (GRCm38.p5) for the mouse using RSEM v1.3.0 (Li and Dewey, 2011) and Perl scripts (align_and_estimate_abundance.pl and abundance_estimates_to_matrix.pl) in the Trinity package. TPM (transcripts per million), but not TMM (trimmed mean of M-values), was used for all analyses, because we found some artificial biases in TMM values (see Figure 1—figure supplement 4). TPM values from the splicing variants of a single gene were summed up to generate a single value per gene. Then, the means and standard errors of TPM values from three replicates were used for the downstream analyses. Genes with a maximum TPM <1.0 were considered not expressed. For clustering and distance measures, TPM values were scaled so that the maximum value of each gene of each species was set to ‘1’ (Max 1). Whereas this scaling method loses information with respect to the absolute value of the TPMs, it has a substantial advantage when comparisons are being made between evolutionarily distant species. Indeed, previous comparative transcriptome studies have scaled gene expression values in different ways. Among those approaches, the use of Z-scores (standardization) and log transformations are relatively common strategies (e.g. Kalinka et al., 2010; Leiboff and Hake, 2019; Levin et al., 2016). Some researchers have used the intact RPKM (reads per kilobase per million) values to compare closely related species (Wang et al., 2013), but, because the RPKM is known to be inconsistent between samples even within a species (Wagner et al., 2012). Scaled transcriptional values are commonly used for clustering analyses and visualization of transcriptomic data from different samples within a single species. In this case, scaling is mainly aimed at flattening the dynamic range of transcription levels among genes. For inter-specific comparisons, scaling is also useful for being simultaneously sensitive to differentially regulated genes and also insensitive to conserved housekeeping genes. Here, we examine the effect of several scaling methods and the use of intact TPM values. We define the four relevant scaling methods as follows:

Mg,s,t=xg,s,tmax{xg,s,t:t=1..Ts}
Zg,s,t=xg,s,t-x¯g,sσg,s
Ug,s,t=xg,s,t{xg,s,t:t=1..Ts}
Lg,s,t=log10xg,s,t+1

where xg,s,t is the intact TPM of gene g, species s, and time point t; Ts is the total number of time points in species s; Mg,s,t, Zg,s,t, Ug,s,t and Lg,s,t are scaled values that we refer to as the Max 1, Z-score, Unit vector and Log10 methods, respectively; and x¯g,s and σg,s are the mean and standard deviation, respectively, of {xg,s,1...xg,s,Ts}.

First, we take a simple example to develop some intuition as to how these calculations transform TPM values. Let us assume that we compare two species [(species 1 and species 2)], and each species has two genes (gene 1 and gene 2) and three developmental time points (t1, t2, and t3; Figure 1—figure supplement 5A). Gene 1 is a constitutively active gene (i.e. a housekeeping gene), and gene 2 is differentially regulated between species. In this example, we want to identify t2 as the most conserved time point because gene two is expressed in both species at this time point. In addition, we want to ignore the subtle expression differences of gene one within and between species. As seen in Figure 1—figure supplement 5A, scaling by the Max 1, Unit vector, and Log10 methods effectively conserves the expression dynamics of gene two while suppressing the expression noise of gene 1. In contrast, Z-score scaling amplifies the expression dynamics of both genes to the same degree, which suggests that the Z-score method is sensitive to noise. Calculation of the Euclidean distances for each time point between species 1 and 2 (‘Distance’ in Figure 1—figure supplement 5A) shows that although all scaling methods and the use of intact TPMs indicate that t2 is the most similar time point, Max one creates a greater contrast between conserved and non-conserved time points than the other methods. Therefore, Max one is likely to be able to sensitively detect inter-specific differences. We also examined a subset of our real transcriptomic data from mouse limb buds and bamboo shark fin buds. As an example, we chose three housekeeping genes conserved in most vertebrates, Psmd5, Mrpl21, and Polr1b—these genes are listed both in a housekeeping gene list https://www.tau.ac.il/~elieis/HKG/HK_genes.txt (Eisenberg and Levanon, 2013) and in the BUSCO data set, a gene list used to assess the completeness of genome assemblies (Simão et al., 2015). As shown in Figure 1—figure supplement 5B and C, the TPM values of these genes were stable throughout developmental time in both species, suggesting that these genes also play a role in the maintenance of basic cellular function in bamboo shark fin development. However, the TPM values of Mrpl21 and Polr1b in mouse limb buds were roughly twice as high as those in bamboo shark fin buds. One explanation for this finding is that the expression of housekeeping genes is low in the bamboo shark because the relatively low temperature of the environment in which it lives slows its metabolic activity. We note, however, that there are many technical uncertainties when directly interpreting TPM values, particularly when comparing distantly related species. For example, differences in DNA sequences of transcripts (such as variations in GC content) between species probably affects the efficiency of library preparation and sequencing. The TPM values are also likely to be biased because of the incompleteness of the reference transcriptome sequence that we used for the bamboo shark (e.g. some genes lack 3ʹ untranlated regions). Therefore, the dynamics of TPM values extracted by scaling methods rather than absolute TPM values are likely to contain more biologically relevant information. Of the scaling methods, Max 1, Unit vector, and Log10 conserved the stable expression profile of the housekeeping genes, whereas the Z-score method amplified the subtle variation in TPM values as seen in the above simple example (Figure 1—figure supplement 5B). In particular, the Max one and Unit vector methods transformed the TPM values into relatively comparable values between the two species (compare Figure 1—figure supplement 5B with C). For a comparison, we also examined three genes that are heterochronically regulated between bamboo shark fin buds and mouse limb buds (Figure 1—figure supplement 6A and B). In this case, all of the scaling methods seemed to conserve the temporal dynamics of gene expression.

To obtain an objective measure, we calculated the ratio of the interspecific Euclidean distance of the three housekeeping genes to that of the three heterochronic genes with different scaling methods (Figure 1—figure supplement 6C and D). Namely, the Euclidean distance of expression values was close to zero if we used only housekeeping genes (left panel of Figure 1—figure supplement 6C), but it was larger when comparing heterochronic genes (right panel of Figure 1—figure supplement 6C). As a result, the Max1 method resulted in the highest ratio (Figure 1—figure supplement 6D), suggesting that Max1 is most sensitive to interspecific differences in dynamically regulated genes.

Clustering analyses of transcriptome data

Request a detailed protocol

The scaled values of each orthologous pair were concatenated as a 10-dimensional vector (consisting of four stages for mouse limb buds and six stages for bamboo shark fin buds), and all gene expression vectors were dimensionally reduced with UMAP (hyper parameters: a = 10, b = 1.8) followed by hierarchical clustering (hyper parameters: method = 'ward', metric = 'euclidean'; the code is available at https://github.com/koonimaru/easy_heatmapper; copy archived at https://archive.softwareheritage.org/swh:1:dir:b1b8edece650ac9e8a7458354aaf69e74f437092;origin=https://github.com/koonimaru/easy_heatmapper;visit=swh:1:snp:a69e903d0efcde99cb203ec86832c5e5c56a43e5;anchor=swh:1:rev:ba1fde133621a52390b82b4c9f73711a56f252b8/). To find genes that have an opposite trend in their expression relative to 'Heterochronic2', a Pearson correlation coefficient (PCC) for TPM values and developmental stages was calculated for each gene for each species, and genes with PCC > 0.5 for bamboo shark fin buds and PCC < −0.5 for mouse limb buds were listed (Figure 2—figure supplement 1B and Supplementary file 8). For the distance measurements, four different distance methods were calculated: Euclidean distance (ui-vi2), correlation distance (1-u-u¯v-v¯u-u¯2v-v¯2), Shannon distance (-12uilogui+vi2ui+vilogui+vi2vi), standardized Euclidean distance (ui-vi2/Vi), where u and v are gene expression vectors of two samples and Vi is the variance computed over all the values of gene i. For PCA analysis, we used the PCA module in a python package, scikit-learn (https://scikit-learn.org/stable/).

For the stage-associated gene analysis in Figure 3—figure supplement 1B and C, we first calculated the z-score of each gene at each stage as uk,i-u¯iσi, where uk,i is the TPM value of gene i at stage k, u¯i is a mean of TPM over all the stages, and σi is the standard deviation of the TPM. Genes with TPM ≥ 1.0 and the absolute Z-score ≥1.0 were counted as stage-associated genes. For the tissue-associated gene analysis, the entropy of each gene was calculated using RNA-seq data of 71 tissues downloaded from the ENCODE web site (https://www.encodeproject.org/; see Supplementary file 10 for all list). Entropy was calculated as follows:

pk,i=TPMk,ikTPMk,i
Hi=-kpk,ilogpk,i

where TPMk,i is the TPM value of gene i in tissue k, pk,i is a probability distribution and Hi is entropy. Genes with TPM (of mouse limb buds) ≥ 1.0 and 0.65 ≤ entropy were counted as tissue-associated genes.

Whole-mount in situ hybridization

Request a detailed protocol

To clone DNA sequences for RNA probes, we used primers that were based on the nucleotide sequences in the ENSEMBL database (https://www.ensembl.org) for mouse genes and in the transcriptome assembly (Supplementary file 3); bamboo shark Hand2 (Chipun0004250/g4250.t1/ TRINITY_DN85524_c0_g1_i1), 5′-ACCAGCTACATTGCCTACCTCATGGAC-3′ and 5′-CACTTGTTGAACGGAAGTGCACAAGTC-3′; bamboo shark Vcan (Chipun0003941/g3941.t1/ TRINITY_DN95522_c0_g1_i8), 5′-AGCTTGGGAAGATGCAGAGAAGGAATG-3′ and 5′-AGAGCAGCTTCACAATGCAGTCTCTGG-3′; bamboo shark Hoxd12 (Chipun0005654/g5654.t1/TRINITY_DN85970_c0_g1_i1), 5′-GCCAGTATGCAACAGATCCTCTGATGG-3′ and 5′-CTAATGACCTGTTGTACTTACATTCTC-3′; bamboo shark Ptch1 (Chipun0003320/g3320.t1/TRINITY_DN92499_c0_g1_i3), 5′-TTCAGCCAGATTGCAGATTACATCAACC-3′ and 5′-TTCTCTGTGTTTCACATTCAACGTCCTG-3′; bamboo shark Aldh1a2 (Chipun0010503/g10503.t1/TRINITY_DN81423_c0_g1_i1), 5′-TTGAACTTGTACTAAGTGGTATCGCTG-3′ and 5′-AGGATGTGAACATTAGGCTGACCTCAC-3′; mouse Hand2 (ENSMUST00000040104.4), 5′-ACCAAACTCTCCAAGATCAAGACACTG-3′ and 5′-TTGAATACTTACAATGTTTACACCTTC-3′; mouse Vcan (ENSMUST00000109546.8), 5′-TGCAAAGATGGTTTCATTCAGCGACAC-3′ and 5′-ACACGTGCAGAGACCTGCAAGATGCTG-3′; mouse Hoxd12 (ENSMUST00000109546.8), 5′-TGCAAAGATGGTTTCATTCAGCGACAC-3′ and 5′-ACACGTGCAGAGACCTGCAAGATGCTG-3′; mouse Aldh1a2 (ENSMUST00000034723.5), 5′-ACCGTGTTCTCCAACGTCACTGATGAC-3′ and 5′-TCTGTCAGTAACAGTATGGAGAGCTTG-3′; mouse Ptch1 (ENSMUST00000192155.5), 5′-GGGAAGGCAGTTCATTGTTACTGTAACTG-3′ and 5′-TGTAATACGACTCACTATAGGTCAGAAGCTGCCACACACAGGCATGAAGC-3′. Note that although we also tried bamboo shark Shh expression analysis using several RNA probes, we did not obtain specific signals. Fixed embryos were processed for in situ hybridization as described (Westerfield, 2000) with slight modifications. Briefly, embryos were re-hydrated with 50% MeOH in PBST (0.01% Tween 20 in PBS) and with PBST for 5–30 min each at room temperature (RT). Then, embryos were treated with 20 μg/ml proteinase K (Roche) in PBST (5 s for mouse E11.5 and E12.5 embryos, 5 min for st. 27 and st. 29 bamboo shark embryos, 10 min for st. 31 and st. 32 bamboo shark embryos). After the proteinase treatment, embryos were fixed in 4% paraformaldehyde/PBS for 1 hr, followed by one or two washes with PBST for 5–10 min each. Optionally, if embryos had some pigmentation, they were immersed in 2% H2O2 until they became white. Then, embryos were incubated for 1 hr in preheated hybridization buffer (50 ml formaldehyde; 25 ml 20× SSC, pH 5.0; 100 μl 50 mg/ml yeast torula RNA; 100 μl 50 mg/ml heparin; 1 ml 0.5 M EDTA; 2.5 ml 10% Tween 20; 5 g dextran sulfate; and DEPC-treated MilliQ water to a final volume of 100 ml) at 68°C. Subsequently, embryos were incubated with fresh hybridization buffer containing 0.25–4 μl/ml of RNA probes at 68°C overnight. Embryos were washed twice with preheated Wash buffer 1 (50 ml formaldehyde; 25 ml 20× SSC, pH 5.0; 2.5 ml 10% Tween 20; and DEPC-treated MilliQ water to a final volume of 100 ml) for 1 hr each at 68°C; once with preheated Wash buffer 2, which consisted of equal volumes of Wash buffer 1 and 2× SSCT (10 ml 20× SSC, pH 7.0; 1 ml 10% Tween 20; and MilliQ water to a final volume of 100 ml), for 10 min at 68°C; once with preheated 2× SSCT at 68°C for 10 min; and once with TBST at room temperature for 10 min. Embryos were then incubated with a blocking buffer (20 μl/ml 10% bovine serum albumin, 20 μl/ml heat-inactivated fetal bovine serum in TBST) for 1 hr at room temperature, followed by incubation with 1/4000 anti-digoxigenin (Roche) in fresh blocking buffer at 4°C overnight. Embryos were washed four times with TBST for 10–20 min each and were incubated at 4°C overnight. Finally, embryos were incubated with NTMT (200 μl 5 M NaCl; 1 ml 1 M Tris-HCl, pH 9.8; 500 μl 1 M MgCl2; 100 μl 10% Tween 20; and MilliQ water to a final volume of 10 ml) for 20 min and then with 15 μg/ml nitro-blue tetrazolium chloride (NBT) and 175 μg/ml 5-bromo-4-chloro-3-indolyphosphate p-toluidine salt (BCIP) in NTMT for 10 min to 2 hr until signals appeared. Pictures were taken with an Olympus microscope. For bamboo shark embryos, experiments were performed for at least two biological replicates.

ATAC-seq

Request a detailed protocol

Mouse forelimb buds at E9.5, E10.5, E11.5, and E12.5 were dissected, and samples from several individuals were pooled by stage to obtain enough cells. We considered this pooled sample to represent a biological replicate (other replicates were generated using different individuals). To obtain single-cell suspensions, pooled samples were treated with collagenase for 10 min at room temperature. The tissues were then dissociated into single-cell suspensions by pipetting the mixture and passing it through a 40 μm mesh filter (Funakoshi, Cat. No. HT-AMS-14002); the cell suspension was frozen in CryoStor medium (STEMCELL Technologies, Cat. No. ST07930) with Mr. Frosty (Thermo Scientific, Cat. No. 5100–0001) at −80°C overnight, according to Milani et al., 2016. An ATAC-seq library was prepared as described (Buenrostro et al., 2013) with some minor modifications. For library preparation, stored cells were thawed in a 38°C water bath and centrifuged at 500 g for 5 min at 4°C, which was followed by a wash using 50 μl of cold PBS and a second centrifugation at 500 g for 5 min at 4°C. Ten thousand cells per sample were collected, without distinguishing dead cells, and were lysed using 50 μl of cold lysis buffer (10 mM Tris-HCl, pH 7.4; 10 mM NaCl; 3 mM MgCl2; and 0.1% IGEPAL CA-630). Immediately after lysis, cells were spun at 1000 g for 10 min at 4°C, and the supernatant was discarded. For the transposition reaction, cells were re-suspended in the transposase reaction mix (25 μl 2× TD buffer, 2.5 μl Tn5 transposase [in the Nextera DNA Sample Preparation Kit, Illumina, Cat. No. FC-121–1031], and 22.5 μl nuclease-free water) and incubated for 30 min at 37°C. The reaction mix was purified using DNA Clean and Concentrator-5 (Zymo Research, Cat. No. D4014) by adding 350 μl of DNA-binding buffer and eluting in a volume of 10 μl. After a five-cycle pre-PCR amplification, the optimal number of PCR cycles was determined by a preliminary PCR using KAPA HiFi HotStart Real-Time Library Amplification Kit and was estimated to be four cycles. The PCR products were purified using 1.8× volumes of Agencourt AMPure XP. As a control, 50 ng of mouse genomic DNA was also transposed following the standard procedure of the Nextera DNA Sample Preparation Kit. Sequencing with HiSeq X was outsourced to Macrogen, Inc, which was carried out with HiSeq Control Software 3.3.76 (Run type: PE151bp). The output was processed with Illumina RTA 2.7.6 for base-calling and with bcl2fastq 2.15.0 for de-multiplexing. Quality control of the obtained fastq files for individual libraries was performed with FASTQC v0.11.5. ATAC-seq was performed with three biological replicates for each stage.

ATAC-seq data analysis

Request a detailed protocol

The short-read data from ATAC-seq were trimmed and filtered with Trim-Galore! (v0.5.0; options: --paired --phred33 -e 0.1 -q 30). We also removed reads that originated from mitochondrial genome contamination by mapping reads to the mouse mitochondrial genome using bowtie2 v2.3.4.1 (Langmead and Salzberg, 2012). The rest of the reads were mapped onto the mouse genome (mm10) using bwa v0.7.17 with the ‘mem’ option (Li and Durbin, 2010). Among the mapped reads, we removed reads with length >320 bp to reduce noise. The rest of the reads were further down-sampled to around 83.2 million reads to equalize the sequence depth of every sample. Peak calls were done with MACS2 v2.1.1 (Zhang et al., 2008; options: --nomodel --shift −100 --extsize 200 f BAMPE -g mm -B -q 0.01; the genomic reads were used as a control for all samples). For FRiP score calculation, a module, ‘countReadsPerBin.CountReadsPerBin’ in deepTools v3.2.1 (Ramírez et al., 2016), was used to count reads in peaks, and these read counts were then divided by the total number of reads. To evaluate reproducibility among the replicates, we first divided the mouse genome into 500 bp bins. Then, the ATAC-seq peaks were re-distributed into these bins with bedtools (Quinlan and Hall, 2010; options: intersect -F 0.4 f 0.4 -e -wo). Peaks of >500 bp were subdivided into 500-bp-long regions, and those of <500 bp were extended to fit within the closest 500 bp window. Subsequently, these peaks were converted into one-hot vectors, in which ‘1’ means that a 500-bp-long genomic region harbors an ATAC-seq peak. Genomic regions that lacked ATAC-seq peaks in all data were omitted. Using these one-hot vectors, Euclidean distances between the ATAC-seq data were calculated (Figure 5—figure supplement 1A).

For the conservation analysis, the significant variation in the length of ATAC-seq peaks complicated this evaluation. To deal with such variation, we the ATAC-seq peaks were re-distributed into 100 bp bins with bedtools (Quinlan and Hall, 2010; options: intersect -F 0.4 f 0.4 -e -wo) as described above. The sequences in these peaks were retrieved with BLASTN v2.7.1 against the genomes of 16 vertebrate species listed in Supplementary file 10 (BLASTN options: -task dc-megablast -max_target_seqs 1). The blast hits that scored ≥40 were considered as conserved sequences. In this way, the final figures shown in Figure 5C represent the fraction of the total conserved sequence length in the peaks of each stage rather than the number of conserved peaks. For confirmation, we also used a different alignment algorithm, LAST v961 (Kiełbasa et al., 2011) to find conserved sequences. To generate mouse genome databases for LAST, we first masked repeat sequences with N and split the genome file into multiple files, each of which contained a single chromosome sequence. Then, databases were generated using lastdb (options: -cR01). Alignments with the bamboo shark genome (Cpunctatum_v1.0; https://transcriptome.riken.jp/squalomix/resources/01.GCA_003427335.1_Cpunctatum_v1.0_genomic.rn.fna.gz) and the alligator genome (ASM28112v3) were carried out by lastal (options: -a1 -m100). Only a unique best alignment was selected using last-split. These alignment results were converted into the bed format, and regions that overlapped with the ATAC-seq peaks that were subdivided into 100 bp bins were counted.

For the clustering analysis, we converted the alignment files of the ATAC-seq reads into mapped reads in bins per million (BPM) coverage values with 200 bp resolution using bamCoverage in deepTools v3.2.1 (Ramírez et al., 2016; options: -of bedgraph --normalizeUsing BPM --effectiveGenomeSize 2652783500 -e -bs 200). Then, BPMs at the summits of ATAC-seq peaks and an additional 600 bp to the left and to the right of each summit (1400 bp in total) were collected and clustered by t-SNE (https://github.com/DmitryUlyanov/Multicore-TSNE; hyper parameters: perplexity = 30.0, n_iter = 5000) followed by hierarchical clustering (hyper parameters: method = ‘ward’, metric = ‘euclidean’). Enriched motifs were detected using a Perl script, findMotifsGenome.pl in HOMER v4.10.4 (Heinz et al., 2010; options: -size 100 -mask). To count the number of motif occurrences, ‘-find’ option of findMotifsGenome.pl was used, and sequences that scored ≥75% of the highest motif score were counted. For GO analysis, annotatePeaks.pl in HOMER was used. For the tissue-specificity analysis, we downloaded several aligned and unaligned reads of ATAC-seq experiments on 25 different tissues from the ENCODE web site (https://www.encodeproject.org/; see Supplementary file 10 for a complete list), and peaks were called as described above. Then, peaks that did not overlap with other tissues/cells were detected using bedtools.

Data and materials availability

Request a detailed protocol

RNA-seq and ATAC-seq data sets generated during the current study are available in the Gene Expression Omnibus (GEO) repository under accession number GSE136445. Other sequence data and raw data are available in the figshare (DOI: 10.6084/m9.figshare.9928541). Code for clustering analysis is available at https://github.com/koonimaru/easy_heatmapper. Materials related to this paper are available upon request from the corresponding authors.

References

    1. Duboule D
    (1994)
    Temporal colinearity and the phylotypic progression: a basis for the stability of a vertebrate bauplan and the evolution of morphologies through heterochrony
    Development 120:135–142.
    1. Johanson Z
    2. Joss J
    3. Boisvert CA
    4. Ericsson R
    5. Sutija M
    6. Ahlberg PE
    (2007) Fish fingers: digit homologues in sarcopterygian fish fins
    Journal of Experimental Zoology Part B: Molecular and Developmental Evolution 308B:757–768.
    https://doi.org/10.1002/jez.b.21197
    1. Westerfield M
    (2000)
    The Zebrafish Book
    Eugene.

Decision letter

  1. Karen E Sears
    Reviewing Editor; University of California, Los Angeles, United States
  2. Detlef Weigel
    Senior Editor; Max Planck Institute for Developmental Biology, Germany
  3. Gunter Wagner
    Reviewer; Yale University, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This manuscript presents cutting edge data to compare the development of shark fin and mouse limb and an important discovery – the existence of a conserved mid-developmental stage in paired appendage development. This study is timely and important, and will make an excellent publication to eLife.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "Developmental hourglass and heterochronic shifts in fin and limb development" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Jose Luis Gomez-Skarmeta (Reviewer #2).

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that your work will not be considered further for publication in eLife.

While the general scope of the paper is potentially suitable for eLife, it requires substantial additional analyses and discussion, as outlined in the reviews. In particular, it would be important to show the pattern of expression of Shh and its main targets in the bamboo shark to sustain the conclusions in Figure 2. Given that these experiments would likely take longer than the normal revision times for eLife papers, we reject the current version for now, but would consider a substantially revised new version.

Reviewer #1:

In this manuscript, to identify differences between limbs and fins, the authors generate and compare the temporal transcriptomes of mouse forelimb and bamboo shark pectoral fins. The comparison reveals a notable heterochrony of gene expression between limbs and fins. The analysis of distances of transcriptome profiles indicate stronger conservation at intermediate stages (hourglass-shaped conservation). Next the authors generate the ATAC-seq profiles of developing limb buds and find that conserved regulatory sequences are most active during mid-stage limb development.

This is an interesting study that requires some additional analysis and discussion to better sustain the conclusions reached.

Major concerns:

A major concern regarding this work is the use of the whole limb/fin for the transcriptomic and ATAC analyses. The limb bud is very heterogenous, even more as it develops, and this makes it difficult to extract conclusions using this kind of bulk analysis, more taking into account that late processes such as chondrogenic differentiation may greatly vary between mouse and shark.

Another concern is the scaling (maximum TPM=1, minimum TPM=0). While I agree that this helps capturing the dynamics of gene expression, it does not reflect the magnitude of the change and could lead to misleading interpretations. I think this may happen with the interpretation of the Shh pathway

1) Figure 2A: More information on the list of genes in each of the categories after gene-by-gene comparison of expression dynamics should be provided and discussed rather than only mentioning a couple of genes. The list should also be provided (i.e. excel). Particularly interesting is the inverse behavior of some genes in the "Heterochrony" group that are downregulated over time in the mouse limb bud being upregulated in the shark fin.

2) Figure 2D-E- I think that some hybridizations for Shh and its main targets in the bamboo shark are required to sustain the conclusions from these two panels. It may be that the sustained expression in the mouse corresponds to later chondrogenic stages that have already started at E12.5 and the whole heterochrony responding to different time resolution as mentioned by the authors. I don't agree with the authors in that the expression dynamics of HoxA/D genes is similar in both species, at least for the 5' members.

3) The consideration of ATAC sequences as active sequences should be softened as this is not always hold true. My interpretation is that most changes happen between 9.5 and 10.5, rather than conserved sequences being more active at E10.5

Reviewer #2:

This is an interesting study which aims to identify differences between fins and limbs. Performing transcriptomic comparison between pectoral fins from a non-model Chondricthyan species and forelimbs from mouse across a series of developmental stages, Onimaru et al. show that a noteworthy number of genes shows a heterochronic shift, alias a reverse temporal dynamic of expression between the two species. Moreover, they present an hourglass-shaped conservation of gene expression, but also of active regulatory regions in middle stages of development. Interestingly, in these stages they also detect more tissue- and stage-specific enhancers leading to the hypothesis that the middle developmental stages are evolutionary constrained by the increased regulatory complexity over pleiotropic genes.

This work shows how comparing distant species constitutes a good approach to understand how morphological novelties occur or are constrained during evolution and hints towards some of the changes that might have occurred during fin-to-limb transition. The data that Onimaru et al. have produced are also a good resource for the scientific community and the overall work leads to many interesting follow-up questions. Due to all the above reasons, I support the publication of this article in eLife.

However, the analyses presented are not always described as clearly or in-depth as desired and few observations are overstated. Therefore, I recommend the implementation of the comments below to strengthen the reliability, to enrich the content of the data presented and to prevent any confusion for the reader.

1) In Figure 2A, using hierarchical clustering the authors show that there is a heterochronic shift in gene expression between mouse limbs and shark fins. However, this group consists of different subclusters which not all follow exactly the general trend that the authors describe in their results. Could the authors discuss on these genes that still show different temporal dynamics of expression between sharks and mouse, but do not show opposite -timewise- trend than in mouse?

2) The authors should provide individual tables for each cluster described in Figure 2A (fin-specific, limb-specific, stable, conserved/late, heterochronic) instead of the Supplementary file 3, which is quite confusing in its current form.

3) In Figure 5, which are the GO terms associated to the genes for these clusters? What are the enriched motifs in cluster 8, largely specific to E9.5, and the GOs of the associated genes? The full list of motifs and associated genes for each cluster should be available. Moreover, is the conservation degree of the ATAC peaks different for each cluster?

4) As far as the HOMER analysis is concerned in Figure 5, why the authors used the extended sequence length of 1400 bp to perform TF motif analysis? Also, did they perform the enrichment analysis with default HOMER options? If so, random genomic regions were used as a statistical background. Can these results be replicated when using a more biologically relevant background? For example, the peaks of Cluster5 are enriched in CTCF when compared to random regions of the DNA (the default HOMER approach), but are they also enriched in CTCF when compared to all the open chromatin regions that were detected during development?

5) Results first paragraph: the figure supplement 4, not 1, refers to the details of RNA-seq data.

6) Have the authors used all three replicates in the transcriptomic analyses? We could assume that due to the last sentence referring to that in Figure 1—figure supplement 4, but it should be clearly stated.

7) Could the authors explain why they used only one of the replicates for the ATAC-seq hierarchical clustering in Figure 5A and comment whether the ATAC-seq peaks tested were present in all 3 replicates?

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for submitting your article "Developmental hourglass and heterochronic shifts in fin and limb development" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Detlef Weigel as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Gunter Wagner (Reviewer #2).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

We would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). Specifically, we are asking editors to accept without delay manuscripts, like yours, that they judge can stand as eLife papers without additional data, even if they feel that they would make the manuscript stronger. Thus the revisions requested below primarily address clarity and presentation.

Summary:

The authors present a detailed RNAseq and ATACseq comparison of mouse limb development and bamboo shark pectoral fin development. This study provides a wealth of functional genomic data leading to an important discovery: the existence of a mid – developmental "hour glass like" constrained stage of limb development. This is a significant discovery because it suggests a mechanistic basis for conserved developmental identities as the sub-organismal level. The homology of paired fins and limbs is not in question since upwards of 200 years, but understanding the developmental/mechanistic basis for this fact is a still unresolved issue in biology. This paper makes an important step in resolving this issue.

Revisions:

Given the recent publication of Dr. Woltering (https://pubmed.ncbi.nlm.nih.gov/32875118/), the authors may want to comment on this paper in relation with the Shh, Hox expressions they report.

We suggest that the authors indicate, at least in Materials and methods, their failure to detect Shh expression by ISH. Knowing this may be of help for other researchers.

Results paragraph three: Figure 1—figure supplement 7 instead of 8?

Introduction: the expression of Hoxa11 and Hoxa13 is actually not conserved in fin development, because the critical spatial exclusion of their expression domains is NOT seen in fins, even though a distal bias of Hoxa13 expression is shared. Please correct.

It is surprising to find 16,442 orthologs between shark and mouse, given that 1-1 orthologs among a sample of 10 eutherian species finds only <8,000 orthologs. A comment on this finding might be in order.

Results paragraph three: a non-colinear relationship in Hoxd gene expression levels also applies to chicken wings, where Hoxd12 is higher expressed than all the others, but I am not sure that was ever published. This could point to a scenario where mouse limb development is not as paradigmatic as it often seems.

Subsection “Comparison of SHH signaling pathways in limb and fin buds”: it is hard to see how Shh delayed onset can be supported without a rigid mapping of developmental stages between fin and limb development.

Discussion paragraph three: Please add reference to Piasecka et al., 2013.

Discussion paragraph four: We think it is important to be precise here. The correct statement is that mutations affecting this stage have more dramatic fitness consequences, rather than that it is less susceptible to mutation. What creates this impression is that the substitution rate is less not necessarily the mutation rate, as the authors note in the next sentence.

https://doi.org/10.7554/eLife.62865.sa1

Author response

[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

Reviewer #1:

In this manuscript, to identify differences between limbs and fins, the authors generate and compare the temporal transcriptomes of mouse forelimb and bamboo shark pectoral fins. The comparison reveals a notable heterochrony of gene expression between limbs and fins. The analysis of distances of transcriptome profiles indicate stronger conservation at intermediate stages (hourglass-shaped conservation). Next the authors generate the ATAC-seq profiles of developing limb buds and find that conserved regulatory sequences are most active during mid-stage limb development.

This is an interesting study that requires some additional analysis and discussion to better sustain the conclusions reached.

Major concerns:

A major concern regarding this work is the use of the whole limb/fin for the transcriptomic and ATAC analyses. The limb bud is very heterogenous, even more as it develops, and this makes it difficult to extract conclusions using this kind of bulk analysis, more taking into account that late processes such as chondrogenic differentiation may greatly vary between mouse and shark.

We agree that our comparative transcriptomic analysis has certain limitations owing to the use of bulk samples. In particular, whereas our analysis can capture overall temporal expression differences between the fin and limb buds, spatial information is lacking in our data. However, in contrast to the reviewer’s concern, we found that our data correctly identified some differentiation processes such as chondrogenesis and myogenesis (Figure 1—figure supplement 8).

In the case of chondrogenesis, bamboo shark fin buds and mouse limb buds share the early rise of Sox9, Runx3 and Tgfb2 expressions (key factors for chondrogenic differentiation). Subsequently, a cartilage-specific proteoglycan, Acan is upregulated slightly later in both species. On the other hand, Noggin, which is used for early chodrogenic marker in mouse limb development, was not detected neither in the genome nor transcriptome assemblies of the bamboo shark. Therefore, while it is convincing that the chondrogenic process of the bamboo shark have varied due to the lack of endochondral ossification in cartilaginous fishes, our trascriptome data revealed a partially conserved process for chondrogenic differentiation.

Regarding myogenesis, our trascriptome data captures a myogenic process conserved in the both species: Pax3 (a marker of myogenic precursor cells) is downregulated over the developmental time, and MyoD gene family (MyoG, MyoD1, Myf5) takes turns for further differentiation. On the other hand, whereas mouse limb buds upregulate three myosin genes (Myh3, Myh7, Myh8) at E12.5, we detected only upregulation of Myh7 in bamboo shark fin buds. This is probably an example that our transcriptome data revealed an interesting divergence between sharks and mice.

Thus, our transcriptome data contain biologically meaningful information that allows us to discuss many aspects of differences and conserved processes between the shark fins and mouse limbs, and we carefully draw conclusions within the limitation of our data. We have added these discussions in the text and related data to Figure 1—figure supplement 8.

Another concern is the scaling (maximum TPM=1, minimum TPM=0). While I agree that this helps capturing the dynamics of gene expression, it does not reflect the magnitude of the change and could lead to misleading interpretations. I think this may happen with the interpretation of the Shh pathway

We apologize for the poor explanation for why we scaled TPMs. While we agree that the scaling loses information that can be extracted only from intact TPMs, it has a significant advantage when comparing between evolutionarily distant species besides capturing the dynamics of gene expression. Indeed, other past comparative transcriptome studies also scaled gene expression values in their own ways. Of those, z-score (or standardization) may be a relatively common strategy and better reflect the magnitude of changes in TPM values (e.g., Levin et al., 2016; Leiboff and Hake, 2019). However, we found that z-score has a disadvantage that it could misrepresent differences between species.

To validate our scaling method with an objective measure, we have examined the effect of intact TPM values and scaling methods by analyzing the expression dynamics of several housekeeping genes which exhibit a stable expression profile and are likely conserved in all vertebrates (these genes are listed in the BUSCO dataset; Simão et al., 2015) and that of heterochronically expressed genes. Namely, we assumed that the euclidean distance of expression values is close to zero if we only use housekeeping genes (left panel of Figure 1—figure supplement 6C), but it is larger when comparing heterochronic genes (right panel of Figure 1—figure supplement 6C). The result is that Max1 scored the highest ratio of the euclidean distance of the housekeeping genes to that of the heterochronic genes (Figure 1—figure supplement 6D), suggesting that Max1 is most sensitive to interspecific differences of dynamically regulated genes. Please see the main text, Figure 1—figure supplement 5,6, and section 1 in Supplementary file 7.

Regarding the SHH pathway shown in Figure 2 (now moved to Figure 3), we compared the temporal dynamics with intact TPM values, and did not find any misinterpretation caused by the scaling (please see Figure 3—figure supplement 1 and the main text). In addition, the in situ data for Ptch1 and Hoxd12 that we have added as a response to another comment are consistent with the RNA-seq data (Figure 3). Please note that comparing intact TPM values between species involves many technical uncertainties particularly when comparing distantly related species. For example, the difference in DNA sequences of transcripts (such as GC contents) between species probably affects the efficiency of library preparations and sequencing. TPM values are also very likely to be biased by the incompleteness of the reference transcriptome sequence that we used for the bamboo shark (e.g., some genes lack 3ʹUTR). Therefore, the fraction to the maximum TPM values (probably reflect the saturation point of genes) or the deviation to the average TPM values are more relevant biological indices than intact TPM values (please see section 1 in Supplementary file 7 for details).

1) Figure 2A: More information on the list of genes in each of the categories after gene-by-gene comparison of expression dynamics should be provided and discussed rather than only mentioning a couple of genes. The list should also be provided (i.e. excel). Particularly interesting is the inverse behavior of some genes in the "Heterochrony" group that are downregulated over time in the mouse limb bud being upregulated in the shark fin.

We apologize, the list of genes related to Figure 2A was originally shown in Supplementary file 3. However, as pointed by the reviewer 2, the table was not readily understandable. Therefore, we have improved the table to be readable and consistent with Figure 2A (Supplementary file 8).

As requested, we also sought genes that are downregulated over time in the mouse limb bud and being upregulated in the shark fin (Figure 2—figure supplement 1B for a heatmap visualization and Supplementary file 9 for the full list of genes). As a result, we identified 241 genes. Of those, Fgf8 is particularly interesting as it functions as a crucial growth signal from the apical ectodermal ridge (AER). On the other hand, in the zebrafish, FGF24 and Fgf16 are known to be indispensable signaling molecules of the apical fin fold (roughly equivalent to the AER). As shown in Figure 2—figure supplement 1C, Fgf9, Fgf16 and Fgf24 (known to be expressed in the AER/apical fin fold), instead of Fgf8, are expressed in the early stages of bamboo shark fin buds. Although we cannot infer the ancestral state of the expression pattern of these genes, the overlapping function of these genes may have allowed subfunctionalization. We have added these discussions in section 3 in Supplementary file 8.

2) Figure 2D-E- I think that some hybridizations for Shh and its main targets in the bamboo shark are required to sustain the conclusions from these two panels. It may be that the sustained expression in the mouse corresponds to later chondrogenic stages that have already started at E12.5 and the whole heterochrony responding to different time resolution as mentioned by the authors. I don't agree with the authors in that the expression dynamics of HoxA/D genes is similar in both species, at least for the 5' members.

As requested, we have added in situ hybridization data for Ptch1 to visualize Shh pathway. In addition, we also examined Hoxd12 expression pattern (a) to highlight the temporal difference of Ptch1 expression and (b) to confirm the conserved dynamics of Hox genes. Overall, we concluded that these data are consistent with the transcriptome data. For some technical reasons, we failed to obtain in situ data for Shh on bamboo shark fin buds, even though we tried three different RNA probes (all resulted in non-specific expression pattern). However, we believe that Ptch1 and Hoxd12 are enough to address the comment raised here.

Please note that although we have prepared for a stable supply of bamboo shark embryos from an aquarium since 2015, the number of available embryos is still limited compared to other model species. Therefore, while we wanted to examine all the related genes, there is a fundamental limitation for it.

Regarding the HOX gene expression, the overall similarity of Hox gene expressions between the two species is supported by the measurement of euclidean distance as shown in Figure 4C. In addition, the in situ data for Hoxd12 mentioned above also show a conserved expression dynamics, in which Hoxd12 expression is initially restricted in the posterior part of fin/limb buds and extend anteriorly in later stages. Therefore, at least in some aspects, the transcriptional dynamics of Hox genes is comparable between these species.

That being said, we also agree that there are some differences in Hox gene expressions between these species. To address this, we have re-examined the 5ʹ members and described some differences that we think important. Namely, in the mouse limb bud, 5ʹ genes are known to exhibit a quantitative collinearity that the expression amount of Hoxd13 is much higher than other neighbors, whose transcription levels decrease with distance from Hoxd13. On the other hand, in the bamboo shark fin bud, Hoxd13 expression is lowest and there is no clear trend like quantitative collinearity (Figure 1—figure supplement 8; again, we emphasize that we cannot perform in situ for all of these genes due to the limited availability of bamboo shark embryos). Therefore, some degree of gene regulatory differences may exist between the two species. Accordingly, we have modified the manuscript to mention the difference of HOX gene expression (main text and section 2 in Supplementary file 7). Nevertheless, we think that our argument on the overall similarity of the spatio-temporal dynamics of HOX genes is well supported by the euclidean distance measure (Figure 4C) and the in situ data for Hoxd12 (Figure 3C and D).

3) The consideration of ATAC sequences as active sequences should be softened as this is not always hold true. My interpretation is that most changes happen between 9.5 and 10.5, rather than conserved sequences being more active at E10.5

As suggested, we have softened the word “active” to either “open” or “accessible”.

Regarding the temporal dynamics of evolutionarily conserved open chromatin regions, we have analyzed how the number of conserved sequences differs between stage-specific open chromatin regions in two ways. First, we plotted the absolute count of accessible conserved sequences at each stage (Figure 5—figure supplement 2A and B). In these plots, accessible conserved sequences are clearly increased at E10.5 compared to other stages. Note that because there is a difference in total peak numbers between stages, the fraction of conserved sequences in peaks is also worth to account for. To do so, we showed the fraction of accessible conserved sequence counts in Figure 5D in the previous submission. Second, we also counted the number of conserved sequences in E10.5-specific peaks by using the output of clustering analysis indicated in Figure 5A. As a result, the E10.5-specific cluster (c7) contains the highest fraction of accessible conserved sequences (Figure 5—figure supplement 2C and D). In addition to these results, we have already shown in Figure 5B that many ATAC-seq peak signals are decreased and increased from E10.5 to E11.5. Therefore, conserved sequences are certainly more accessible at E10.5 than other stages.

Reviewer #2:

This is an interesting study which aims to identify differences between fins and limbs. Performing transcriptomic comparison between pectoral fins from a non-model Chondricthyan species and forelimbs from mouse across a series of developmental stages, Onimaru et al. show that a noteworthy number of genes shows a heterochronic shift, alias a reverse temporal dynamic of expression between the two species. Moreover, they present an hourglass-shaped conservation of gene expression, but also of active regulatory regions in middle stages of development. Interestingly, in these stages they also detect more tissue- and stage-specific enhancers leading to the hypothesis that the middle developmental stages are evolutionary constrained by the increased regulatory complexity over pleiotropic genes.

This work shows how comparing distant species constitutes a good approach to understand how morphological novelties occur or are constrained during evolution and hints towards some of the changes that might have occurred during fin-to-limb transition. The data that Onimaru et al. have produced are also a good resource for the scientific community and the overall work leads to many interesting follow-up questions. Due to all the above reasons, I support the publication of this article in eLife.

However, the analyses presented are not always described as clearly or in-depth as desired and few observations are overstated. Therefore, I recommend the implementation of the comments below to strengthen the reliability, to enrich the content of the data presented and to prevent any confusion for the reader.

1) In Figure 2A, using hierarchical clustering the authors show that there is a heterochronic shift in gene expression between mouse limbs and shark fins. However, this group consists of different subclusters which not all follow exactly the general trend that the authors describe in their results. Could the authors discuss on these genes that still show different temporal dynamics of expression between sharks and mouse, but do not show opposite -timewise- trend than in mouse?

We agree that the original cluster contained several subclusters that exhibit different temporal dynamics. To improve the clustering analysis, we used the UMAP method, which resulted in two major heterochronic groups (Heterochonic1 and 2 in Figure 2A). Of them, “Heterochronic2” mostly follows the previously identified trend (the c7 better fits with such trend than c6). “Heterochonic1” includes genes whose expression is highest during the late stages of mouse limb bud development but relatively stable throughout fin bud development. These genes are also interesting, and we examined the expression pattern for Aldh1a2 as an example of genes in “Heterochonic 1”. A retinoic acid-generating enzyme, Aldh1a2 is known to be expressed in the interdigital web and retinoic acid signaling regulates the interdigital cell death, which is very likely to be specific to the tetrapod limb. On the other hand, we found that in bamboo shark fin buds, Aldh1a transcripts are initially uniform and later restricted to the edge of fin buds (arrowheads in Figure2—figure supplement 1A). We have mentioned this in the Results.

In addition, as requested by the reviewer 1, we separately investigated genes that follow the inverse trend (downregulated over time in the mouse limb bud being upregulated in the shark fin; Figure 2—figure supplement 1B and Supplementary file 9). Of those, Fgf8 is particularly interesting as it plays a crucial role as a growth signal from the apical ectodermal ridge (AER). On the other hand, in the zebrafish, FGF24 and Fgf16 are known to be indispensable signaling molecules of the apical fin fold (roughly equivalent to the AER). As shown in Figure 2—figure supplement 1C, Fgf9 (known to be expressed in mouse AER), instead of Fgf8, is expressed in the early stages of bamboo shark fin buds. Although we cannot infer the ancestral state of the expression pattern of these genes, the overlapping function of these genes may have allowed subfunctionalization. We have added these discussions in section 3 in Supplementary file 7.

2) The authors should provide individual tables for each cluster described in Figure 2A (fin-specific, limb-specific, stable, conserved/late, heterochronic) instead of the Supplementary file 3, which is quite confusing in its current form.

We apologize for the confusing table. We have examined how the table should be presented. If the table is split into the cluster annotations as requested, we think that there will be too many supplementary tables. Instead, we have added the cluster annotations to the original form of the table and the cluster numbers to the Figure 2A to be consistent.

3) In Figure 5, which are the GO terms associated to the genes for these clusters? What are the enriched motifs in cluster 8, largely specific to E9.5, and the GOs of the associated genes? The full list of motifs and associated genes for each cluster should be available. Moreover, is the conservation degree of the ATAC peaks different for each cluster?

We thank the reviewer for suggesting us an interesting analysis. We have performed GO analysis and the result was quite consistent with the clustering analysis. That is, while the constitutionally accessible peaks (C5, C6) are closely located to genes related to “cellular components”, the dynamically regulated peaks (C3, C4, C7, C8) are associated with GO terms, “developmental process”, “multicellular organism development”, and “anatomical structure morphogenesis”. Although there are subtle differences between C3, C4, C7 and C8, we did not detect any C7-specific GO terms. As requested, we have made a supplementary file that lists top 10 GO terms for every clusters (Supplementary file 12), and a supplementary file that contain full GO terms and all associated genes (Supplementary data) and mentioned this in the text.

As for the C8, the enriched motifs are mostly similar to those in the C7 (both consist of TAATT, ATTTAT and GACCTC), though the best matched transcription factors are different (VSX2 for C8 and LHX for C7). LHX and VSX motifs include TAATT sequence, but VSX1 and VSX2 are hardly expressed in mouse limb buds. As requested, we now showed the list of top 5 motifs of all clusters (Figure 6—figure supplement 2 and 3) and source data that include all motif list of all clusters (Supplementary data). In addition, for a better understanding of how motifs are dynamically changing over time, we have added plots of the average number of the motifs identified against mouse limb stages (Figure 6—figure supplement 5). These plots showed that COUP-TFII detected in C8 sharply decreases from E11.5. On the other hand, VSX2 (detected in C8) and LHX (detected in C7) follow a similar trend in which the number of the motif transiently increases at E10.5, suggesting that these motifs may represent a mostly same sequence feature. We have added this discussion in the text.

Regarding the conservation level of each cluster, E10.5-specific cluster (C7) is indeed enriched for conserved sequences as expected. We have added bar plots in Figure 5—figure supplement 2 and discussed in the text.

4) As far as the HOMER analysis is concerned in Figure 5, why the authors used the extended sequence length of 1400 bp to perform TF motif analysis? Also, did they perform the enrichment analysis with default HOMER options? If so, random genomic regions were used as a statistical background. Can these results be replicated when using a more biologically relevant background? For example, the peaks of Cluster5 are enriched in CTCF when compared to random regions of the DNA (the default HOMER approach), but are they also enriched in CTCF when compared to all the open chromatin regions that were detected during development?

We apologize that the HOMER analysis was poorly explained in the manuscript. First, we performed the motif analysis using +-50 bp from the center of a peak region by setting the “-size 100” option (the default of the program is “-size 200” , but we did not see any significant differences between -size 100 and 200). Therefore, although the input sequence length was 1400 bp, the regions that we analyzed are 100 bp. We have add the options that we used for running HOMER.

Second, we used random genomic regions as a background. We think that using random genomic regions is one way to fairly estimate enriched motifs, because some clusters share several enriched motifs (e.g., c5 and c6 are both enriched for the CTCF motif; please see Figure 6—figure supplement 2 and 3). However, as pointed by the reviewer, we also performed motif analyses with the other peak regions as a background (Figure 6—figure supplement 4), which consistently detected the CTCF motif.

5) Results first paragraph: the figure supplement 4, not 1, refers to the details of RNA-seq data.

We have corrected the figure number.

6) Have the authors used all three replicates in the transcriptomic analyses? We could assume that due to the last sentence referring to that in Figure 1—figure supplement 4, but it should be clearly stated.

We apologize for not mentioning this point. We used all three replicates, and means and standard errors of them were used for the all downstream analyses. We have corrected sentences to explicitly describe how the data was processed.

7) Could the authors explain why they used only one of the replicates for the ATAC-seq hierarchical clustering in Figure 5A and comment whether the ATAC-seq peaks tested were present in all 3 replicates?

One of the ways to integrate replicates in this analysis could be to extract peaks shared by all replicates. However, we faced two problems to do that. First, we used summits (an output of the MACS2 peak caller) to estimate the location of the center of peaks, but summits are rarely overlapped between replicates due to the narrow range of summits. Second, as shown in Figure 5B, because the quality of our ATAC-seq data is not uniform among replicates, data with relatively poor quality affects the analysis.

Now, to ensure that Figure 6A is reproducible, we showed data that repeatedly performed the same analysis with different combinations of replicates (Figure 6—figure supplement 1 and the Results). In addition, we have also added plots of the average number of the motifs identified using all the three replicates of ATAC-seq peaks at each stage (Figure 6—figure supplement 4), which support the clustering result.

[Editors’ note: what follows is the authors’ response to the second round of review.]

Revisions:

Given the recent publication of Dr. Woltering (https://pubmed.ncbi.nlm.nih.gov/32875118/), the authors may want to comment on this paper in relation with the Shh, Hox expressions they report.

As requested, we have discussed Woltering et al., 2020 in relation to the present and previous studies including ours. While we agree with the authors that an anterior expansion of the expression domain of genes regulated by SHH may have contributed to the substantial anatomical changes during the fin-to-limb transition (as we have already proposed in Onimaru et al., 2015), we partly disagree with their argument. They proposed that the absence and presence of the phase of the anterior expansion of Hoxd13 correlate with the metapterygial morphologies. However, this argument is valid ONLY within species that they picked up (cichlids, lungfish and tetrapods). Indeed, such anterior expansion of Hoxd13 expression has been already observed in the fin buds of Polyodon, the little skate, and the small-spotted catshark (Davis MC et al. 2007, Freitas R et al. 2007, Nakamura T, et al., 2015). In addition, their Hoxd13 expression data of lungfish fin buds is still needed to be confirmed, as there is a conflicting report by another group (Johanson et al., 2007). Therefore, while there is an overall consensus that an anterior expansion of gene expression domains may have contributed to the substantial anatomical changes during the fin-to-limb transition, further studies are necessary in order to understand the involvement of Hoxd13 regulatory changes in the future.

We suggest that the authors indicate, at least in Materials and methods, their failure to detect Shh expression by ISH. Knowing this may be of help for other researchers.

We have mentioned on Shh ISH in the Materials and methods section.

Results paragraph three: Figure 1—figure supplement 7 instead of 8?

Thank you for pointing this out. We have corrected the figure number.

Introduction: the expression of Hoxa11 and Hoxa13 is actually not conserved in fin development, because the critical spatial exclusion of their expression domains is NOT seen in fins, even though a distal bias of Hoxa13 expression is shared. Please correct.

We have included the fact that the expression domains of Hoxa11 and Hoxa13 are overlapped in fin buds. We also note that the separation of Hoxa11/13 expression domain may not be a critical factor that differentiate limbs from fins, because axolotl limb buds also exhibit a similar overlapping expression of these genes (Woltering et al., 2019).

It is surprising to find 16,442 orthologs between shark and mouse, given that 1-1 orthologs among a sample of 10 eutherian species finds only <8,000 orthologs. A comment on this finding might be in order.

First of all, we would like to clarify that the number of genes that were uniquely orthologous to mouse ones is 13005. The figure of “16,442” represents the number of orthologs mapped to all vertebrate genes we analyzed. Although we are not sure which paper the reviewer referred to, based on a very recent study that infers orthology relations between human genes and genes from each of 43 other vertebrate genomes, the number of orthologous pairs for each genome comparison ranges from 9775 to 17027 with the average of around 14000 (Hao et al., 2020). Therefore, the number of the orthologs we detected is reasonable. We have mentioned this in the text.

Results paragraph three: a non-colinear relationship in Hoxd gene expression levels also applies to chicken wings, where Hoxd12 is higher expressed than all the others, but I am not sure that was ever published. This could point to a scenario where mouse limb development is not as paradigmatic as it often seems.

We thank the reviewer for making us realize this point. Indeed, we found a report that the expression level of Hoxd12 is slightly higher (around 30–40%) than that of Hoxd13 in the presumptive autopod region of chick fore- and hind-limb buds (Yakushiji-Kaminatsui et al., 2018). Since they did not perform replicates, the significance of this expression difference is not clear. Thus, at least we can say that Hoxd12 and Hoxd13 show a nearly same level of transcripts in chick limb buds. On the other hand, in bamboo shark fin buds at stage 31, the expression level of Hoxd12 is 5.6 times as high as that of Hoxd13 (p-value, 0.00712). This sharp contrast of Hoxd12/Hoxd13 expression amount may represent a substantial difference between shark fin buds and chick limb buds. Therefore, chick limbs and bamboo shark fins are probably not in the same situation. Although this limited taxon sampling does not give us a solid conclusion about the ancestral state, we appreciate that the mouse limb is not the paradigmatic case of the tetrapod limb. We have discussed the divergent controls of Hoxd genes.

Subsection “Comparison of SHH signaling pathways in limb and fin buds”: it is hard to see how Shh delayed onset can be supported without a rigid mapping of developmental stages between fin and limb development.

We agree that the original sentence was misleading. We have changed the words and added more explanation about Shh delay.

Discussion paragraph three: Please add reference to Piasecka et al., 2013.

We have added the paper and modified the Discussion accordingly.

Discussion paragraph four: We think it is important to be precise here. The correct statement is that mutations affecting this stage have more dramatic fitness consequences, rather than that it is less susceptible to mutation. What creates this impression is that the substitution rate is less not necessarily the mutation rate, as the authors note in the next sentence.

We apologize for the confusing discussion. This whole paragraph was not intended to discuss either the mid-stage conservation or fitness. Instead, we wanted to discuss the significance of regulatory sequences in terms of human diseases. We realized that this paragraph was quite off-topic and could mislead the audience. Therefore, we have deleted the whole paragraph.

https://doi.org/10.7554/eLife.62865.sa2

Article and author information

Author details

  1. Koh Onimaru

    1. Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research (BDR), Kobe, Japan
    2. Laboratory for Bioinformatics Research, RIKEN BDR, Wako City, Japan
    3. Molecular Oncology Laboratory, Graduate School of Medicine, Nagoya University, Nagoya, Japan
    Contribution
    Conceptualization, Resources, Data curation, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing - original draft, Project administration
    For correspondence
    koh.onimaru@riken.jp
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2428-9510
  2. Kaori Tatsumi

    Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research (BDR), Kobe, Japan
    Contribution
    Data curation, Validation, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  3. Chiharu Tanegashima

    Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research (BDR), Kobe, Japan
    Contribution
    Data curation, Validation, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  4. Mitsutaka Kadota

    Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research (BDR), Kobe, Japan
    Contribution
    Data curation, Validation, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  5. Osamu Nishimura

    Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research (BDR), Kobe, Japan
    Contribution
    Data curation, Validation, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1969-2580
  6. Shigehiro Kuraku

    Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research (BDR), Kobe, Japan
    Contribution
    Conceptualization, Supervision, Funding acquisition, Validation, Project administration, Writing - review and editing
    For correspondence
    shigehiro.kuraku@riken.jp
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1464-8388

Funding

Japan Society for the Promotion of Science (17K15132)

  • Koh Onimaru

Ministry of Education, Culture, Sports, Science and Technology

  • Koh Onimaru
  • Kaori Tatsumi
  • Chiharu Tanegashima
  • Mitsutaka Kadota
  • Osamu Nishimura
  • Shigehiro Kuraku

RIKEN

  • Koh Onimaru

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Kenta Sumiyama, James Shape and Masahiro Uesaka for fruitful discussions; Itoshi Nikaido and Laboratory for Bioinformatics Research for providing computational resources; Itsuki Kiyatake, Kiyonori Nishida, and Osaka Aquarium Kaiyukan for kindly providing bamboo shark eggs; Laboratory for Animal Resources and Genetic Engineering for supplying mouse embryos; the ENCODE consortium and the ENCODE production laboratories for generating ATAC-seq data sets.

Ethics

Animal experimentation: Animal experiments were conducted in accordance with the guidelines approved by the Institutional Animal Care and Use Committee (IACUC), RIKEN Kobe Branch, and experiments involving mice were approved by IACUC (K2017-ER032).

Senior Editor

  1. Detlef Weigel, Max Planck Institute for Developmental Biology, Germany

Reviewing Editor

  1. Karen E Sears, University of California, Los Angeles, United States

Reviewer

  1. Gunter Wagner, Yale University, United States

Publication history

  1. Received: September 7, 2020
  2. Accepted: February 1, 2021
  3. Accepted Manuscript published: February 9, 2021 (version 1)
  4. Version of Record published: March 4, 2021 (version 2)

Copyright

© 2021, Onimaru et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 561
    Page views
  • 122
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Developmental Biology
    Feng Wang et al.
    Research Article Updated

    The X-linked gene Rlim plays major roles in female mouse development and reproduction, where it is crucial for the maintenance of imprinted X chromosome inactivation in extraembryonic tissues of embryos. However, while females carrying a systemic Rlim knockout (KO) die around implantation, male Rlim KO mice appear healthy and are fertile. Here, we report an important role for Rlim in testis where it is highly expressed in post-meiotic round spermatids as well as in Sertoli cells. Systemic deletion of the Rlim gene results in lower numbers of mature sperm that contains excess cytoplasm, leading to decreased sperm motility and in vitro fertilization rates. Targeting the conditional Rlim cKO specifically to the spermatogenic cell lineage largely recapitulates this phenotype. These results reveal functions of Rlim in male reproduction specifically in round spermatids during spermiogenesis.

    1. Developmental Biology
    2. Neuroscience
    Hiroki Takechi et al.
    Research Article

    Transmembrane protein Golden goal (Gogo) interacts with atypical cadherin Flamingo to direct R8 photoreceptor axons in the Drosophila visual system. However, the precise mechanisms underlying Gogo regulation during columnar- and layer-specific R8 axon targeting are unknown. Our studies demonstrated that the insulin secreted from surface and cortex glia switches the phosphorylation status of Gogo, thereby regulating its two distinct functions. Non-phosphorylated Gogo mediates the initial recognition of the glial protrusion in the center of the medulla column, whereas phosphorylated Gogo suppresses radial filopodia extension by counteracting Flamingo to maintain a one axon to one column ratio. Later, Gogo expression ceases during the midpupal stage, thus allowing R8 filopodia to extend vertically into the M3 layer. These results demonstrate that the long- and short-range signaling between the glia and R8 axon growth cones regulates growth cone dynamics in a stepwise manner, and thus shape the entire organization of the visual system.