Landscape of histone modifications in a sponge reveals the origin of animal cis-regulatory complexity

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Combinatorial patterns of histone modifications regulate developmental and cell type-specific gene expression and underpin animal complexity, but it is unclear when this regulatory system evolved. By analysing histone modifications in a morphologically-simple, early branching animal, the sponge Amphimedonqueenslandica, we show that the regulatory landscape used by complex bilaterians was already in place at the dawn of animal multicellularity. This includes distal enhancers, repressive chromatin and transcriptional units marked by H3K4me3 that vary with levels of developmental regulation. Strikingly, Amphimedon enhancers are enriched in metazoan-specific microsyntenic units, suggesting that their genomic location is extremely ancient and likely to place constraints on the evolution of surrounding genes. These results suggest that the regulatory foundation for spatiotemporal gene expression evolved prior to the divergence of sponges and eumetazoans, and was necessary for the evolution of animal multicellularity.

https://doi.org/10.7554/eLife.22194.001

eLife digest

Animals come in many shapes and sizes, and vary in how they move, grow and reproduce. The long-held thought that animal complexity is related to the number of genes that are in the animal’s DNA has now been largely dismissed; simple animals like sponges and cnidarians (for example, jellyfish, anemones and corals) have comparable gene numbers to vertebrates, insects, mollusks and other complicated bilaterians (animals that feature a plane of symmetry, meaning that they have a top, a bottom, a front and a back). This observation led to the idea that gene regulation (how and when genes are turned off and on) is responsible for the evolution of animal diversity.

Genomic DNA packs into cells by winding around proteins called histones. Histones themselves can bear certain chemical marks, which in turn determine if the genes contained in the DNA associated with the histones are going to be turned on or off. In bilaterians and cnidarians these marks substantially contribute to gene regulation. Some of these marks predate the evolution of multicellular animals from single-celled organisms. However, the origin of the marks that associate with the gene regulatory elements that are essential for animals to be multicellular remained unknown. In other words, does the evolution of histone marks underpin animal complexity?

Gaiti et al. turned to the marine sponge Amphimedon queenslandica to address this question. Sponges are one of the morphologically simplest animals, lacking a gut, nerves and muscles. By analyzing histone marks in this sponge, Gaiti et al. found they were remarkably similar to the networks of histone marks seen in more complex animals. This is consistent with this form of gene regulation being present at the dawn of the animal kingdom. Indeed, this mode of gene regulation may have been necessary for multicellular animals to first evolve.

It now appears that most of the genes and regulatory mechanisms underlying the formation of complex animals, like ourselves, had an unexpected early origin – probably as early as the first steps in the evolution of multicellular animals from single-celled organisms. Further studies of animals that are close relatives of sponges, such as comb jellies, and their single-celled cousins, may further improve our understanding of how these simple single-celled organisms became multicellular animals.

https://doi.org/10.7554/eLife.22194.002

Introduction

Animals rely on genomic regulatory systems to direct the dynamic spatiotemporal and cell type-specific gene expression that is essential for the development and maintenance of a multicellular lifestyle. However, how such a system originated and evolved in animals remains unclear. As the last common ancestor of modern animals already possessed an extensive repertoire of regulatory genes, including most transcription factors and signaling pathways used in bilaterian development (Srivastava et al., 2010; Larroux et al., 2008; Degnan et al., 2009; Larroux et al., 2006; Richards and Degnan, 2009; Ryan et al., 2013; Moroz et al., 2014; King et al., 2008; Sebé-Pedrós et al., 2011; de Mendoza et al., 2013; King et al., 2003; Richter and King, 2013), the evolution of animal multicellularity likely required more than the origin of novel genes. Other regulatory features, such as cis-regulatory DNA and combinatorial patterns of histone covalent post-translational modifications (PTMs) (Davidson and Peter, 2015), would have been instrumental to direct differential gene expression in the first multicellular animals. For instance, recent analysis of the genome of Capsaspora, one of the closest unicellular relatives of animals, reveals a lack of chromatin repressive marks, developmental promoter types and distal cis-regulatory elements (enhancers) typically present in complex animals (i.e., eumetazoans) (Sebé-Pedrós et al., 2016).

The development of high-throughput chromatin assays like chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq) (Robertson et al., 2007) has allowed the dissection of chromatin-encoded information beyond the primary DNA sequence, especially the systematic examination of histone PTMs and their role(s) in transcriptional regulation (Zhou et al., 2011; Thurman et al., 2012; Kundaje et al., 2015; ENCODE Project Consortium, 2012). Although combinatorial patterns of histone acetylation and methylation are key components of gene regulatory mechanisms underpinning the formation and maintenance of eumetazoans (Schwaiger et al., 2014), it remains unknown if this system is restricted to these animals or is indeed more ancient.

Porifera (sponges) are considered one of the oldest surviving phyletic lineages of animals, diverging from other metazoans around 700 Mya (Erwin et al., 2011). Despite being one of the morphologically simplest animals, lacking a gut, nerves and muscles, sponges possess an extensive gene repertoire for transcriptional regulation required in eumetazoan development and body patterning (Srivastava et al., 2010; Larroux et al., 2008, 2006; Adamska et al., 2007; Gaiti et al., 2015; Nakanishi et al., 2014; Conaco et al., 2012; Riesgo et al., 2014; Grimson et al., 2008; Richards et al., 2008; Leininger et al., 2014; Fortunato et al., 2015, 2014; Bråte et al., 18212015). Here, following on from our recent transcriptomic studies that revealed that the sponge Amphimedon queenslandica (herein Amphimedon) has dynamic developmental gene expression akin to eumetazoans (Gaiti et al., 2015; Fernandez-Valverde et al., 2015; Levin et al., 2016), we set out to determine whether this transcriptional complexity is paralleled by regulatory complexity encoded by combinatorial histone PTM patterns. By analysing an extensive ChIP-seq compendium of histone H3 PTMs in this sponge, we show that a complex gene regulatory landscape comprised of combinatorial histone modifications was already in place at the dawn of animals. Moreover, we provide evidence for the evolution and expansion of distal cis-regulatory genomic capabilities at the origin of the animal kingdom.

Results

Amphimedon key regulatory chromatin states are shared with eumetazoans

We carried out chromatin immunoprecipitation (ChIP) on sexually reproducing Amphimedon adults and larvae using antibodies against specific histone H3 PTMs that have been used to define chromatin states in model bilaterians (Zhou et al., 2011; Ho et al., 2014) (Figure 1A). These analyses were undertaken on separate admixtures of adult and larval somatic cell types and, thus, a diversity of gene transcriptional states. Importantly, Amphimedon adults and larvae are comprised of different cell types with markedly different transcriptional profiles and regulatory states (Gaiti et al., 2015; Conaco et al., 2012; Fernandez-Valverde et al., 2015; Degnan et al., 2015). While our sampling strategy increases the biological complexity of chromatin states in toto, it may dilute cell type-specific signals. This contrasts with ChIP-seq analyses performed on cell lines, embryos with few cell types, or distinct tissue samples, which encapsulate more homogenous cellular populations and environments (Sebé-Pedrós et al., 2016; Kundaje et al., 2015; Schwaiger et al., 2014; Gerstein et al., 2010; Pérez-Lluch et al., 2015a). Given the current Amphimedon genome is a draft sequence, our analyses may also be incomplete in regions that have incomplete annotations and gaps in the assembly (13% of the total genome assembly) (Srivastava et al., 2010).

Figure 1 with 4 supplements see all

Download asset Open asset

Chromatin states in Amphimedon.

(A) Schematic representation of *Amphimedon* life cycle. Larvae (oval shaped, 300–500 µm long) emerge from maternal brood chambers and then swim in the water column before they develop competence to settle and initiate metamorphosis into a juvenile. The juvenile body plan, which displays the hallmarks of the adult body plan, including an aquiferous system with canals, choanocytes chambers and oscula, is the outcome of the dramatic reorganization of the radially-symmetrical, bi- or trilayered larva. This juvenile will then grow and mature into a benthic adult (ranging from 10–30 cm³) (Degnan et al., 2015; Edgar et al., 2002). (B) Definition and enrichments for a 9-state Hidden Markov Model based on five histone PTMs (H3K4me3, H3K27ac, H3K4me1, H3K36me3 and H3K27me3) in adult *Amphimedon*. From left to right: chromatin state definitions, abbreviations, histone PTM probabilities, genomic coverage, protein-coding gene functional annotation enrichments, expressed (Expr.) and repressed (Repr.) protein-coding gene enrichments. Blue shading indicates intensity, scaled by column. (C) Adult chromatin state annotations on gene rich highly transcribed (active) scaffold (contig13500) showing the predominance of ‘TssA’, ‘TxFlnk’, and ‘TxEnhA’ states. For the definition of chromatin states see panel (A). Coding genes (purple) and long non-coding RNAs (blue) are shown, along with signal coverage tracks showing CEL-seq expression in adult. A grey scale indicates CEL-seq expression level: white (no-expression); black (highest expression). (D) Adult chromatin state annotations on a predominantly silenced scaffold (contig13522 from 500,000 to 1,500,000 bp) showing the prevalence of ‘ReprPC’ and ‘ReprPCWk’ states. For the definition of chromatin states see panel (A). Coding genes (purple) and long non-coding RNAs (blue) are shown, along with signal coverage tracks showing CEL-seq expression in adult. A grey scale indicates CEL-seq expression level: white (no-expression); black (highest expression).

https://doi.org/10.7554/eLife.22194.003

Figure 1—source data 1 Histone H3 covalent post-translation modifications and RNA Polymerase II investigated in this study and their typical genomic localization relative to coding genes and regulatory regions in bilaterian model organisms.: https://doi.org/10.7554/eLife.22194.004
Download elife-22194-fig1-data1-v2.xlsx
Figure 1—source data 2 Histone H3 sequences used to generate Figure 1—figure supplement 1.: https://doi.org/10.7554/eLife.22194.005
Download elife-22194-fig1-data2-v2.xlsx
Figure 1—source data 3 BLASTp search outcome of the relevant histone methyltransferases and acetyltransferases against Amphimedon queenslandica proteins (NCBI nr database; E-value <1e-09).: https://doi.org/10.7554/eLife.22194.006
Download elife-22194-fig1-data3-v2.xlsx
Figure 1—source data 4 Summary statistics and quality metrics of the ChIP-seq datasets used in this study. See also Materials and methods for preprocessing of ChIP-seq datasets procedure.: https://doi.org/10.7554/eLife.22194.007
Download elife-22194-fig1-data4-v2.xlsx
Figure 1—source data 5 Validation of the ChIP-seq results by ChIP-quantitative PCRs (ChIP-qPCRs).: https://doi.org/10.7554/eLife.22194.008
Download elife-22194-fig1-data5-v2.xlsx

The antibodies used target the following histone H3 PTMs: (i) monomethylated lysine 4 (H3K4me1), associated with distal cis-regulatory elements such as enhancers; (ii) trimethylated lysine 4 (H3K4me3), enriched in active promoters; (iii) trimethylated lysine 36 (H3K36me3), found with actively transcribed regions; (iv) trimethylated lysine 27 (H3K27me3), enriched in Polycomb-silenced regions; and (v) acetylated lysine 27 (H3K27ac), which occurs around activated regulatory regions. We also used an antibody against total histone H3 (Figure 1—source data 1). An antibody against unphosphorylated Ser2 residues of RNA polymerase II (RNAPII 8WG16) C-terminal domain also was included (Brookes and Pombo, 2009) (Figure 1—source data 1). As the entire amino acid sequence of histone H3 is perfectly conserved in Amphimedon, along with the relevant histone methyltransferases and acetyltransferases, these antibodies are predicted to recognize the correct epitopes (Figure 1—figure supplement 1; Figure 1—source data 2; Figure 1—source data 3). These antibodies recognize the correct epitopes in even more distantly related organisms (i.e., non-metazoan eukaryotes) (e.g., [Sebé-Pedrós et al., 2016; Ercan et al., 2009; Barraza et al., 2015; Harmeyer et al., 2015; Liu et al., 2007; Eckalbar et al., 2016]).

ChIP-seq reads generated from immunoprecipitated and input (whole-cell extract) DNA were aligned to the Amphimedon genome (Srivastava et al., 2010), resulting in highly reproducible data sets (Figure 1—figure supplement 2; Figure 1—source data 4; Figure 1—source data 5). Uniquely mapped reads were subsequently used to identify a set of distinct chromatin states based on the five different histone H3 PTMs we assayed. Specifically, chromatin states were predicted throughout the genome training a multivariate Hidden Markov Model with different a priori defined states (from 5 to 15) (Materials and methods). We elected to use a 9-state model for all further analyses as it covered all major gene coding and regulatory components (promoter, enhancer, gene body) that we expected to resolve with this selection of histone H3 PTMs. Despite the inherent cellular heterogeneity of our starting material, we were able to resolve specificities towards gene components between these nine chromatin states. They fell into two broad categories: one that correlated with actively transcribed genes that include active promoters (‘TssA’) and enhancers (‘TxEnhA’, ‘EnhWk’), and 5’ and 3’ boundaries of transcribed genes (‘TxFlnk’); and another category with genes with no or little detectable transcription; these include bivalent or poised regulatory (‘BivTx’, ‘EnhP’), repressed Polycomb (‘ReprPC’, ‘ReprPCWk’), and quiescent (‘Quies’) states (Figure 1B–D). The nine chromatin states differentially associated with specific Amphimedon genomic features. For instance, the ‘TssA’ state (defined by the presence of H3K4me3) was enriched around transcription start sites (TSSs) of active genes. ‘TxEnhA’ state (defined by H3K4me1, H3K27ac, and H3K36me3 enrichment) associated with coding exons and introns that correspond to potential cis-regulatory elements and short intergenic regions, which are common in the Amphimedon genome (Kundaje et al., 2015; Fernandez-Valverde et al., 2015; Kowalczyk et al., 2012; Ritter et al., 2012; Singer et al., 2015; Birnbaum et al., 2012; Zentner and Scacheri, 2012; Zentner et al., 2011; Fernandez-Valverde and Degnan, 2016). In contrast, the ‘ReprPC’ states (defined by H3K27me3 enrichment) were spread through the gene bodies of repressed genes, consistent with the known role of H3K27me3 in transcriptional silencing (Zhou et al., 2011; Ho et al., 2014) (Figure 1BD; Figure 1—figure supplement 2; Figure 1—figure supplement 3).

Despite being comprised of different cell types and having a distinct gene expression profile from the adult, the larval genome possesses a remarkably similar set of chromatin states (Figure 1—figure supplement 4). Obtaining consistent chromatin states based on histone PTMs ChIP-seq data from two markedly different stages of the Amphimedon life cycle provides corroborating evidence that this sponge possesses the same regulatory states as present in eumetazoans.

Histone PTMs and the tuning of gene expression in Amphimedon

To investigate the distribution of histone H3 PTMs in Amphimedon genes, we calculated the average enrichment of histone H3 PTMs and RNAPII relative to the TSSs of protein-coding genes. Input-normalized ChIP-seq read coverage revealed a strong unimodal H3K4me3 peak positioned immediately after the TSS of expressed genes that co-localizes with H3K27ac and RNAPII (Figure 2A; Figure 2—figure supplement 1; Figure 2—figure supplement 2A). Additionally, H3K4me3 marked (i) genes with head-to-head orientation that may be under the control of a bidirectional promoter (a common feature in the Amphimedon genome [Fernandez-Valverde and Degnan, 2016]), and (ii) alternative TSSs (Figure 2—figure supplement 3). This is consistent with H3K4me3 being promoter-proximal and positioned on the +1 nucleosome (Zhou et al., 2011; Ho et al., 2014; Lenhard et al., 2012). A prominent nucleosome-depleted region was observed right upstream of the TSS of expressed genes (likely corresponding to the proximal promoter) followed by a narrowly localized nucleosome (the +1 nucleosome) (see below Figure 2—figure supplement 4D), suggesting that the interplay between nucleosome positioning and transcription is conserved in sponge promoters (Sebé-Pedrós et al., 2016; Schwaiger et al., 2014; Roy et al., 2010; Bai and Morozov, 2010; Jiang and Pugh, 2009). Overall, the distribution of histone H3 PTMs in Amphimedon correlated with the expression state of its genes, as in eumetazoans (Schwaiger et al., 2014; Roy et al., 2010) (Fisher’s exact test, FDR adjusted p-value<0.05) (Figure 2B and C; Figure 2—figure supplement 2B–D).

Figure 2 with 4 supplements see all

Download asset Open asset

Histone PTMs are correlated with gene expression variations during development.

(A) TSS-centred average input DNA normalised read coverage plot of H3K4me3 across *Amphimedon* protein-coding genes. The x-axis spans ± 3 kb around TSSs and represents the position within the gene relative to TSS. The y-axis represents the input DNA normalised enrichment for H3K4me3 ChIP-seq reads in adult *Amphimedon*. Pink line: Non-expressed genes. Blue line: Low expressed genes. Orange line: Medium expressed genes. Light blue line: High expressed genes. The shaded gray area represents the average size of *Amphimedon* coding sequences. (B) Example of coding genes marked by H3K4me3 peaks. The genomic window shows input DNA-normalized H3K4me3 coverage and RNA-seq expression in both larva and adult. (C) The association of regions of enrichment of five histone H3 PTMs (H3K4me3, H3K27ac, H3K4me1, H3K36me3 and H3K27me3) and RNAPII with lists of various gene expression groups in adult is shown. The color key represents the log2(odds ratio) and the significant adjusted P-values (Fisher’s exact test) are superimposed on the grids. A P-value of zero means the overlap is highly significant. N.S.: not significant. Odds ratio represents the strength of association. (D) TSS-centred average input DNA normalised read coverage plots of H3K4me3 and RNAPII across ‘*high-variance*’ and ‘*low-variance*’ protein-coding genes. The x-axis spans ± 3 kb around TSSs and represents the position within the gene relative to TSS. The y-axis represents the input DNA normalised enrichment for ChIP-seq reads in adult *Amphimedon*. Light blue: high-variance coding genes. Orange line: low-variance coding genes. The shaded gray area represents the average size of *Amphimedon* coding sequences. (E) Top five most significantly enriched Gene Ontology (GO) terms for high-variance and low-variance protein-coding genes (adjusted P-values in brackets, Hypergeometric test). The full GO table is shown in Figure 2—source data 1.

https://doi.org/10.7554/eLife.22194.013

Figure 2—source data 1 GO biological process term enrichment outcome for the high-variance and low-variance gene sets (Hypergeometric test, FDR<0.01).: https://doi.org/10.7554/eLife.22194.014
Download elife-22194-fig2-data1-v2.xlsx
Figure 2—source data 2 KEGG pathways significantly enriched in low-variance and high-variance genes.: https://doi.org/10.7554/eLife.22194.015
Download elife-22194-fig2-data2-v2.xlsx

To investigate the dynamics of histone PTMs in genes regulated throughout Amphimedon development, we analysed CEL-seq data (Levin et al., 2016; Hashimshony et al., 2012; Anavy et al., 2014), comprising of 82 Amphimedon developmental samples from early cleavage to adult compressed into 17 stages, in the context of ChIP-seq profiles of total histone H3, H3K4me3, and RNAPII. We selected genes with the highest median absolute deviation for gene expression across these 17 Amphimedon developmental stages (effectively measuring the amplitude of change in expression levels for a given gene), resulting in a set of 3,200 ‘high-variance’ expressed genes (Figure 2—figure supplement 4A). The remaining expressed genes were defined as ‘low-variance’ genes (3,999) (see Materials and methods for the complete list of selection criteria). It is noteworthy that the high-variance genes were, on average, also expressed at higher levels than the low-variance genes (average adult expression of 51 vs 7 CEL-seq normalized counts, respectively). The TSSs of high-variance genes were strongly marked by H3K4me3 and occupied by RNAPII (Figure 2D; Figure 2—figure supplement 4B). Additionally, they showed nucleosome depletion right upstream of the TSSs (seen as lack of total histone H3 signal), consistent with the notion that H3K4me3 near TSSs destabilizes the interaction between histones and DNA to direct RNAPII to facilitate binding of promoter regulator elements and initiate transcription (Jiang and Pugh, 2009; Ha et al., 2011; Boeger et al., 2003) (Figure 2—figure supplement 4D). Conversely, lower levels of H3K4me3 or RNAPII (Mann-Whitney U test, p-value=0.05287 and p-value<2.2e-16, respectively; Figure 2D; Figure 2—figure supplement 4C) but higher nucleosome occupancy characterized low-variance genes (seen as lack of nucleosome depletion right upstream of the TSSs; Figure 2—figure supplement 4D). These results are consistent with H3K4me3 being predictive of gene expression levels (Ha et al., 2011; Karlić et al., 2010).

The distinctive landscapes of histone PTMs in high-variance and low-variance genes also correlated with distinct functional related gene groups, as indicated by Gene Ontology (GO)and KEGG pathway analyses. High-variance genes, which also include a significantly higher number of transcription factor gene families (e.g., JUN and ATF6 Jindrich and Degnan [2016]) compared to low-variance genes (Fisher’s exact test, p-value=3.872e-08), were predominantly enriched in signaling pathways (Hypergeometric test, FDR adjusted p-value<0.01; Figure 2E; see Figure 2—source data 1 and Figure 2—source data 2 for the complete list). In contrast, low-variance genes were enriched for metabolic GO terms (Figure 2E; see Figure 2—source data 1 and Figure 2—source data 2 for the complete list). This result is consistent with H3K4me3 being important for tuning the gene expression of dynamically expressed developmental genes, e.g., transcription factor and signaling genes. However, it remains unclear whether H3K4me3 is needed for high levels of gene expression or if it is needed for, or associated with, frequent switching of transcriptional status.

Absence of H3K4me3 in strongly developmentally regulated genes appear to be a metazoan conserved feature

The recent finding that transcription of a subpopulation of extremely dynamically expressed genes – typically being expressed at only one stage of development – in Drosophila and C. elegans occurs in the absence of H3K4me3 challenged the canonical role of histone PTMs in the modulation of gene expression (Pérez-Lluch et al., 2015a). To test whether this newly-discovered feature is conserved in non-bilaterians, we interrogated above-mentioned CEL-seq data (Levin et al., 2016; Hashimshony et al., 2012; Anavy et al., 2014), comprising 82 Amphimedon developmental samples from early cleavage to adult compressed into 17 stages, and arbitrarily selected, similarly to Pérez-Lluch et al. (2015a), the 1,000 genes with the lowest coefficients of variation (‘stable’ genes) expressed with minor changes throughout development. Conversely, the 1,000 genes with the highest coefficients of variation were defined as ‘regulated’ genes. Notably, the ‘regulated’ genes consisted of a small population of genes that differed from the ‘high-variance’ genes described earlier in having much more restricted expression patterns, mainly expressed at late juvenile and/or adult stage (Figure 3—figure supplement 1). Although stable and regulated genes had similar levels of RNAPII and total histone H3 (Figure 3—figure supplement 1B and C ), the stable genes were strongly marked by H3K4me3 and the regulated genes had significantly lower levels of H3K4me3 (Mann-Whitney U test, p-value=7.431e-05; Figure 3A), suggesting that reduction in H3K4me3 levels does not affect expression of the regulated genes (Pérez-Lluch et al., 2015a).

Figure 3 with 1 supplement see all

Download asset Open asset

Expression without H3K4me3 in strongly developmentally regulated genes.

(A) TSS-centred average input DNA normalised read coverage plot of H3K4me3 across *‘regulated’* and *‘stable’* protein-coding genes during *Amphimedon* development. The x-axis spans ± 3 kb around TSSs and represents the position within the gene relative to TSS. The y-axis represents the input DNA normalised enrichment for H3K4me3 ChIP-seq reads in adult *Amphimedon*. Light blue line: first 500 regulated genes. Orange line: second 500 regulated genes. Purple line: first 500 stable genes. Pink line: second 500 stable genes. The shaded gray area represents the average size of *Amphimedon* coding sequences. (B) Input DNA-normalized H3K4me3 coverage and RNA-seq expression in adult for Aqu2.1.40735_001, a gene stably expressed during *Amphimedon* development, Aqu2.1.39666_001, a regulated gene with adult-specific expression, and Aqu2.1.34366_001, a regulated gene with larva-specific expression. (C) TSS-centred average input DNA normalised read coverage plot of H3K4me3 across *‘regulated’* and *‘stable’* protein-coding genes during *Nematostella vectensis* development. The x-axis spans ± 3 kb around TSSs and represents the position within the gene relative to TSS. The y-axis represents the input DNA normalised enrichment for H3K4me3 ChIP-seq reads in *Nematostella* adult female polyps. The shaded gray area represents the average size of *Nematostella* coding sequences.

https://doi.org/10.7554/eLife.22194.020

We compared the pattern of H3K4me3 between one of the top three stably expressed genes during sponge development (Aqu2.1.40735_001, a E3 ubiquitin-protein ligase), and the gene with the highest coefficient of variation (Aqu2.1.39666_001, a putative sponge-specific gene specifically expressed in adult) (Figure 3B). The former showed a strong H3K4me3 enrichment at the TSS, whereas the latter lacked any marking, though its expression in the adult was ~70 times higher than the stable gene (33 vs 2361 CEL-seq normalized counts in adult, respectively). This lack of H3K4me3 at the TSS of regulated genes was similarly observed in the larva, exemplified here by a larva-specific regulated gene (Aqu2.1.34366_001) expressed 3.5-fold higher than the above-mentioned stable gene (Aqu2.1.40735_001) (147 vs 43 CEL-seq normalized counts in larva, respectively) (Figure 3B). Additionally, as shown in Drosophila (Pérez-Lluch et al., 2015a), regulated genes showed higher levels of H3K27me3 (Mann-Whitney U test, p-value<6.517e-06) and lower levels of H3K36me3 (Mann-Whitney U test, p-value<9.235e-08) than did stable genes (Figure 3—figure supplement 1D and E). Analyzing RNA-seq–based gene expression through the development of the cnidarian Nematostella vectensis (Helm et al., 2013) and previously published ChIP-seq data sets in Nematostella adult female polyps (Schwaiger et al., 2014), we obtained the same pattern (Mann-Whitney U test, p-value<2.2e-16; Figure 3C).

These results suggest that H3K4me3 might not be instrumental for extremely dynamic developmental expression and enforces our interpretation that it is required for tuning the levels of gene expression, a pattern that appears to be a conserved metazoan feature (Pérez-Lluch et al., 2015a).

Polycomb repressive complex 2 (PRC2) is conserved in Amphimedon and its binding sites contain putative GAGA factor binding motifs

PRC2 is responsible for the trimethylation of lysine 27 of histone H3 (H3K27me3), one of the best-characterized repressive histone H3 PTMs (Margueron and Reinberg, 2011). As a step to investigate a putative mechanism of PRC2-mediated silencing in Amphimedon, we identified the sponge homologs of Drosophila PRC2 components and found that the Amphimedon genome contains four copies of E(z) homologs, two copies of ESC homologs and one copy for each of the remaining components, SU(z)12 and Nurf55 (Figure 4A; Figure 4—source data 1).

Figure 4 with 1 supplement see all

Download asset Open asset

DNA motifs overrepresented in H3K27me3 transcriptionally silenced regions.

(A) Diagram representing the composition of *Drosophila* PRC2 complex and its four core components: the catalytic subunit of the complex E(z), the zinc finger protein SU(z)12, the WD-repeat protein ESC and the histone-binding protein Nurf55. E(z) is responsible for the main enzymatic activity of PRC2, which is to trimethylate histone H3 at lysine 27, yielding H3K27me3. Adapted from (Vissers et al., 2012). The presence (green) or absence (orange) of PRC2 and its core components in the different opisthokont species represented in the phylogenetic tree (left) is shown. *Amphimedon* is highlighted in green. (B) Sequence logos of a subset of the DNA motifs determined by MEME-ChIP analysis to be significantly enriched in the transcriptionally silenced regions marked by H3K27me3 in adult *Amphimedon*. For each motif, the best TOMTOM match to a motif in the JASPAR CORE and UniPROBE mouse databases, the E-value and the number of sites contributing to the construction of the motif are shown, respectively. The matched motif is shown on the top and the query motif is shown on the bottom.

https://doi.org/10.7554/eLife.22194.022

Figure 4—source data 1 Putative orthologs of Drosophila PcG components and associated factors in yeast, Capsaspora, sponge, nematode, and human genome. Table of PcG proteins is adapted from (http://www.igh.cnrs.fr/equip/cavalli/link.PolycombTeaching.html).: https://doi.org/10.7554/eLife.22194.023
Download elife-22194-fig4-data1-v2.xlsx

PRC2 recruitment has been best characterised in Drosophila where PRC2 proteins repress their target genes by recruitment to Polycomb Response Elements (PREs), which contain binding sites for sequence-specific DNA binding proteins, including GAGA factor and members of the Krüppel-like factor family (Müller and Kassis, 2006; Brown et al., 2005; Strutt et al., 1997; Simon and Kingston, 2009; Kassis and Brown, 2013). To test whether Amphimedon PRC2 complexes might be recruited via a similar mechanism, we used the transcriptionally silenced regions marked by H3K27me3 in a de novo motif analysis (Materials and methods). We searched for short motifs (6–15 bp) on the basis that the known interaction sites of PREbinding proteins in Drosophila are of approximately this length (~8 bp). Conserved binding motifs similar to the GAGA and Krüppel-like factors, in addition to binding motifs similar to homeodomain-containing developmental regulators (e.g., Irx family members), were significantly enriched (E-value<0.05) in the DNA associated with the H3K27me3 silenced regions in both adult and larva (Figure 4B; Figure 4—figure supplement 1). As in eumetazoans, this result suggests that Amphimedon PRC2 complexes are likely to be recruited through PRE-like sequences and may target developmental regulators for H3K27me3 deposition and transcriptional silencing (Margueron and Reinberg, 2011; Di Croce and Helin, 2013; Boyer et al., 2006).

Subset of Amphimedon lincRNAs is associated with an enhancer-like chromatin state

An additional layer of regulatory complexity in eumetazoan development is provided by long intergenic non-coding RNAs (lincRNAs) (Ulitsky, 2016; Hezroni et al., 2015; Quinn and Chang, 2016), which have been recently demonstrated to be developmentally expressed in sponges (Gaiti et al., 2015; Bråte et al., 2015). Here, we extended these analyses and analyzed the chromatin states of Amphimedon long intergenic ncRNAs (lincRNAs) (Gaiti et al., 2015), avoiding lncRNAs in protein-coding sequence introns or antisense to coding genes, which may yield ambiguous signals.

Previous studies have shown that the ratio of H3K4me1-to-H3K4me3 marks around TSSs can separate lincRNAs into enhancer-like lincRNAs (elincRNAs; high H3K4me1-to-H3K4me3 ratio) and canonical promoter-like lincRNAs (plincRNAs; low H3K4me1-to-H3K4me3 ratio) (Sebé-Pedrós et al., 2016; Marques et al., 2013; IIott et al., 2014). Thus, to explore whether sponge lincRNAs might originate from enhancer regions, we interrogated our ChIP-seq data sets and calculated the relative ratio of H3K4me1-to-H3K4me3 in a 4 kb window centered on lincRNA TSSs. Only lincRNAs in scaffolds larger than 10 kb that overlapped with regions of enrichment of H3K4me1, H3K4me3, and RNAPII were used in this analysis (n = 217). Similarly to IIott et al. (2014), we arbitrarily adopted a H3K4me1-to-H3K4me3 ratio of >1.2 and <0.8 to define elincRNAs and plincRNAs, respectively. Based on these criteria, we found 153 putative elincRNAs (70%) significantly enriched for H3K4me1 over H3K4me3 (Mann-Whitney U test, p-value=2.272e-05) and 21 (10%) putative plincRNAs with canonical promoter signature, i.e., higher enrichment of H3K4me3 over H3K4me1 (Mann-Whitney U test, p-value=1.925e-07). 43 (20%) lincRNAs could not be assigned to either group, that is, 0.8 < H3K4me1-to-H3K4me3 < 1.2 (Figure 5A–D; Figure 5—source data 1; Figure 5—figure supplement 1).

Figure 5 with 1 supplement see all

Download asset Open asset

*Amphimedon* lincRNA populations defined by histone PTM signatures.

(A) Heatmap showing the average read normalised coverage of H3K4me1, H3K4me3 and their ratio in adult *Amphimedon* across a 4 kb interval centred on TSSs of lincRNAs. Each line of the heatmaps represents a single lincRNA (y-axis). Profiles are sorted based on the differences in enrichment between H3K4me1 and input DNA, and H3K4me3 and input DNA, respectively. Also provided is the H3K4me1:H3K4me3 log2(ratio) around TSSs. (B) Enrichment of H3K4me1 (left) and H3K4me3 (right) (ChIP versus input) at plincRNAs and elincRNAs. P-values are indicated for Mann-Whitney U test. (C) Example of lincRNAs with promoter-like chromatin signature (plincRNAs). For the definition of adult chromatin states see Figure 1A. Promoter-like lincRNAs (blue) are shown, along with input DNA-normalized coverage of different histone modifications and RNA-seq expression in adult. (D) Same as (C) but for lincRNAs with enhancer-like chromatin signature (elincRNAs).

https://doi.org/10.7554/eLife.22194.025

Figure 5—source data 1 Annotation of putative elincRNAs and plincRNAs.: https://doi.org/10.7554/eLife.22194.026
Download elife-22194-fig5-data1-v2.xlsx

These results indicate that sponge lincRNAs can be separated in two distinct populations of poly(A)⁺ transcripts based on the chromatin status at their TSSs. Although these two populations resemble those found in human, mouse and Capsaspora lincRNAs (Sebé-Pedrós et al., 2016; Marques et al., 2013; IIott et al., 2014), their functional significance is yet to be determined.

Identification of enhancer elements in Amphimedon

To identify putative enhancer elements in Amphimedon in silico, we selected distal H3K4me1 regions of enrichment (high confidence regions, representing reproducible events across true biological replicates) that did not overlap TSSs (±200 bp) of protein-coding genes and lncRNAs, but overlapped with regions designated as being in an enhancer chromatin state based on the ChromHMM analysis (‘TxEnhA’ or ‘EnhWk’ or ‘EnhP’ state in adult; ‘TxEnhA1’ or ‘TxEnhA2’ or ‘EnhWk’ or ‘EnhP’ state in larva, which consist of typical eumetazoan enhancer histone H3 PTM patterns) (Figure 6A). A subset of these regions was also marked by H3K27ac, and therefore likely to be transcriptionally active (Figure 6A and B; Figure 6—source data 1). These predicted activated enhancer-like regions showed a significant enrichment of H3K4me1 and H3K27ac over H3K4me3 (Mann-Whitney U test, p-value<2.2e-16; Figure 6C; Figure 6—figure supplement 1), a biochemical signature typical of eumetazoan enhancers (Schwaiger et al., 2014). Interestingly, RNAPII occupied some of these Amphimedon predicted activated enhancer-like elements (35% and 41% in adult and larva, respectively), suggesting poly(A)+ enhancer RNAs could be transcribed from these regions (Natoli and Andrau, 2012; Li et al., 2016; Kim et al., 2010) (Figure 6A–D; Figure 6—figure supplement 2). Alternatively, but not exclusively, this might represent the result of chromatin looping and the simultaneous pulldown of both enhancers and promoters with the RNAPII antibody (Shlyueva et al., 2014).

Figure 6 with 5 supplements see all

Download asset Open asset

Distal enhancer regulation at the dawn of animals.

(A) Overview of the computational filtering pipeline adopted to predict the putative *Amphimedon* activated enhancer-like elements. See main text and Materials and methods for details. (B) Heatmap showing different histone modifications enrichment at predicted activated enhancer-like elements (±2 kb of flanking regions). (C) Boxplot showing enrichment of different histone modifications (ChIP versus input) at predicted activated enhancer-like elements, showing that activated enhancer-like elements have higher H3K4me1 than H3K4me3 levels, a typical characteristic of eumetazoan enhancers. Four asterisks (****) indicate p-values<2.2e-16 for Mann-Whitney U test between H3K4me3 and H3K27ac, between H3K4me3 and H3K4me1, and between H3K4me3 and RNAPII, respectively. (D) Example of predicted activated enhancer-like elements. Protein coding genes (purple) are shown, along with input DNA-normalized coverage of different histone modifications and RNA-seq expression in adult. Regions of enrichments (high confidence peaks, representing reproducible events across true biological replicates) corresponding to the predicted activated enhancer-like elements are highlighted in grey. (E) Sequence logos of the DNA motifs determined by MEME-ChIP analysis enriched in the adult predicted activated enhancer-like sequences. For each motif, the best match to a motif in the JASPAR CORE and UniPROBE mouse databases, the E-value and the number of sites contributing to the construction of the motif are shown, respectively. The matched motif is shown on the top and the query motif is shown on the bottom. (F) Gene Ontology (GO) enrichment activities of the nearest neighbor protein-coding genes of the adult predicted activated enhancer-like elements are shown. Bar length indicates the significance of the enrichment (Hypergeometric test; -log10[adjusted P- value]). Only the top ten GO biological process terms are shown. See Figure 6—source data 2 for the complete list. (G) Boxplot showing the size of introns that harbour adult activated enhancer-like elements versus all introns in the genome. The y-axis indicates the intron size (bp) in log scale. P-value is indicated for Mann–Whitney U test.

https://doi.org/10.7554/eLife.22194.028

Figure 6—source data 1 Genomic location of all the predicted activated enhancer-like elements and their distance to the closest TSS.: https://doi.org/10.7554/eLife.22194.029
Download elife-22194-fig6-data1-v2.xlsx
Figure 6—source data 2 Functional annotation of nearest neighbors genes of the adult predicted activated enhancer-like elements.: https://doi.org/10.7554/eLife.22194.030
Download elife-22194-fig6-data2-v2.xlsx
Figure 6—source data 3 Functional annotation of nearest neighbors genes of the larva predicted activated enhancer-like elements.: https://doi.org/10.7554/eLife.22194.031
Download elife-22194-fig6-data3-v2.xlsx
Figure 6—source data 4 GO term enrichment outcome for the nearest neighbors genes of the adult predicted activated enhancer-like elements (Hypergeometric test, FDR<0.01).: https://doi.org/10.7554/eLife.22194.032
Download elife-22194-fig6-data4-v2.xlsx
Figure 6—source data 5 GO term enrichment outcome for the nearest neighbors genes of the larva predicted activated enhancer-like elements (Hypergeometric test, FDR<0.01).: https://doi.org/10.7554/eLife.22194.033
Download elife-22194-fig6-data5-v2.xlsx

In eumetazoans, genes encoding transcriptional regulators are themselves regulated by multiple enhancer elements (Schwaiger et al., 2014; Shlyueva et al., 2014; Nègre et al., 2011; Bogdanovic et al., 2012; Woolfe et al., 2005; Heintzman et al., 2009). We therefore performed de novo motif analysis and, despite the limited power of motif detection due to the inherent cellular heterogeneity of our starting material, we were able to show that consensus binding motifs of key developmental transcription factor families were over-represented in the adult predicted activated enhancer-like sequences, including Zinc finger, Irx, SOX and POU binding motifs (Figure 6E; Figure 6—figure supplement 3). It is noteworthy that Zinc fingers can also be involved in roles that might be unrelated to directly regulating gene expression per se, e.g., chromatin remodeling (Wysocka et al., 2006). Similar binding motifs were obtained analysing the larva predicted activated enhancer-like sequences (Figure 6—figure supplement 4). Next, we examined whether the sponge predicted activated enhancer-like elements were preferentially located next to protein-coding genes involved in development and/or transcriptional regulation. By searching for the closest located TSSs to each of the predicted activated enhancer-like elements in Amphimedon, we nominated putative target protein-coding genes. Akin to eumetazoans, these nearest neighbor genes were significantly enriched for Gene Ontology (GO) terms associated with transcription factor activity and developmental processes (Hypergeometric test, FDR adjusted p-value<0.01) (Figure 6F), and comprised several transcription factors, including SOX2, FOS and NF-kB (Figure 6D; Figure 6—figure supplement 5; Figure 6—source data 2–5).

Vertebrates exhibit expansive intergenic regions where the majority of predicted enhancers are located (ENCODE Project Consortium, 2012; Djebali et al., 2012). In contrast, in Amphimedon, which has a highly compact genome with minimal intergenic regions (Fernandez-Valverde and Degnan, 2016), predicted activated enhancer-like elements were predominantly intragenic, with only a minority found in intergenic regions (9% and 20% in adult and larva, respectively) (Fernandez-Valverde and Degnan, 2016). This, along with the strong enrichment of chromatin states typically associated with eumetazoan enhancers – ‘TxEnhA’ and ‘EnhWk’ – in introns (Figure 1A; Figure 1—figure supplement 4), suggests a similar overall genomic distribution between Amphimedon, Nematostella and Drosophila enhancer elements (Schwaiger et al., 2014; Nègre et al., 2011; Arnold et al., 2013).

Greater intron length often associates with the presence of highly conserved non-coding elements (Irimia et al., 2011). We, therefore, extracted the introns that harbour predicted activated enhancer-like elements and compared their size distribution to the size of all intronic regions found across the genome. The former were significantly longer than the average genomic intron size, with a mean of 332 bp and 256 bp, and a median of 99 bp and 71 bp, respectively (Ansari-Bradley test, p-value=0.06151; Mann-Whitney U test, p-value=1.927e-06) (Figure 6G), suggesting that a cis-regulatory expansion appear to have occurred primarily in intronic rather than intergenic regions in Amphimedon.

Cis-regulation constrains genome architectures over 700 Myr of evolution

Highly conserved non-coding regulatory elements are often associated not only with greater intron length, but also with genes encoding developmental regulators (Woolfe et al., 2005; Vavouri et al., 2007; Sandelin et al., 2004). Particularly interesting are the conserved ancestral microsyntenic pairs (herein microsyntenic units) that consist of either (i) two neighbor genes that share common cis-regulatory elements, or (ii) a developmental regulator and nearby functionally unrelated gene(s), whose introns harbor conserved cis-regulatory elements (Kikuta et al., 2007; Irimia et al., 2013; Engström et al., 2007; Irimia et al., 2012; Naville et al., 2015). Experimental evidence has been provided for the existence of this type of cis-regulation in vertebrates (Irimia et al., 2012; Naville et al., 2015).

To test whether this is an ancient cis-regulatory mechanism maintained through animal evolution, we assessed the spatial relationship between the genes of each of the 80 microsyntenic units previously reported to be present in the Amphimedon genome (Irimia et al., 2012) and clarified their orthology, confirming the presence of 60 unambiguous microsyntenic units. Remarkably, 43 of these 60 evolutionary conserved metazoan microsyntenies contained putative enhancer-like signatures in Amphimedon adults (Figure 7A; Figure 7—source data 1; Figure 7—figure supplement 1). This was a much higher fraction relative to a control set consisting of 60 pairs of two randomly selected nonsyntenic neighbor genes (1,000 iterations; p-value<0.00001). This pattern was substantiated by the finding of larva enhancer-like signatures in 16 of the 60 microsyntenic units, seven of which contained both larva and adult predicted enhancer-like elements (Figure 7A; Figure 7—source data 1; Figure 7—figure supplement 1).

Figure 7 with 1 supplement see all

Download asset Open asset

*Amphimedon* enhancer-like elements are enriched in metazoan-specific microsyntenic units.

(A) Putative adult and larva enhancer-like signatures identified in the 60 metazoan-specific microsyntenic pairs investigated. (B) The cladogram represents known phylogenetic distribution of the *Isl2-Scaper* microsyntenic gene pair across opisthokonts. The orientation of the arrow corresponds to gene orientation. *Isl2-Scaper* is not conserved in yeast, *Capsaspora*, *Nematostella* and *C. elegans*. (C) Enhancer elements in the *Isl-Scaper* microsyntenic gene pair locus in *Amphimedon*. *Scaper* and *Isl* genes (purple) are shown, along with input DNA-normalized coverage of H3K4me3 and H3K4me1 and RNA-seq expression in both adult and larva. Regions of enrichments (high confidence peaks, representing reproducible events across true biological replicates) corresponding to the predicted enhancer-like elements located within the introns of *Scaper* are highlighted in grey.

https://doi.org/10.7554/eLife.22194.039

Figure 7—source data 1 60 microsyntenic units representing functional gene linkages and presence-absence of chromatin states containing typical eumetazoan enhancer histone PTM patterns (‘EnhP’,’ EnhWk’,’ TxEnhA’) (adult only) and/or in silico predicted enhancer-like elements (both larva and adult).: https://doi.org/10.7554/eLife.22194.040
Download elife-22194-fig7-data1-v2.xlsx

A striking case of conserved gene linkage involves the Islet LIM homeobox gene (Isl), which plays conserved roles in animal development (Thor and Thomas, 1997; Liang et al., 2011), and Scaper (S-phase cyclin A-associated protein in the ER) (Figure 7B). The Amphimedon Scaper contains 25 introns, some of which are considerably longer (>1 kb) than the mean intron size (Fernandez-Valverde et al., 2015), and predicted enhancer-like elements located within its intron 10, 17 and 21 (Figure 7C). Likewise, the microsynteny of Tfap4 (transcription factor AP-4) (Simionato et al., 2007) and Glis2 (GLIS family zinc finger 2) is deeply conserved. Similar to an observation in vertebrates (Abbasi et al., 2007), the sponge Glis2 contains two introns, of which the second harbors several adult predicted activated enhancer-like elements (Figure 7—source data 1; Figure 7—figure supplement 1). Together, these results suggest that the genomic location of some cis-regulatory elements likely places constraints on the evolution of nearby genes, leading to the occurrence of conserved microsyntenic gene blocks across the animal kingdom.

Discussion

Since their point of divergence over 700 Mya, sponges and eumetazoans have had radically different evolutionary histories, with the eumetazoan ancestor giving rise to a range of morphologically-complex body plans, and the sponge ancestor yielding one basic morphologically-simple body plan. As both these lineages share a remarkably similar repertoire of developmental gene families (Srivastava et al., 2010; Larroux et al., 2008; Richards et al., 2008), these different evolutionary trajectories have yet to be reconciled in terms of genome content and organization. Recently it has been shown that, despite having a highly compact genome with minimal intergenic regions (Fernandez-Valverde and Degnan, 2016), Amphimedon displays dynamic developmental gene expression akin to eumetazoans (Gaiti et al., 2015; Levin et al., 2016). By generating the first, to our knowledge, comprehensive genome-wide maps of histone H3 PTMs and putative enhancer elements in a non-eumetazoan animal, we determine that this transcriptional complexity is paralleled by regulatory complexity encoded by combinatorial histone H3 PTM patterns in this sponge.

Histone H3 PTMs have conserved role(s) in the modulation of gene expression across metazoans

Despite Amphimedon’s morphological simplicity, we find strong evidence in this sponge for the existence of a range of regulatory states that underlie eumetazoan development. For instance, the genome-wide promoter analysis of H3K4me3 – the canonical and widespread eukaryotic histone H3 PTM of active transcription – reveals a complex correlation between H3K4me3-containing nucleosome occupancy and gene expression in Amphimedon adults and larvae, consistent with an active and finely tuned role for H3K4me3 in modulating transcriptional activity and expression variability of developmental genes. Unexpectedly, we identify a small subpopulation of highly and specifically expressed genes that challenge this premise and are transcribed in the absence of H3K4me3 in Amphimedon and Nematostella. This subpopulation of genes differs from most other developmentally-expressed genes that possess the H3K4me3 mark, in having much more stage-restricted expression profiles; in this analysis, most are expressed only in one stage of development. Although it could be argued that this apparent absence of H3K4me3 is the consequence of the expression of regulated genes being spatially confined to specific cell populations, thus potentially limiting our detection sensitivity with our cell admixture ChIP-seq, these results run parallel to the recent finding by Pérez-Lluch et al. (2015a) that Drosophila and C. elegans exhibit the same pattern, suggesting that this newly-discovered feature is conserved across the animal kingdom. As the expression of the developmentally regulated genes is required only for a limited period, the absence of H3K4me3 mark would allow their rapid on-off switching. Alternative mechanisms, such as the transient binding of transcription factors, appear to play a major role in regulating the expression of these genes (Pérez-Lluch et al., 2015a, 2015b).

Model of an evolutionarily conserved mechanism of PRC2-mediated gene silencing

Polycomb Repressive Complex 2 (PRC2) primarily trimethylates histone H3 on lysine 27 and has been conserved throughout opisthokonts evolution, with its core subunits (E(z), SU(z)12, ESC and Nurf55) being present in animals, choanoflagellates and multicellular fungi, but absent in Capsaspora, and budding and fission yeast (Sebé-Pedrós et al., 2016; Margueron and Reinberg, 2011; Shaver et al., 2010; Jamieson et al., 2013; Connolly et al., 2013; Ikeuchi et al., 2015; Whitcomb et al., 2007) (Figure 4A). This is consistent with PRC2 complex being lost in several unicellular lineages. One of the ancestral roles of PCR2 in opisthokonts may have been in defense response against viruses and transposable elements, or insertion of new genes (Jamieson et al., 2013), prior to being co-opted for cell-type specific developmental regulation in animals, where H3K27me3 and PRC2 are required for transmitting the memory of repression across generations and during development (Margueron and Reinberg, 2011; Shaver et al., 2010; Gaydos et al., 2014; Barski et al., 2007). In fact, PRC2 often regulates deposition of H3K27me3 marks at loci encoding developmental regulators (Ha et al., 2011; Margueron and Reinberg, 2011; Barski et al., 2007). The finding of short conserved developmental transcription factor-binding-sites in Amphimedon H3K27me3 silenced regions is consistent with this evolutionary scenario. Analogous to recent findings in plants (Deng et al., 2013; Hecker et al., 2015), the identification of an enriched motif in the H3K27me3 silenced regions similar to the GAGA factor binding site, a component of the Drosophila Polycomb group response elements, suggests a role for the GAGA factor binding sites in strengthening PRC2 recruitment to target genes (Müller and Kassis, 2006; Simon and Kingston, 2009; Kassis and Brown, 2013). It is noteworthy that a sponge homolog of Drosophila GAGA factor was not identified in the current Amphimedon genome assembly (Figure 4—source data 1), suggesting the convergent co-option of other DNA binding proteins with analogous role(s) in the recruitment of PRC2.

The origin of animal distal enhancer regulation

Analysis of cis-regulatory DNA and histone PTMs have revealed that some cis-regulatory mechanisms, such as those associated with proximal promoters, are present in non-animal holozoans, while others appear to have evolved later on the stem leading to the crown metazoans, most notably distal enhancers (Sebé-Pedrós et al., 2016; Schwaiger et al., 2014). The latter has been posited to be one of the key contributing factors underlying the spatial and temporal coordination of cell differentiation that defines animal development (Levine, 2010; Levine et al., 2014; Levine and Tjian, 2003; Peter and Davidson, 2011). Our in silico prediction of Amphimedon enhancer elements based on histone H3 PTM co-localization patterns is consistent with these elements evolving along the metazoan stem at the transition to multicellularity (Sebé-Pedrós et al., 2016). Interestingly, promoter DNA regulatory elements to allow for context and cell type-specific gene expression also appeared to evolve in stem metazoans (Fernandez-Valverde and Degnan, 2016), suggesting these are also a critical component of the animal cis-regulatory landscape. Amphimedon predicted enhancer-like elements are characterized by the same combination of histone H3 PTMs as in eumetazoans, which appear to be lacking in unicellular holozoan relatives of animals (Sebé-Pedrós et al., 2016; Bulger and Groudine, 2011). Their preferential association with developmental and transcriptional regulators suggests that Amphimedon enhancer elements are likely to regulate developmental genes in a manner akin to eumetazoans (Schwaiger et al., 2014; Shlyueva et al., 2014; Nègre et al., 2011; Bogdanovic et al., 2012; Woolfe et al., 2005; Heintzman et al., 2009). Enhancer elements are known to be associated with the transcription of both short poly(A)^- and long poly(A)⁺ enhancer RNAs (2D and 1D eRNAs, respectively) (Natoli and Andrau, 2012; Li et al., 2016; Kim et al., 2010). The presence of RNAPII and the detection of expression at a subset of the Amphimedon activated enhancer-like elements is consistent with this notion (Figure 6—figure supplement 2). Although non-coding transcription at these enhancers will need to be investigated in detail, this co-occupancy of enhancer elements and RNAPII has also been observed in Nematostella and bilaterians (Schwaiger et al., 2014; Li et al., 2016; Kim et al., 2010; De Santa et al., 2010; Chen et al., 2013), where these elements might be physically interacting with the transcription initiation complex at the TSS of their target gene(s) (Schwaiger et al., 2014).

Unlike bilaterians, where the transcriptional repressor CCCTC-binding factor (CTCF) localizes with cohesin genome-wide and is involved in enhancer-promoter long-range interactions and higher-order chromatin structure (Lee and Iyer, 2012; Seitan et al., 2013; Merkenschlager and Odom, 2013), Amphimedon lacks CTCF (Heger et al., 2012). This likely constrains Amphimedon enhancer interactions with the proximal promoter transcriptional machinery to short distances. Chromatin looping of enhancers to their target promoters in this sponge might therefore occur through a CTCF-independent cohesin binding mechanism, as proposed in cnidarians, which also lack CTCF (Schwaiger et al., 2014). Alternatively, but not exclusively, RNAPII and its associated transcriptional machinery may track through the intervening DNA between enhancers and promoters (Li et al., 2016), and might be the preferred mechanism of enhancer-promoter interactions in this sponge. The co-occupancy of Amphimedon enhancer-like elements and RNAPII supports this mechanism of transcriptional activation. Future studies of the 3D genome architecture will be crucial in elucidating the mechanism of enhancer-promoter interaction in this sponge and other early-branching non-bilaterian animals lacking this architectural protein (Gaiti et al., 2016).

Finally, we find strong evidence for cis-regulatory elements being important for the maintenance of metazoan-specific microsyntenic gene blocks over 700 Myr of evolution. The emergence of distal enhancer regulation prior to metazoan cladogenesis could explain the pervasiveness of conserved syntenic regulatory blocks in animal genomes and the absence of these blocks in their unicellular relatives (Srivastava et al., 2010; Sebé-Pedrós et al., 2016; Irimia et al., 2013, 2012; Bulger and Groudine, 2011; Putnam et al., 2007; Duan et al., 2010). The strong evidence for enhancer elements being enriched in deeply conserved metazoan-specific microsyntenic units suggests that their genomic location is likely to constraint genome architecture, leading to the occurrence of conserved microsyntenies across the animal kingdom (Irimia et al., 2013, 2012).

In conclusion, a conserved gene regulatory landscape similar to that of morphologically-complex eumetazoans appears to have been already in place at the dawn of animals, and thus likely to have originated at least 700 Mya. Specifically, there appears to have been fundamental changes in the cis-regulatory architecture of the genome along the metazoan stem, concomitant with the evolution of animal multicellularity, including the apparent origin of distal enhancers and promoter types for cell-type-specificity and developmental regulation. With this in mind, we propose an evolutionary scenario in which quantitative rather than qualitative differences in regulatory mechanisms likely drive the evolution and diversification of eumetazoan body plans (Figure 8).

Figure 8

Download asset Open asset

Origin of animal *cis*-regulatory complexity.

The phylogenetic relationship of representative animal lineages and unicellular holozoans is shown here. Highlighted are the major genomic innovations that correlate with the emergence and early diversification of animals. Some components of the metazoan regulatory landscape may predate the split of the metazoan and holozoan lineages, including core TF-TF regulatory interactions and long intergenic non-coding RNAs, which have been recently identified in unicellular relatives of animals (Sebé-Pedrós et al., 2016; de Mendoza et al., 2015) but for which the evolutionary origin is still unclear. With a complex gene regulatory landscape already in place at the dawn of animals, the expansion of developmental gene families (encoding transcription factors and components of signaling pathways), *cis*-regulatory DNA and non-coding RNAs, along with the emergence of the architectural protein CTCF to allow more complex enhancer-promoter interactions, appear to underlie the evolutionary diversification of the eumetazoan body plans.

https://doi.org/10.7554/eLife.22194.042

Materials and methods

Animal collection

Request a detailed protocol

Amphimedon queenslandica adults and larvae were collected from Heron Island Reef, Great Barrier Reef, Queensland, Australia, and reared as previously described (Leys et al., 2008).

Antibodies

We used a mouse monoclonal antibody against the unphosphorylated C-terminal repeat of RNA polymerase II (RRID:AB_492629) (clone 8WG16, #05–952, Merck Millipore, Billerica, MA), a rabbit polyclonal antibody against H3K4me3 (RRID:AB_1977252) (#07–473, Merck Millipore), a rabbit polyclonal antibody against H3K27me3 (RRID:AB_310624) (#07–449, Merck Millipore), a mouse monoclonal antibody against H3K4me1 (RRID:AB_10806625) (#17–676, Merck Millipore), a rabbit polyclonal antibody against H3K27ac (RRID:AB_310550) (#07–360, Merck Millipore), a rabbit monoclonal antibody against H3K36me3 (RRID:AB_10615601) (#17–10032, Merck Millipore), and a rabbit polyclonal antibody against histone H3 (RRID:AB_417398) (#07–690, Merck Millipore) (Figure 1—source data 1). The entire amino acid sequence of histone H3 is perfectly conserved between Amphimedon and other eukaryotes where these antibodies have been used successfully (Sebé-Pedrós et al., 2016; Ercan et al., 2009; Barraza et al., 2015; Harmeyer et al., 2015; Liu et al., 2007; Eckalbar et al., 2016) (Figure 1—figure supplement 1).

Chromatin immunoprecipitation (ChIP) assays

View detailed protocol

Approximately a cm³ of adult sponge tissue was squeezed through a fine cloth and cells (~10⁷) were crosslinked in 2% formaldehyde for 5 min at room temperature (RT). Larvae (~350) were pooled, homogenized and crosslinked as above. A similar procedure was then adopted for both developmental stages. Specifically, crosslinking was quenched with 125 mM glycine for 5 min at RT. Cells were washed twice in 0.22 µm filtered seawater and centrifuged at 500 g for 5 min. Pelleted cells were lysed in SDS Lysis buffer (10 mM EDTA, 50 mM Tris-HCl at pH 8.0, 1% SDS, plus protease and phosphatase inhibitors), incubated for at least 10 min on ice, and sonicated for 12 min (12 cycles, each one 30 s ‘ON’, 30 s ‘OFF’) in a Bioruptor Sonicator (Diagenode, Seraing, Belgium) to generate 200–300 bp fragments. Optimal sonication conditions were previously determined by testing a range of sonication cycles (from 5 to 30); 12 cycles were deemed as optimal. Non-soluble material was removed from the lysate by centrifugation at 12,000 g for 10 min at 4°C. An aliquot of the soluble material was removed for input DNA and stored at −20°C. To reduce the SDS concentration to 0.1%, the remaining soluble material was diluted 10-fold in ChIP dilution buffer (1.2 mM EDTA, 16.7 mM Tris-HCl at pH 8.0, 167 mM NaCl, 1.1% Triton X-100, 0.01% SDS, plus PhosSTOP phosphatase inhibitor and cOmplete protease inhibitor cocktail [Roche, Basil, Switzerland]). To reduce non-specific background, the diluted soluble material was pre-cleared with Dynabeads protein G beads (#10003D, ThermoFisher, Waltham, MA), and, at the same time, the antibodies were linked to Dynabeads protein G beads (#10003D, ThermoFisher) by rotating for one hour at 4°C. At this point, the pre-cleared diluted soluble material was incubated with the antibody-bead mixtures, rotating at 4°C overnight. Immunoprecipitated material was washed three times with Low Salt Wash Buffer (2 mM EDTA, 20 mM Tris-HCl at pH 8.0, 150 mM NaCl, 1% Triton X-100, 0.1% SDS), three times with High Salt Wash Buffer (2 mM EDTA, 20 mM Tris-HCl at pH 8.0, 500 mM NaCl, 1% Triton X-100, 0.1% SDS), three times with LiCl Wash Buffer (1 mM EDTA, 1 mM Tris-HCl at pH 8.0, 1% DOC, 1% NP-40, 250 mM LiCl), and three times with TE buffer (10 mM Tris·Cl, pH 8.0; 1 mM EDTA). DNA complexes were eluted 30 min at 65°C with TE-SDS (10 mM Tris·Cl, pH 8.0; 1 mM EDTA; 1% SDS) and decrosslinked overnight at 65°C, along with input DNA, with the addition of 125 mM NaCl. Decrosslinked DNA complexes and input DNA were treated with RNaseA, and subsequently with proteinase K. Finally, immunoprecipitated and input DNA were purified with phenol:chloroform:isoamyl extraction (25:24:1), recovered by precipitation with ethanol in the presence of 300 mM NaOAc pH 5.2 and 2 µl of glycogen carrier (10 mg/ml), and resuspended in UltraPure DNase/RNase-Free Distilled Water (ThermoFisher) for later use. Libraries of immunoprecipitated DNA and input DNA were prepared using the NEBNext ChIP-seq Library Prep Master Mix Set for Illumina (#E6240, New England Biolabs, Ipswich, MA) according to the manufacturer’s protocol. The quality and profile of the libraries was analyzed using Agilent High Sensitivity DNA Kit (#5067–4626, Agilent, Santa Clara, CA) and quantified using KAPA Library Quantification Kit (#KK4824, Kapa Biosystems, Wilmington, MA). Deep sequencing (100 bp paired-end) of the adult libraries – two biological replicates for H3K4me3, H3K4me1, H3K36me3, H3K27me3, RNAPII, input DNA and no biological replicates for H3K27ac and total histone H3 – was performed by the Macrogen Oceania NGS Unit on Illumina HiSeq 2000 instrument (Illumina, San Diego, CA, United States). Deep sequencing (40 bp paired-end) of the larva libraries – no biological replicates for H3K4me3, H3K4me1, H3K27me3, H3K27ac, RNAPII, input DNA – was performed by the Central Analytical Research facility (CARF), Brisbane, Queensland, Australia, on Illumina NextSeq 500 instrument (Illumina, San Diego, CA, United States).

ChIP-seq data analyses

Request a detailed protocol

Adult raw Illumina sequencing reads were checked using FastQC v0.52 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and quality filtered using Trimmomatic v1.0.0 (SLIDINGWINDOW: 4:15, LEADING: 3, TRAILING: 3, HEADCROP: 5, MINLEN: 50) (RRID:SCR_011848) (Bolger et al., 2014). Quality filtered paired-end Illumina sequencing reads were then aligned to the Amphimedon genome (Srivastava et al., 2010) using Bowtie v1.1.2 (RRID:SCR_005476) (Langmead et al., 2009) with -m 1, -n 2, -X 500, --best parameters (uniquely mapped reads and maximum of two mismatches within the seed). Non-aligned reads were removed using SAMtools v0.1.19 (RRID:SCR_002105) (Li et al., 2009). For all the ChIP-seq data sets, strand cross-correlation measures were used to estimate signal-to-noise ratios using SPP v1.11.0 (RRID:SCR_001790). ChIP-seq data sets for each mark were flagged if the scores were below a normalized strand cross-correlation coefficient (NSC) threshold of 1.05, as described in the modENCODE and ENCODE guidelines (ENCODE Project Consortium, 2012; Landt et al., 2012; Kellis et al., 2014; Kharchenko et al., 2008). These analyses were performed on Galaxy-qld server (http://galaxy-qld.genome.edu.au/galaxy) developed within the GVL project (Afgan et al., 2015, 2016) and maintained by the Research Computing Centre, University of Queensland, Australia.

Pearson’s correlation coefficients (Pearson’s r) of genome-wide fold enrichment (FE) signals (see below) was computed for biological replicates and a minimum threshold of 0.5 was required, as per Ho et al. (2014). In addition, to ensure consistency between biological replicates, we further required an Irreproducible Discovery Rate (IDR) of at least 0.5 (see below), as described in the modENCODE and ENCODE guidelines (ENCODE Project Consortium, 2012; Landt et al., 2012; Kellis et al., 2014; Kharchenko et al., 2008). ChIP-seq data sets that met these criteria were then merged across biological replicates (see ,).

Histone PTM regions of enrichment relative to corresponding sequenced input DNA controls were determined using MACS2 v2.1.0 (RRID:SCR_013291) (Zhang et al., 2008) according to modENCODE, ENCODE and Roadmap Epigenomics consortiums guidelines (Kundaje et al., 2015; ENCODE Project Consortium, 2012; Landt et al., 2012; Kellis et al., 2014; Kharchenko et al., 2008). Specifically, MACS2 was used in broadpeak mode with a broadpeak P-value threshold of 0.1 and a narrowpeak threshold of 0.01 (-p 0.01, --broad, --nomodel, --extsize 146, -g 1.45e8). Enriched regions were scored on individual replicates (R1 and R2), pooled data (reads pooled across biological replicates) (P) and on subsampled pseudoreplicates (obtained by pooling reads from biological replicates and randomly subsampling, without replacement, two pseudoreplicates with half the total number of pooled reads) (PR1 and PR2). For each histone PTM, we defined ‘R’ as the set of peaks in P that overlap peaks in R1 and R2, and ‘PR’ as the set of peaks in P that overlap peaks in PR1 and PR2. Next, we defined ‘M’ as the set of peaks that match exactly in R and PR, and ‘T’ as the set of peaks that match exactly in R and PR as well as those that are unique to R or unique to PR. For a statement about reproducibility we required the M-to-T ratio to be at least 0.5 (Figure 1—source data 4). To obtain reliable regions of enrichment, we restricted all further analyses to enriched regions identified using pooled data that were also independently identified in both replicates and pseudoreplicates (the ‘M’ set). These regions of enrichment can be interpreted as high confidence regions, representing reproducible events across true biological replicates. For H3K27ac, for which no replication was available, we used the P-value column to rank peaks and only retained peaks with a p-value<0.001. We used the gappedPeak representation for the histone PTMs with relatively compact enrichment patterns, including H3K4me3, H3K27ac and H3K4me1. The gapped peaks are broad domains (passing P-value 0.1) that contain at least one narrow peak passing a P-value of 0.01. For the diffused histone PTMs – H3K36me3 and H3K27me3 – we used the broadPeak representation. RNAPII peaks were detected using the peakzilla software (RRID:SCR_007471) (Bardet et al., 2013), using input DNA reads as control (-c 1.5, –s 3). The fraction of reads falling within peak regions (FRiP) was also calculated (see Figure 1—source data 4). In line with ENCODE guidelines (ENCODE Project Consortium, 2012; Landt et al., 2012; Kellis et al., 2014; Kharchenko et al., 2008), all our data sets have a FRiP enrichment of 1% or more.

For every pair of aligned ChIP and matching input DNA data sets, we also used MACS2 (Zhang et al., 2008) to generate genome-wide signal coverage tracks for every position in the Amphimedon genome (Srivastava et al., 2010). Input DNA was used as a control for signal normalization for the histone ChIP-seq coverage. The three types of signal score statistics computed per base are as follows: (i) fold-enrichment ratio of ChIP-seq counts relative to expected background counts _local (FE); (ii) negative log10 of the Poisson P-value of ChIP-seq counts relative to expected background counts _local (ppois); and (iii) subtraction of noise from treatment sample (subtract).

Larva ChIP-seq data sets were analysed as described above, with the following minor modifications. Adapter contamination prior to read quality filtering was removed using Cutadapt (RRID:SCR_011841) (Martin, 2011). Reads were then quality filtered using Trimmomatic v1.0.0 (SLIDINGWINDOW: 4:15, LEADING: 3, TRAILING: 3, HEADCROP: 3, MINLEN: 20) (RRID:SCR_011848) (Bolger et al., 2014). Histone PTM and RNAPII regions of enrichment relative to sequenced input DNA controls were determined using MACS2 v2.1.0 (RRID:SCR_013291) (Zhang et al., 2008) in broadpeak mode with a broadpeak q-value threshold of 0.1 and a narrowpeak threshold of 0.05 (-q 0.05, --broad, --nomodel, --extsize 146, -g 1.45e8).

In both stages, chromatin states across the genome were defined using ChromHMM v1.10 (Ernst and Kellis, 2012), which is based on a multivariate Hidden Markov Model, using default parameters. For each ChIP-seq data set, read counts were computed in non-overlapping 200 bp bins across the Amphimedon genome (Srivastava et al., 2010). Each bin was discretised into two levels, one indicating enrichment and 0 indicating no enrichment. The binarization was performed by comparing ChIP-seq read counts to corresponding input DNA control read counts within each bin and using a Poisson P-value threshold of 1e-4 (the default discretization threshold in ChromHMM). We trained several models in parallel mode with the number of states ranging from 5 states to 15 states and chose a 9-state model as the best model that captures all the key interactions between the chromatin marks and cover all possible genomic locations (promoter, enhancer, gene body) that we expected to resolve given the selection of histone PTMs we used (H3K4me3, H3K27ac, H3K36me3, H3K4me1, H3K27me3 in adult; and H3K4me3, H3K27ac, H3K4me1, H3K27me3 in larva). To assign biologically meaningful mnemonics to the nine states, ChromHMM was used to compute the overlap and neighborhood enrichments of each state relative to various types of functional annotations (Figure 1B; Figure 1—figure supplement 2; Figure 1—figure supplement 3; Figure 1—figure supplement 4). State enrichment in different genomic features was calculated dividing the percentage of nucleotides occupied by a state in a particular genomic feature by the percentage of nucleotides that this genomic feature represents in the entire genome. For the overlap enrichment plots in the figures, the enrichments for each genomic feature (column) across all states is normalized by subtracting the minimum value from the column and then dividing by the max of the column. So, the values always range from 0 (white) to 1 (dark blue) (i.e., a column wise relative scale). For the neighborhood positional enrichment plots, the normalization is done across all columns (i.e., the minimum value over the entire matrix is subtracted from each value and divided by the maximum over the entire matrix). The functional annotations used were as follows: (1) CpG islands obtained using Hidden Markov Models as described in Wu et al. (2010). (2) Exons, genes, introns, transcription start sites (TSSs) and transcription end sites (TESs), 200 bp windows around TSSs and 200 bp windows around TESs based on Aqu2.1 gene model annotations (Fernandez-Valverde et al., 2015). (3) Expressed and repressed genes, their TSSs and TESs. Genes were classified into expressed (CEL-seq normalized counts > 0.5) and repressed (CEL-seq normalized counts < 0.5) class based on their CEL-seq expression levels in the relevant stage (larva or adult) (Levin et al., 2016; Hashimshony et al., 2012; Anavy et al., 2014).

Regions of enrichment of the various histone H3 PTMs and RNAPII were overlapped with protein-coding genes and the Bioconductor R package GeneOverlap v1.14.0 (https://www.bioconductor.org/packages/release/bioc/html/GeneOverlap.html) was used to test and visualise their association with lists of various gene expression groups (R Core Team, 2014) (Figure 2B; Figure 2—figure supplement 2B). Protein-coding genes were classified into ‘high’, ‘mid’, ‘low’ and ‘non-expressed’ based on their CEL-seq expression levels in the relevant stage (larva or adult) (Levin et al., 2016; Hashimshony et al., 2012; Anavy et al., 2014). Expressed genes were liberally defined as genes that had CEL-seq read counts > 0 in the relevant stage. Specifically, to define ‘high’, ‘medium’, ‘low’ expressed genes, all protein-coding genes expressed in the relevant stage were sorted based on CEL-seq data values and separated into three bins of an equal number of genes, similar to previous analyses (Schwaiger et al., 2014).

Enhancer elements were predicted as reliable H3K4me1 regions of enrichment, which did not overlap TSSs (no intersection with 200 bp upstream or 200 bp downstream of the TSSs of protein-coding genes and lncRNAs), but overlapped with regions designated as being in an enhancer chromatin state (‘TxEnhA’ or ‘EnhWk’ or ‘EnhP’ state in adult; ‘TxEnhA1’ or ‘TxEnhA2’ or ‘EnhWk’ or ‘EnhP’ state in larva) based on the ChromHMM analysis. The activated enhancer elements were predicted intersecting enhancer elements with H3K27ac significant peaks, requiring a 50% minimal overlap fraction. BEDTools v2.23.0 (RRID:SCR_006646) (Quinlan and Hall, 2010) was used to calculate overlaps between regions of enrichment and chromatin states with the different genomic features, as well as to identify the nearest TSS for each of the activated enhancer elements.

De novo motif enrichment analyses were performed using MEME-ChIP against JASPAR CORE and UniPROBE Mouse databases (-meme-minw 6, -meme-maxw 15, meme-nmotifs 20, -dreme-e 0.05, -meme-mod zoops) (RRID:SCR_001783) (Machanick and Bailey, 2011). Each motif was renamed according to their most similar motif in the TOMTOM database or literature, if any.

Gene Ontology (GO) functional enrichment analyses were performed using the Cytoscape plugin BiNGO (RRID:SCR_005736) (Maere et al., 2005; Shannon et al., 2003) with custom annotation and a FDR adjusted P-value cut-off of 0.01. All Amphimedon predicted peptides (Fernandez-Valverde et al., 2015) were annotated using BLASTp (RRID:SCR_001010) (Altschul et al., 1990) (E-value of 0.001) against the non-redundant (nr) NCBI protein database. All proteins were also searched for protein motifs and signal peptides using InterProScan 5 (Jones et al., 2014) with default parameters. KEGG pathway annotations were obtained on the webserver BlastKOALA for the taxonomic group ‘Animals’ against the ‘family_eukaryotes + genus_prokaryotes’ database file, using default settings. Pathway analyses were performed with the BlastKOALA annotation files using the KEGG Mapper – Reconstruct pathway tool (Kanehisa et al., 2016).

Transcription Start Site (TSS) input DNA-normalised coverage profiles and heatmaps were calculated using ngs.plot v2.61 (RRID:SCR_011795) (Shen et al., 2014) and deepTools v2.4.1 (Ramírez et al., 2016). As above, protein-coding genes were classified into ‘high’, ‘mid’, ‘low’ and ‘non-expressed’ based on their CEL-seq expression levels in the relevant stage (larva or adult) (Levin et al., 2016; Hashimshony et al., 2012; Anavy et al., 2014). Expressed genes were liberally defined as genes that had CEL-seq read counts >0 in the relevant stage. Specifically, to define ‘high’, ‘medium’, ‘low’ expressed genes, protein-coding genes expressed in the relevant stage were sorted based on CEL-seq data values and separated into three bins of an equal number of genes, similar to previous analyses (Schwaiger et al., 2014).

Only lincRNAs found in scaffolds larger than 10 kb were used for all the analyses and, given the compact genome of Amphimedon (Fernandez-Valverde and Degnan, 2016), all the TSS analyses were restricted to non-overlapping protein-coding genes with an intergenic distance >1 kb that were found in scaffolds larger than 10 kb.

All genome browser figures were generated using a local instance of the UCSC genome browser (RRID:SCR_005780) (Kuhn et al., 2013).

ChIP-quantitative PCRs (ChIP-qPCRs)

Request a detailed protocol

ChIP-quantitative PCRs (ChIP-qPCRs) were performed using the LightCycler 480 platform (Roche, Basil, Switzerland). ChIP (H3K4me1, H3K27ac, H3K4me3, H3K27me3) and Input DNA libraries were diluted in water, combined with LightCycler 480 SYBR green I master mix (Roche, Basil, Switzerland) and 0.2 µM primers, then cycled with the following profile: 95°C for 10 min, 40 cycles of 95°C for 10 s, 60°C for 10 s, 72°C for 20 s. Primer sequences are available in Supplementary file 1.

Quantification cycle (Cq) values were extrapolated from manufacturers software (version 1.5.1.6.1 SP2) using High Confidence settings. A melt curve and no template controls (ntc) were also run to ensure single amplicons were responsible for the fluorescent signal. The numerical value 3.32 (log₂10, representing 10% of input chromatin) was subtracted from the Cq value of the input sample to generate the adjusted input Cq. Two different intergenic regions not bound by our histone PTMs of interest were used as negative controls. Double delta (dd) Cq analysis was computed (see Figure 1—source data 5).

Specifically, the following formulas were used to calculate fold increase in signal over background:

dCq_IP = Cq_IP - Cq_Intergenic

dCq_Input = Cq_Input - Cq_intergenic

ddCq = dCq_IP - dCq_Input

Fold Change = 2^∧(-ddCq)

High- and low-variance genes in Amphimedon

Request a detailed protocol

CEL-seq raw reads were processed and mapped back to the Amphimedon genome using Bowtie (RRID:SCR_005476) (Langmead et al., 2009). We then compressed the 82 Amphimedon developmental samples, from early cleavage to adult, into 17 stages averaging the biological replicates for each developmental stage across them. Larval stages have been combined in two different groups (Larvae 0–7 hr and Larvae 6–50 hr), as these developmental time points only have one replicate per time point. To reduce noise, the protein-coding genes and long non-coding RNAs with an overall expression of less than 100 CEL-seq raw counts throughout the whole developmental time course were discarded. The CEL-seq raw gene counts were then normalized using variance stabilizing transformation in DEseq2 1.6.3 (RRID:SCR_000154) (Love et al., 2014) and the 15,000 most variable genes (14,698 protein coding genes + 301 lncRNAs) were extracted using median absolute deviation. The 14,698 protein-coding genes were then filtered to retain only non-overlapping protein-coding genes with detectable expression at adult stage (CEL-seq normalized counts > 0) with an intergenic distance >1 kb that were found in scaffolds larger than 10 kb. This resulted in a total number of 3,200 ‘high-variance’ genes. The remaining expressed (CEL-seq normalized counts > 0 in adult) non-overlapping protein-coding genes with an intergenic distance >1 kb that were found in scaffolds larger than 10 kb were considered ‘low-variance’ genes (n = 3,999). To define low, medium and high, the 3,200 high-variance genes and 3,999 low-variance genes were sorted based on CELseq data values and separated into three bins of an equal number of genes.

Regulated and stable genes in Amphimedon

Request a detailed protocol

CEL-seq raw reads were processed and mapped back to the Amphimedon genome using Bowtie (RRID:SCR_005476) (Langmead et al., 2009). Read counts were normalized by dividing by the total number of counted reads and multiplying by 10⁶. We then compressed the 82 Amphimedon developmental samples, from early cleavage to adult, into 17 stages averaging the biological replicates for each developmental stage across them. Larval stages have been combined in two different groups (Larvae 0–7 hr and Larvae 6–50 hr), as these developmental time points only have one replicate per time point. To reduce noise, only the protein-coding genes with an expression of at least four CEL-seq normalised counts in at least two developmental time points were retained. To define the transcriptional stability of protein-coding genes, the coefficient of variation of gene expression was calculated for each protein-coding gene (n = 15,146), as reported by Pérez-Lluch et al. (2015a). For the TSS input DNA-normalised coverage plots, these 15,146 protein-coding genes were then filtered to retain only expressed (CEL-seq normalized counts > 0 in the relevant stage [larva or adult]) non-overlapping protein-coding genes with an intergenic distance >1 kb that were found in scaffolds larger than 10 kb. Finally, from the full ranking of these expressed protein-coding genes, we defined the bottom 1,000 genes with the lowest variation in expression during development as ‘stable’ genes and the top 1,000 genes with the highest variation in expression as strongly developmentally ‘regulated’ genes.

Regulated and stable genes in Nematostella vectensis

Request a detailed protocol

Available ChIP-seq data sets on adult female polyps for H3K4me3 and corresponding input DNA controls were used (Schwaiger et al., 2014). Aligned ChIP and matching input DNA data sets and developmentally stable and regulated genes were generated using the same procedures as in the sponge (see above). To obtain gene and transcript quantifications, we mapped available RNA-seq data sets (Helm et al., 2013) to NveGenes2.0 gene models (http://www.cnidariangenomes.org/) using kallisto (Bray et al., 2016).

Orthologs identification and phylogeny

Request a detailed protocol

Orthologs of Drosophila PcG components and associated factors were identified using BLASTp (RRID:SCR_001010) (Altschul et al., 1990) searches against the predicted proteomes of the selected species (Figure 4—source data 1) with a threshold E-value of 0.001 and taking a maximum of 5 hits per species. All the obtained protein hits were aligned using MAFFT with L-INS-i mode (RRID:SCR_011811) (Katoh and Standley, 2013). The alignments were automatically trimmed with trimAl v1.4 (151) in -automated1 mode. Resulting trimmed alignments were then used for phylogenetic inference using FastTree2 (Price et al., 2010) with -wag -cat 8 -gamma parameters. The phylogenetic trees were inspected manually to discriminate which BLASTp hits formed monophyletic clades with the Drosophila query sequences. The same methodology was used to identify the conserved ancestral microsyntenic pairs taken from Irimia et al. (2012), but using Homo sapiens sequences as query proteins. The phylogeny-validated Amphimedon ortholog pairs were manually checked for contiguity in the genome and those found in different scaffolds or with more than two intervening genes were removed from the subsequent analyses.

Data access

Request a detailed protocol

Amphimedon ChIP-seq data sets have been deposited to the NCBI Gene Expression Omnibus (GEO) (RRID:SCR_007303) (Edgar et al., 2002) under accession number GSE79645. Amphimedon genome assembly ampQue1 was used throughout the study. CEL-seq data sets can be obtained from NCBI GEO (GSE54364) (Anavy et al., 2014). Amphimedon RNA-seq data sets can be downloaded at NCBI's SRA (RRID:SCR_004891) with accession SRP044247 (Fernandez-Valverde et al., 2015). Nematostella vectensis RNA-seq data sets can be downloaded at NCBI's SRA with accession SRP018739 (Helm et al., 2013). N. vectensis ChIP-seq data sets can be obtained from NCBI GEO (GSE46488) (Schwaiger et al., 2014). We used the following gene model data sets for all analyses. A. queenslandica: Aqu2.1 models (http://amphimedon.qcloud.qcif.edu.au/) (last accessed February 25, 2017) (Fernandez-Valverde et al., 2015), lncRNAs (http://amphimedon.qcloud.qcif.edu.au/lncRNAs/) (last accessed February 25, 2017) (Gaiti et al., 2015); N. vectensis: NveGenes2.0 models (http://www.cnidariangenomes.org/)(last accessed February 25, 2017).

Data availability

The following data sets were generated

(2017) Landscape of histone modifications in a sponge reveals the origin of animal cis-regulatory complexity
Publicly available at the NCBI Gene Expression Omnibus (accession no: GSE79645).

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE79645

The following previously published data sets were used

(2014) A high-resolution Amphimedon queenslandica transriptomic timecourse
Publicly available at NCBI Gene Expression Omnibus (accession no: GSE54364).

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE54364
(2015) Amphimedon queenslandica deep developmental transcriptomes
Publicly available at NCBI Sequence Read Archive (accession no: SRP044247).

https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?study=SRP044247
1. Helm RR
2. Siebert S
3. Tulin S
4. Smith J
5. Dunn CW
(2013) Characterization of differential transcript abundance through time during Nematostella vectensis development
Publicly available at NCBI Sequence Read Archive (accession no: SRP018739).

http://sra.dnanexus.com/studies/SRP018739/experiments
1. Schwaiger M
2. Schönauer A
3. Rendeiro AF
4. Pribitzer C
5. Schauer A
6. Gilles A
7. Schinko J
8. Renfer E
9. Fredman D
10. Technau U
(2013) Evolutionary conservation of the eumetazoan gene regulatory landscape
Publicly available at NCBI Gene Expression Omnibus (accession no: GSE46488).

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE46488

References

1. Abbasi AA
2. Paparidis Z
3. Malik S
4. Goode DK
5. Callaway H
6. Elgar G
7. Grzeschik KH
(2007) Human GLI3 intragenic conserved non-coding sequences are tissue-specific enhancers
PLoS One 2:e366.

https://doi.org/10.1371/journal.pone.0000366
- PubMed
- Google Scholar
1. Adamska M
2. Degnan SM
3. Green KM
4. Adamski M
5. Craigie A
6. Larroux C
7. Degnan BM
(2007) Wnt and TGF-beta expression in the sponge Amphimedon queenslandica and the origin of metazoan embryonic patterning
PLoS One 2:e1031.

https://doi.org/10.1371/journal.pone.0001031
- PubMed
- Google Scholar
1. Afgan E
2. Baker D
3. van den Beek M
4. Blankenberg D
5. Bouvier D
6. Čech M
7. Chilton J
8. Clements D
9. Coraor N
10. Eberhard C
11. Grüning B
12. Guerler A
13. Hillman-Jackson J
14. Von Kuster G
15. Rasche E
16. Soranzo N
17. Turaga N
18. Taylor J
19. Nekrutenko A
20. Goecks J
(2016) The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update
Nucleic Acids Research 44:W3–W10.

https://doi.org/10.1093/nar/gkw343
- PubMed
- Google Scholar
1. Afgan E
2. Sloggett C
3. Goonasekera N
4. Makunin I
5. Benson D
6. Crowe M
7. Gladman S
8. Kowsar Y
9. Pheasant M
10. Horst R
11. Lonie A
(2015) Genomics virtual laboratory: a practical bioinformatics workbench for the cloud
Plos One 10:e0140829.

https://doi.org/10.1371/journal.pone.0140829
- PubMed
- Google Scholar
1. Altschul SF
2. Gish W
3. Miller W
4. Myers EW
5. Lipman DJ
(1990) Basic local alignment search tool
Journal of Molecular Biology 215:403–410.

https://doi.org/10.1016/S0022-2836(05)80360-2
- PubMed
- Google Scholar
(2014) BLIND ordering of large-scale transcriptomic developmental timecourses
Development 141:1161–1166.

https://doi.org/10.1242/dev.105288
- PubMed
- Google Scholar
1. Arnold CD
2. Gerlach D
3. Stelzer C
4. Boryń ŁM
5. Rath M
6. Stark A
(2013) Genome-wide quantitative enhancer activity maps identified by STARR-seq
Science 339:1074–1077.

https://doi.org/10.1126/science.1232542
- PubMed
- Google Scholar
1. Bai L
2. Morozov AV
(2010) Gene regulation by nucleosome positioning
Trends in Genetics 26:476–483.

https://doi.org/10.1016/j.tig.2010.08.003
- PubMed
- Google Scholar
(2013) Identification of transcription factor binding sites from ChIP-seq data at high resolution
Bioinformatics 29:2705–2713.

https://doi.org/10.1093/bioinformatics/btt470
- PubMed
- Google Scholar
(2015) The Phaseolus vulgaris PvTRX1h gene regulates plant hormone biosynthesis in embryogenic callus from common bean
Frontiers in Plant Science 6:577.

https://doi.org/10.3389/fpls.2015.00577
- PubMed
- Google Scholar
1. Barski A
2. Cuddapah S
3. Cui K
4. Roh TY
5. Schones DE
6. Wang Z
7. Wei G
8. Chepelev I
9. Zhao K
(2007) High-resolution profiling of histone methylations in the human genome
Cell 129:823–837.

https://doi.org/10.1016/j.cell.2007.05.009
- PubMed
- Google Scholar
1. Birnbaum RY
2. Clowney EJ
3. Agamy O
4. Kim MJ
5. Zhao J
6. Yamanaka T
7. Pappalardo Z
8. Clarke SL
9. Wenger AM
10. Nguyen L
11. Gurrieri F
12. Everman DB
13. Schwartz CE
14. Birk OS
15. Bejerano G
16. Lomvardas S
17. Ahituv N
(2012) Coding exons function as tissue-specific enhancers of nearby genes
Genome Research 22:1059–1068.

https://doi.org/10.1101/gr.133546.111
- PubMed
- Google Scholar
(2003) Nucleosomes unfold completely at a transcriptionally active promoter
Molecular Cell 11:1587–1598.

https://doi.org/10.1016/S1097-2765(03)00231-4
- PubMed
- Google Scholar
(2012) Dynamics of enhancer chromatin signatures mark the transition from pluripotency to cell specification during embryogenesis
Genome Research 22:2043–2053.

https://doi.org/10.1101/gr.134833.111
- PubMed
- Google Scholar
(2014) Trimmomatic: a flexible trimmer for Illumina sequence data
Bioinformatics 30:2114–2120.

https://doi.org/10.1093/bioinformatics/btu170
- PubMed
- Google Scholar
1. Boyer LA
2. Plath K
3. Zeitlinger J
4. Brambrink T
5. Medeiros LA
6. Lee TI
7. Levine SS
8. Wernig M
9. Tajonar A
10. Ray MK
11. Bell GW
12. Otte AP
13. Vidal M
14. Gifford DK
15. Young RA
16. Jaenisch R
(2006) Polycomb complexes repress developmental regulators in murine embryonic stem cells
Nature 441:349–353.

https://doi.org/10.1038/nature04733
- PubMed
- Google Scholar
(2016) Near-optimal probabilistic RNA-seq quantification
Nature Biotechnology 34:525–527.

https://doi.org/10.1038/nbt.3519
- PubMed
- Google Scholar
(2015) Regulatory RNA at the root of animals: dynamic expression of developmental lincRNAs in the calcisponge Sycon ciliatum
Proceedings. Biological Sciences 282:20151746.

https://doi.org/10.1098/rspb.2015.1746
- PubMed
- Google Scholar
1. Brookes E
2. Pombo A
(2009) Modifications of RNA polymerase II are pivotal in regulating gene expression states
EMBO Reports 10:1213–1219.

https://doi.org/10.1038/embor.2009.221
- PubMed
- Google Scholar
1. Brown JL
2. Grau DJ
3. DeVido SK
4. Kassis JA
(2005) An Sp1/KLF binding site is important for the activity of a polycomb group response element from the Drosophila engrailed gene
Nucleic Acids Research 33:5181–5189.

https://doi.org/10.1093/nar/gki827
- PubMed
- Google Scholar
1. Bulger M
2. Groudine M
(2011) Functional and mechanistic diversity of distal transcription enhancers
Cell 144:327–339.

https://doi.org/10.1016/j.cell.2011.01.024
- PubMed
- Google Scholar
(2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses
Bioinformatics 25:1972–1973.

https://doi.org/10.1093/bioinformatics/btp348
- PubMed
- Google Scholar
1. Chen RA
2. Down TA
3. Stempor P
4. Chen QB
5. Egelhofer TA
6. Hillier LW
7. Jeffers TE
8. Ahringer J
(2013) The landscape of RNA polymerase II transcription initiation in C. elegans reveals promoter and enhancer architectures
Genome Research 23:1339–1347.

https://doi.org/10.1101/gr.153668.112
- PubMed
- Google Scholar
1. Conaco C
2. Neveu P
3. Zhou H
4. Arcila ML
5. Degnan SM
6. Degnan BM
7. Kosik KS
(2012) Transcriptome profiling of the demosponge Amphimedon queenslandica reveals genome-wide events that accompany major life cycle transitions
BMC Genomics 13:209.

https://doi.org/10.1186/1471-2164-13-209
- PubMed
- Google Scholar
(2013) The Fusarium graminearum histone H3 K27 methyltransferase KMT6 regulates development and expression of secondary metabolite gene clusters
PLoS Genetics 9:e1003916.

https://doi.org/10.1371/journal.pgen.1003916
- PubMed
- Google Scholar
Book
1. Davidson EH
2. Peter IS
(2015)
Genomic Control Process

Academic Press: Oxford.
- Google Scholar
(2013) Transcription factor evolution in eukaryotes and the assembly of the regulatory toolkit in multicellular lineages
PNAS 110:E4858–E4866.

https://doi.org/10.1073/pnas.1311818110
- PubMed
- Google Scholar
(2015) Complex transcriptional regulation and independent evolution of fungal-like traits in a relative of animals
eLife 4:e08904.

https://doi.org/10.7554/eLife.08904
- PubMed
- Google Scholar
1. De Santa F
2. Barozzi I
3. Mietton F
4. Ghisletti S
5. Polletti S
6. Tusi BK
7. Muller H
8. Ragoussis J
9. Wei CL
10. Natoli G
(2010) A large fraction of extragenic RNA pol II transcription sites overlap enhancers
PLoS Biology 8:e1000384.

https://doi.org/10.1371/journal.pbio.1000384
- PubMed
- Google Scholar
Book
(2015)
Porifera

In: Wanninger A, editors. Evolutionary Developmental Biology of Invertebrates 1: Introduction, Non-Bilateria, Acoelomorpha, Xenoturbellida, Chaetognatha. Vienna, Austria: Springer. pp. 65–106.
- Google Scholar
(2009) Early evolution of metazoan transcription factors
Current Opinion in Genetics & Development 19:591–599.

https://doi.org/10.1016/j.gde.2009.09.008
- PubMed
- Google Scholar
1. Deng W
2. Buzas DM
3. Ying H
4. Robertson M
5. Taylor J
6. Peacock WJ
7. Dennis ES
8. Helliwell C
(2013) Arabidopsis polycomb repressive complex 2 binding sites contain putative GAGA factor binding motifs within coding regions of genes
BMC Genomics 14:593.

https://doi.org/10.1186/1471-2164-14-593
- PubMed
- Google Scholar
1. Di Croce L
2. Helin K
(2013) Transcriptional regulation by Polycomb group proteins
Nature Structural & Molecular Biology 20:1147–1155.

https://doi.org/10.1038/nsmb.2669
- PubMed
- Google Scholar
1. Djebali S
2. Davis CA
3. Merkel A
4. Dobin A
5. Lassmann T
6. Mortazavi A
7. Tanzer A
8. Lagarde J
9. Lin W
10. Schlesinger F
11. Xue C
12. Marinov GK
13. Khatun J
14. Williams BA
15. Zaleski C
16. Rozowsky J
17. Röder M
18. Kokocinski F
19. Abdelhamid RF
20. Alioto T
21. Antoshechkin I
22. Baer MT
23. Bar NS
24. Batut P
25. Bell K
26. Bell I
27. Chakrabortty S
28. Chen X
29. Chrast J
30. Curado J
31. Derrien T
32. Drenkow J
33. Dumais E
34. Dumais J
35. Duttagupta R
36. Falconnet E
37. Fastuca M
38. Fejes-Toth K
39. Ferreira P
40. Foissac S
41. Fullwood MJ
42. Gao H
43. Gonzalez D
44. Gordon A
45. Gunawardena H
46. Howald C
47. Jha S
48. Johnson R
49. Kapranov P
50. King B
51. Kingswood C
52. Luo OJ
53. Park E
54. Persaud K
55. Preall JB
56. Ribeca P
57. Risk B
58. Robyr D
59. Sammeth M
60. Schaffer L
61. See LH
62. Shahab A
63. Skancke J
64. Suzuki AM
65. Takahashi H
66. Tilgner H
67. Trout D
68. Walters N
69. Wang H
70. Wrobel J
71. Yu Y
72. Ruan X
73. Hayashizaki Y
74. Harrow J
75. Gerstein M
76. Hubbard T
77. Reymond A
78. Antonarakis SE
79. Hannon G
80. Giddings MC
81. Ruan Y
82. Wold B
83. Carninci P
84. Guigó R
85. Gingeras TR
(2012) Landscape of transcription in human cells
Nature 489:101–108.

https://doi.org/10.1038/nature11233
- PubMed
- Google Scholar
1. Duan Z
2. Andronescu M
3. Schutz K
4. McIlwain S
5. Kim YJ
6. Lee C
7. Shendure J
8. Fields S
9. Blau CA
10. Noble WS
(2010) A three-dimensional model of the yeast genome
Nature 465:363–367.

https://doi.org/10.1038/nature08973
- PubMed
- Google Scholar
1. Eckalbar WL
2. Schlebusch SA
3. Mason MK
4. Gill Z
5. Parker AV
6. Booker BM
7. Nishizaki S
8. Muswamba-Nday C
9. Terhune E
10. Nevonen KA
11. Makki N
12. Friedrich T
13. VanderMeer JE
14. Pollard KS
15. Carbone L
16. Wall JD
17. Illing N
18. Ahituv N
(2016) Transcriptomic and epigenomic characterization of the developing bat wing
Nature Genetics 48:528–536.

https://doi.org/10.1038/ng.3537
- PubMed
- Google Scholar
(2002) Gene expression omnibus: ncbi gene expression and hybridization array data Repository
Nucleic Acids Research 30:207–210.

https://doi.org/10.1093/nar/30.1.207
- PubMed
- Google Scholar
1. ENCODE Project Consortium
(2012) An integrated encyclopedia of DNA elements in the human genome
Nature 489:57–74.

https://doi.org/10.1038/nature11247
- PubMed
- Google Scholar
(2007) Genomic regulatory blocks underlie extensive microsynteny conservation in insects
Genome Research 17:1898–1908.

https://doi.org/10.1101/gr.6669607
- PubMed
- Google Scholar
1. Ercan S
2. Dick LL
3. Lieb JD
(2009) The C. elegans dosage compensation complex propagates dynamically and independently of X chromosome sequence
Current Biology 19:1777–1787.

https://doi.org/10.1016/j.cub.2009.09.047
- PubMed
- Google Scholar
1. Ernst J
2. Kellis M
(2012) ChromHMM: automating chromatin-state discovery and characterization
Nature Methods 9:215–216.

https://doi.org/10.1038/nmeth.1906
- PubMed
- Google Scholar
(2011) The Cambrian conundrum: early divergence and later ecological success in the early history of animals
Science 334:1091–1097.

https://doi.org/10.1126/science.1206375
- PubMed
- Google Scholar
(2015) Deep developmental transcriptome sequencing uncovers numerous new genes and enhances gene annotation in the sponge Amphimedon queenslandica
BMC Genomics 16:387.

https://doi.org/10.1186/s12864-015-1588-z
- PubMed
- Google Scholar
1. Fernandez-Valverde SL
2. Degnan BM
(2016) Bilaterian-like promoters in the highly compact Amphimedon queenslandica genome
Scientific Reports 6:22496.

https://doi.org/10.1038/srep22496
- PubMed
- Google Scholar
(2015) Comparative analyses of developmental transcription factor repertoires in sponges reveal unexpected complexity of the earliest animals
Marine Genomics 24:121–129.

https://doi.org/10.1016/j.margen.2015.07.008
- PubMed
- Google Scholar
1. Fortunato SA
2. Adamski M
3. Ramos OM
4. Leininger S
5. Liu J
6. Ferrier DE
7. Adamska M
(2014) Calcisponges have a ParaHox gene and dynamic expression of dispersed NK homeobox genes
Nature 514:620–623.

https://doi.org/10.1038/nature13881
- PubMed
- Google Scholar
(2016) Origin and evolution of the metazoan non-coding regulatory genome
Developmental Biology pii: S0012- 1606:30573–30575.

https://doi.org/10.1016/j.ydbio.2016.11.013
- PubMed
- Google Scholar
(2015) Dynamic and widespread lncRNA expression in a sponge and the origin of animal complexity
Molecular Biology and Evolution 32:2367–2382.

https://doi.org/10.1093/molbev/msv117
- PubMed
- Google Scholar
(2014) Gene repression. H3K27me and PRC2 transmit a memory of repression across generations and during development
Science 345:1515–1518.

https://doi.org/10.1126/science.1255023
- PubMed
- Google Scholar
1. Gerstein MB
2. Lu ZJ
3. Van Nostrand EL
4. Cheng C
5. Arshinoff BI
6. Liu T
7. Yip KY
8. Robilotto R
9. Rechtsteiner A
10. Ikegami K
11. Alves P
12. Chateigner A
13. Perry M
14. Morris M
15. Auerbach RK
16. Feng X
17. Leng J
18. Vielle A
19. Niu W
20. Rhrissorrakrai K
21. Agarwal A
22. Alexander RP
23. Barber G
24. Brdlik CM
25. Brennan J
26. Brouillet JJ
27. Carr A
28. Cheung MS
29. Clawson H
30. Contrino S
31. Dannenberg LO
32. Dernburg AF
33. Desai A
34. Dick L
35. Dosé AC
36. Du J
37. Egelhofer T
38. Ercan S
39. Euskirchen G
40. Ewing B
41. Feingold EA
42. Gassmann R
43. Good PJ
44. Green P
45. Gullier F
46. Gutwein M
47. Guyer MS
48. Habegger L
49. Han T
50. Henikoff JG
51. Henz SR
52. Hinrichs A
53. Holster H
54. Hyman T
55. Iniguez AL
56. Janette J
57. Jensen M
58. Kato M
59. Kent WJ
60. Kephart E
61. Khivansara V
62. Khurana E
63. Kim JK
64. Kolasinska-Zwierz P
65. Lai EC
66. Latorre I
67. Leahey A
68. Lewis S
69. Lloyd P
70. Lochovsky L
71. Lowdon RF
72. Lubling Y
73. Lyne R
74. MacCoss M
75. Mackowiak SD
76. Mangone M
77. McKay S
78. Mecenas D
79. Merrihew G
80. Miller DM
81. Muroyama A
82. Murray JI
83. Ooi SL
84. Pham H
85. Phippen T
86. Preston EA
87. Rajewsky N
88. Rätsch G
89. Rosenbaum H
90. Rozowsky J
91. Rutherford K
92. Ruzanov P
93. Sarov M
94. Sasidharan R
95. Sboner A
96. Scheid P
97. Segal E
98. Shin H
99. Shou C
100. Slack FJ
101. Slightam C
102. Smith R
103. Spencer WC
104. Stinson EO
105. Taing S
106. Takasaki T
107. Vafeados D
108. Voronina K
109. Wang G
110. Washington NL
111. Whittle CM
112. Wu B
113. Yan KK
114. Zeller G
115. Zha Z
116. Zhong M
117. Zhou X
118. Ahringer J
119. Strome S
120. Gunsalus KC
121. Micklem G
122. Liu XS
123. Reinke V
124. Kim SK
125. Hillier LW
126. Henikoff S
127. Piano F
128. Snyder M
129. Stein L
130. Lieb JD
131. Waterston RH
132. modENCODE Consortium
(2010) Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project
Science 330:1775–1787.

https://doi.org/10.1126/science.1196914
- PubMed
- Google Scholar
1. Grimson A
2. Srivastava M
3. Fahey B
4. Woodcroft BJ
5. Chiang HR
6. King N
7. Degnan BM
8. Rokhsar DS
9. Bartel DP
(2008) Early origins and evolution of microRNAs and Piwi-interacting RNAs in animals
Nature 455:1193–1197.

https://doi.org/10.1038/nature07415
- PubMed
- Google Scholar
1. Ha M
2. Ng DW
3. Li WH
4. Chen ZJ
(2011) Coordinated histone modifications are associated with gene expression variation within and between species
Genome Research 21:590–598.

https://doi.org/10.1101/gr.116467.110
- PubMed
- Google Scholar
1. Harmeyer KM
2. South PF
3. Bishop B
4. Ogas J
5. Briggs SD
(2015) Immediate chromatin immunoprecipitation and on-bead quantitative PCR analysis: a versatile and rapid ChIP procedure
Nucleic Acids Research 43:e38.

https://doi.org/10.1093/nar/gku1347
- PubMed
- Google Scholar
1. Hashimshony T
2. Wagner F
3. Sher N
4. Yanai I
(2012) CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification
Cell Reports 2:666–673.

https://doi.org/10.1016/j.celrep.2012.08.003
- PubMed
- Google Scholar
1. Hecker A
2. Brand L H
3. Peter S
4. Simoncello N
5. Kilian J
6. Harter K
(2015) The Arabidopsis GAGA-binding factor BPC6 recruits PRC1 component LHP1 to GAGA DNA-motifs
Plant Physiology 168:1013–1024.

https://doi.org/10.1104/pp.15.00409
- Google Scholar
1. Heger P
2. Marin B
3. Bartkuhn M
4. Schierenberg E
5. Wiehe T
(2012) The chromatin insulator CTCF and the emergence of metazoan diversity
PNAS 109:17507–17512.

https://doi.org/10.1073/pnas.1111941109
- PubMed
- Google Scholar
1. Heintzman ND
2. Hon GC
3. Hawkins RD
4. Kheradpour P
5. Stark A
6. Harp LF
7. Ye Z
8. Lee LK
9. Stuart RK
10. Ching CW
11. Ching KA
12. Antosiewicz-Bourget JE
13. Liu H
14. Zhang X
15. Green RD
16. Lobanenkov VV
17. Stewart R
18. Thomson JA
19. Crawford GE
20. Kellis M
21. Ren B
(2009) Histone modifications at human enhancers reflect global cell-type-specific gene expression
Nature 459:108–112.

https://doi.org/10.1038/nature07829
- PubMed
- Google Scholar
1. Helm RR
2. Siebert S
3. Tulin S
4. Smith J
5. Dunn CW
(2013) Characterization of differential transcript abundance through time during Nematostella vectensis development
BMC Genomics 14:266–10.

https://doi.org/10.1186/1471-2164-14-266
- PubMed
- Google Scholar
(2015) Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species
Cell Reports 11:1110–1122.

https://doi.org/10.1016/j.celrep.2015.04.023
- PubMed
- Google Scholar
1. Ho JW
2. Jung YL
3. Liu T
4. Alver BH
5. Lee S
6. Ikegami K
7. Sohn KA
8. Minoda A
9. Tolstorukov MY
10. Appert A
11. Parker SC
12. Gu T
13. Kundaje A
14. Riddle NC
15. Bishop E
16. Egelhofer TA
17. Hu SS
18. Alekseyenko AA
19. Rechtsteiner A
20. Asker D
21. Belsky JA
22. Bowman SK
23. Chen QB
24. Chen RA
25. Day DS
26. Dong Y
27. Dose AC
28. Duan X
29. Epstein CB
30. Ercan S
31. Feingold EA
32. Ferrari F
33. Garrigues JM
34. Gehlenborg N
35. Good PJ
36. Haseley P
37. He D
38. Herrmann M
39. Hoffman MM
40. Jeffers TE
41. Kharchenko PV
42. Kolasinska-Zwierz P
43. Kotwaliwale CV
44. Kumar N
45. Langley SA
46. Larschan EN
47. Latorre I
48. Libbrecht MW
49. Lin X
50. Park R
51. Pazin MJ
52. Pham HN
53. Plachetka A
54. Qin B
55. Schwartz YB
56. Shoresh N
57. Stempor P
58. Vielle A
59. Wang C
60. Whittle CM
61. Xue H
62. Kingston RE
63. Kim JH
64. Bernstein BE
65. Dernburg AF
66. Pirrotta V
67. Kuroda MI
68. Noble WS
69. Tullius TD
70. Kellis M
71. MacAlpine DM
72. Strome S
73. Elgin SC
74. Liu XS
75. Lieb JD
76. Ahringer J
77. Karpen GH
78. Park PJ
(2014) Comparative analysis of metazoan chromatin organization
Nature 512:449–452.

https://doi.org/10.1038/nature13415
- PubMed
- Google Scholar
1. IIott NE
2. Heward JA
3. Roux B
4. Tsitsiou E
5. Fenwick PS
6. Lenzi L
7. Goodhead I
8. Hertz-Fowler C
9. Heger A
10. Hall N
11. Donnelly LE
12. Sims D
13. Lindsay MA
(2014) Long non-coding RNAs and enhancer RNAs regulate the lipopolysaccharide-induced inflammatory response in human monocytes
Nature Communications 5:3979.

https://doi.org/10.1038/ncomms4979
- PubMed
- Google Scholar
1. Ikeuchi M
2. Iwase A
3. Rymen B
4. Harashima H
5. Shibata M
6. Ohnuma M
7. Breuer C
8. Morao AK
9. de Lucas M
10. De Veylder L
11. Goodrich J
12. Brady SM
13. Roudier F
14. Sugimoto K
(2015) PRC2 represses dedifferentiation of mature somatic cells in Arabidopsis
Nature Plants 1:15089.

https://doi.org/10.1038/nplants.2015.89
- PubMed
- Google Scholar
(2011) Contrasting 5' and 3' evolutionary histories and frequent evolutionary convergence in Meis/hth gene structures
Genome Biology and Evolution 3:551–564.

https://doi.org/10.1093/gbe/evr056
- PubMed
- Google Scholar
1. Irimia M
2. Maeso I
3. Roy SW
4. Fraser HB
(2013) Ancient cis-regulatory constraints and the evolution of genome architecture
Trends in Genetics 29:521–528.

https://doi.org/10.1016/j.tig.2013.05.008
- PubMed
- Google Scholar
(2012) Extensive conservation of ancient microsynteny across metazoans due to cis-regulatory constraints
Genome Research 22:2356–2367.

https://doi.org/10.1101/gr.139725.112
- PubMed
- Google Scholar
(2013) Regional control of histone H3 lysine 27 methylation in Neurospora
PNAS 110:6027–6032.

https://doi.org/10.1073/pnas.1303750110
- PubMed
- Google Scholar
1. Jiang C
2. Pugh BF
(2009) Nucleosome positioning and gene regulation: advances through genomics
Nature Reviews Genetics 10:161–172.

https://doi.org/10.1038/nrg2522
- PubMed
- Google Scholar
1. Jindrich K
2. Degnan BM
(2016) The diversification of the basic leucine zipper family in eukaryotes correlates with the evolution of multicellularity
BMC Evolutionary Biology 16:1–12.

https://doi.org/10.1186/s12862-016-0598-z
- PubMed
- Google Scholar
1. Jones P
2. Binns D
3. Chang HY
4. Fraser M
5. Li W
6. McAnulla C
7. McWilliam H
8. Maslen J
9. Mitchell A
10. Nuka G
11. Pesseat S
12. Quinn AF
13. Sangrador-Vegas A
14. Scheremetjew M
15. Yong SY
16. Lopez R
17. Hunter S
(2014) InterProScan 5: genome-scale protein function classification
Bioinformatics 30:1236–1240.

https://doi.org/10.1093/bioinformatics/btu031
- PubMed
- Google Scholar
(2016) BlastKOALA and GhostKOALA: kegg tools for functional characterization of genome and metagenome sequences
Journal of Molecular Biology 428:726–731.

https://doi.org/10.1016/j.jmb.2015.11.006
- PubMed
- Google Scholar
(2010) Histone modification levels are predictive for gene expression
PNAS 107:2926–2931.

https://doi.org/10.1073/pnas.0909344107
- PubMed
- Google Scholar
1. Kassis JA
2. Brown JL
(2013) Polycomb group response elements in Drosophila and vertebrates
Advances in Genetics 81:83–118.

https://doi.org/10.1016/B978-0-12-407677-8.00003-8
- PubMed
- Google Scholar
1. Katoh K
2. Standley DM
(2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability
Molecular Biology and Evolution 30:772–780.

https://doi.org/10.1093/molbev/mst010
- PubMed
- Google Scholar
1. Kellis M
2. Wold B
3. Snyder MP
4. Bernstein BE
5. Kundaje A
6. Marinov GK
7. Ward LD
8. Birney E
9. Crawford GE
10. Dekker J
11. Dunham I
12. Elnitski LL
13. Farnham PJ
14. Feingold EA
15. Gerstein M
16. Giddings MC
17. Gilbert DM
18. Gingeras TR
19. Green ED
20. Guigo R
21. Hubbard T
22. Kent J
23. Lieb JD
24. Myers RM
25. Pazin MJ
26. Ren B
27. Stamatoyannopoulos JA
28. Weng Z
29. White KP
30. Hardison RC
(2014) Defining functional DNA elements in the human genome
PNAS 111:6131–6138.

https://doi.org/10.1073/pnas.1318948111
- PubMed
- Google Scholar
(2008) Design and analysis of ChIP-seq experiments for DNA-binding proteins
Nature Biotechnology 26:1351–1359.

https://doi.org/10.1038/nbt.1508
- PubMed
- Google Scholar
1. Kikuta H
2. Laplante M
3. Navratilova P
4. Komisarczuk AZ
5. Engström PG
6. Fredman D
7. Akalin A
8. Caccamo M
9. Sealy I
10. Howe K
11. Ghislain J
12. Pezeron G
13. Mourrain P
14. Ellingsen S
15. Oates AC
16. Thisse C
17. Thisse B
18. Foucher I
19. Adolf B
20. Geling A
21. Lenhard B
22. Becker TS
(2007) Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates
Genome Research 17:545–555.

https://doi.org/10.1101/gr.6086307
- PubMed
- Google Scholar
1. Kim TK
2. Hemberg M
3. Gray JM
4. Costa AM
5. Bear DM
6. Wu J
7. Harmin DA
8. Laptewicz M
9. Barbara-Haley K
10. Kuersten S
11. Markenscoff-Papadimitriou E
12. Kuhl D
13. Bito H
14. Worley PF
15. Kreiman G
16. Greenberg ME
(2010) Widespread transcription at neuronal activity-regulated enhancers
Nature 465:182–187.

https://doi.org/10.1038/nature09033
- PubMed
- Google Scholar
(2003) Evolution of key cell signaling and adhesion protein families predates animal origins
Science 301:361–363.

https://doi.org/10.1126/science.1083853
- PubMed
- Google Scholar
1. King N
2. Westbrook MJ
3. Young SL
4. Kuo A
5. Abedin M
6. Chapman J
7. Fairclough S
8. Hellsten U
9. Isogai Y
10. Letunic I
11. Marr M
12. Pincus D
13. Putnam N
14. Rokas A
15. Wright KJ
16. Zuzow R
17. Dirks W
18. Good M
19. Goodstein D
20. Lemons D
21. Li W
22. Lyons JB
23. Morris A
24. Nichols S
25. Richter DJ
26. Salamov A
27. Sequencing JG
28. Bork P
29. Lim WA
30. Manning G
31. Miller WT
32. McGinnis W
33. Shapiro H
34. Tjian R
35. Grigoriev IV
36. Rokhsar D
(2008) The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans
Nature 451:783–788.

https://doi.org/10.1038/nature06617
- PubMed
- Google Scholar
1. Kowalczyk MS
2. Hughes JR
3. Garrick D
4. Lynch MD
5. Sharpe JA
6. Sloane-Stanley JA
7. McGowan SJ
8. De Gobbi M
9. Hosseini M
10. Vernimmen D
11. Brown JM
12. Gray NE
13. Collavin L
14. Gibbons RJ
15. Flint J
16. Taylor S
17. Buckle VJ
18. Milne TA
19. Wood WG
20. Higgs DR
21. Lynch Magnus D
22. Sharpe Jacqueline A
23. Sloane-Stanley Jacqueline A
(2012) Intragenic enhancers act as alternative promoters
Molecular Cell 45:447–458.

https://doi.org/10.1016/j.molcel.2011.12.021
- PubMed
- Google Scholar
(2013) The UCSC genome browser and associated tools
Briefings in Bioinformatics 14:144–161.

https://doi.org/10.1093/bib/bbs038
- PubMed
- Google Scholar
1. Kundaje A
2. Meuleman W
3. Ernst J
4. Bilenky M
5. Yen A
6. Heravi-Moussavi A
7. Kheradpour P
8. Zhang Z
9. Wang J
10. Ziller MJ
11. Amin V
12. Whitaker JW
13. Schultz MD
14. Ward LD
15. Sarkar A
16. Quon G
17. Sandstrom RS
18. Eaton ML
19. Wu YC
20. Pfenning AR
21. Wang X
22. Claussnitzer M
23. Liu Y
24. Coarfa C
25. Harris RA
26. Shoresh N
27. Epstein CB
28. Gjoneska E
29. Leung D
30. Xie W
31. Hawkins RD
32. Lister R
33. Hong C
34. Gascard P
35. Mungall AJ
36. Moore R
37. Chuah E
38. Tam A
39. Canfield TK
40. Hansen RS
41. Kaul R
42. Sabo PJ
43. Bansal MS
44. Carles A
45. Dixon JR
46. Farh KH
47. Feizi S
48. Karlic R
49. Kim AR
50. Kulkarni A
51. Li D
52. Lowdon R
53. Elliott G
54. Mercer TR
55. Neph SJ
56. Onuchic V
57. Polak P
58. Rajagopal N
59. Ray P
60. Sallari RC
61. Siebenthall KT
62. Sinnott-Armstrong NA
63. Stevens M
64. Thurman RE
65. Wu J
66. Zhang B
67. Zhou X
68. Beaudet AE
69. Boyer LA
70. De Jager PL
71. Farnham PJ
72. Fisher SJ
73. Haussler D
74. Jones SJ
75. Li W
76. Marra MA
77. McManus MT
78. Sunyaev S
79. Thomson JA
80. Tlsty TD
81. Tsai LH
82. Wang W
83. Waterland RA
84. Zhang MQ
85. Chadwick LH
86. Bernstein BE
87. Costello JF
88. Ecker JR
89. Hirst M
90. Meissner A
91. Milosavljevic A
92. Ren B
93. Stamatoyannopoulos JA
94. Wang T
95. Kellis M
96. Roadmap Epigenomics Consortium
(2015) Integrative analysis of 111 reference human epigenomes
Nature 518:317–330.

https://doi.org/10.1038/nature14248
- PubMed
- Google Scholar
1. Landt SG
2. Marinov GK
3. Kundaje A
4. Kheradpour P
5. Pauli F
6. Batzoglou S
7. Bernstein BE
8. Bickel P
9. Brown JB
10. Cayting P
11. Chen Y
12. DeSalvo G
13. Epstein C
14. Fisher-Aylor KI
15. Euskirchen G
16. Gerstein M
17. Gertz J
18. Hartemink AJ
19. Hoffman MM
20. Iyer VR
21. Jung YL
22. Karmakar S
23. Kellis M
24. Kharchenko PV
25. Li Q
26. Liu T
27. Liu XS
28. Ma L
29. Milosavljevic A
30. Myers RM
31. Park PJ
32. Pazin MJ
33. Perry MD
34. Raha D
35. Reddy TE
36. Rozowsky J
37. Shoresh N
38. Sidow A
39. Slattery M
40. Stamatoyannopoulos JA
41. Tolstorukov MY
42. White KP
43. Xi S
44. Farnham PJ
45. Lieb JD
46. Wold BJ
47. Snyder M
(2012) ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia
Genome Research 22:1813–1831.

https://doi.org/10.1101/gr.136184.111
- PubMed
- Google Scholar
(2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
Genome Biology 10:R25.

https://doi.org/10.1186/gb-2009-10-3-r25
- PubMed
- Google Scholar
1. Larroux C
2. Fahey B
3. Liubicich D
4. Hinman VF
5. Gauthier M
6. Gongora M
7. Green K
8. Wörheide G
9. Leys SP
10. Degnan BM
(2006) Developmental expression of transcription factor genes in a demosponge: insights into the origin of metazoan multicellularity
Evolution &Development 8:150–173.

https://doi.org/10.1111/j.1525-142X.2006.00086.x
- PubMed
- Google Scholar
1. Larroux C
2. Luke GN
3. Koopman P
4. Rokhsar DS
5. Shimeld SM
6. Degnan BM
(2008) Genesis and expansion of metazoan transcription factor gene classes
Molecular Biology and Evolution 25:980–996.

https://doi.org/10.1093/molbev/msn047
- PubMed
- Google Scholar
1. Lee BK
2. Iyer VR
(2012) Genome-wide studies of CCCTC-binding factor (CTCF) and cohesin provide insight into chromatin structure and regulation
Journal of Biological Chemistry 287:30906–30913.

https://doi.org/10.1074/jbc.R111.324962
- PubMed
- Google Scholar
1. Leininger S
2. Adamski M
3. Bergum B
4. Guder C
5. Liu J
6. Laplante M
7. Bråte J
8. Hoffmann F
9. Fortunato S
10. Jordal S
11. Rapp HT
12. Adamska M
(2014) Developmental gene expression provides clues to relationships between sponge and eumetazoan body plans
Nature Communications 5:3905.

https://doi.org/10.1038/ncomms4905
- PubMed
- Google Scholar
(2012) Metazoan promoters: emerging characteristics and insights into transcriptional regulation
Nature Reviews Genetics 13:233–245.

https://doi.org/10.1038/nrg3163
- PubMed
- Google Scholar
1. Levin M
2. Anavy L
3. Cole AG
4. Winter E
5. Mostov N
6. Khair S
7. Senderovich N
8. Kovalev E
9. Silver DH
10. Feder M
11. Fernandez-Valverde SL
12. Nakanishi N
13. Simmons D
14. Simakov O
15. Larsson T
16. Liu SY
17. Jerafi-Vider A
18. Yaniv K
19. Ryan JF
20. Martindale MQ
21. Rink JC
22. Arendt D
23. Degnan SM
24. Degnan BM
25. Hashimshony T
26. Yanai I
(2016) The mid-developmental transition and the evolution of animal body plans
Nature 531:637–641.

https://doi.org/10.1038/nature16994
- PubMed
- Google Scholar
(2014) Looping back to leap forward: transcription enters a new era
Cell 157:13–25.

https://doi.org/10.1016/j.cell.2014.02.009
- PubMed
- Google Scholar
1. Levine M
2. Tjian R
(2003) Transcription regulation and animal diversity
Nature 424:147–151.

https://doi.org/10.1038/nature01763
- PubMed
- Google Scholar
1. Levine M
(2010) Transcriptional enhancers in animal development and evolution
Current Biology 20:R754–R763.

https://doi.org/10.1016/j.cub.2010.06.070
- PubMed
- Google Scholar
1. Leys SP
2. Degnan BM
(2002) Embryogenesis and metamorphosis in a haplosclerid demosponge: gastrulation and transdifferentiation of larval ciliated cells to choanocytes
Invertebrate Biology 121:171–189.

https://doi.org/10.1111/j.1744-7410.2002.tb00058.x
- Google Scholar
1. Leys SP
2. Larroux C
3. Gauthier M
4. Adamska M
5. Fahey B
6. Richards GS
7. Degnan SM
8. Degnan BM
(2008) Isolation of Amphimedon developmental material
Cold Spring Harbor Protocols 2008:pdb prot5095.

https://doi.org/10.1101/pdb.prot5095
- Google Scholar
(2009) The sequence alignment/Map format and SAMtools
Bioinformatics 25:2078–2079.

https://doi.org/10.1093/bioinformatics/btp352
- PubMed
- Google Scholar
(2016) Enhancers as non-coding RNA transcription units: recent insights and future perspectives
Nature Reviews Genetics 17:207–223.

https://doi.org/10.1038/nrg.2016.4
- PubMed
- Google Scholar
1. Liang X
2. Song MR
3. Xu Z
4. Lanuza GM
5. Liu Y
6. Zhuang T
7. Chen Y
8. Pfaff SL
9. Evans SM
10. Sun Y
(2011) Isl1 is required for multiple aspects of motor neuron development
Molecular and Cellular Neuroscience 47:215–222.

https://doi.org/10.1016/j.mcn.2011.04.007
- PubMed
- Google Scholar
1. Liu Y
2. Taverna SD
3. Muratore TL
4. Shabanowitz J
5. Hunt DF
6. Allis CD
(2007) RNAi-dependent H3K27 methylation is required for heterochromatin formation and DNA elimination in Tetrahymena
Genes & Development 21:1530–1545.

https://doi.org/10.1101/gad.1544207
- PubMed
- Google Scholar
1. Love MI
2. Huber W
3. Anders S
(2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
Genome Biology 15:1–21.

https://doi.org/10.1186/s13059-014-0550-8
- PubMed
- Google Scholar
1. Machanick P
2. Bailey TL
(2011) MEME-ChIP: motif analysis of large DNA datasets
Bioinformatics 27:1696–1697.

https://doi.org/10.1093/bioinformatics/btr189
- PubMed
- Google Scholar
(2005) BiNGO: a cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks
Bioinformatics 21:3448–3449.

https://doi.org/10.1093/bioinformatics/bti551
- PubMed
- Google Scholar
1. Margueron R
2. Reinberg D
(2011) The polycomb complex PRC2 and its mark in life
Nature 469:343–349.

https://doi.org/10.1038/nature09784
- PubMed
- Google Scholar
1. Marques AC
2. Hughes J
3. Graham B
4. Kowalczyk MS
5. Higgs DR
6. Ponting CP
(2013) Chromatin signatures at transcriptional start sites separate two equally populated yet distinct classes of intergenic long noncoding RNAs
Genome Biology 14:R131.

https://doi.org/10.1186/gb-2013-14-11-r131
- PubMed
- Google Scholar
1. Martin M
(2011) Cutadapt removes adapter sequences from high-throughput sequencing reads
EMBnet.journal 17:10–12.

https://doi.org/10.14806/ej.17.1.200
- Google Scholar
1. Merkenschlager M
2. Odom DT
(2013) CTCF and cohesin: linking gene regulatory elements with their targets
Cell 152:1285–1297.

https://doi.org/10.1016/j.cell.2013.02.029
- PubMed
- Google Scholar
1. Moroz LL
2. Kocot KM
3. Citarella MR
4. Dosung S
5. Norekian TP
6. Povolotskaya IS
7. Grigorenko AP
8. Dailey C
9. Berezikov E
10. Buckley KM
11. Ptitsyn A
12. Reshetov D
13. Mukherjee K
14. Moroz TP
15. Bobkova Y
16. Yu F
17. Kapitonov VV
18. Jurka J
19. Bobkov YV
20. Swore JJ
21. Girardo DO
22. Fodor A
23. Gusev F
24. Sanford R
25. Bruders R
26. Kittler E
27. Mills CE
28. Rast JP
29. Derelle R
30. Solovyev VV
31. Kondrashov FA
32. Swalla BJ
33. Sweedler JV
34. Rogaev EI
35. Halanych KM
36. Kohn AB
(2014) The ctenophore genome and the evolutionary origins of neural systems
Nature 510:109–114.

https://doi.org/10.1038/nature13400
- PubMed
- Google Scholar
1. Müller J
2. Kassis JA
(2006) Polycomb response elements and targeting of Polycomb group proteins in Drosophila
Current Opinion in Genetics & Development 16:476–484.

https://doi.org/10.1016/j.gde.2006.08.005
- PubMed
- Google Scholar
(2014) Evolutionary origin of gastrulation: insights from sponge development
BMC Biology 12:26.

https://doi.org/10.1186/1741-7007-12-26
- PubMed
- Google Scholar
1. Natoli G
2. Andrau JC
(2012) Noncoding transcription at enhancers: general principles and functional models
Annual Review of Genetics 46:1–19.

https://doi.org/10.1146/annurev-genet-110711-155459
- PubMed
- Google Scholar
1. Naville M
2. Ishibashi M
3. Ferg M
4. Bengani H
5. Rinkwitz S
6. Krecsmarik M
7. Hawkins TA
8. Wilson SW
9. Manning E
10. Chilamakuri CS
11. Wilson DI
12. Louis A
13. Lucy Raymond F
14. Rastegar S
15. Strähle U
16. Lenhard B
17. Bally-Cuif L
18. van Heyningen V
19. FitzPatrick DR
20. Becker TS
21. Roest Crollius H
(2015) Long-range evolutionary constraints reveal cis-regulatory interactions on the human X chromosome
Nature Communications 6:6904.

https://doi.org/10.1038/ncomms7904
- PubMed
- Google Scholar
1. Nègre N
2. Brown CD
3. Ma L
4. Bristow CA
5. Miller SW
6. Wagner U
7. Kheradpour P
8. Eaton ML
9. Loriaux P
10. Sealfon R
11. Li Z
12. Ishii H
13. Spokony RF
14. Chen J
15. Hwang L
16. Cheng C
17. Auburn RP
18. Davis MB
19. Domanus M
20. Shah PK
21. Morrison CA
22. Zieba J
23. Suchy S
24. Senderowicz L
25. Victorsen A
26. Bild NA
27. Grundstad AJ
28. Hanley D
29. MacAlpine DM
30. Mannervik M
31. Venken K
32. Bellen H
33. White R
34. Gerstein M
35. Russell S
36. Grossman RL
37. Ren B
38. Posakony JW
39. Kellis M
40. White KP
(2011) A cis-regulatory map of the Drosophila genome
Nature 471:527–531.

https://doi.org/10.1038/nature09990
- PubMed
- Google Scholar
1. Peter IS
2. Davidson EH
(2011) Evolution of gene regulatory networks controlling body plan development
Cell 144:970–985.

https://doi.org/10.1016/j.cell.2011.02.017
- PubMed
- Google Scholar
(2015a) Absence of canonical marks of active chromatin in developmentally regulated genes
Nature Genetics 47:1158–1167.

https://doi.org/10.1038/ng.3381
- PubMed
- Google Scholar
(2015b) Active transcription without histone modifications
Oncotarget 6:41401.

https://doi.org/10.18632/oncotarget.6437
- PubMed
- Google Scholar
(2010) FastTree 2--approximately maximum-likelihood trees for large alignments
PLoS One 5:e9490.

https://doi.org/10.1371/journal.pone.0009490
- PubMed
- Google Scholar
1. Putnam NH
2. Srivastava M
3. Hellsten U
4. Dirks B
5. Chapman J
6. Salamov A
7. Terry A
8. Shapiro H
9. Lindquist E
10. Kapitonov VV
11. Jurka J
12. Genikhovich G
13. Grigoriev IV
14. Lucas SM
15. Steele RE
16. Finnerty JR
17. Technau U
18. Martindale MQ
19. Rokhsar DS
(2007) Sea Anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization
Science 317:86–94.

https://doi.org/10.1126/science.1139158
- PubMed
- Google Scholar
1. Quinlan AR
2. Hall IM
(2010) BEDTools: a flexible suite of utilities for comparing genomic features
Bioinformatics 26:841–842.

https://doi.org/10.1093/bioinformatics/btq033
- PubMed
- Google Scholar
1. Quinn JJ
2. Chang HY
(2016) Unique features of long non-coding RNA biogenesis and function
Nature Reviews Genetics 17:47–62.

https://doi.org/10.1038/nrg.2015.10
- PubMed
- Google Scholar
Software
1. R Core Team
(2014)
R: A Language and Environment for Statistical Computing

R Foundation for Statistical Computing, Vienna, Austria.
1. Ramírez F
2. Ryan DP
3. Grüning B
4. Bhardwaj V
5. Kilpert F
6. Richter AS
7. Heyne S
8. Dündar F
9. Manke T
(2016) deepTools2: a next generation web server for deep-sequencing data analysis
Nucleic Acids Research 44:W160–W165.

https://doi.org/10.1093/nar/gkw257
- PubMed
- Google Scholar
1. Richards GS
2. Degnan BM
(2009) The dawn of developmental signaling in the metazoa
Cold Spring Harbor Symposia on Quantitative Biology 74:81–90.

https://doi.org/10.1101/sqb.2009.74.028
- PubMed
- Google Scholar
(2008) Sponge genes provide new insight into the evolutionary origin of the neurogenic circuit
Current Biology 18:1156–1161.

https://doi.org/10.1016/j.cub.2008.06.074
- PubMed
- Google Scholar
1. Richter DJ
2. King N
(2013) The genomic and cellular foundations of animal origins
Annual Review of Genetics 47:509–537.

https://doi.org/10.1146/annurev-genet-111212-133456
- PubMed
- Google Scholar
1. Riesgo A
2. Farrar N
3. Windsor PJ
4. Giribet G
5. Leys SP
(2014) The analysis of eight transcriptomes from all poriferan classes reveals surprising genetic complexity in sponges
Molecular Biology and Evolution 31:1102–1120.

https://doi.org/10.1093/molbev/msu057
- PubMed
- Google Scholar
1. Ritter DI
2. Dong Z
3. Guo S
4. Chuang JH
(2012) Transcriptional enhancers in protein-coding exons of vertebrate developmental genes
PLoS One 7:e35202.

https://doi.org/10.1371/journal.pone.0035202
- PubMed
- Google Scholar
1. Robertson G
2. Hirst M
3. Bainbridge M
4. Bilenky M
5. Zhao Y
6. Zeng T
7. Euskirchen G
8. Bernier B
9. Varhol R
10. Delaney A
11. Thiessen N
12. Griffith OL
13. He A
14. Marra M
15. Snyder M
16. Jones S
(2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing
Nature Methods 4:651–657.

https://doi.org/10.1038/nmeth1068
- PubMed
- Google Scholar
1. Roy S
2. Ernst J
3. Kharchenko PV
4. Kheradpour P
5. Negre N
6. Eaton ML
7. Landolin JM
8. Bristow CA
9. Ma L
10. Lin MF
11. Washietl S
12. Arshinoff BI
13. Ay F
14. Meyer PE
15. Robine N
16. Washington NL
17. Di Stefano L
18. Berezikov E
19. Brown CD
20. Candeias R
21. Carlson JW
22. Carr A
23. Jungreis I
24. Marbach D
25. Sealfon R
26. Tolstorukov MY
27. Will S
28. Alekseyenko AA
29. Artieri C
30. Booth BW
31. Brooks AN
32. Dai Q
33. Davis CA
34. Duff MO
35. Feng X
36. Gorchakov AA
37. Gu T
38. Henikoff JG
39. Kapranov P
40. Li R
41. MacAlpine HK
42. Malone J
43. Minoda A
44. Nordman J
45. Okamura K
46. Perry M
47. Powell SK
48. Riddle NC
49. Sakai A
50. Samsonova A
51. Sandler JE
52. Schwartz YB
53. Sher N
54. Spokony R
55. Sturgill D
56. van Baren M
57. Wan KH
58. Yang L
59. Yu C
60. Feingold E
61. Good P
62. Guyer M
63. Lowdon R
64. Ahmad K
65. Andrews J
66. Berger B
67. Brenner SE
68. Brent MR
69. Cherbas L
70. Elgin SC
71. Gingeras TR
72. Grossman R
73. Hoskins RA
74. Kaufman TC
75. Kent W
76. Kuroda MI
77. Orr-Weaver T
78. Perrimon N
79. Pirrotta V
80. Posakony JW
81. Ren B
82. Russell S
83. Cherbas P
84. Graveley BR
85. Lewis S
86. Micklem G
87. Oliver B
88. Park PJ
89. Celniker SE
90. Henikoff S
91. Karpen GH
92. Lai EC
93. MacAlpine DM
94. Stein LD
95. White KP
96. Kellis M
97. modENCODE Consortium
(2010) Identification of functional elements and regulatory circuits by Drosophila modENCODE
Science 330:1787–1797.

https://doi.org/10.1126/science.1198374
- PubMed
- Google Scholar
(2013) The genome of the ctenophore Mnemiopsis leidyi and its implications for cell type evolution
Science 342:1242592.

https://doi.org/10.1126/science.1242592
- PubMed
- Google Scholar
1. Sandelin A
2. Bailey P
3. Bruce S
4. Engström PG
5. Klos JM
6. Wasserman WW
7. Ericson J
8. Lenhard B
(2004) Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes
BMC Genomics 5:99–9.

https://doi.org/10.1186/1471-2164-5-99
- PubMed
- Google Scholar
1. Schwaiger M
2. Schönauer A
3. Rendeiro AF
4. Pribitzer C
5. Schauer A
6. Gilles AF
7. Schinko JB
8. Renfer E
9. Fredman D
10. Technau U
(2014) Evolutionary conservation of the eumetazoan gene regulatory landscape
Genome Research 24:639–650.

https://doi.org/10.1101/gr.162529.113
- PubMed
- Google Scholar
(2016) The dynamic regulatory genome of Capsaspora and the origin of animal multicellularity
Cell 165:1224–1237.

https://doi.org/10.1016/j.cell.2016.03.034
- PubMed
- Google Scholar
(2011) Unexpected repertoire of metazoan transcription factors in the unicellular holozoan Capsaspora owczarzaki
Molecular Biology and Evolution 28:1241–1254.

https://doi.org/10.1093/molbev/msq309
- PubMed
- Google Scholar
1. Seitan VC
2. Faure AJ
3. Zhan Y
4. McCord RP
5. Lajoie BR
6. Ing-Simmons E
7. Lenhard B
8. Giorgetti L
9. Heard E
10. Fisher AG
11. Flicek P
12. Dekker J
13. Merkenschlager M
(2013) Cohesin-based chromatin interactions enable regulated gene expression within preexisting architectural compartments
Genome Research 23:2066–2077.

https://doi.org/10.1101/gr.161620.113
- PubMed
- Google Scholar
1. Shannon P
2. Markiel A
3. Ozier O
4. Baliga NS
5. Wang JT
6. Ramage D
7. Amin N
8. Schwikowski B
9. Ideker T
(2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks
Genome Research 13:2498–2504.

https://doi.org/10.1101/gr.1239303
- PubMed
- Google Scholar
(2010) Origin of the polycomb repressive complex 2 and gene silencing by an E(z) homolog in the unicellular alga Chlamydomonas
Epigenetics 5:301–312.

https://doi.org/10.4161/epi.5.4.11608
- PubMed
- Google Scholar
1. Shen L
2. Shao N
3. Liu X
4. Nestler E
(2014) Ngs.plot: quick mining and visualization of next-generation sequencing data by integrating genomic databases
BMC Genomics 15:284.

https://doi.org/10.1186/1471-2164-15-284
- PubMed
- Google Scholar
(2014) Transcriptional enhancers: from properties to genome-wide predictions
Nature Reviews Genetics 15:272–286.

https://doi.org/10.1038/nrg3682
- PubMed
- Google Scholar
1. Sievers F
2. Wilm A
3. Dineen D
4. Gibson TJ
5. Karplus K
6. Li W
7. Lopez R
8. McWilliam H
9. Remmert M
10. Söding J
11. Thompson JD
12. Higgins DG
(2011) Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega
Molecular Systems Biology 7:539.

https://doi.org/10.1038/msb.2011.75
- PubMed
- Google Scholar
(2007) Origin and diversification of the basic helix-loop-helix gene family in metazoans: insights from comparative genomics
BMC Evolutionary Biology 7:33–18.

https://doi.org/10.1186/1471-2148-7-33
- PubMed
- Google Scholar
1. Simon JA
2. Kingston RE
(2009) Mechanisms of polycomb gene silencing: knowns and unknowns
Nature Reviews Molecular Cell Biology 10:697–708.

https://doi.org/10.1038/nrm2763
- PubMed
- Google Scholar
(2015) A diverse epigenetic landscape at human exons with implication for expression
Nucleic Acids Research 43:3498–3508.

https://doi.org/10.1093/nar/gkv153
- PubMed
- Google Scholar
1. Srivastava M
2. Simakov O
3. Chapman J
4. Fahey B
5. Gauthier ME
6. Mitros T
7. Richards GS
8. Conaco C
9. Dacre M
10. Hellsten U
11. Larroux C
12. Putnam NH
13. Stanke M
14. Adamska M
15. Darling A
16. Degnan SM
17. Oakley TH
18. Plachetzki DC
19. Zhai Y
20. Adamski M
21. Calcino A
22. Cummins SF
23. Goodstein DM
24. Harris C
25. Jackson DJ
26. Leys SP
27. Shu S
28. Woodcroft BJ
29. Vervoort M
30. Kosik KS
31. Manning G
32. Degnan BM
33. Rokhsar DS
(2010) The Amphimedon queenslandica genome and the evolution of animal complexity
Nature 466:720–726.

https://doi.org/10.1038/nature09201
- PubMed
- Google Scholar
(1997) Co-localization of polycomb protein and GAGA factor on regulatory elements responsible for the maintenance of homeotic gene expression
The EMBO Journal 16:3621–3632.

https://doi.org/10.1093/emboj/16.12.3621
- PubMed
- Google Scholar
1. Thor S
2. Thomas JB
(1997) The Drosophila islet gene governs axon pathfinding and neurotransmitter identity
Neuron 18:397–409.

https://doi.org/10.1016/S0896-6273(00)81241-6
- PubMed
- Google Scholar
1. Thurman RE
2. Rynes E
3. Humbert R
4. Vierstra J
5. Maurano MT
6. Haugen E
7. Sheffield NC
8. Stergachis AB
9. Wang H
10. Vernot B
11. Garg K
12. John S
13. Sandstrom R
14. Bates D
15. Boatman L
16. Canfield TK
17. Diegel M
18. Dunn D
19. Ebersol AK
20. Frum T
21. Giste E
22. Johnson AK
23. Johnson EM
24. Kutyavin T
25. Lajoie B
26. Lee BK
27. Lee K
28. London D
29. Lotakis D
30. Neph S
31. Neri F
32. Nguyen ED
33. Qu H
34. Reynolds AP
35. Roach V
36. Safi A
37. Sanchez ME
38. Sanyal A
39. Shafer A
40. Simon JM
41. Song L
42. Vong S
43. Weaver M
44. Yan Y
45. Zhang Z
46. Zhang Z
47. Lenhard B
48. Tewari M
49. Dorschner MO
50. Hansen RS
51. Navas PA
52. Stamatoyannopoulos G
53. Iyer VR
54. Lieb JD
55. Sunyaev SR
56. Akey JM
57. Sabo PJ
58. Kaul R
59. Furey TS
60. Dekker J
61. Crawford GE
62. Stamatoyannopoulos JA
(2012) The accessible chromatin landscape of the human genome
Nature 489:75–82.

https://doi.org/10.1038/nature11232
- PubMed
- Google Scholar
1. Ulitsky I
(2016) Evolution to the rescue: using comparative genomics to understand long non-coding RNAs
Nature Reviews Genetics 17:601–614.

https://doi.org/10.1038/nrg.2016.85
- PubMed
- Google Scholar
1. Vavouri T
2. Walter K
3. Gilks WR
4. Lehner B
5. Elgar G
(2007) Parallel evolution of conserved non-coding elements that target a common set of developmental regulatory genes from worms to humans
Genome Biology 8:R15–14.

https://doi.org/10.1186/gb-2007-8-2-r15
- PubMed
- Google Scholar
(2012) The emerging role of Polycomb repressors in the response to DNA damage
Journal of Cell Science 125:3939–3948.

https://doi.org/10.1242/jcs.107375
- PubMed
- Google Scholar
(2007) Polycomb Group proteins: an evolutionary perspective
Trends in Genetics 23:494–502.

https://doi.org/10.1016/j.tig.2007.08.006
- PubMed
- Google Scholar
1. Woolfe A
2. Goodson M
3. Goode DK
4. Snell P
5. McEwen GK
6. Vavouri T
7. Smith SF
8. North P
9. Callaway H
10. Kelly K
11. Walter K
12. Abnizova I
13. Gilks W
14. Edwards YJ
15. Cooke JE
16. Elgar G
(2005) Highly conserved non-coding sequences are associated with vertebrate development
PLoS Biology 3:e7.

https://doi.org/10.1371/journal.pbio.0030007
- PubMed
- Google Scholar
1. Wu H
2. Caffo B
3. Jaffee HA
4. Irizarry RA
5. Feinberg AP
(2010) Redefining CpG islands using Hidden Markov models
Biostatistics 11:499–514.

https://doi.org/10.1093/biostatistics/kxq005
- PubMed
- Google Scholar
1. Wysocka J
2. Swigut T
3. Xiao H
4. Milne TA
5. Kwon SY
6. Landry J
7. Kauer M
8. Tackett AJ
9. Chait BT
10. Badenhorst P
11. Wu C
12. Allis CD
(2006) A PHD finger of NURF couples histone H3 lysine 4 trimethylation with chromatin remodelling
Nature 442:86–90.

https://doi.org/10.1038/nature04815
- PubMed
- Google Scholar
1. Zentner GE
2. Scacheri PC
(2012) The chromatin fingerprint of gene enhancer elements
Journal of Biological Chemistry 287:30888–30896.

https://doi.org/10.1074/jbc.R111.296491
- PubMed
- Google Scholar
(2011) Epigenetic signatures distinguish multiple classes of enhancers with distinct cellular functions
Genome Research 21:1273–1283.

https://doi.org/10.1101/gr.122382.111
- PubMed
- Google Scholar
1. Zhang Y
2. Liu T
3. Meyer CA
4. Eeckhoute J
5. Johnson DS
6. Bernstein BE
7. Nusbaum C
8. Myers RM
9. Brown M
10. Li W
11. Liu XS
(2008) Model-based analysis of ChIP-Seq (MACS)
Genome Biology 9:R137.

https://doi.org/10.1186/gb-2008-9-9-r137
- PubMed
- Google Scholar
(2011) Charting histone modifications and the functional organization of mammalian genomes
Nature Reviews Genetics 12:7–18.

https://doi.org/10.1038/nrg2905
- PubMed
- Google Scholar

Article and author information

Author details

Federico Gaiti

School of Biological Sciences, University of Queensland, Brisbane, Australia
Present address
1. New York Genome Center, New York City, United States
2. Department of Medicine, Weill Cornell Medicine, New York City, United States
Contribution
FG, Conceptualization, Data curation, Formal analysis, Validation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing

Competing interests
The authors declare that no competing interests exist.

"This ORCID iD identifies the author of this article:" 0000-0001-5111-8816
Katia Jindrich

School of Biological Sciences, University of Queensland, Brisbane, Australia

Contribution
KJ, Formal analysis, Validation, Methodology, Writing—review and editing

Competing interests
The authors declare that no competing interests exist.
Selene L Fernandez-Valverde

School of Biological Sciences, University of Queensland, Brisbane, Australia

Present address
Consejo Nacional de Ciencia y Tecnología, Laboratorio Nacional de Genómica para la Biodiversidad, Centro de Investigación y de Estudios Avanzados del IPN, Guanajuato, México

Contribution
SLF-V, Conceptualization, Formal analysis, Supervision, Project administration, Writing—review and editing

Competing interests
The authors declare that no competing interests exist.
Kathrein E Roper

School of Biological Sciences, University of Queensland, Brisbane, Australia

Contribution
KER, Validation, Methodology, Writing—review and editing, Performed ChIP-quantitative PCRs (ChIP-qPCRs). All authors agree with her inclusion and place in the author list

Competing interests
The authors declare that no competing interests exist.
Bernard M Degnan

School of Biological Sciences, University of Queensland, Brisbane, Australia

Contribution
BMD, Conceptualization, Supervision, Funding acquisition, Writing—original draft, Project administration, Writing—review and editing

For correspondence
b.degnan@uq.edu.au

Competing interests
The authors declare that no competing interests exist.
Miloš Tanurdžić

School of Biological Sciences, University of Queensland, Brisbane, Australia

Contribution
MT, Conceptualization, Formal analysis, Supervision, Methodology, Writing—original draft, Project administration, Writing—review and editing

For correspondence
m.tanurdzic@uq.edu.au

Competing interests
The authors declare that no competing interests exist.

"This ORCID iD identifies the author of this article:" 0000-0002-7564-0868

Funding

Australian Research Council (FL110100044)

Bernard M Degnan

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Alex de Mendoza for constructive comments on the manuscript, orthologs identification and phylogenetic analyses, William Hatleberg for KEGG pathways analysis, Simon Blomberg for statistical consultation, Nicholas Rhodes and Igor Makunin for bioinformatics support, Emily Wong for critical reading of the manuscript, and Kevin Dudley for sequencing of the larva ChIP-seq libraries.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.