Genetic Rearrangement: One genome’s junk is another’s garbage

  1. Lydia J Bright  Is a corresponding author
  2. Douglas L Chalker  Is a corresponding author
  1. State University of New York at New Paltz, United States
  2. Washington University in St. Louis, United States

Most of us store all kinds of junk in our garages and basements because we think that it might be useful at some time in the future. One day, we convince ourselves, we will dig out those old cleats or need that widget. Then the time comes for a good spring cleaning, and we declare that much of the junk we’ve been saving is just garbage and put it out on the curb for pick-up.

Much like basements, genomes are full of junk (such as repetitive sequences of DNA that have no obvious function). Of course, genomes also encode the instructions for making essential mRNA molecules and proteins, and these instructions need to be passed on to future generations. Many eukaryotes separate out these activities. For example, multicellular plants and animals use different cell types: the main “somatic” cells of the body express genes, while germline cells (such as egg and sperm cells) propagate DNA to offspring. Single-celled eukaryotes called ciliates, on the other hand, keep their germline genome in a germline micronucleus and their somatic genome in a separate somatic macronucleus. Now, in eLife, Robert Coyne of the J. Craig Venter Institute and colleagues – including Eileen Hamilton and Aurélie Kapusta as joint first authors – report that they have sequenced the germline genome of a ciliate called Tetrahymena thermophila (Hamilton et al., 2016).

When ciliates mate, their germline nuclei fuse to form a new nucleus that develops into both the somatic and germline nuclei of the offspring (Figure 1). The new germline genome remains intact and is transcriptionally inactive. To form the somatic genome, germline chromosomes break into fragments to form the chromosomes that end up in the somatic macronucleus.

Junk DNA in Tetrahymena.

Left: Tetrahymena is a single-celled ciliate that stores its germline genome in a micronucleus and its somatic genome in a macronucleus. During reproduction, two micronuclei fuse to form a zygotic nucleus that splits to form a new germline micronucleus and a new somatic macronucleus. This means that the genetic material in the somatic macronuclei of the parents is discarded. Right: Germline DNA (purple; top) remains intact in the germline micronucleus, but is processed in the somatic macronucleus to form somatic DNA (blue; bottom). Junk DNA in the form of internally eliminated sequences (IES; green boxes) is removed and the DNA fragments at chromosome breakage sites (Cbs; red arrowhead) to form five chromosomes, which are stabilized by the addition of telomeres (purple triangles) at their ends.

During this developmental process, ciliates treat junk DNA like garbage, tossing it from their somatic genomes. One can therefore identify what these cells consider to be junk by comparing the contents of their germline and somatic genomes: junk DNA is only found in the germline. The fact that ciliates keep junk DNA in their germline, even though they have developed mechanisms to remove it, may indicate that this DNA serves (or has served) some role in the lifestyle of ciliates. On the other hand, the evolutionary cost of developing or deploying mechanisms to remove the nonfunctional DNA may exceed the cost of propagating it, leading to its retention.

When Hamilton et al. compared the germline genome of Tetrahymena with the somatic genome, which was sequenced a decade ago (Eisen et al., 2006), they found that one third of the germline genome is discarded to form the somatic genome. In particular, they discovered that approximately 12,000 junk DNA loci are “internally eliminated sequences” that are removed during development (Figure 1), and they were able to map the exact locations of nearly 7500 of these.

The sequence and distribution of these internally eliminated sequences in the germline genome reveal a history spent combating mobile DNA elements called transposons. Most of the internally eliminated sequences appear to be descendants of transposons and were highly enriched near the center of the five metacentric chromosomes. (A metacentric chromosome has its centromere – the structure that holds sister chromatids together – at its center, whereas an acentric chromosome lacks a centromere). Some of these sequences must act as centromeric DNA, only to be removed when the acentric somatic chromosomes form.

The centromeres of most eukaryotes are rich in repetitive DNA that is packaged as heterochromatin to suppress the activity of the genes it contains. However, the germline chromosomes in Tetrahymena lack the form of heterochromatin found in the outer layer of the centromeres of most eukaryotes (Taverna et al., 2002). Perhaps junk DNA accumulates around centromeres in order to keep those parts of the genome that are transcribed away from the centromere, where chromatin suppresses the transcription of DNA (Fukagawa and Earnshaw, 2014).

Tetrahymena still forms heterochromatin to combat transposon proliferation, but not in the germline genome. The ciliates have adapted a process by which small RNA molecules direct the formation of heterochromatin to silence, then eliminate, transposons before they reach the expressed genome (Taverna et al., 2002, Mochizuki et al., 2002). To go from silencing to elimination, Tetrahymena cells have domesticated Tpb2p – an enzyme that normally helps transposons to hop around the genome – to cut out any DNA that is packaged in newly formed heterochromatin (Cheng et al., 2010).

A related ciliate called Paramecium tetraurelia removes internally eliminated sequences using a more precise method than Tetrahymena (Bétermier, 2004). Hamilton et al. show that all but a very small number of internally eliminated sequences in Tetrahymena are located within non-coding sequences, whereas thousands are located within coding sequences in Paramecium (Arnaiz et al., 2012). Tetrahymena’s imprecise excision mechanism likely prevents internally eliminated sequences (or, more accurately, their ancestral active transposons) from accumulating in protein-coding genes, which errant excision events might render non-functional.

Does any of the eliminated junk DNA contain useful stuff? There is a lot left to explore. When Hamilton et al. – who are based at institutes in the United States, Austria, the United Kingdom and China – mapped the 225 sites at which germline chromosomes break when creating the ends of the acentric somatic chromosomes, they discovered that 33 chromosome segments are not maintained in the somatic macronucleus. This appears to be more than happenstance. The sequence and position of each chromosome breakage site was conserved across Tetrahymena species, and new fragmentation sites appeared to be created by duplicating existing ones.

These 33 fragments encode 47 predicted open reading frames, some of which are transcribed during development before they are eliminated. Hamilton et al. propose that these fragments present a strategy for regulating gene expression during development. This is not without precedent, as the region encoding a subunit of the telomerase enzyme that is only required during development is also eliminated from the somatic genome of the ciliate Euplotes crassus (Karamysheva et al., 2003). In other words, the existence of junk and the means to remove it during development have been repeatedly co-opted to regulate gene expression, much like that widget in the basement proving to be useful after all. How much more of this junk DNA is more than garbage?

References

Article and author information

Author details

  1. Lydia J Bright

    Department of Biology, State University of New York at New Paltz, New Paltz, United States
    For correspondence
    brightl@newpaltz.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7185-9988
  2. Douglas L Chalker

    Biology Department, Washington University in St. Louis, St. Louis, United States
    For correspondence
    dchalker@wustl.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0285-3344

Publication history

  1. Version of Record published:

Copyright

© 2016, Bright et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,409
    views
  • 257
    downloads
  • 0
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Lydia J Bright
  2. Douglas L Chalker
(2016)
Genetic Rearrangement: One genome’s junk is another’s garbage
eLife 5:e23447.
https://doi.org/10.7554/eLife.23447

Further reading

    1. Chromosomes and Gene Expression
    2. Immunology and Inflammation
    Matthew C Pahl, Prabhat Sharma ... Andrew D Wells
    Research Article

    Genome-wide association studies (GWAS) have identified hundreds of genetic signals associated with autoimmune disease. The majority of these signals are located in non-coding regions and likely impact cis-regulatory elements (cRE). Because cRE function is dynamic across cell types and states, profiling the epigenetic status of cRE across physiological processes is necessary to characterize the molecular mechanisms by which autoimmune variants contribute to disease risk. We localized risk variants from 15 autoimmune GWAS to cRE active during TCR-CD28 co-stimulation of naïve human CD4+ T cells. To characterize how dynamic changes in gene expression correlate with cRE activity, we measured transcript levels, chromatin accessibility, and promoter–cRE contacts across three phases of naive CD4+ T cell activation using RNA-seq, ATAC-seq, and HiC. We identified ~1200 protein-coding genes physically connected to accessible disease-associated variants at 423 GWAS signals, at least one-third of which are dynamically regulated by activation. From these maps, we functionally validated a novel stretch of evolutionarily conserved intergenic enhancers whose activity is required for activation-induced IL2 gene expression in human and mouse, and is influenced by autoimmune-associated genetic variation. The set of genes implicated by this approach are enriched for genes controlling CD4+ T cell function and genes involved in human inborn errors of immunity, and we pharmacologically validated eight implicated genes as novel regulators of T cell activation. These studies directly show how autoimmune variants and the genes they regulate influence processes involved in CD4+ T cell proliferation and activation.

    1. Chromosomes and Gene Expression
    2. Developmental Biology
    Leif Benner, Savannah Muron ... Brian Oliver
    Research Article

    Differentiation of female germline stem cells into a mature oocyte includes the expression of RNAs and proteins that drive early embryonic development in Drosophila. We have little insight into what activates the expression of these maternal factors. One candidate is the zinc-finger protein OVO. OVO is required for female germline viability and has been shown to positively regulate its own expression, as well as a downstream target, ovarian tumor, by binding to the transcriptional start site (TSS). To find additional OVO targets in the female germline and further elucidate OVO’s role in oocyte development, we performed ChIP-seq to determine genome-wide OVO occupancy, as well as RNA-seq comparing hypomorphic and wild type rescue ovo alleles. OVO preferentially binds in close proximity to target TSSs genome-wide, is associated with open chromatin, transcriptionally active histone marks, and OVO-dependent expression. Motif enrichment analysis on OVO ChIP peaks identified a 5’-TAACNGT-3’ OVO DNA binding motif spatially enriched near TSSs. However, the OVO DNA binding motif does not exhibit precise motif spacing relative to the TSS characteristic of RNA polymerase II complex binding core promoter elements. Integrated genomics analysis showed that 525 genes that are bound and increase in expression downstream of OVO are known to be essential maternally expressed genes. These include genes involved in anterior/posterior/germ plasm specification (bcd, exu, swa, osk, nos, aub, pgc, gcl), egg activation (png, plu, gnu, wisp, C(3)g, mtrm), translational regulation (cup, orb, bru1, me31B), and vitelline membrane formation (fs(1)N, fs(1)M3, clos). This suggests that OVO is a master transcriptional regulator of oocyte development and is responsible for the expression of structural components of the egg as well as maternally provided RNAs that are required for early embryonic development.