• Figure 6.
    Download figureOpen in new tabFigure 6. Repressive elements co-evolve with splice site sequences at cryptic exons.

    Alignment within primate genomes of Alu elements that contain human Alu-exon reveals a tight coupling between repressive U-tracts and the variation of 3' splice site strength. (A) As the strength of 3' splice sites increases, the repressive U-tracts are lengthened, which recruits hnRNPC to prevent splicing of Alu-exons. Hence, the splice site sequence and the repressive U-tract undergo an evolutionary dynamic that we refer to as regulatory co-evolution. Splice site sequences without a nearby repressive element are depleted, likely due to strong negative selection. Selection pressure for long U-tracts decreased for as-yet unknown reasons at the more ancient Alu-exons (those that contain a splice site in a distant primate species, or their sequence diverges from the Alu consensus), and these exons are less repressed by hnRNPC and have an increased incidence of tissue-specific splicing. At this stage, abundance of the Alu-exon isoform is determined also by its ability to trigger NMD, and likely other factors such as tissue-specific splicing factors. (B) Based on the variation of the 3' splice sites between primate species, we characterised three evolutionary groups of Alu-exons. Most human Alu-exons have stronger 3' splice sites in human compared to New World monkeys, and these exons also have longer repressive U-tracts in human. These exons are split according to the evolutionary trajectory of their 3' splice site sequence into emerging, evolving or stable 3' splice sites. The most ancient Alu-exons, which have a strong and stable 3' splice site, lack any trend towards longer U-tracts. This demonstrates that tight coupling between positive and negative splicing elements establishes a balanced regulatory environment at the newly emerging exons.

    DOI: http://dx.doi.org/10.7554/eLife.19545.021

  • The following datasets were generated:

    Attig J, Haberman N, Ruiz de los Mozos I, Wang Z, Zarnack K, König J, Ule J, 2016,Transcriptome profiling of cells depleted of hnRNPC, UPF1, or co-depleted of hnRNPC and UPF1., http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-4008, Publicly available at the EBI ArrayExpress (accession no: E-MTAB-4008)

    Attig J, Haberman N, Ruiz de los Mozos I, Wang Z, Zarnack K, König J, Ule J, 2016,Transcription profiling of cytoplasmic and nuclear RNA of hnRNPC-depleted cells and ER-RAF1 activated HR1 cells, http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-4009, Publicly available at the EBI ArrayExpress (accession no: E-MTAB-4009)

    Attig J, Haberman N, Ruiz de los Mozos I, Wang Z, Zarnack K, König J, Ule J, 2016,Data from: Splicing repression and NMD control the emergence of Alu-exons, http://dx.doi.org/10.5061/dryad.7h81d, Available at Dryad Digital Repository under a CC0 Public Domain Dedication

    The following previously published dataset was used:

    Schroth P, 2014,RNA-Seq of human individual tissues and mixture of 16 tissues (Illumina Body Map), http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-513/, Publicly available at the EBI ArrayExpress (accession no: E-MTAB-513)

  • Figure 4—source data 1.Percent exon inclusion of Alu-exons in different human tissues.

    We list percent exon inclusion for each Alu-exon with sufficient coverage across the GTEx data (889 exons with at least 4000 junction-spanning reads across 8555 individual samples). In addition, the data table includes chromosome position of the exon, mean skipping counts, the maximum and minimum inclusion across the tissues, the maximum difference in exon inclusion across the tissues, strength of the 3' splice site. In human, the length of the longest U-tract, substitutions in the Alu elements and substitution group. If no junction-reads were detected in a tissue extending from the upstream directly to the downstream exon ('skipping junctions'), we report 100% exon inclusion in this tissue.

    DOI: http://dx.doi.org/10.7554/eLife.19545.015

    Download source data [figure-4—source-data-1.media-1.csv]
  • Figure 5—source data 1.List of Alu-exons across our datasets and UCSC annotation, including cross-species annotation.

    We merged all Alu-exons identified from RNAseq data and annotated in UCSC (6731 exons), and filtered for non-overlapping exons. If any two exons overlapped, we retained the larger exon. This created a list of 6309 Alu-exons for which we selected the 3' splice site and mapped orthologues positions in four primate genomes. For each 3'splice site (if present), we predicted the splice site strength using MaxEntScan (Yeo and Burge, (2004) and searched for the longest U-tract within the Alu element. The table lists in order: Alu-exon co-ordinates (in hg19), and the annotation by UCSC as alternative or constitutive exon, if not present the exon is annotated as cryptic exon. Next, the furthest species in which we could identify the orthologous position, the exonisation group of the 3' splice site as defined in Figure 5 (see Materials and methods for details), the coordinates, repeat family, substitution group of the Alu element the exon arises from (in hg19), as well as the position, predicted strength of the 3' splice site and longest U-tract of the Alu element in human (hg19), in chimpanzee (panTro4), in gibbon (nomLeu1), in rhesus macaque (rheMac3) and in marmoset (calJac3). All 3' splice site positions start with the −2 position of the canonical 3' splice site (the AG nucleotide) consensus.

    DOI: http://dx.doi.org/10.7554/eLife.19545.019

    Download source data [figure-5—source-data-1.media-2.xlsx]
  • Supplementary file 1.List of RT-PCR and RT-qPCR primers used in this study.

    DOI: http://dx.doi.org/10.7554/eLife.19545.022

    Download source data [supplementary-file-1.media-3.xlsx]