The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture

  1. Athma A Pai
  2. Telmo Henriques
  3. Kayla McCue
  4. Adam Burkholder
  5. Karen Adelman
  6. Christopher B Burge  Is a corresponding author
  1. Massachusetts Institute of Technology, United States
  2. National Institute for Environmental Health Sciences, United States
  3. Harvard Medical School, United States

Abstract

Production of most eukaryotic mRNAs requires splicing of introns from pre-mRNA. The splicing reaction requires definition of splice sites, which are initially recognized in either intron-spanning ('intron definition') or exon-spanning ('exon definition') pairs. To understand how exon and intron length and splice site recognition mode impact splicing, we measured splicing rates genome-wide in Drosophila, using metabolic labeling/RNA sequencing and new mathematical models to estimate rates. We found that the modal intron length range of 60-70 nt represents a local maximum of splicing rates, but that much longer exon-defined introns are spliced even faster and more accurately. Surprisingly, we observed low variation in splicing rates across introns in the same gene, suggesting the presence of gene-level influences, and we identified multiple gene level variables associated with splicing rate. Together our data suggest that developmental and stress response genes may have preferentially evolved exon definition in order to enhance rates of splicing.

Data availability

The following data sets were generated

Article and author information

Author details

  1. Athma A Pai

    Department of Biology, Massachusetts Institute of Technology, Cambridge, United States
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7995-9948
  2. Telmo Henriques

    Epigenetics and Stem Cell Biology Laboratory, National Institute for Environmental Health Sciences, Durham, United States
    Competing interests
    No competing interests declared.
  3. Kayla McCue

    Program in Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, United States
    Competing interests
    No competing interests declared.
  4. Adam Burkholder

    Center for Integrative Bioinformatics, National Institute for Environmental Health Sciences, Durham, United States
    Competing interests
    No competing interests declared.
  5. Karen Adelman

    Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, United States
    Competing interests
    Karen Adelman, Reviewing editor, eLife.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5364-334X
  6. Christopher B Burge

    Department of Biology, Massachusetts Institute of Technology, Cambridge, United States
    For correspondence
    cburge@mit.edu
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9047-5648

Funding

National Institutes of Health (Z01-ES101987)

  • Telmo Henriques
  • Adam Burkholder
  • Karen Adelman

National Institutes of Health (R01-GM085319)

  • Athma A Pai
  • Christopher B Burge

Jane Coffin Childs Memorial Fund for Medical Research

  • Athma A Pai

U.S. Department of Energy (FG02-97ER25308)

  • Kayla McCue

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Timothy W Nilsen, Case Western Reserve University, United States

Publication history

  1. Received: October 5, 2017
  2. Accepted: December 22, 2017
  3. Accepted Manuscript published: December 27, 2017 (version 1)
  4. Version of Record published: January 10, 2018 (version 2)

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 5,490
    Page views
  • 772
    Downloads
  • 35
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Athma A Pai
  2. Telmo Henriques
  3. Kayla McCue
  4. Adam Burkholder
  5. Karen Adelman
  6. Christopher B Burge
(2017)
The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture
eLife 6:e32537.
https://doi.org/10.7554/eLife.32537

Further reading

    1. Computational and Systems Biology
    2. Neuroscience
    Andrew McKinney, Ming Hu ... Xiaolong Jiang
    Research Article

    The locus coeruleus (LC) houses the vast majority of noradrenergic neurons in the brain and regulates many fundamental functions including fight and flight response, attention control, and sleep/wake cycles. While efferent projections of the LC have been extensively investigated, little is known about its local circuit organization. Here, we performed large-scale multi-patch recordings of noradrenergic neurons in adult mouse LC to profile their morpho-electric properties while simultaneously examining their interactions. LC noradrenergic neurons are diverse and could be classified into two major morpho-electric types. While fast excitatory synaptic transmission among LC noradrenergic neurons was not observed in our preparation, these mature LC neurons connected via gap junction at a rate similar to their early developmental stage and comparable to other brain regions. Most electrical connections form between dendrites and are restricted to narrowly spaced pairs or small clusters of neurons of the same type. In addition, more than two electrically coupled cell pairs were often identified across a cohort of neurons from individual multi-cell recording sets that followed a chain-like organizational pattern. The assembly of LC noradrenergic neurons thus follows a spatial and cell type-specific wiring principle that may be imposed by a unique chain-like rule.

    1. Computational and Systems Biology
    Damiano Sgarbossa, Umberto Lupo, Anne-Florence Bitbol
    Research Article

    Computational models starting from large ensembles of evolutionarily related protein sequences capture a representation of protein families and learn constraints associated to protein structure and function. They thus open the possibility for generating novel sequences belonging to protein families. Protein language models trained on multiple sequence alignments, such as MSA Transformer, are highly attractive candidates to this end. We propose and test an iterative method that directly employs the masked language modeling objective to generate sequences using MSA Transformer. We demonstrate that the resulting sequences score as well as natural sequences, for homology, coevolution and structure-based measures. For large protein families, our synthetic sequences have similar or better properties compared to sequences generated by Potts models, including experimentally-validated ones. Moreover, for small protein families, our generation method based on MSA Transformer outperforms Potts models. Our method also more accurately reproduces the higher-order statistics and the distribution of sequences in sequence space of natural data than Potts models. MSA Transformer is thus a strong candidate for protein sequence generation and protein design.