Unraveling the influences of sequence and position on yeast uORF activity using massively parallel reporter systems and machine learning

Abstract

Upstream open reading frames (uORFs) are potent cis-acting regulators of mRNA translation and nonsense-mediated decay (NMD). While both AUG- and non-AUG initiated uORFs are ubiquitous in ribosome profiling studies, few uORFs have been experimentally tested. Consequently, the relative influences of sequence, structural, and positional features on uORF activity have not been determined. We quantified thousands of yeast uORFs using massively parallel reporter assays in wildtype and ∆upf1 yeast. While nearly all AUG uORFs were robust repressors, most non-AUG uORFs had relatively weak impacts on expression. Machine learning regression modeling revealed that both uORF sequences and locations within transcript leaders predict their effect on gene expression. Indeed, alternative transcription start sites highly influenced uORF activity. These results define the scope of natural uORF activity, identify features associated with translational repression and NMD, and suggest that the locations of uORFs in transcript leaders are nearly as predictive as uORF sequences.

Data availability

Sequencing data have been deposited in NCBI SRA under accession PRJNA721222.

The following data sets were generated

Article and author information

Author details

  1. Gemma E May

    Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  2. Christina Akirtava

    Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Matthew Agar-Johnson

    Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. Jelena Micic

    Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  5. John Woolford

    Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, United States
    Competing interests
    The authors declare that no competing interests exist.
  6. Joel McManus

    Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, United States
    For correspondence
    mcmanus@andrew.cmu.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6605-2642

Funding

National Institutes of Health (R01GM121895)

  • Gemma E May
  • Christina Akirtava
  • Matthew Agar-Johnson
  • Joel McManus

National Institutes of Health (R35GM145317)

  • Gemma E May
  • Christina Akirtava
  • Joel McManus

National Institutes of Health (R01GM028301)

  • Jelena Micic
  • John Woolford

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2023, May et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,196
    views
  • 276
    downloads
  • 19
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Gemma E May
  2. Christina Akirtava
  3. Matthew Agar-Johnson
  4. Jelena Micic
  5. John Woolford
  6. Joel McManus
(2023)
Unraveling the influences of sequence and position on yeast uORF activity using massively parallel reporter systems and machine learning
eLife 12:e69611.
https://doi.org/10.7554/eLife.69611

Share this article

https://doi.org/10.7554/eLife.69611

Further reading

    1. Chromosomes and Gene Expression
    2. Genetics and Genomics
    Steven Henikoff, David L Levens
    Insight

    A new method for mapping torsion provides insights into the ways that the genome responds to the torsion generated by RNA polymerase II.

    1. Genetics and Genomics
    2. Microbiology and Infectious Disease
    Nicole Herrmann May, Anh Cao ... Tom Beneke
    Research Advance

    The ability to analyze the function of all genes in a genome is highly desirable, yet challenging in Leishmania due to a repetitive genome, limited DNA repair mechanisms, and lack of RNA interference in most species. While our introduction of a cytosine base editor (CBE) demonstrated potential to overcome these limitations (Engstler and Beneke, 2023), challenges remained, including low transfection efficiency, variable editing rates across species, parasite growth effects, and competition between deleterious and non-deleterious mutations. Here, we present an optimized approach addressing these issues. We identified a T7 RNAP promoter variant ensuring high editing rates across Leishmania species without compromising growth. A revised CBE single-guide RNAs (sgRNAs) scoring system was developed to prioritize STOP codon generation. Additionally, a triple-expression construct was created for stable integration of CBE sgRNA expression cassettes into a Leishmania safe harbor locus using AsCas12a ultra-mediated DNA double-strand breaks, increasing transfection efficiency by ~400-fold to 1 transfectant per 70 transfected cells. Using this improved system for a small-scale proof-of-principle pooled screen, we successfully confirmed the essential and fitness-associated functions of CK1.2, CRK2, CRK3, AUK1/AIRK, TOR1, IFT88, IFT139, IFT140, and RAB5A in Leishmania mexicana, demonstrating a significant improvement over our previous method. Lastly, we show the utility of co-expressing AsCas12a ultra, T7 RNAP, and CBE for hybrid CRISPR gene replacement and base editing within the same cell line. Overall, these improvements will broaden the range of possible gene editing applications in Leishmania species and will enable a variety of loss-of-function screens in the near future.