Unraveling the influences of sequence and position on yeast uORF activity using massively parallel reporter systems and machine learning
Abstract
Upstream open reading frames (uORFs) are potent cis-acting regulators of mRNA translation and nonsense-mediated decay (NMD). While both AUG- and non-AUG initiated uORFs are ubiquitous in ribosome profiling studies, few uORFs have been experimentally tested. Consequently, the relative influences of sequence, structural, and positional features on uORF activity have not been determined. We quantified thousands of yeast uORFs using massively parallel reporter assays in wildtype and ∆upf1 yeast. While nearly all AUG uORFs were robust repressors, most non-AUG uORFs had relatively weak impacts on expression. Machine learning regression modeling revealed that both uORF sequences and locations within transcript leaders predict their effect on gene expression. Indeed, alternative transcription start sites highly influenced uORF activity. These results define the scope of natural uORF activity, identify features associated with translational repression and NMD, and suggest that the locations of uORFs in transcript leaders are nearly as predictive as uORF sequences.
Data availability
Sequencing data have been deposited in NCBI SRA under accession PRJNA721222.
Article and author information
Author details
Funding
National Institutes of Health (R01GM121895)
- Gemma E May
- Christina Akirtava
- Matthew Agar-Johnson
- Joel McManus
National Institutes of Health (R35GM145317)
- Gemma E May
- Christina Akirtava
- Joel McManus
National Institutes of Health (R01GM028301)
- Jelena Micic
- John Woolford
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Reviewing Editor
- Christian R Landry, Université Laval, Canada
Publication history
- Received: April 20, 2021
- Accepted: May 24, 2023
- Accepted Manuscript published: May 25, 2023 (version 1)
Copyright
© 2023, May et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 151
- Page views
-
- 55
- Downloads
-
- 0
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Chromosomes and Gene Expression
- Genetics and Genomics
Regulatory networks underlying innate immunity continually face selective pressures to adapt to new and evolving pathogens. Transposable elements (TEs) can affect immune gene expression as a source of inducible regulatory elements, but the significance of these elements in facilitating evolutionary diversification of innate immunity remains largely unexplored. Here, we investigated the mouse epigenomic response to type II interferon (IFN) signaling and discovered that elements from a subfamily of B2 SINE (B2_Mm2) contain STAT1 binding sites and function as IFN-inducible enhancers. CRISPR deletion experiments in mouse cells demonstrated that a B2_Mm2 element has been co-opted as an enhancer driving IFN-inducible expression of Dicer1. The rodent-specific B2 SINE family is highly abundant in the mouse genome and elements have been previously characterized to exhibit promoter, insulator, and non-coding RNA activity. Our work establishes a new role for B2 elements as inducible enhancer elements that influence mouse immunity, and exemplifies how lineage-specific TEs can facilitate evolutionary turnover and divergence of innate immune regulatory networks.
-
- Epidemiology and Global Health
- Genetics and Genomics
Background:
Many genes associated with asthma explain only a fraction of its heritability. Most genome-wide association studies (GWASs) used a broad definition of 'doctor-diagnosed asthma', thereby diluting genetic signals by not considering asthma heterogeneity. The objective of our study was to identify genetic associates of childhood wheezing phenotypes.
Methods:
We conducted a novel multivariate GWAS meta-analysis of wheezing phenotypes jointly derived using unbiased analysis of data collected from birth to 18 years in 9,568 individuals from five UK birth-cohorts.
Results:
44 independent SNPs were associated with early-onset persistent, 25 with preschool remitting, 33 with mid-childhood remitting and 32 with late-onset wheeze. We identified a novel locus on chr9q21.13 (close to annexin 1 (ANXA1), p<6.7×10-9), associated exclusively with early-onset persistent wheeze. We identified rs75260654 as the most likely causative single nucleotide polymorphism (SNP) using Promoter Capture Hi-C loops, and then showed that the risk allele (T) confers a reduction in ANXA1 expression. Finally, in a murine model of house dust mite (HDM)-induced allergic airway disease, we demonstrated that anxa1 protein expression increased and anxa1 mRNA was significantly induced in lung tissue following HDM exposure. Using anxa1-/- deficient mice, we showed that loss of anxa1 results in heightened airway hyperreactivity and Th2 inflammation upon allergen challenge.
Conclusions:
Targeting this pathway in persistent disease may represent an exciting therapeutic prospect.
Funding:
UK Medical Research Council Programme Grant MR/S025340/1 and the Wellcome Trust Strategic Award (108,818/15/Z) provided most of the funding for this study.