Many lncRNAs, 5’UTRs, and pseudogenes are translated and some are likely to express functional proteins

  1. Zhe Ji
  2. Ruisheng Song
  3. Aviv Regev
  4. Kevin Struhl  Is a corresponding author
  1. Harvard Medical School, United States
  2. Broad Institute of MIT and Harvard, United States
  3. Howard Hughes Medical Institute, Massachusetts Institute of Technology, United States
7 figures and 3 additional files

Figures

Figure 1 with 1 supplement
Ribosome profiling reveals in vivo translation with single nucleotide resolution.

(A) Ribosome profiling experiment. (B) Read distribution (reads/million mappable reads; RPM) around start and stop codons of canonical protein coding genes. (C) Fractions of reads in 1st, 2nd and 3rd

https://doi.org/10.7554/eLife.08890.003
Figure 1—figure supplement 1
Ribosome profiling data.

(A) RPF length distribution. (B) The read distribution of RPFs around start and stop codons of canonical mRNA ORFs. RPFs were grouped based on their length.

https://doi.org/10.7554/eLife.08890.004
Figure 2 with 2 supplements
RibORF identifies translating ORFs.

(A) Receiver-operating characteristic (ROC) curves to measure algorithm performance using different training parameters. (B) Types of translated ORFs identified in this study, with ORF number:gene …

https://doi.org/10.7554/eLife.08890.005
Figure 2—figure supplement 1
RibORF algorithm performance.

(A) ORFs were grouped based on expression levels, and corresponding AUC values were plotted as in Figure 2A. (B) Correlation of predicted translating probability of candidate ORFs, using ribosome …

https://doi.org/10.7554/eLife.08890.006
Figure 2—figure supplement 2
Analysis of ribosome-associated RNA.

(A) Sucrose gradient fractionation of polyribosomes with fractions indicated. (B) Analysis of RNAs associated with 80S monoribosomes (fraction 1) and polyribosomes with 2 (fraction 2) or 3+ …

https://doi.org/10.7554/eLife.08890.007
Figure 3 with 1 supplement
RNA subcellular localization is a major determinate of translation efficiency.

(A) RNA expression levels of lncRNAs with or without translated ORFs and canonical mRNAs in MCF10A-ER-Src cells. (B) Relative subcellular location of translated and untranslated lncRNAs and …

https://doi.org/10.7554/eLife.08890.008
Figure 3—figure supplement 1
RNA subcellular localization regulates translation.

(A) RNA expression levels of expressed lncRNAs with or without translated ORFs and mRNAs in fibroblast cells measure by RNA-seq. (B) Translation efficiency of translated ORFs in lncRNAs and …

https://doi.org/10.7554/eLife.08890.009
Figure 4 with 6 supplements
Features and conservation of lncRNA peptides.

(A) Fraction of expressed lncRNAs that encode peptides longer than a certain length. (B) Peptide length encoded by lncRNAs. (C) Length of the longest peptide in a given lncRNAs. (D) Length of …

https://doi.org/10.7554/eLife.08890.010
Figure 4—figure supplement 1
Features of lncRNA translation.

(A) Start codon of translated ORFs in lncRNAs and mRNAs. (B) Start codon of translated ORFs in lncRNA grouped based on length. (C) Length of the longest candidate ORFs in a given lncRNAs considering …

https://doi.org/10.7554/eLife.08890.011
Figure 4—figure supplement 2
Conservation of nucleotides encoding lncRNA and pseudogene peptides.

(A) PhastCon scores of nucleotides encoding lncRNA peptide grouped based on length. The median PhastCon value of translated ORFs in each group was shown. The PhastCon scores of random untranslated …

https://doi.org/10.7554/eLife.08890.012
Figure 4—figure supplement 3
Coding potential of nucleotides encoding lncRNA and pseudogene peptide.

(A) PhyloCSF scores of nucleotides encoding lncRNA peptide grouped based on length. The PhyloCSF scores of random untranslated sequences of matching sizes and locations are also plotted. Wilcoxon …

https://doi.org/10.7554/eLife.08890.013
Figure 4—figure supplement 4
BLASTP E-values of peptide sequences encoded by homologous human and mouse ORF.

(A) LncRNAs (B) Pseudogene RNAs BLASTP E-values between human translated ORFs and their randomized sequences were shown as the control.

https://doi.org/10.7554/eLife.08890.014
Figure 4—figure supplement 5
BLASTP E-values of peptide sequences encoded by homologous human and mouse peptides.

(A) uORFs (B) Overlapping uORFs (C) Internal ORFs (D) dORFs BLASTP E-values between human translated ORFs and their randomized sequences were shown as the control.

https://doi.org/10.7554/eLife.08890.015
Figure 4—figure supplement 6
The Ka/Ks ratios between human translated ORFs and 50 randomly generated sequences with BLASTP alignment E-value <10-4.

(A) ORFs < 50 aa. (B) ORFs ≥ 50 aa.

https://doi.org/10.7554/eLife.08890.016
Features and conservation of pseudogene peptides.

(A) Fraction of expressed pseudogenes that encode peptides longer than a certain length. (B) Peptide length encoded by pseudogenes. (C) Length of the longest peptides in a given pseudogenes. (D) …

https://doi.org/10.7554/eLife.08890.017
Figure 6 with 1 supplement
Features of ORFs encoded by protein coding genes.

(A) Length distribution of peptides encoded by human protein coding genes. (B) Relative translation efficiency comparing non-canonical ORF vs. canonical ORF from the same gene. (C) Translation …

https://doi.org/10.7554/eLife.08890.018
Figure 6—figure supplement 1
Example genes showing high translation of uORFs.

(A) RELA (B) PTEN (C) DICER1 Enlarged figures show supporting read distribution in uORFs.

https://doi.org/10.7554/eLife.08890.019
Figure 7 with 2 supplements
Conservation of non-canonical peptides encoded by mRNAs.

(A) Fraction of human mRNA peptides conserved in mouse. (B) Ka and Ks values of conserved mRNA peptides with Z-Test p-values shown. (C) Ka/Ks ratios of conserved mRNA peptides.

https://doi.org/10.7554/eLife.08890.020
Figure 7—figure supplement 1
Conservation of nucleotides encoding uORF and dORF peptides.

(A,B) PhastCon scores of nucleotides in uORFs (A) and dORFs (B) and their neighboring untranslated sequences of matching size and location (See methods for detail) were plotted. (C,D) PhyloCSF …

https://doi.org/10.7554/eLife.08890.021
Figure 7—figure supplement 2
Examples of conserved uORF peptides.
https://doi.org/10.7554/eLife.08890.022

Additional files

Supplementary file 1

Identified non-canonical human translated ORFs.

https://doi.org/10.7554/eLife.08890.023
Supplementary file 2

Human non-canonical peptides conserved in mouse.

https://doi.org/10.7554/eLife.08890.024
Supplementary file 3

uORF and dORFs with high translational efficiency (>three-fold higher than canonical ORFs).

https://doi.org/10.7554/eLife.08890.025

Download links