Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling

  1. Anil Raj  Is a corresponding author
  2. Sidney H Wang
  3. Heejung Shim
  4. Arbel Harpak
  5. Yang I Li
  6. Brett Engelmann
  7. Matthew Stephens
  8. Yoav Gilad
  9. Jonathan K Pritchard
  1. Stanford University, United States
  2. University of Chicago, United States
  3. Purdue University, United States

Abstract

Accurate annotation of protein coding regions is essential for understanding how genetic information is translated into function. We describe riboHMM, a new method that uses ribosome footprint data to accurately infer translated sequences. Applying riboHMM to human lymphoblastoid cell lines, we identified 7,273 novel coding sequences, including 2,442 translated upstream open reading frames. We observed an enrichment of footprints at inferred initiation sites after drug-induced arrest of translation initiation, validating many of the novel coding sequences. The novel proteins exhibit significant selective constraint in the inferred reading frames, suggesting that many are functional. Moreover, ~40% of bicistronic transcripts showed negative correlation in the translation levels of their two coding sequences, suggesting a potential regulatory role for these novel regions. Despite known limitations of mass spectrometry to detect protein expressed at low level, we estimated a 14% validation rate. Our work significantly expands the set of known coding regions in humans.

Article and author information

Author details

  1. Anil Raj

    Department of Genetics, Stanford University, Stanford, United States
    For correspondence
    rajanil@stanford.edu
    Competing interests
    The authors declare that no competing interests exist.
  2. Sidney H Wang

    Department of Human Genetics, University of Chicago, Chicago, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Heejung Shim

    Department of Statistics, Purdue University, West Lafayette, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. Arbel Harpak

    Department of Biology, Stanford University, Stanford, United States
    Competing interests
    The authors declare that no competing interests exist.
  5. Yang I Li

    Department of Genetics, Stanford University, Stanford, United States
    Competing interests
    The authors declare that no competing interests exist.
  6. Brett Engelmann

    Department of Human Genetics, University of Chicago, Chicago, United States
    Competing interests
    The authors declare that no competing interests exist.
  7. Matthew Stephens

    Department of Human Genetics, University of Chicago, Chicago, United States
    Competing interests
    The authors declare that no competing interests exist.
  8. Yoav Gilad

    Department of Human Genetics, University of Chicago, Chicago, United States
    Competing interests
    The authors declare that no competing interests exist.
  9. Jonathan K Pritchard

    Department of Genetics, Stanford University, Stanford, United States
    Competing interests
    The authors declare that no competing interests exist.

Reviewing Editor

  1. Nicholas T Ingolia, University of California, Berkeley, United States

Version history

  1. Received: December 24, 2015
  2. Accepted: May 26, 2016
  3. Accepted Manuscript published: May 27, 2016 (version 1)
  4. Version of Record published: July 11, 2016 (version 2)
  5. Version of Record updated: July 12, 2016 (version 3)

Copyright

© 2016, Raj et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 7,186
    Page views
  • 1,569
    Downloads
  • 88
    Citations

Article citation count generated by polling the highest count across the following sources: Scopus, Crossref, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Anil Raj
  2. Sidney H Wang
  3. Heejung Shim
  4. Arbel Harpak
  5. Yang I Li
  6. Brett Engelmann
  7. Matthew Stephens
  8. Yoav Gilad
  9. Jonathan K Pritchard
(2016)
Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling
eLife 5:e13328.
https://doi.org/10.7554/eLife.13328

Further reading

    1. Cancer Biology
    2. Computational and Systems Biology
    Megan E Kelley, Adi Y Berman ... Gregory P Way
    Research Article

    Drug resistance is a challenge in anticancer therapy. In many cases, cancers can be resistant to the drug prior to exposure, i.e., possess intrinsic drug resistance. However, we lack target-independent methods to anticipate resistance in cancer cell lines or characterize intrinsic drug resistance without a priori knowledge of its cause. We hypothesized that cell morphology could provide an unbiased readout of drug resistance. To test this hypothesis, we used HCT116 cells, a mismatch repair-deficient cancer cell line, to isolate clones that were resistant or sensitive to bortezomib, a well-characterized proteasome inhibitor and anticancer drug to which many cancer cells possess intrinsic resistance. We then expanded these clones and measured high-dimensional single-cell morphology profiles using Cell Painting, a high-content microscopy assay. Our imaging- and computation-based profiling pipeline identified morphological features that differed between resistant and sensitive cells. We used these features to generate a morphological signature of bortezomib resistance. We then employed this morphological signature to analyze a set of HCT116 clones (five resistant and five sensitive) that had not been included in the signature training dataset, and correctly predicted sensitivity to bortezomib in seven cases, in the absence of drug treatment. This signature predicted bortezomib resistance better than resistance to other drugs targeting the ubiquitin-proteasome system. Our results establish a proof-of-concept framework for the unbiased analysis of drug resistance using high-content microscopy of cancer cells, in the absence of drug treatment.

    1. Computational and Systems Biology
    Barbara Bravi, Andrea Di Gioacchino ... Rémi Monasson
    Research Article Updated

    Antigen immunogenicity and the specificity of binding of T-cell receptors to antigens are key properties underlying effective immune responses. Here we propose diffRBM, an approach based on transfer learning and Restricted Boltzmann Machines, to build sequence-based predictive models of these properties. DiffRBM is designed to learn the distinctive patterns in amino-acid composition that, on the one hand, underlie the antigen’s probability of triggering a response, and on the other hand the T-cell receptor’s ability to bind to a given antigen. We show that the patterns learnt by diffRBM allow us to predict putative contact sites of the antigen-receptor complex. We also discriminate immunogenic and non-immunogenic antigens, antigen-specific and generic receptors, reaching performances that compare favorably to existing sequence-based predictors of antigen immunogenicity and T-cell receptor specificity.