Protein evidence of unannotated ORFs in Drosophila reveals diversity in the evolution and properties of young proteins

  1. Eric B Zheng
  2. Li Zhao  Is a corresponding author
  1. Rockefeller University, United States

Abstract

De novo gene origination, where a previously non-genic genomic sequence becomes genic through evolution, has been increasingly recognized as an important source of evolutionary novelty across diverse taxa. Many de novo genes have been proposed to be protein-coding, and in several cases have been experimentally shown to yield protein products. However, the systematic study of de novo proteins has been hampered by doubts regarding the translation of their transcripts without the experimental observation of protein products. Using a systematic, ORF-focused mass-spectrometry-first computational approach, we identify almost 1000 unannotated open reading frames with evidence of translation (utORFs) in the model organism Drosophila melanogaster, 371 of which have canonical start codons. To quantify the comparative genomic similarity of these utORFs across Drosophila and to infer phylostratigraphic age, we further develop a synteny-based protein similarity approach. Combining these results with reference datasets on tissue- and life-stage-specific transcription and conservation, we identify different properties amongst these utORFs. Contrary to expectations, the fastest-evolving utORFs are not the youngest evolutionarily. We observed more utORFs in the brain than in the testis. Most of the identified utORFs may be of de novo origin, even accounting for the possibility of false-negative similarity detection. Finally, sequence divergence after an inferred de novo origin event remains substantial, raising the possibility that de novo proteins turn over frequently. Our results suggest that there is substantial unappreciated diversity in de novo protein evolution: many more may exist than have been previously appreciated; there may be divergent evolutionary trajectories; and de novo proteins may be gained and lost frequently. All in all, there may not exist a single characteristic model of de novo protein evolution, but instead, there may be diverse evolutionary trajectories for de novo proteins.

Data availability

Raw MS data are deposited in PRIDE under accession number PXD032197. Relevant scripts and intermediate files can be found in our Github repository https://github.com/LiZhaoLab/utORF_mass_spec.

The following data sets were generated

Article and author information

Author details

  1. Eric B Zheng

    Laboratory of Evolutionary Genetics and Genomics, Rockefeller University, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  2. Li Zhao

    Laboratory of Evolutionary Genetics and Genomics, Rockefeller University, New York, United States
    For correspondence
    lzhao@rockefeller.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6776-1996

Funding

National Institute of General Medical Sciences (R35GM133780)

  • Li Zhao

National Institute of General Medical Sciences (T32GM007739)

  • Eric B Zheng

Robertson Foundation

  • Li Zhao

Rita Allen Foundation (Rita Allen Foundation Scholar)

  • Li Zhao

Vallee Foundation (Vallee Scholar)

  • Li Zhao

Monique Weill-Caulier Trust

  • Li Zhao

Alfred P. Sloan Foundation (Alfred P. Sloan Research Fellowship)

  • Li Zhao

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2022, Zheng & Zhao

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,733
    views
  • 355
    downloads
  • 13
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Eric B Zheng
  2. Li Zhao
(2022)
Protein evidence of unannotated ORFs in Drosophila reveals diversity in the evolution and properties of young proteins
eLife 11:e78772.
https://doi.org/10.7554/eLife.78772

Share this article

https://doi.org/10.7554/eLife.78772

Further reading

    1. Evolutionary Biology
    Xuankun Li, Adriana E Marvaldi ... Duane D McKenna
    Research Article

    The rise of angiosperms to ecological dominance and the breakup of Gondwana during the Mesozoic marked major transitions in the evolutionary history of insect-plant interactions. To elucidate how contemporary trophic interactions were influenced by host plant shifts and palaeogeographical events, we integrated molecular data with information from the fossil record to construct a time tree for ancient phytophagous weevils of the beetle family Belidae. Our analyses indicate that crown-group Belidae originated approximately 138 Ma ago in Gondwana, associated with Pinopsida (conifer) host plants, with larvae likely developing in dead/decaying branches. Belids tracked their host plants as major plate movements occurred during Gondwana’s breakup, surviving on distant, disjunct landmasses. Some belids shifted to Angiospermae and Cycadopsida when and where conifers declined, evolving new trophic interactions, including brood-pollination mutualisms with cycads and associations with achlorophyllous parasitic angiosperms. Extant radiations of belids in the genera Rhinotia (Australian region) and Proterhinus (Hawaiian Islands) have relatively recent origins.

    1. Evolutionary Biology
    Amanda D Melin
    Insight

    Studying the fecal microbiota of wild baboons helps provide new insight into the factors that influence biological aging.