Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Xiaohua Shen
    Tsinghua University, Beijing, China
  • Senior Editor
    David Ron
    University of Cambridge, Cambridge, United Kingdom

Reviewer #1 (Public review):

This work compiles a comprehensive atlas of ncORFs across mammalian tissues and cell types, derived from reanalysis of ~400 public ribosome profiling datasets. The authors then evaluate cross-species conservation and functional signatures, proposing that evolutionarily ancient ncORFs tend to have higher translation potential, stronger expression, and closer relationships with canonical coding sequences.

Strengths:

In general, the study provides a large-scale and timely resource of annotated ncORFs, which could be broadly useful for the community. The authors collected ~400 public ribosome profiling datasets for annotations of ncORFs, which, to my best knowledge, is the largest collection of data for such a purpose. The catalog could facilitate future investigations into ncORF biology and broaden understanding of the coding potential of the "non-coding" genome.

Weaknesses:

Based on the ncORF catalog, some of the analyses were not properly done. Some of the results are descriptive.

(1) Bias and representations of the data source. Public ribo-seq datasets are unevenly distributed across tissues and cell lines, raising concerns about heterogeneity and underrepresentation of certain contexts. This may limit the generalizability of the catalog.

(2) The discussion on modular domains of ncORFs is unclear, and the claim that they may originate via TE-related mechanisms is not well supported. Stronger evidence or clearer reasoning is needed.

(3) The conservation comparisons are not fully convincing. Figure S7 shows only mild differences between ncORFs and CDS, and statistical significance is not clearly demonstrated.
Comparisons with other non-coding RNAs should be added, and overlapping sequences between ncORFs and CDS should be excluded to avoid bias.

(4) Figure 3 indicates that some ncORFs are subject to evolutionary constraints. This is not surprising. The authors should provide further analyses on more detailed features of these "conserved" ncORFs vs. the "non-conserved" ones. Some pretty informative works have been done in Drosophila, worms, mice, and humans. Figure 3 suggests some ncORFs are under evolutionary constraint, but this is not unexpected. More granular analyses contrasting "conserved" versus "non-conserved" ncORFs would be informative. In fact, small ORFs, especially uORFs, have been extensively studied for their functions and cross-species conservation. The authors should explicitly show what is new here in their analyses.

(5) Translation levels are reported using RPF counts. However, translation efficiency (normalized by RNA expression) is a more appropriate measure to account for expression heterogeneity.

(6) The correlation analyses between ncORF translation levels and PhyloCSF are confusing and largely descriptive. These sections need sharper framing and clearer conclusions.

(7) Public ribo-seq datasets, generated by different research labs, are known for their strong batch effects. Representations of tissues and cells are also very unbalanced. Therefore, the co-translation analysis between ncORFs and canonical CDS is not well controlled. This should be done by referring to a recent large-scale ribo-seq meta-analysis (Nat Biotechnol. 2025. doi: 10.1038/s41587-025-02718-5).

Reviewer #2 (Public review):

Summary:

Chang et al. attempted to analyze a large number of ribo-seq datasets through a standardized pipeline, identifying novel non-canonical ORFs and elucidating their evolutionary and expression characteristics.

Strengths:

(1) The datasets analyzed by the authors are sufficiently comprehensive, and the use of standardized pipelines ensures excellent analytical consistency.

(2) Their analyses of ORF evolution and co-expression further deepen our understanding of these ORFs.

Weaknesses:

(1) The authors primarily conducted analyses through bioinformatics, lacking sufficient wet-lab experimental evidence.

(2) Regarding the evolution of non-canonical ORFs, a considerable amount of prior work already exists. The authors need to further clarify what new insights and discoveries they have made based on the analysis of such a large dataset.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation