(A) We detected utORFs by searching via a two-round approach through a comprehensive database of potential open reading frames (ORFs) that was generated from a six-frame translation of the …
All unannotated translated open reading frames (utORFs) sequences.
Unannotated translated open reading frame (utORF) supporting peptides.
Unannotated translated open reading frame (utORF) locations.
(A) When performing a simple homology search for a locus of interest (red arrow) across a given genome (blue line), the search space is orders of magnitude larger, requiring heuristic shortcuts to …
(A) The reference phylogenetic tree used for these analyses (UCSC 27-way insect alignment). Abbreviations are as follows: D. mel: Drosophila melanogaster, D. sim: D. simulans, D. sec: D. sechellia, D…
(A) Figure 3A, but also showing the melanogaster species subgroup, species group, and Drosophila taxa. (B) Change in furthest significant ortholog (using a significance threshold of 2.3 instead of …
(A) Class 1 is notably distinct for strong bias toward intergenic and antisense locations at the expense of sense locations. Class 2 is notable for being relatively unbiased and for being the only …
Unannotated translated open reading frame (utORF) inferred latent class analysis (LCA) classes.
(A–E) Same as Figure 4 but examining utORFs with canonical start sites.
(A) As expected, phastCons conservation scores vary by class. Scores near 0 indicate low conservation, while scores near 1 indicate high conservation. Note that fast-evolving and melanogaster-specifi…
Top panel: utORFs, separated by inferred latent class analysis (LCA) class. Mean TPMs across the given tissue in FlyAtlas2 are log10-transformed with a pseudocount of 1E-3. Horizontal line marks an …
(A–G) Same as Figure 5 but examining utORFs with canonical start sites.
(A) Proportion of utORFs by inferred class with genomic conservation consistent with de novo origin. Box widths correlate with size of class (Table 1). (B) Number of supporting outgroups by inferred …
(A) Cumulative distribution of differences between observed and predicted retention times for peptide-spectrum matches (PSMs) of peptides supporting annotated FlyBase proteins (orange) and PSMs of …
Mascot search results from embryo mass spectrometry (MS) data.
Rank of potentially biologically significant targets.
Dataset subset mappings.
Box widths correlate with size of class (Table 1).
Class | Interpretation | Estimated percent | Number |
---|---|---|---|
1 | Putatively nonfunctional loci | 4.35% | 41 |
2 | melanogaster-specific ORFs | 5.71% | 54 |
3 | Fast-evolving ORFs | 12.03% | 96 |
4 | General unannotated ORFs | 57.61% | 591 |
5 | Alternative-frame ORFs | 20.30% | 161 |
Supplementary tables 1A–1E.