(a) AltORF nomenclature. AltORFs partially overlapping the CDS must be in a different reading frame. (b) Pipeline for the identification of altORFs. (c) Size distribution of alternative (empty bars, …
Annotation of human altORFs.
While more than half of the human genome is composed of repeated sequences, only 9.83% or 18,003 altORFs are located inside these repeats (a), compared to 2,45% or 1,677 CDSs (b). AltORFs and CDSs …
10% altORFs are present in different classes of repeats.
Percentage of altORFs with a TIS within an optimal Kozak sequence in hg38 (dark blue) compared to 100 shuffled hg38 (light blue). Mean and standard deviations for sequence shuffling are displayed, …
Proportion of altORFs with a Kozak motif in hg38 and shuffled hg38.
(a) Number of orthologous and paralogous alternative and reference proteins between H. sapiens and other species (pairwise study). (b) Phylogenetic tree: conservation of alternative (blue) and …
Conservation of alternative and reference proteins across different species.
Differences between altORF and CDS PhyloP scores (altORF PhyloP – CDS PhyloP, y-axis) are plotted against PhyloPs for their respective CDSs (x-axis). We restricted the analysis to altORF-CDS pairs …
Number of orthologous and co-conserved alternative and reference proteins between H. sapiens and other species (pairwise).
Chromosomal coordinates for the different CDSs and altORFs are indicated on the right. The regions highlighted in red indicate the presence of an altORF characterized by a region with elevated …
AltORFs completely nested within CDSs show more extreme PhyloP values (more conserved or faster evolving) than their CDSs.
(a) Percentage of CDSs and altORFs with detected TISs by ribosomal profiling and footprinting of human cells (Iacono et al., 2005). The total number of CDSs and altORFs with a detected TIS is …
Expression of human altORFs.
Example of validation for altSLC35A45’ specific peptide RVEDEVNSGVGQDGSLLSSPFLK. (a) Experimental MS/MS spectra (PeptideShaker graphic interface output). (b) MS/MS spectra of the synthetic peptide. …
Example of validation for altRELT5’-specific peptide VALELLK. (a) Experimental MS/MS spectra (PeptideShaker graphic interface output). (b) MS/MS spectra of the synthetic peptide. Matching peaks are …
Example of validation for altLINC01420nc-specific peptide WDYPEGTPNGGSTTLPSAPPPASAGLK. (a) Experimental MS/MS spectra (PeptideShaker graphic interface output). (b) MS/MS spectra of the synthetic …
Example of validation for altSRRM2CDS-specific peptide EVILDPDLPSGVGPGLHR. (a) Experimental MS/MS spectra (PeptideShaker graphic interface output). (b) MS/MS spectra of the synthetic peptide. …
Heatmap showing relative levels of spectral counts for phosphorylated peptides following the indicated treatment (Sharma et al., 2014). For each condition, heatmap colors show the percentage of …
(a) AltLINC01420nc amino acid sequence with detected peptides underlined and phosphorylated peptide in bold (73,9% sequence coverage). (b) MS/MS spectrum for the phosphorylated peptide …
The expression of 467 alternative proteins was detected by both ribosome profiling (translation initiation sites, TIS) and mass spectrometry (MS).
Number of alternative proteins detected by ribosome profiling and mass spectrometry.
(a) InterPro annotation pipeline. (b) Alternative and reference proteins with InterPro signatures. (c) Number of alternative and reference proteins with transmembrane domains (TM), signal peptides …
For each organism, the number of InterPro signatures (top graphs) and proteins with transmembrane (TM), signal peptide (SP), or TM +SP features (bottom pie charts) is indicated for alternative and …
Alternative proteome sequence analysis and classification in P. troglodytes, M. musculus, B. Taurus, D. melanogaster and S. cerevisiae.
GO terms assigned to InterPro entries are grouped into 13 categories for each of the three ontologies. (a) 34 GO terms were categorized into cellular component for 107 alternative proteins. (b) 64 …
Gene ontology(GO) annotations for human alternative proteins.
(a) The top 10 InterPro families in the human alternative proteome. (b) A total of 110 alternative proteins have between 1 and 23 zinc finger domains.
Main InterPro entries in human alternative proteins.
(a) Distribution of the number of identical InterPro entries co-ocurring between alternative and reference proteins coded by the same transcripts. 138 pairs of alternative and reference proteins …
Distribution of the percentage of sequence identity and overlap between alternative-reference protein pairs with (20) or without (80) identical Interpro signature.
Pixels show the number of times entries co-occur in reference and alternative proteins. Blue pixels indicate that these domains do not co-occur, white pixels indicate that they co-occur once, and …
The number of reference/alternative protein pairs with identical domains (n = 49) is higher than expected by chance alone (p<0.001). The distribution of expected pairs with identical domains and the …
There is no significant differences between both groups (p-value=0.6272; Kolmogorov Smirnov test). We conclude that there is no significant association between identity/overlap and functional …
(a) AltMiD515’ coding sequence is located in exon two or the MiD51/MIEF1/SMCR7L gene and in the 5’UTR of the canonical mRNA (RefSeq NM_019008).+2 and+1 indicate reading frames. AltMiD51 amino acid …
Mitochondrial morphologies in HeLa cells.
Example of validation for altMiD51 specific peptides YTDRDFYFASIR and GLVFLNGK. (a,c) Experimental MS/MS spectra (PeptideShaker graphic interface output). (b,d) MS/MS spectra of the synthetic …
(a) Confocal microscopy of HeLa cells transfected with MiD51GFP immunostained with anti-TOM20 (red channel) monoclonal antibodies. In each image, boxed areas are shown at higher magnification in the …
Trypan blue quenching experiment performed on HeLa cells stably expressing the indicated constructs: Matrix-Venus (Mx-Venus) and Intermembrane space-Venus (IMS-Venus). The fluorescence remaining …
(a) Oxygen consumption rates (OCR) in HeLa cells transfected with empty vector (mock) or altMiD51Flag. Mitochondrial function parameters were assessed in basal conditions (basal), in the presence of …
(a) Confocal microscopy of HeLa cells co-transfected with altMiD51GFP and Drp1(K38A)HA immunostained with anti-TOM20 (blue channel) and anti-HA (red channel) monoclonal antibodies. In each image, …
HeLa cells were transfected with empty vector (pcDNA3.1), altMiD51(WT)Flag, altMID51(LYR→AAA)Flag, Drp1(K38A)HA, or Drp1(K38A)HA and altMiD51(WT)Flag, as indicated. Proteins were extracted and …
(a) Bar graphs show mitochondrial morphologies in HeLa cells treated with non-target or Drp1 siRNAs. Cells were mock-transfected (pcDNA3.1) or transfected with altMiD51Flag. Means of three …
Mitochondrial morphologies in HeLa cells treated with non-target or Drp1 siRNAs.
(a) AltDDIT35’ coding sequence is located in exons 1 and 2 or the DDIT3/CHOP/GADD153 gene and in the 5’UTR of the canonical mRNA (RefSeq NM_004083.5).+2 and+1 indicate reading frames. AltDDIT3 amino …
HeLa cells were co-transfected with GFP and mCherry, or altDDIT3GFP and DDIT3mCherry, as indicated. Proteins were extracted and analyzed by western blot with antibodies, as indicated. Molecular …
Scatter plots of Pearson’s Correlation Coefficient and Manders’ Correlation Coefficient after Costes’ automatic threshold (p-value<0.001, based on 1000 rounds of Costes’ randomization colocalization …
Genomes | Features | |||||
---|---|---|---|---|---|---|
Transcripts | Current annotations | Annotations of alternative protein coding sequences | ||||
mRNAs | Others1* | CDSs | Proteins | AltORFs | Alternative proteins | |
H. sapiens GRCh38 RefSeq GCF_000001405.26 | 67,765 | 11,755 | 68,264 | 54,498 | 539,134 | 183,191 |
P. troglodytes 2.1.4 RefSeq GCF_000001515.6 | 55,034 | 7527 | 55,243 | 41,774 | 416,515 | 161,663 |
M. musculus GRCm38p2, RefSeq GCF_000001635.22 | 73,450 | 18,886 | 73,55 1 | 53,573 | 642,203 | 215,472 |
B. Taurus UMD3.1.86 | 22,089 | 838 | 22,089 | 21,915 | 79,906 | 73,603 |
X. tropicalis Ensembl JGI_4.2 | 28,462 | 4644 | 28,462 | 22,614 | 141,894 | 69,917 |
D. rerio Ensembl ZV10.84 | 44,198 | 8196 | 44,198 | 41,460 | 214,628 | 150,510 |
D. melanogaster RefSeq GCA_000705575.1 | 30,255 | 3474 | 30,715 | 20,995 | 174,771 | 71,705 |
C. elegans WBcel235, RefSeq GCF_000002985.6 | 28,653 | 25,256 | 26,458 | 25,750 | 131,830 | 45,603 |
S. cerevisiae YJM993_v1, RefSeq GCA_000662435.1 | 5471 | 1463 | 5463 | 5423 | 12,401 | 9492 |
*Other transcripts include miRNAs, rRNAs, ncRNAs, snRNAs, snoRNAs, tRNAs.
†Annotated retained-intron and processed transcripts were classified as mRNAs.
Alternative protein accession | Detection method* | Gene | Amino acid sequence | AltORF localization |
---|---|---|---|---|
IP_238718.1 | MS | RP11 | MLVEVACSSCRSLLHKGAGASEDGAALEPAHTGGKENGATT | nc |
IP_278905.1 | RP | ZNF761 | MSVARPLVGSHILYAIIDFILERNLISVMSVARTLVRSHPLYATIDFILERNLTSVMSVARPLVRSQTLHAIVDFILEKNKCNECGEVFNQQAHLAGHHRIHTGEKP | CDS |
IP_278745.1 | MS and RP | ZNF816 | MSVARPSVRNHPFNAIIYFTLERNLTNVKNVTMFTFADHTLKDIGRFILERDHTNVRFVTRFSGVIHTLQNIREFILERNHTSVINVAGVSVGSHPFNTIIHFTLERNLTHVMNVARFLVEEKTLHVIIDFMLERNLTNVKNVTKFSVADHTLKDIGEFILGKNHTNVRFVTRLSGVIHALQTIREFILERNLTSVINVRRFLIKKESLHNIREFILERNLTSVMNVARFLIKKQALQNIREFILQRNLTSVMSVAKPLLDSQHLFTIKQSMGVGKLYKCNDCHKVFSNATTIANHYRIHIEERSTSVINVANFSDVIHNL | CDS |
IP_138289.1 | MS | ZSCAN31 | MNIGGATLERNPINVRSVGKPSVPAMASLDTEESTQGKNHMNAKCVGRLSSSAHALFSIRGYTLERSAISVVSVAKPSFRMQGFSSISESTLVRNPISAVSAVNSLVSGHFLRNIRKSTLERDHKGDEFGKAFSHHCNLIRHFRIHTVPAELD | CDS |
IP_278564.1 | MS | ZNF808 | MIVTKSSVTLQQLQIIGESMMKRNLLSVINVACFSDIVHTLQFIGNLILERNLTNVMIEARSSVKLHPMQNRRIHTGEKPHKCDDCGKAFTSHSHLVGHQRIHTGQKSCKCHQCGKVFSPRSLLAEHEKIHF | 3’UTR |
IP_275012.1 | MS | ZNF780A | MKPCECTECGKTFSCSSNIVQHVKIHTGEKRYNVRNMGKHLLWMISCLNIRKFRIVRNFVTIRSVDKPSLCTKNLLNTRELILMRNLVNIKECVKNFHHGLGFAQLLSIHTSEKSLSVRNVGRFIATLNTLEFGEDNSCEKVFE | 3’UTR |
IP_270595.1† | MS | ZNF440 | MHSVERPYKCKICGRGFYSAKSFQIHEKSYTGEKPYECKQCGKAFVSFTSFRYHERTHTGENPYECKQFGKAFRSVKNLRFHKRTHTGEKPCECKKCRKAFHNFSSLQIHERMHRGEKLCECKHCGKAFISAKIL | CDS |
IP_270643.1† | MS | ZNF763 | MKKLTLERNPINACHVVKPSIFPVPFSIMKGLTLERNPMSVSVGKPSDVPHTFEGMVGLTGEKPYECKECGKAFRSASHLQIHERTQTHIRIHSGERPYKCKTCGKGFYSPTSFQRHEKTHTAEKPYECKQCGKAFSSSSSFWYHERTHTGEKPYECKQCGKAFRSASIQMHAGTHPEEKPYECKQCGKAFRSAPHLRIHGRTHTGEKPYECKECGKAFRSAKNLRIHERTQTHVRMHSVERPYKCKICGKGFYSAKSFQIPEKSYTGEKPYECKQCGKAFISFTSFR | 3’UTR |
IP_270597.1‡ | MS | ZNF440 | MKNLTLERNPMSVSNVGKPLFPSLPFDIMKGLTLERTPMSVSNLGKPSDLSKIFDFIKGHTLERNPVNVRNVEKHSIISLLCKYMKGCTEERSSVNVSIVGKHSYLPRSFEYMQEHTMERNPMNVKNAEKHSACLLPFIDMKRLTLEGNTMNASNVAKLSLLPVLFNIMKEHTREKPYQCKQCAKAFISSTSFQYHERTHMGEKPYECMPSGKAFISSSSLQYHERTHTGEKPYEYKQCGKAFRSASHLQMHGRTHTGEKPYECKQYGKAFRPDKIL | 3’UTR |
IP_270609.1‡ | MS | ZNF439 | MNVSNVAKAFTSSSSFQYHERTHTGEKPYQCKQCGKAVRSASRLQMHGSTHTWQKLYECKQYGKAFRSARIL | 3’UTR |
IP_270663.1‡ | MS | ZNF844 | MHGRTHTQEKPYECKQCGKAFIFSTSFRYHERTHTGEKPYECKQCGKAFRSATQLQMHRKIHTGEKPYECKQCGKAYRSVSQLLVHERTHTVEQPYEYKQYGKAFRFAKNLQIQTMNVNN | CDS |
IP_270665.1‡ | MS | ZNF844 | MHRKIHTGEKPYECKQCGKAYRSVSQLLVHERTHTVEQPYEYKQYGKAFRFAKNLQIQTMNVNN | CDS |
IP_270668.1‡ | MS | ZNF844 | MSSTAFQYHEKTHTREKHYECKQCGKAFISSGSLRYHERTHTGEKPYECKQCGKAFRSATQLQMHRKIHTGEKPYECKQCGKAYRSVSQLLVHERTHTVEQPYEYKQYGKAFRFAKNLQIQTMNVNN | 3’UTR |
IP_138139.1 | MS | ZNF322 | MLSPSRCKRIHTGEQLFKCLQCQLCCRQYEHLIGPQKTHPGEKPQQV | 3’UTR |
IP_204754.1 | RP | ZFP91-CNTF | MPGETEEPRPPEQQDQEGGEAAKAAPEEPQQRPPEAVAAAPAGTTSSRVLRGGRDRGRAAAAAAAAAVSRRRKAEYPRRRRSSPSARPPDVPGQQPQAAKSPSPVQGKKSPRLLCIEKVTTDKDPKEEKEEEDDSALPQEVSIAASRPSRGWRSSRTSVSRHRDTENTRSSRSKTGSLQLICKSEPNTDQLDYDVGEEHQSPGGISSEEEEEEEEEMLISEEEIPFKDDPRDETYKPHLERETPKPRRKSGKVKEEKEKKEIKVEVEVEVKEEENEIREDEEPPRKRGRRRKDDKSPRLPKRRKKPPIQYVRCEMEGCGTVLAHPRYLQHHIKYQHLLKKKYVCPHPSCGRLFRLQKQLLRHAKHHTDQRDYICEYCARAFKSSHNLAVHRMIHTGEKPLQCEICGFTCRQKASLNWHMKKHDADSFYQFSCNICGKKFEKKDSVVAHKAKSHPEVLIAEALAANAGALITSTDILGTNPESLTQPSDGQGLPLLPEPLGNSTSGECLLLEAEGMSKSYCSGTERSIHR | nc |
IP_098649.1 | RP | INO80B-WBP1 | MSKLWRRGSTSGAMEAPEPGEALELSLAGAHGHGVHKKKHKKHKKKHKKKHHQEEDAGPTQPSPAKPQLKLKIKLGGQVLGTKSVPTFTVIPEGPRSPSPLMVVDNEEEPMEGVPLEQYRAWLDEDSNLSPSPLRDLSGGLGGQEEEEEQRWLDALEKGELDDNGDLKKEINERLLTARQRALLQKARSQPSPMLPLPVAEGCPPPALTEEMLLKREERARKRRLQAARRAEEHKNQTIERLTKTAATSGRGGRGGARGERRGGRAAAPAPMVRYCSGAQGSTLSFPPGVPAPTAVSQRPSPSGPPPRCSVPGCPHPRRYACSRTGQALCSLQCYRINLQMRLGGPEGPGSPLLATFESCAQE | nc |
IP_115174.1 | RP | ZNF721 | MYIGEFILERNPTHVENVAKPLDSLQIFMRIRKFILERNPTRVETVAKPLDSLQIFMHIRKFILEIKPYKCKECGKAFKSYYSILKHKRTHTRGMSYEGDECRGL | CDS |
IP_275016.1 | RP | ZNF780A | MNVRSVGKALIVVHTLFSIRKFIPMRNLLYVGNVRWPLDIIANLLNILEFILVTSHLNVKTVGRPSIVAQALFNIRVFTLVRSPMNVRSVGRLLDFTYNFPNIRKLTQVKNHLNVRNVGNSFVVVQILINIEVFILERNPLNVRNVGKPFDFICTLFDIRNCILVRNPLNVRSVGKPFDFICNLFDIRNCILVRNPLNVRNVERFLVFPPSLIAIRTFTQVRRHLECKECGKSFNRVSNHVQHQSIRAGVKPCECKGCGKGFICGSNVIQHQKIHSSEKLFVCKEWRTTFRYHYHLFNITKFTLVKNPLNVKNVERPSVF | CDS or 3’UTR |
IP_278870.1 | RP | ZNF845 | MNVARFLIEKQNLHVIIEFILERNIRNMKNVTKFTVVNQVLKDRRIHTGEKAYKCKSL | CDS |
IP_278888.1 | RP | ZNF765 | MSVARPSAGRHPLHTIIDFILDRNLTNVKIVMKLSVSNQTLKDIGEFILERNYTCNECGKTFNQELTLTCHRRLHSGEKPYKYEELDKAYNFKSNLEIHQKIRTEENLTSVMSVARP | CDS |
IP_278918.1 | RP | ZNF813 | MNVARVLIGKHTLHVIIDFILERNLTSVMNVARFLIEKHTLHIIIDFILEINLTSVMNVARFLIKKHTLHVTIDFILERNLTSVMNVARFLIKKQTLHVIIDFILERNLTSLMSVAKLLIEKQSLHIIIQFILERNKCNECGKTFCHNSVLVIHKNSYWRETSVMNVAKFLINKHTFHVIIDFIVERNLRNVKHVTKFTVANRASKDRRIHTGEKAYKGEEYHRVFSHKSNLERHKINHTAEKP | CDS |
IP_280349.1 | RP | ZNF587 | MNAVNVGNHFFPALRFMFIKEFILDKSLISAVNVENPFLNVPVSLNTGEFTLEKGLMNAPNVEKHFSEALPSFIIRVHTGERPYECSEYGKSFAEASRLVKHRRVHTGERPYECCQCGKHQNVCCPRS | CDS |
IP_280385.1 | RP | ZNF417 | MNAMNVGNHFFPALRFMFIKEFILDKSLISAVNVENPLLNVPVSLNTGEFTLEKGLMNVPNVEKHFSEALPSFIIRVHTGERPYECSEYGKSFAETSRLIKHRRVHTGERPYECCQSGKHQNVCSPWS | CDS |
*MS, mass spectrometry; RP, ribosome profiling.
† These two proteins were not detected with unique peptides but with shared peptides. One protein only was counted in subsequent analyses.
These five proteins were not detected with unique peptides but with shared peptides. One protein only was counted in subsequent analyses.
Gene | Polypeptides* | Reference | altORF localization | altORF size aa | Conservation | Summary of functional relationship with the annotated protein |
---|---|---|---|---|---|---|
CDKN2A, INK4 | Cyclin-dependent kinase inhibitor 2A or p16-INK4 (P42771), and p19ARF (Q8N726) | (61) | 5'UTR | 169 | Human, mouse | the unitary inheritance of p16INK4a and p19ARF may underlie their dual requirement in cell cycle control. |
GNAS, XLalphas | Guanine nucleotide-binding protein G(s) subunit alpha isoforms XLαs (Q5JWF2) and Alex (P84996) | (62) | 5'UTR | +700 | Human, mouse, rat | Both subunits transduce receptor signals into stimulation of adenylyl cyclase. |
ATXN1 | Ataxin-1 (P54253) and altAtaxin-1 | (63) | CDS | 185 | Human, chimpanzee, cow | Direct interaction |
Adora2A | A2A adenosine receptor (P30543) and uORF5 | (64) | 5'UTR | 134 | Human, chimpanzee, rat, mouse | A2AR stimulation increases the level of the uORF5 protein via post-transcriptional regulation. |
AGTR1 | Angiotensin type 1a receptor (P25095) and PEP7 | (65) | 5'UTR | 7 | Highly conserved across mammalian species | Inhibits non-G protein-coupled signalling of angiotensin II, without altering the classical G protein-coupled pathway activated by the ligand. |
*The UniProtKB accession is indicated when available.
12,616 alternative proteins and 26,531 reference proteins with translation initiation sites detected by ribosome profiling after re-analysis of large-scale studies.
Sheet 1: general information. Sheet 2: list of alternative proteins; sheet 3: pie chart of corresponding altORFs localization. Sheet 4: Sheet 2: list of reference proteins
4,872 alternative proteins detected by mass spectrometry (MS) after re-analysis of large proteomic studies.
Sheet 1: MS identification parameters; sheet 2: raw MS output; sheet 3: list of detected alternative proteins; sheet 4: pie chart of corresponding altORFs localization.
List of phosphopeptides.
Linker sequences separating adjacent zinc finger motifs.
100 alternative proteins with 25% to 100% identity and 10% to 100% overlap with their reference protein pairs.
Sheet 1: BlastP output and protein domains.
383 alternative proteins detected by mass spectrometry in the interactome of 118 zinc finger proteins.
Sheet 1: MS identification parameters; sheet 2: raw MS output; sheet 3: list of detected alternative proteins.
High-confidence list of predicted functional alternative proteins based on conservation and expression analyses.
Sheet 1: high-confidence list in mammals; sheet 2: high-confidence list in in vertebrates.
Extraction of PhyloP scores.