Codon usage bias controls mRNA and protein abundance in trypanosomatids

  1. Laura Jeacock
  2. Joana Faria
  3. David Horn  Is a corresponding author
  1. University of Dundee, United Kingdom
9 figures, 1 table and 2 additional files

Figures

Protein expression is increased by GC3 codons in T.brucei.

(A) Schematic map of the pRPai-based, tetracycline-inducible reporter construct. Relevant restriction sites are shown. Black bars, tubulin untranslated regions; arrow, pol-I promoter; pA, …

https://doi.org/10.7554/eLife.32496.003
mRNA expression is increased by GC3 codons in T.brucei.

(A) Schematic map of the reporter cassette. The grey bar indicates the position of the tubulin untranslated region probe. The RNA blot indicates native tubulin transcripts and the gLUC transcripts. …

https://doi.org/10.7554/eLife.32496.004
Genome scale analysis of codon usage bias.

(A) CAI value distribution is shown for all non-redundant T. brucei genes and the cohorts of genes indicated. See the text for more detail on each cohort. (B) CAI values are shown in heat-map format …

https://doi.org/10.7554/eLife.32496.005
Genome scale analysis of codon pair bias in T.

brucei. (A) Codon co-occurrence by encoded amino acid. Amino acid pairs are over-represented; highlighted by white boxes. (B) Analysis of third position followed by first position pairs. Examples of …

https://doi.org/10.7554/eLife.32496.006
Figure 5 with 1 supplement
Transcriptome and proteome data and the impact of gene length in T.brucei.

(A) Correspondence between observed mRNA and protein expression. (B) Relationship between observed mRNA expression and protein coding sequence (CDS) length. RPKM, Reads Per Kilobase of transcript …

https://doi.org/10.7554/eLife.32496.007
Figure 5—figure supplement 1
RNA-seq data.

Replicate read counts and correspondence analysis. RPKM (Reads Per Kilobase of transcript per Million mapped reads).

https://doi.org/10.7554/eLife.32496.008
Figure 6 with 1 supplement
Codon usage is predictive of relative mRNA and protein expression in T.brucei.

(A) Correspondence between relative observed mRNA expression and CAI. (B) Correspondence between relative observed mRNA levels and predicted expression based on CAI and CDS length in kbp (L); the …

https://doi.org/10.7554/eLife.32496.009
Figure 6—figure supplement 1
Length-adjusted CAI is predictive of relative mRNA expression in previously published datasets.

Data from distinct life cycle stages of T. brucei and from different research groups were analysed; the data source is indicated in each case. Correspondence is shown between relative observed mRNA …

https://doi.org/10.7554/eLife.32496.010
Codon usage predicts the relative expression of protein complexes and cohorts of proteins with related functions in T.brucei.

Correspondence between observed peptide counts and predicted abundance based on CAI. The complexes and cohorts are listed in order of peptides/kbp and number of proteins is indicated for each; …

https://doi.org/10.7554/eLife.32496.011
Length-adjusted CAI and CAI are predictive of translation efficiency and mRNA half-life, respectively, in previously published data from T.brucei; the data source is indicated in each case.

(A) Correspondence between translation efficiency (footprint levels/mRNA levels) and length-adjusted CAI. n = 4880 genes. Data from bloodstream-form cells is shown; correlation coefficient for …

https://doi.org/10.7554/eLife.32496.012
Length-adjusted CAI is predictive of relative mRNA and protein expression in previously published data from the other trypanosomatids, T.vivax and Leishmania mexicana; the data source is indicated in each case.

The plots indicate correspondence between relative observed mRNA or protein expression and our predictions based on CAI and CDS length in kbp (L). (A) T. vivax mRNA expression. n = 5170 genes. (B) T.…

https://doi.org/10.7554/eLife.32496.013

Tables

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
gene (Gaussia princeps)gLUCPMID: 18408930AY015993.1wild-type
gene (Gaussia princeps)gLUCPMID: 18408930EU372000human codon-optimised
cell line
(Trypanosoma brucei)
2T1PMID: 16182389
transfected construct
(Trypanosoma brucei)
pRPa-iSL plasmidPMID: 1858891869244available from addgene.org
transfected construct
(Trypanosoma brucei)
pRPa plasmidPMID: 18588918
transfected construct
(Trypanosoma brucei)
pRPa-λ plasmidthis papersee materials and methods
antibodyα-gLUCNew England Biolabsone in 1000
sequence-based reagentEUluc5 oligonucleotidethis paperGATCCTGCAGCTCGAGATGAAGCCCACCGAGAACAACG
sequence-based reagentEUluc3 oligonucleotidethis paperGATCGAATTCAGATCTAAGCTTTTACAGCTTCGAGTCGCCGCCGGCGCC
sequence-based reagentWTluc5 oligonucleotidethis paperGATCCTCGAGATGAAACCAACTGAAAACAATG
sequence-based reagentWTluc3 oligonucleotidethis paperGATCAAGCTTTTATAATTTACTATCACCACCGGCACCCTT
sequence-based reagentLambda5 oligonucleotidethis paperGATCAAGCTTTGCAGGGTGAGATTGTGGC
sequence-based reagentLambda3 oligonucleotidethis paperGATCGAATTCGCTCAGTTGTTCAGGAATATG
sequence-based reagentTUBF oligonucleotidethis paperAGATCTTCAAACACTAGTTTAAGC
sequence-based reagentTUBR oligonucleotidethis paperCATGATAAATAAATAGAAGTGCTTTGTTG
sequence-based reagentλF oligonucleotidethis paperGATTCATAAGTTCCGCTGTGTGCCGCATCTC
sequence-based reagentλR oligonucleotidethis paperGCTCAGTTGTTCAGGAATATGGTGCAGCAG
commercial assay or kitBioLux Gaussia luciferaseNew England Biolabs
software, algorithmBowtie 2PMID: 22388286
software, algorithmSAMtoolsPMID: 19505943
software, algorithmedgeRPMID: 19910308
software, algorithmCAI calculatorhttp://www.umbc.edu//codon/cai/cais.php
software, algorithmANACONDAhttp://bioinformatics.ua.pt/software/anaconda/
online databaseTriTrypDB, RRID:SCR_007043http://tritrypdb.org/tritrypdb/

Additional files

Supplementary file 1

Sheet 1: Synthetic genes.

Sequences of gLUC and GFP genes with high, medium or low proportions of GC3-codons. Sheet 2: T. brucei expression data. CAI values, proteome and transcriptome data and predicted expression levels are tabulated for the non-redundant gene sets analysed in Figures 56. Sheet 3: T. brucei gene cohorts. Data for the genes analysed in Figure 7. Sheet 4: T. brucei - extended set of ranked CAI values and predictions, including GeneID and product description; n = 8479. Sheet 5: T. vivax - ranked CAI values and predictions, including GeneID and product description; n = 7836. Sheet 6: L. mexicana - ranked CAI values and predictions, including GeneID and product description; n = 5715.

https://doi.org/10.7554/eLife.32496.014
Transparent reporting form
https://doi.org/10.7554/eLife.32496.015

Download links