Coenzyme-protein interactions since early life
Figures
Workflow of this study.
All available coenzymes in the Protein Data Bank (PDB) were identified according to the CoFactor database (Fischer et al., 2010). The PDB entries of structures bound to coenzymes were downloaded programmatically through the PDBe REST API (pdbe.org/api), including the interatomic cofactor-protein interactions, calculated by Arpeggio (Jubb et al., 2017). The coenzyme-binding amino acids were mapped to UniProt databases via Structure Integration with Function (SIFTS) (Velankar et al., 2013; Dana et al., 2019). PDB entries were grouped by UniProt code; redundancy was removed by clustering the UniProt sequences by 90% (and in parallel also 30%) sequence identity.
Classification of coenzymes and amino acids by their assumed evolutionary temporality.
The ‘Unclassified’ coenzymes Thiamine diphosphate, Coenzyme M, Factor F430, and Glutathione are not shown in the scheme.
Enzymatic class diversity per coenzyme class.
All coenzymes are colored by temporality: Ancient are shown in color purple, Last Universal Common Ancestor (LUCA) in turquoise, Post-LUCA in yellow, and Unclassified cofactors in gray. The EC numbers were retrieved from Structure Integration with Function (SIFTS) (Velankar et al., 2013; Dana et al., 2019).
Early versus late amino acid composition of the coenzyme binding sites, categorized according to the evolutionary temporality of coenzymes.
Early amino acids are shown in color blue and late residues in red. The dashed line corresponds to the proportion of early vs. late amino acids within the UniProt composition of the sequences derived from our database (67% early and 33% late residues). The statistical significance of the early versus late amino acid composition was assessed by a Chi-squared test (p<0.0001). Detailed statistical data are listed in Supplementary file 9.
Binding of coenzymes with early and late amino acids by backbone and side chain atoms.
‘Backbone’ interactions refer to residues in the coenzyme binding sites that interact purely through amino acid backbone atoms. ‘Side chain’ interactions involve residues that interact solely via side chain atoms. ‘Backbone & Side chain’ residues are those that interact with the coenzyme using both their backbone and side chain atoms. (A) Abundance of amino acids in individual studied coenzymes. ‘Backbone & Side chain’ interactions are not depicted. Unclassified cofactors are in gray, Post-Last Universal Common Ancestor (LUCA) in yellow, LUCA in cyan, and Ancient in purple. Amino acids are ranked by the order of addition of amino acids to the genetic code (Higgs and Pudritz, 2009). (B) Proportion of early versus late residues in coenzyme categories by interaction type. In each coenzyme category, the individual proportions add up to 100%. The amino acid composition was normalized by the percentage of late residues from the UniProt sequences retrieved from our database. The statistical significance of early versus late amino acid composition for each interaction type per coenzyme temporality was determined by a Chi-squared test (*p<0.05; **p<0.01; ***p<0.001; ****p<0.0001). For detailed statistical analysis, refer to Supplementary file 10.
Interaction types.
(A) Interaction types for each amino acid- coenzyme binding event. (B) Interactions by coenzyme class. Early residues are shown as ‘E’ and late as ‘L.’ The interactions were assigned by Arpeggio (Jubb et al., 2017).
Secondary structure content in coenzyme binding sites.
Composition of secondary structural elements in amino acids interacting with coenzymes. The Protein Data Bank (PDB) category represents secondary structure content across the dataset for comparison with coenzyme binding sites. Additional statistical analyses are shown in Supplementary file 11.
Secondary structure content per cofactor class.
All the structural assignments were obtained from Structure Integration with Function (SIFTS) (Jubb et al., 2017).
Fold diversity of coenzyme binding sites.
(A) Folds represented by ECOD X-groups, according to numbers of coenzyme binding sites. (B) Comparison of numbers of ECOD X-groups vs. UniProt entries per cofactor class.
Examples of coenzyme binding solely through early or late amino acids.
(A) Coenzymes bound exclusively by early residues AMP bound by ATP-phosphoribosyltransferase. PDB code 6czm (chain B) created by LIGPLOT (Laskowski and Swindells, 2011). (B) Coenzyme, entirely bound by late residues (Ascorbic acid bound by Hyaluronate lyase. Protein Data Bank (PDB) code 1f9g (chain A), created by LIGPLOT).
Cofactor binding with only early amino acids.
ECOD X-group folds bound to those coenzymes are shown with the ConSurf (Ashkenazy et al., 2010; Ashkenazy et al., 2016) conservation scheme, and the enzyme cofactors are shown in spheres.
Additional files
-
Supplementary file 1
PDB codes assigned to each coenzyme class.
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp1-v1.zip
-
Supplementary file 2
Identification of coenzymes in Protein Data Bank (PDB).
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp2-v1.xlsx
-
Supplementary file 3
Amino Acid Composition of the Coenzyme Binding Sites.
Table 2A_90 Residue Composition at 90% Sequence Identity. Table 2B_30 Amino Acid Composition at 30% Sequence Identity.
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp3-v1.xlsx
-
Supplementary file 4
Folds catalogue of ECOD X-groups in coenzyme binding sites.
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp4-v1.xlsx
-
Supplementary file 5
Proteins and nucleic acids with coenzyme binding mediated by metallic ions and water molecules.
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp5-v1.xlsx
-
Supplementary file 6
Amino acid fractional differences observed across all coenzyme binding sites.
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp6-v1.zip
-
Supplementary file 7
Amino acid fractional differences observed across all non-phosphate-containing coenzyme binding sites.
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp7-v1.zip
-
Supplementary file 8
Coenzymes interacting with nucleic acids.
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp8-v1.zip
-
Supplementary file 9
Chi-squared test of early versus late amino acid composition per coenzyme class.
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp9-v1.zip
-
Supplementary file 10
Chi-squared test comparing early versus late residue composition across all coenzyme temporalities in different interaction types.
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp10-v1.zip
-
Supplementary file 11
Average secondary structure content of the different coenzyme temporalities.
- https://cdn.elifesciences.org/articles/94174/elife-94174-supp11-v1.zip
-
MDAR checklist
- https://cdn.elifesciences.org/articles/94174/elife-94174-mdarchecklist1-v1.docx