Coevolution-based prediction of key allosteric residues for protein function regulation

  1. Juan Xie
  2. Weilin Zhang
  3. Xiaolei Zhu
  4. Minghua Deng
  5. Luhua Lai  Is a corresponding author
  1. Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, China
  2. BNLMS, Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, China
  3. School of Sciences, Anhui Agricultural University, China
  4. School of Mathematical Sciences, Peking University, China
  5. Center for Statistical Science, Peking University, China
  6. Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), China
6 figures, 1 table and 10 additional files

Figures

Steps to identify key allo-residues.

(A) Multiple sequence alignment. (B) Evolutionary coupling (EC) analysis. (C–D) Calculation of the EC values between residues in allosteric and orthosteric pockets. (E) Pairwise compared the …

Figure 2 with 3 supplements
Z-scores of allosteric pockets and probabilities of ranking an allosteric pocket in the top 3.

(A) The sequence lengths of all proteins in our data set. (B) The number of homologous sequences. Neff represents the number of effective homologous sequences obtained under 80% reweighting. (C) …

Figure 2—figure supplement 1
Phylogenetic tree of the androgen receptor.
Figure 2—figure supplement 2
Comparison of evolutionary coupling strength between pockets when all residue pairs and partial residue pairs were used.

(A) Prediction accuracy of using different numbers of residue pairs. We defined that the criterion for successful prediction is that the Z-score of the allosteric pocket is greater than 0.5. (B) The …

Figure 2—figure supplement 3
Difference between the evolutionary coupling between orthosteric and allosteric sites and the evolutionary coupling between two random patches.

Two residues that are not part of the orthosteric and allosteric sites were randomly selected from the surface residues of proteins. Among them, one was taken as the first center, and the residues …

Figure 3 with 3 supplements
The number of predicted key allo-residues.

Number of residues refers to the number of residues from allosteric pockets, including the number of all residues in allosteric pockets and predicted key allo-residues.

Figure 3—figure supplement 1
Distribution of the ratios of the number of key allo-residues predicted by KeyAlloSite in the number of all residues in allosteric pockets when using different cutoffs in all proteins.
Figure 3—figure supplement 2
Examples of distributions of the statistics corresponding to significant scores obtained from the t-test.

These three distributions are the distributions of the statistics in BCR-ABL1, Tar, and PDZ3.

Figure 3—figure supplement 3
Random sampling of homologous sequences.

For each of the seven proteins, we randomly sampled different numbers of homologous sequences such as 1 L, 2 L, and so on. The ratio refers to the proportion of identical key allo-residues …

Key allo-residues predicted in BCR-ABL1.

(A) The crystal structure of the kinase domain of BCR-ABL1. The allosteric inhibitor asciminib, represented by sticks, binds to the myristoyl pocket (marine). (B) Predicted key allo-residues in the …

The key allo-residues predicted by our method in Tar and PDZ3.

(A) The crystal structure of holo-Tar. Aspartate (Asp) is represented by magenta sticks, the allosteric pocket is represented by marine surface, and the salmon helix is selected as the orthosteric …

KeyAlloSite predicted key allo-residues for enzymes.

(A) KeyAlloSite predicted key allo-residues for Candida antarctica lipase B. Among the predicted residues, the residues that have been annotated by the literature are shown as marine spheres, and …

Tables

Table 1
Predicted key allo-residues that were mutated in cancers.
ProteinGenePredicted key allo-residuesMutation*Cancer type
AR1ARD732D732NSKCM
AR2ARM832M832ISKCM
PTP-1BPTPN1M282M282TCOAD
CDK2CDK2P155P155HUCEC
CK2alphaCSNK2A1F54; A110F54C; A110TUCEC; UCEC, GBM
MAPK14MAPK14P191; E192P191S; P191H; E192QSKCM; KIRC; BLCA
MAPK8MAPK8E195; M200E195K; M200IUCEC; SKCM
CYP3A4CYP3A4F219F219LUCEC
  1. *

    Mutation: confirmed disease mutations among the predicted key allo-residues.

  2. Cancer type: COAD: colon adenocarcinoma; SKCM: skin cutaneous melanoma; UCEC: uterine corpus endometrial carcinoma; GBM: glioblastoma multiforme; KIRC: kidney renal clear cell carcinoma; BLCA: bladder urothelial carcinoma.

Additional files

Supplementary file 1

Information of the allosteric proteins in the data set.

https://cdn.elifesciences.org/articles/81850/elife-81850-supp1-v2.docx
Supplementary file 2

List of the Z-scores and ranking of allosteric pockets in the data set.

https://cdn.elifesciences.org/articles/81850/elife-81850-supp2-v2.docx
Supplementary file 3

KeyAlloSite prediction results of Aurora A kinase.

https://cdn.elifesciences.org/articles/81850/elife-81850-supp3-v2.docx
Supplementary file 4

List of the predicted key allo-residues in allosteric pockets.

https://cdn.elifesciences.org/articles/81850/elife-81850-supp4-v2.docx
Supplementary file 5

Key allo-residues predicted by KeyAlloSite with different cutoffs.

https://cdn.elifesciences.org/articles/81850/elife-81850-supp5-v2.docx
Supplementary file 6

KeyAlloSite prediction results of tyrosine-protein kinase ABL1.

https://cdn.elifesciences.org/articles/81850/elife-81850-supp6-v2.docx
Supplementary file 7

The key allo-residues predicted by our method on Candida antarctica lipase B.

https://cdn.elifesciences.org/articles/81850/elife-81850-supp7-v2.docx
Supplementary file 8

The confusion matrices of KeyAlloSite in different scenarios.

https://cdn.elifesciences.org/articles/81850/elife-81850-supp8-v2.docx
Supplementary file 9

Comparison of KeyAlloSite and SCA methods.

https://cdn.elifesciences.org/articles/81850/elife-81850-supp9-v2.docx
MDAR checklist
https://cdn.elifesciences.org/articles/81850/elife-81850-mdarchecklist1-v2.docx

Download links