Intermolecular epistasis shaped the function and evolution of an ancient transcription factor and its DNA binding sites

  1. Dave W Anderson
  2. Alesia N McKeown
  3. Joseph W Thornton  Is a corresponding author
  1. University of Oregon, United States
  2. University of Chicago, United States
6 figures and 2 additional files

Figures

Recognition helix (RH) substitutions change DNA-binding affinity and specificity.

(A) Phylogenetic relationships of modern-day vertebrate SRs are shown, with ancestral proteins AncSR1 and AncSR2 marked. Each protein's preferred response element (RE) is shown: estrogen RE (ERE; …

https://doi.org/10.7554/eLife.07864.003
Protein intermediates between AncSR1 and AncSR1+RH are promiscuous or weak transcription factor proteins (TFs).

Binding energies of AncSR1 variants containing all combinations of ancestral and derived states at the RH sites with historical replacements are shown for all 16 REs as measured by fluorescence …

https://doi.org/10.7554/eLife.07864.004
Each amino acid replacement contributes to the evolution of novel DNA specificity.

For each protein intermediate in the sequence space between AncSR1 and AncSR1+RH, the energy logo depicts the main and epistatic effects of the RE nucleotide states and combinations on binding …

https://doi.org/10.7554/eLife.07864.005
Epistasis across the protein-DNA interface: effect of historical replacements in the TF on DNA determinants of affinity in the RE.

(A) Main and epistatic effects of RH replacement on DNA affinity. Bars indicate the mean change in binding energy caused by each amino acid change in the RH, averaged across all TF:RE combinations …

https://doi.org/10.7554/eLife.07864.006
Accessible mutational pathways in the joint TF-RE sequence space.

Each vertex of the cube represents a protein genotype between AncSR1 and AncSR1+RH; amino acid states at variable RH residues are shown; lower and upper case denote ancestral and derived states, …

https://doi.org/10.7554/eLife.07864.007
Figure 6 with 2 supplements
Hydrogen bonding and packing efficiency do not explain TF-RE affinity.

(A) The number of hydrogen bonds formed between atoms in the RH and atoms in the RE in molecular dynamic (MD) simulations is not positively correlated with the experimentally measured binding energy …

https://doi.org/10.7554/eLife.07864.008
Figure 6—figure supplement 1
Direct hydrogen bonding at the protein-DNA interface positively correlates with binding affinity for only 2 out of 8 protein genotypes.

From MD simulations, the number of direct hydrogen bonds formed at the protein-DNA interface was calculated for each protein genotype across all 16 REs. The best-fit linear regression was determined …

https://doi.org/10.7554/eLife.07864.009
Figure 6—figure supplement 2
Packing efficiency at the protein-DNA interface positively correlates with binding affinity for only 2 out of 8 protein genotypes.

From MD simulations, the number of protein-DNA atom pairs within 4.5 Å of one another was calculated for each protein genotype across all 16 REs. The best-fit linear regression was determined for …

https://doi.org/10.7554/eLife.07864.010

Additional files

Supplementary file 1

First-order and epistatic genetic determinants of binding affinity. First-order effects indicate the difference in binding energy relative to the mean across all data, while the second-order effects are the marginal addition to the additive sum of the first-order effects. Third-order effects are the marginal addition to the additive sum of all lower-order effects. (A) The energetic effects of binding for all first-order and epistatic terms in the RE as determined by linear modeling for each protein genotype. (B) The energetic effects for amino acid replacements averaged across all 16 REs. (C) The energetic effects from a global model, including all possible first-, second-, and third-order effects within and between the protein and DNA.

https://doi.org/10.7554/eLife.07864.011
Supplementary file 2

abc/WYK- encoding of sequence characters for linear modeling of genetic effects. (A) One-dimensional vectors for ancestral versus derived state at variable amino acid positions 25, 26, and 29 in the protein are shown. (B) Three-dimensional vectors for A, C, G, or T at variable positions 3 and 4 in the RE are shown. The encoding methods shown in panels A and B ensure that the origin in each vector space will be associated with the mean value of the independent variable (in this case, the delta-G of dissociation) across all the data. (C) Terms used in the linear model using abc/WYK coding. Each row shows the expression for the effect on the independent variable of a nucleotide state, amino acid replacement, or interaction among them. Each genetic effect is calculated using the expression shown and the optimized values of the linear coefficients as described in ‘Materials and methods’.

https://doi.org/10.7554/eLife.07864.012

Download links