Simple biochemical features underlie transcriptional activation domain diversity and dynamic, fuzzy binding to Mediator

  1. Adrian L Sanborn  Is a corresponding author
  2. Benjamin T Yeh
  3. Jordan T Feigerle
  4. Cynthia V Hao
  5. Raphael JL Townshend
  6. Erez Lieberman Aiden
  7. Ron O Dror
  8. Roger D Kornberg  Is a corresponding author
  1. Department of Structural Biology, Stanford University School of Medicine, United States
  2. Department of Computer Science, Stanford University, United States
  3. The Center for Genome Architecture, Baylor College of Medicine, United States
  4. Center for Theoretical Biological Physics, Rice University, United States
7 figures, 1 table and 1 additional file

Figures

Figure 1 with 1 supplement
A quantitative screen identifies 150 activation domains from all yeast transcription factors.

(A) Schematic of the activation assay. To measure in vivo activation, we expressed fragments of TF proteins fused to a DNA-binding domain that binds uniquely in the promoter of a genome-integrated …

Figure 1—source data 1

Data from activation assays.

https://cdn.elifesciences.org/articles/68068/elife-68068-fig1-data1-v3.xlsx
Figure 1—source data 2

ADs identified in the activation screen and their overlap with previously-known ADs.

Protein positions are 0-indexed and inclusive of start but exclusive of stop positions.

https://cdn.elifesciences.org/articles/68068/elife-68068-fig1-data2-v3.xlsx
Figure 1—figure supplement 1
Methodology, validation, and summary statistics for pooled activation screens.

(A) Distributions of signal from the GFP reporter (in arbitrary units), measured from FACS, in cells expressing artificial TF (aTF) fused to ADs from VP16, Gcn4, or Pho4, or without an AD. Cells …

Figure 2 with 1 supplement
Activation strength is primarily determined not by motifs but by acidic and hydrophobic content.

(A) Tiles were binned by their hydrophobic content and net charge and the median activation of each bin is displayed in color. Activation was strongest for tiles high in both acidic and hydrophobic …

Figure 2—figure supplement 1
Mutagenesis of hydrophobic and acidic residues in known and newly-discovered ADs.

(A) Distribution of hydrophobic content (top) and net charge (bottom) for non-activating tiles (gray) and highly activating tiles (green). Hydrophobicity is computed using the Wimley-White scale and …

Figure 3 with 1 supplement
A deep learning model, termed PADDLE, predicts the location and strength of acidic activation domains in yeast and human.

(A) Activation of wild-type (orange) and eight scrambled sequences (green) for each of eight ADs. (B) De novo PADDLE predictions (purple) and experimentally measured activation (green) for Arg81 are …

Figure 3—source data 1

ADs predicted in human and virus proteins and core ADs predicted in yeast.

Protein positions are 0-indexed and inclusive of start but exclusive of stop positions.

https://cdn.elifesciences.org/articles/68068/elife-68068-fig3-data1-v3.xlsx
Figure 3—figure supplement 1
Validation of neural network models and PADDLE predictions on human TFs and yeast core ADs.

(A) Neural network predictions from amino acid composition alone on test data withheld from training. R2 is the coefficient of determination. Compare to PADDLE predictions in Panel B. (B) PADDLE …

Figure 4 with 1 supplement
Sequence and structural determinants of activation domains.

(A) To quantify the importance of aa composition, we measured activation of 33 scrambled sequences for each of the 28 strongest 13-aa core ADs (cADs). (Top) Activation of each wild-type cAD divided …

Figure 4—figure supplement 1
Additional activation data for and analysis of core AD mutants and systematic 9mers.

(A) The wild-type-to-scramble ratio of core ADs was not correlated with their wild-type activation strength (Pearson’s r = 0.32, p=0.09). Related to Figure 4A–B. (B) For each position assayed across …

Figure 5 with 1 supplement
The large majority of activation domains bind Mediator, and its recruitment is a key driver for activation.

(A) To measure binding in high-throughput, we used mRNA display, expressing our library of TF tiles as a pool of protein fragments covalently tagged with their mRNA sequences (left), and using this …

Figure 5—source data 1

Data from pull-down assays and list of Med15-binding domains.

https://cdn.elifesciences.org/articles/68068/elife-68068-fig5-data1-v3.xlsx
Figure 5—figure supplement 1
Methodology, validation, and summary statistics for Med15 and TFIID subcomplex pull-down screens.

(A) (Left) A 4–20% polyacrylamide gel showing in vivo biotinylated GST-TEV-Avi-Med15K123-6xHis protein (denoted Med15) eluted from NiNTA resin (lane 2), and the flow-through (lane 3) and bound beads …

Figure 6 with 1 supplement
Med15 uses a shape-agnostic, fuzzy interface to bind diverse activation domain sequences.

(A) We used a Rosetta peptide docking algorithm (Raveh et al., 2011) to build structural models of the 28 13-aa core ADs (cADs) described above (Figure 4A–D) interacting with two activator-binding …

Figure 6—source data 1

List of all Med15 ABD and core AD interactions modeled using FlexPepDock.

https://cdn.elifesciences.org/articles/68068/elife-68068-fig6-data1-v3.xlsx
Figure 6—source data 2

Structural models of Med15 ABD and core AD interactions generated by FlexPepDock.

The 10 best-scoring models from different binding poses are provided as pdb files. Top500.txt contains the score, cluster number, and RMSD to best model for the 500 best-scoring models.

https://cdn.elifesciences.org/articles/68068/elife-68068-fig6-data2-v3.zip
Figure 6—figure supplement 1
Validation and additional analysis for structural modeling of core AD interaction with Med15 ABDs.

(A) The best-scoring model of the Pdr1 cAD (blue) bound to the KIX domain (gray). KIX residues important for binding the Pdr1 AD are shown in red (Thakur et al., 2008). The cAD contacts the …

Figure 7 with 1 supplement
Functional consequences of high valence coactivator interactions.

(A) We took 47 pairs of adjacent cADs and measured their activation enhancement factors—the activation of both cADs in tandem divided by the product of activation by each cAD individually. For 40 …

Figure 7—figure supplement 1
Additional data and analysis.

(A) Percent of input bound to Med15 or individual ABDs for mRNA-tagged and pooled protein fragments of VP16 AD, Gcn4 AD, and the AD mutant library (sub-library B). Binding was tracked by qPCR of …

Tables

Key resources table
Reagent type
(species) or resource
DesignationSource or referenceIdentifiersAdditional information
Cell line (S. cerevisiae)BY4711ATCC200873MATalpha trp1delta63
Cell line (human)HEK293TATCCCRL-3216
Sequence-based reagentOligo pool librariesTwist BioscienceCustom synthesisSequences provided in source data files
Recombinant DNA reagentpMVS142-pACT1-mCherry-Zif268-EBD-MCS-KANAddgene99049Barak Cohen
Recombinant DNA reagentpMVS102-P3-GFP-NATMX6Addgene99048Barak Cohen
Chemical compound, drugFetal bovine serumMilliporeTMS-031-B
Strain, strain background (Escherichia coli)XL10 Gold Ultracompetent CellsAgilent200315For cloning
Peptide, recombinant proteinPhusion polymeraseNew England BiolabsM0531L
Commercial assay or kitAMPure XP beadsBeckman CoulterA63880
Recombinant DNA reagentpFN26A BIND plasmidPromegaE1380
Recombinant DNA reagentpGL4.35 plasmidPromegaE1370
Commercial assay or kitDual-Glo Luciferase assayPromegaE2920
Recombinant DNA reagentpBirAcmAvidityAVB99
Strain, strain background (Escherichia coli)BL21 Star (DE3) competent cellsThermoFisherC601003For protein expression
Commercial assay or kitNi-NTA Suerflow resinQiagen30410
Recombinant DNA reagentpFASTBac1 plasmidThermoFisher10360014
Commercial assay or kitMEGAscript T7 transcription kitThermoFisherAM1333
Peptide, recombinant proteinT4 DNA ligaseNew England BiolabsM0202S
OtherAmicon Ultracel 0.5 mL 30K MWCO columnMillipore SigmaUFC503024
Commercial assay or kitModel 422 Electro-EluterBio Rad1652976
Commercial assay or kitRetic Lysate IVT KitThermoFisherAM1200
Sequence-based reagentPF30P oligoIDTCustom synthesis/5Phos/AA AAA AAA AAA AAA AAA AAA A/iSp9//iSp9//iSp9/AC C/3Puro/
Peptide, recombinant proteinSuperScript II Reverse TranscriptaseThermoFisher18064014
Commercial assay or kitDynabeads MyOne Streptavidin T1ThermoFisher65602
Sequence-based reagentSalmon Sperm DNAThermoFisher15632011
Peptide, recombinant proteinBSANew England BiolabsB9000S
Software, algorithmPADDLEThis papergithub.com/asanborn/PADDLE

Additional files

Download links