Simple biochemical features underlie transcriptional activation domain diversity and dynamic, fuzzy binding to Mediator
Abstract
Gene activator proteins comprise distinct DNA-binding and transcriptional activation domains (ADs). Because few ADs have been described, we tested domains tiling all yeast transcription factors for activation in vivo and identified 150 ADs. By mRNA display, we showed that 73% of ADs bound the Med15 subunit of Mediator, and that binding strength was correlated with activation. AD-Mediator interaction in vitro was unaffected by a large excess of free activator protein, pointing to a dynamic mechanism of interaction. Structural modeling showed that ADs interact with Med15 without shape complementarity ('fuzzy' binding). ADs shared no sequence motifs, but mutagenesis revealed biochemical and structural constraints. Finally, a neural network trained on AD sequences accurately predicted ADs in human proteins and in other yeast proteins, including chromosomal proteins and chromatin remodeling complexes. These findings solve the longstanding enigma of AD structure and function and provide a rationale for their role in biology.
Data availability
All data from in vivo activation and in vitro screens are included in tables as source data files. PDB files of structural models of Med15-AD interactions are included in Figure 6-source data 2. All sequencing data have been deposited in GEO, under the accession code GSE173156.
Article and author information
Author details
Funding
National Institutes of Health (R01-DK121366 and R01-AI021144)
- Roger D Kornberg
U.S. Department of Energy (Office of Science Graduate Student Research (SCGSR) program (DE-SC0014664))
- Raphael J L Townshend
National Institutes of Health (F32-GM126704)
- Jordan T Feigerle
National Institutes of Health (R01-GM127359)
- Ron O Dror
U.S. Department of Energy (Scientific Discovery through Advanced Computing (SciDAC) program)
- Ron O Dror
National Science Foundation (Physics Frontiers Center Award (PHY1427654))
- Erez Lieberman-Aiden
Welch Foundation (Q-1866)
- Erez Lieberman-Aiden
U.S. Department of Agriculture (Agriculture and Food Research Initiative Grant (2017-05741))
- Erez Lieberman-Aiden
National Institutes of Health (4D Nucleome Grant (U01HL130010))
- Erez Lieberman-Aiden
National Institutes of Health (Encyclopedia of DNA Elements Mapping Center Award (UM1HG009375))
- Erez Lieberman-Aiden
U.S. Department of Defense (National Defense Science & Engineering Graduate (NDSEG) Fellowship)
- Adrian L Sanborn
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2021, Sanborn et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 7,800
- views
-
- 949
- downloads
-
- 111
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Chromosomes and Gene Expression
- Genetics and Genomics
Annotation of newly sequenced genomes frequently includes genes, but rarely covers important non-coding genomic features such as the cis-regulatory modules—e.g., enhancers and silencers—that regulate gene expression. Here, we begin to remedy this situation by developing a workflow for rapid initial annotation of insect regulatory sequences, and provide a searchable database resource with enhancer predictions for 33 genomes. Using our previously developed SCRMshaw computational enhancer prediction method, we predict over 2.8 million regulatory sequences along with the tissues where they are expected to be active, in a set of insect species ranging over 360 million years of evolution. Extensive analysis and validation of the data provides several lines of evidence suggesting that we achieve a high true-positive rate for enhancer prediction. One, we show that our predictions target specific loci, rather than random genomic locations. Two, we predict enhancers in orthologous loci across a diverged set of species to a significantly higher degree than random expectation would allow. Three, we demonstrate that our predictions are highly enriched for regions of accessible chromatin. Four, we achieve a validation rate in excess of 70% using in vivo reporter gene assays. As we continue to annotate both new tissues and new species, our regulatory annotation resource will provide a rich source of data for the research community and will have utility for both small-scale (single gene, single species) and large-scale (many genes, many species) studies of gene regulation. In particular, the ability to search for functionally related regulatory elements in orthologous loci should greatly facilitate studies of enhancer evolution even among distantly related species.
-
- Chromosomes and Gene Expression
Previously, we showed that the germ cell-specific nuclear protein RBMXL2 represses cryptic splicing patterns during meiosis and is required for male fertility (Ehrmann et al., 2019). Here, we show that in somatic cells the similar yet ubiquitously expressed RBMX protein has similar functions. RBMX regulates a distinct class of exons that exceed the median human exon size. RBMX protein-RNA interactions are enriched within ultra-long exons, particularly within genes involved in genome stability, and repress the selection of cryptic splice sites that would compromise gene function. The RBMX gene is silenced during male meiosis due to sex chromosome inactivation. To test whether RBMXL2 might replace the function of RBMX during meiosis we induced expression of RBMXL2 and the more distantly related RBMY protein in somatic cells, finding each could rescue aberrant patterns of RNA processing caused by RBMX depletion. The C-terminal disordered domain of RBMXL2 is sufficient to rescue proper splicing control after RBMX depletion. Our data indicate that RBMX and RBMXL2 have parallel roles in somatic tissues and the germline that must have been conserved for at least 200 million years of mammalian evolution. We propose RBMX family proteins are particularly important for the splicing inclusion of some ultra-long exons with increased intrinsic susceptibility to cryptic splice site selection.