Epigenetics: Making the most of methylation
DNA methylation is a key mechanism used by higher eukaryotes to regulate gene expression. The addition of a methyl group to carbon atom number 5 within cytosine bases in DNA is known to repress the transcription of genes into messenger RNA molecules, thus reducing the production of the proteins coded by these genes. Most methylation occurs at CpG dinucleotides—cytosines that are paired with guanines—and these often cluster together to form CpG islands in the promoter regions of genes. In the late 1990s, it was discovered that transcription was repressed when methyl CpG binding proteins were recruited to methylated CpG islands (Hendrich and Bird, 1998).
Subsequent studies have confirmed that the binding of these proteins throughout the genome is proportional to the density of DNA methylation (Baubec et al., 2013), and have identified additional proteins with a high affinity for methylated CpG sites (reviewed in Defossez and Stancheva, 2011). Moreover, in recent years, other screening approaches based mainly on mass spectrometry have revealed that more proteins bind to methylated DNA than previously thought (Mittler et al., 2009; Bartke et al., 2010; Bartels et al., 2011; Spruijt et al., 2013). Now, in eLife, Heng Zhu and co-workers at the Johns Hopkins University School of Medicine—including Shaohui Hu as first author—use a high-throughput screening method to show that many human transcription factors also interact with genomic DNA sequences containing methylated CpG sites (Hu et al., 2013).
To this end, the Johns Hopkins researchers made use of a published protein microarray consisting of 1,321 transcription factors and 210 co-factors (Hu et al., 2009). Hu et al. incubated the array with 154 distinct human promoter sequences, each of which contained at least one methylated CpG dinucleotide.Their results revealed that 150 (97%) of the 154 methylated human promoter sequences showed specific binding to at least one protein on the microarray. Moreover, of the 1531 proteins, 47 (3%) showed binding to methylated cytosines within the promoters. Most of the proteins bound to methylated DNA in a sequence-dependent manner; however, a minority bound to many different methylated DNA probes, indicating that binding can sometimes occur independent of DNA sequence (Figure 1).
![](https://iiif.elifesciences.org/lax/01387%2Felife-01387-fig1-v1.tif/full/617,/0/default.jpg)
Some human transcription factors can bind to both methylated and non-methylated DNA sequences. Hu et al. examined the ability of 17 human transcription factors to bind to 150 different DNA motifs containing methylated or non-methylated CpG islands. Each row represents one transcription factor. For each motif, some transcription factors bound only to the methylated version (red), some to only the non-methylated version (blue); some to both methylated and non-methylated versions (green), and some to neither (grey). From Figure 2a in Hu et al., 2013.
A number of transcription factors, including KLF4—a recently identified methyl-CpG binding protein (Spruijt et al., 2013)—interacted with methylated sequences that did not resemble their known consensus DNA binding motifs. Using a technique based on electrophoresis, Hu et al. showed that KLF4 binds methylated and non-methylated DNA in a non-competitive manner: this suggests that different domains of the protein may be responsible for each type of binding, which they duly confirmed using molecular modeling and mutagenesis studies.
The Johns Hopkins researchers then mined published ChIP-sequencing data from stem cells to identify the target DNA sequences of KLF4, and compared these with data on genome-wide DNA methylation. Strikingly, KLF4 binding appears to be bimodal in nature throughout the genome, with 38% of KLF4 binding sites showing less than 20% methylation, and 48% showing methylation levels over 80%. Finally, Hu et al. used ChIP-bisulfite sequencing, which makes it possible to determine the methylation status of each cytosine within a target DNA sequence, to confirm that KLF4 also binds to both methylated and non-methylated DNA in vivo.
Hu et al. only profiled a small fraction of the complete human methylome for interactions with transcription factors; further proteins capable of binding genomic methyl CpG sequences surely await identification. The same holds true for interactions with methylated non-CpG sequences such as methyl-CpA (cytosine adjacent to adenine), which are fairly abundant in embryonic stem cells (Ramsahoye et al., 2000; Lister et al., 2009). To determine the physiological relevance of these interactions, it will be important to deduce the affinity with which proteins bind these sequences compared to their known targets; initial experiments along these lines are presented in the current eLife paper. Furthermore, recent evidence suggests that non-methylated CpG islands recruit activator proteins, many of which contain a CXXC motif (reviewed in Long et al., 2013). The transcription factor microarray approach used by the Johns Hopkins team, combined with quantitative mass spectrometry-based technology (Spruijt et al., 2013), could thus be used to identify the complete cellular complement of proteins that bind specifically to non-methylated CpG islands.
Finally, this study and other recently published papers force us to reconsider the mechanism(s) via which CpG methylation regulates transcription. Although DNA methylation is generally considered to be a repressive epigenetic modification, experiments presented by Hu et al. suggest that in some cases, methylation of a given promoter sequence can result in activation of transcription. Moreover, other work has revealed a temporal uncoupling of DNA methylation and transcriptional repression during Xenopus embryogenesis (Bogdanovic et al., 2011). Further experiments are therefore required to determine whether the functional readout of CpG methylation is affected by the repertoire and abundance of different DNA methylation ‘readers’ acting at any given time in a cell or a developing organism.
References
-
Biological functions of methyl-CpG-binding proteinsProg Mol Biol Transi Sci 101:377–398.https://doi.org/10.1016/B978-0-12-387685-0.00012-3
-
Identification and characterization of a family of mammalian methyl-CpG binding proteinsMol Cell Biol 18:6538–6547.
-
ZF-CxxC domain-containing proteins, CpG islands and the chromatin connectionBiochem Soc trans 41:727–740.https://doi.org/10.1042/BST20130028
Article and author information
Author details
Publication history
Copyright
© 2013, Vermeulen
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 523
- views
-
- 57
- downloads
-
- 2
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Biochemistry and Chemical Biology
- Genetics and Genomics
RNA binding proteins (RBPs) containing intrinsically disordered regions (IDRs) are present in diverse molecular complexes where they function as dynamic regulators. Their characteristics promote liquid-liquid phase separation (LLPS) and the formation of membraneless organelles such as stress granules and nucleoli. IDR-RBPs are particularly relevant in the nervous system and their dysfunction is associated with neurodegenerative diseases and brain tumor development. Serpine1 mRNA-binding protein 1 (SERBP1) is a unique member of this group, being mostly disordered and lacking canonical RNA-binding domains. We defined SERBP1’s interactome, uncovered novel roles in splicing, cell division and ribosomal biogenesis, and showed its participation in pathological stress granules and Tau aggregates in Alzheimer’s brains. SERBP1 preferentially interacts with other G-quadruplex (G4) binders, implicated in different stages of gene expression, suggesting that G4 binding is a critical component of SERBP1 function in different settings. Similarly, we identified important associations between SERBP1 and PARP1/polyADP-ribosylation (PARylation). SERBP1 interacts with PARP1 and its associated factors and influences PARylation. Moreover, protein complexes in which SERBP1 participates contain mostly PARylated proteins and PAR binders. Based on these results, we propose a feedback regulatory model in which SERBP1 influences PARP1 function and PARylation, while PARylation modulates SERBP1 functions and participation in regulatory complexes.
-
- Biochemistry and Chemical Biology
Missense mutations in the amyloid precursor protein (APP) and presenilin-1 (PSEN1) cause early-onset familial Alzheimer’s disease (FAD) and alter proteolytic production of secreted 38-to-43-residue amyloid β-peptides (Aβ) by the PSEN1-containing γ-secretase complex, ostensibly supporting the amyloid hypothesis of pathogenesis. However, proteolysis of APP substrate by γ-secretase is processive, involving initial endoproteolysis to produce long Aβ peptides of 48 or 49 residues followed by carboxypeptidase trimming in mostly tripeptide increments. We recently reported evidence that FAD mutations in APP and PSEN1 cause deficiencies in early steps in processive proteolysis of APP substrate C99 and that this results from stalled γ-secretase enzyme-substrate and/or enzyme-intermediate complexes. These stalled complexes triggered synaptic degeneration in a Caenorhabditis elegans model of FAD independently of Aβ production. Here, we conducted full quantitative analysis of all proteolytic events on APP substrate by γ-secretase with six additional PSEN1 FAD mutations and found that all six are deficient in multiple processing steps. However, only one of these (F386S) was deficient in certain trimming steps but not in endoproteolysis. Fluorescence lifetime imaging microscopy in intact cells revealed that all six PSEN1 FAD mutations lead to stalled γ-secretase enzyme-substrate/intermediate complexes. The F386S mutation, however, does so only in Aβ-rich regions of the cells, not in C99-rich regions, consistent with the deficiencies of this mutant enzyme only in trimming of Aβ intermediates. These findings provide further evidence that FAD mutations lead to stalled and stabilized γ-secretase enzyme-substrate and/or enzyme-intermediate complexes and are consistent with the stalled process rather than the products of γ-secretase proteolysis as the pathogenic trigger.