DNA methylation presents distinct binding sites for human transcription factors

  1. Shaohui Hu
  2. Jun Wan
  3. Yijing Su
  4. Qifeng Song
  5. Yaxue Zeng
  6. Ha Nam Nguyen
  7. Jaehoon Shin
  8. Eric Cox
  9. Hee Sool Rho
  10. Crystal Woodard
  11. Shuli Xia
  12. Shuang Liu
  13. Huibin Lyu
  14. Guo-Li Ming
  15. Herschel Wade  Is a corresponding author
  16. Hongjun Song  Is a corresponding author
  17. Jiang Qian  Is a corresponding author
  18. Heng Zhu  Is a corresponding author
  1. Johns Hopkins University School of Medicine, United States
  2. Hugo W Moser Research Institute at Kennedy Krieger, Johns Hopkins University School of Medicine, United States
  3. Chinese Academy of Sciences, China
4 figures and 1 additional file

Figures

Figure 1 with 8 supplements
Protein microarray-based approach identified mCpG-dependent DNA-binding activity among human TFs and cofactors.

(A) A competition assay was used to identify proteins that preferentially bind to methylated DNA motifs. SCAPER (S-phase cyclin A-associated protein in the ER) and E2F3 (E2F transcription factor 3) …

https://doi.org/10.7554/eLife.00726.003
Figure 1—figure supplement 1
Data analysis of the protein microarray assays.

(A) Workflow of data normalization. (B) Local normalization (window size 9 × 9). (C) Extrapolation of background noise distribution. Noise distribution of N2 is mirrored from distribution of N1. …

https://doi.org/10.7554/eLife.00726.004
Figure 1—figure supplement 2
Reproducibility of protein microarray data.

Left panel: signal comparison between a duplicated binding-assay with motif M303 shows a high correlation, confirming the reproducibility of the assay. Right panel: comparison between two random …

https://doi.org/10.7554/eLife.00726.005
Figure 1—figure supplement 3
Distribution of number of mCpG-binding TFs/co-factors in a given motif-bind assay.

The median value of TFs/cofactors binding to one methylated CpG-containing motif is 8.

https://doi.org/10.7554/eLife.00726.006
Figure 1—figure supplement 4
Distribution of number of methylated motifs recognized by a given TF/co-factor.

Most TFs/cofactors bind to very few methylated DNA motif(s); whereas 7 TFs bind to more than 77 of the 154 motifs tested in this study.

https://doi.org/10.7554/eLife.00726.007
Figure 1—figure supplement 5
Distribution of TF subfamily members.

(A) Distribution of TF subfamily members that showed mCpG-binding activity. (B) Distribution of all annotated TF subfamily members presented on the TF protein microarrays. Statistic analysis showed …

https://doi.org/10.7554/eLife.00726.008
Figure 1—figure supplement 6
Four additional EMSA assays (A) and competition EMSA assays (B).

The results confirmed specificity of mCpG-dependent DNA-binding activities.

https://doi.org/10.7554/eLife.00726.009
Figure 1—figure supplement 7
Methylation level of the KLF4 and HOXA5 luciferase reporter constructs.

Eight units of KLF4 (TCCCGCCCA) and HOXA5 (AAACGCTGCC) binding motifs were separately cloned into the promoter region of a CpG-free luciferase reporter vector, and methylated with SssI before …

https://doi.org/10.7554/eLife.00726.010
Figure 1—figure supplement 8
Number of unique mCpG-binding TFs/co-factors in function of number of tested methylated DNA motifs.

The curve is far from saturation, suggesting that more such TFs/co-factors remain to be discovered.

https://doi.org/10.7554/eLife.00726.011
Figure 2 with 4 supplements
A group of 17 TFs can bind to both methylated and unmethylated motifs of distinct sequences.

(A) Our previous PDI dataset was compiled with the dataset in this study to generate binding preference of the 17 TFs. Methylated consensus motifs of the 17 TFs identified based on the protein …

https://doi.org/10.7554/eLife.00726.012
Figure 2—figure supplement 1
Competition EMSA assays for ARID3B and ZMYM3.

As expected, unlabeled and methylated motif M319 showed dose-dependent competition against the labeled, methylated motif M319; whereas unlabeled and unmethylated motif M47 could readily compete off …

https://doi.org/10.7554/eLife.00726.013
Figure 2—figure supplement 2
Competition EMSA assays for KLF4 and TFAP2A.

Complex formation between KLF4 and methylated mM197 and between KLF4 and unmethylated umM412 is not affected by either umM412 or mM917, respectively. However, when both methylated and non-methylated …

https://doi.org/10.7554/eLife.00726.014
Figure 2—figure supplement 3
Summary of KLF4’s dual-specificity.

Competition EMSA assays confirm KLF4's binding specificity to methylated motif M197 (mM197) and unmethylated motif M412 (umM412).

https://doi.org/10.7554/eLife.00726.015
Figure 2—figure supplement 4
OIRD sensorgrams for three TFs and MBD2b binding to three methylated DNA motifs.

(A) MBD2b with a reported KD value of 330 nM was used as a benchmark in the OIRD system, showing the sensorgrams of MBD2b binding to methylated M203, M213 and M197. (B)–(D) OIRD sensorgrams for …

https://doi.org/10.7554/eLife.00726.016
Figure 3 with 4 supplements
KLF4’s mCpG-dependent binding activity is decoupled from its binding activity to unmethylated motifs.

(A) Simulation of KLF4–DNA interactions predicted that two residues, Arg458 and Asp460, are involved in the interactions with methylated cytosine. Double arrow indicates van der Waals interactions …

https://doi.org/10.7554/eLife.00726.017
Figure 3—figure supplement 1
Architecture of KLF4 DNA-binding domain.

KLF4 encodes two and half zinc finger DNA-binding domains at its C-terminus. Residues R458 and D460, which were predicted to interact with the 5-methyl group in the cytosine, are located in the …

https://doi.org/10.7554/eLife.00726.018
Figure 3—figure supplement 2
Known crystal structures of MeCP2 and ZFP57 in complex with methylated DNA.

The pink and blue double arrows represent van der Waals force between the arginine and methyl groups. Red balls are water molecules.

https://doi.org/10.7554/eLife.00726.019
Figure 3—figure supplement 3
EMSA assays to evaluate impacts of KLF4 R458K, R458A::D460A mutations, and Δ432 truncation on its binding activity to motifs M412 and M197.

These results clearly demonstrated that both the single- and double-mutations, as well as the truncation, abolished KLF4's ability to form a complex with methylated motif M197, while neither showed …

https://doi.org/10.7554/eLife.00726.020
Figure 3—figure supplement 4
Western blot analysis of overexpression of KLF4WT, KLF4R458A and KLF4D460A proteins in GT1-7 cells.

Using GAPDH as a control, these results demonstrated equal transfection efficiency of the constructs.

https://doi.org/10.7554/eLife.00726.021
Figure 4 with 3 supplements
Endogenous KLF4 binds to methylated loci in human embryonic stem cells (H1) in vivo.

(A) Bioinformatics analysis to derive methylated DNA motif logo binding to KLF4 by integrating of KLF4 ChIP-Seq and methylome data in H1 cells. Based on the distribution of methylation level at the …

https://doi.org/10.7554/eLife.00726.022
Figure 4—figure supplement 1
Integration of KLF4 ChIP-seq and methylome data in H1 cell.

KLF4 ChIP-Seq and methylome data in H1 were compiled to assign the methylation levels in KLF4 ChIP'ed segments (upper panel). Lower panel was schematic plot for KLF4 binding summits. The pink ovals …

https://doi.org/10.7554/eLife.00726.023
Figure 4—figure supplement 2
Five selected KLF4-binding loci for further analyses.

The chromosome positions and KLF4 ChIP-seq peaks (GSM447584) are shown.

https://doi.org/10.7554/eLife.00726.024
Figure 4—figure supplement 3
An example of KLF4 ChIP-bisulfite sequencing assay.

The sequencing results confirmed that KLF4 bound to hyper-methylated loci in the sequence context of CCmCGCC (arrows) in H1 cell. Upper and lower panels represent bisulfite sequencing results of the …

https://doi.org/10.7554/eLife.00726.025

Additional files

Supplementary file 1

(A) 154 CpG-containing motifs tested on our protein microarray. (B) List of transcription factors and cofactors available on our protein microarray. (C) Transcription factors and cofactors binding to methylated DNA motif(s). (D) KLF4 binding methylated 6-mers with CpG at the center position obtained by integrating KLF4 ChIP-Seq and methylome data in human H1 cell. (E) Information of loci (L1–L5) tested in Figure 4C–F: genome locations (hg18), sequences, ChIP PCR and bisulfite-sequencing primers.

https://doi.org/10.7554/eLife.00726.026

Download links