DNAm: DNA methylation; EPIC: Illumina Infinium MethylationEPIC BeadChip; Epig. Roadmap: Roadmap Epigenomics Consortium; ESC: embryonic stem cell; ESS: epigenetic supersimilarity; GSA: Global …
(A) Relationship between date of conception and date of sample collection for ENID (top) and EMPHASIS (bottom) cohorts. (B) Modelled seasonal change in methylation for 768 SoC-associated loci (false …
(A) Relationship between conception date of modelled methylation maximum measured at 2 and 5–7 yr in the same n = 138 individuals from the ENID cohort. n = 157 SoC-CpGs with a significant SoC …
(A) Date of conception at modelled methylation maxima for 259 SoC-CpGs and 259 corresponding matched and random controls across all three analysed datasets. Green and yellow bands indicate the …
(A) Methylation distribution of SoC-CpGs, matched controls, and array background in pre-gastrulation inner cell mass (ICM) and post-gastrulation embryonic liver, measured in reduced representation …
Chromatin states predicted by ChromHMM (Ernst and Kellis, 2012) from chromatin marks in four cell lines and tissues generated by the Roadmap Epigenomics Consortium et al., 2015. Predicted states for …
(A) Proportion of SoC-CpGs, matched controls, and array background CpGs proximal to ERV1 endogenous retroviral elements (top) and ZFP57 binding sites (bottom), within the specified distance. CpG …
Data comprises n=233 and n=289 individuals from the ENID (2 yr) and EMPHASIS cohorts, respectively. 768 SoC-associated CpGs (false discovery rate [FDR]<5%) identified in the ENID (2 yr) cohort.
Data comprises samples from n=138 ENID participants with methylation data at both 2 yr and 5–7 yr.
Conception dates at methylation minima for loci plotted in Figure 4A. Note that since seasonality is modelled by a single pair of Fourier terms, maxima and minima are 6 months apart.
Data from Supplementary file 1D.
Top: Conception date at modelled methylation maxima for ENID 2 yr (x-axis) and EMPHASIS (y-axis) cohorts, according to whether locus falls within (n=154; left-red), or outside (n=105; right-blue) of …
Unadjusted methylation beta values are plotted. Boxes represent inter-quartile ranges (IQRs) for season-specific DNAm at each locus. Note that for visualisation purposes Gambian seasons are …
This is the same as Figure 4D, but with a single CpG randomly sampled from each CpG cluster so that CpGs in each set are a minimum distance of 5000 bp apart. This shows that pairwise correlation …
(Left): Methylation stratified according to sperm methylation status reported in Sugden et al., 2020. Sperm hypomethylation is defined as methylation ≤ 25%. (Right) As left but with loci …
Histone marks or combinations thereof are ordered by abundance. H3 marks were generated by the Roadmap Epigenomics Consortium Sanchez-Delgado et al., 2016 and downloaded using the annotatr (v1.10.0) …
(A) Proportion of SoC-CpGs, matched controls, and array background CpGs proximal to ERVK endogenous retroviral elements (top), and ZFP57 and TRIM28 binding sites (bottom), within the specified …
(A) First and second (left) and second and third (right) principal components (PCs) from a principal component analysis (PCA) of genome-wide genetic data from n=294 individuals from the EMPHASIS …
This figure replicates Figure 4A in the main paper for the ENID 2 yr and EMPHASIS cohorts, but with additional adjustment for ethnicity. Here, we adjust for ethnicity in the EMPHASIS cohort using …
Seven SoC-CpGs on the Illumina 450k array are highlighted. This region falls within intron 2 and bears the hallmarks of being a promoter and/or active or poised enhancer in multiple cell lines. The …
Methylation values at each locus are centred to have mean zero to enable comparison across loci. LINE1 and Alu methylation values are predicted using REMP (v1.16.0) (see Materials and methods). …
Matched controls are selected from array background using Kolmogorov-Smirnov (KS) tests to identify CpGs with similar methylation distributions to SoC-CpGs (see Materials and methods). Top: …
Cohort | Sample size | Age | % male | Tissue | Methylation array |
---|---|---|---|---|---|
ENID (2 yr) | 233 | 2 years | 50.6 | Peripheral blood | Illumina Infinium HM450 |
ENID (5–7 yr) | 138 | 5–7 years | 56.5 | Illumina Infinium MethylationEPIC | |
EMPHASIS | 289 | 7–9 years | 54.3 | Peripheral blood | Illumina Infinium MethylationEPIC |
Note: ENID: Early Nutrition and Immune Development Trial (Moore et al., 2012); EMPHASIS: Epigenetic Mechanisms linking Pre-conceptional nutrition and Health Assessed in India and Sub-Saharan Africa (Chandak et al., 2017). Individuals with ENID longitudinal (5–7 yr) methylation data are a subset of those with methylation at 2 yr. There is no overlap between individuals included in the ENID and EMPHASIS cohorts.
CpG set | Number of CpGs | Notes |
---|---|---|
Array background | 391,814 | Intersection of CpGs on Illumina HM450 (ENID 2 yr) and EPIC (EMPHASIS) cohort arrays, post QC |
SoC-CpGs | 259 | SoC-associated CpGs with SoC effect size (SoC methylation amplitude) > 4% in the ENID 2 yr dataset |
Matched controls | 259 | CpGs with similar methylation distributions to SoC-CpGs in the ENID 2 yr dataset* |
Random controls | 259 | Random sample from array background |
Matching methylation distributions determined by Kolmogorov-Smirnov tests (see Appendix 1—figure 16). QC: quality control; LRT: likelihood ratio test. See Materials and methods for further details.
CpG set | Notes |
---|---|
Putative metastable epialleles (MEs) | 1881 ME/SIV/ESS CpGs overlapping array background identified in multi-tissue and MZ/DZ screens in Van Baak et al., 2018 and Kessler et al., 2018. |
Parent-of-origin-specific methylation (PofOm) | 699 Parent-of-origin-specific methylation loci identified in peripheral blood in Zink et al., 2018, overlapping array background. |
Embryo DNAm data | RRBS data for inner cell mass and embryonic liver (<10 weeks’ gestation) from Guo et al., 2014. |
Sperm DNAm data | WGBS data from Okae et al., 2014. |
Germline DMRs (gDMRs) | Regions differentially methylated in sperm and oocytes identified in WGBS data by Sanchez-Delgado et al., 2016. |
Transposons (ERVs) | ERVs determined by RepeatMasker were downloaded from the UCSC h19 annotations repository. |
Transcription factor ChIP-seq | ZFP57, TRIM28, and CTCF transcription factor binding sites identified from ChIP-seq in human embryonic kidney and hESCs are described in Kessler et al., 2018. |
Chromatin state predictions and histone three marks | Chromatin state predictions for H1 ESCs, fetal brain, fetal muscle, and fetal small intestine generated using Ernst and Kellis, 2012, from Roadmap Epigenomics Consortium et al., 2015. Histone mark data are from the same source. |
ME: metastable epiallele; SIV: systemic interindividual variation; ESS: epigenetic supersimilarity; MZ/DZ: monozygotic/dizygotic twins; PofOm: parent-of-origin methylation; RRBS: reduced representation bisulfite-seq; DMR: differentially methylated region; ERV: endogenous retrovirus; ESCs: embryonic stem cells. See materials and methods for further details.
CpG set | Number of CpGs with mQTL | Number of mQTL (cis/trans) | Median number of mQTL per CpG (IQR) | Methylation variance explained* |
---|---|---|---|---|
SoC-CpGs | 130 (50%) | 2771 (2549/222) | 6 (2–30) | 0.09 (0.08–0.15) |
Matched controls | 201 (78%) | 7886 (7417/469) | 15 (4–50) | 0.09 (0.06–0.21) |
Random controls | 50 (19%) | 1512 (1476/36) | 7 (2–35) | 0.1 (0.08–0.18) |
delta adjusted R2 (see Materials and methods); IQR: inter-quartile range.
Supplementary tables.
(a) ENID SoC-associated CpGs (no amplitude filter applied). (b) Seasonal amplitude tests. (c) Inter-cohort change in SoC amplitude at ENID SoC-associated CpGs. (d) SoC-CpGs (ENID SoC-associated CpGs with SoC methylation amplitude >=4%). (e) SoC-CpG clusters and singletons. (f) SoC-CpG cluster sizes. (g) SoC-CpG enrichment for MEs and overlap with ME sub-classes. (h) SoC-CpG enrichment for gDMRs and parent-of-origin specific methylation. (i) SoC-CpGs overlapping maternally methylated germline DMRs. (j) SoC-CpGs mQTL and their association with SoC. (k) Candidate genes previously associated with periconception / gestational exposures overlapping array background. (l) Look-up of SoC-CpGs in the EWAS Catalog. (m) EWAS Catalog data grouped by trait. (n) SoC-CpG associations with selected variables measured in the ENID 2yr dataset. (o) Look-up of SoC-CpGs in the GWAS Catalog. (p) ENID (2yr): Association test p-values to detect potential residual confounding. (q) EMPHASIS (7-9yr): Association test p-values to detect potential residual confounding. (r) ENID 5-7yr: Association test p-values to detect potential residual confounding. (s) ENID (2yr): Cell count, batch, village sensitivity analyses.