Maternal smoking DNA methylation risk score associated with health outcomes in offspring of European and South Asian ancestry

  1. Wei Q Deng  Is a corresponding author
  2. Nathan Cawte
  3. Natalie Campbell
  4. Sandi M Azab
  5. Russell J de Souza
  6. Amel Lamri
  7. Katherine M Morrison
  8. Stephanie A Atkinson
  9. Padmaja Subbarao
  10. Stuart E Turvey
  11. Theo J Moraes
  12. Koon K Teo
  13. Piush J Mandhane
  14. Meghan B Azad
  15. Elinor Simons
  16. Guillaume Paré
  17. Sonia S Anand  Is a corresponding author
  1. Department of Medicine, Faculty of Health Sciences, McMaster University, Canada
  2. Peter Boris Centre for Addictions Research, St. Joseph’s Healthcare Hamilton, Canada
  3. Department of Psychiatry and Behavioural Neurosciences, McMaster University, Canada
  4. Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Canada
  5. Department of Health Research Methods, Evidence, and Impact, McMaster University, Canada
  6. Department of Pediatrics, McMaster University, Canada
  7. Department of Pediatrics, University of Toronto, Canada
  8. Department of Pediatrics, BC Children’s Hospital, The University of British Columbia, Canada
  9. Program in Translational Medicine, SickKids Research Institute, Canada
  10. Department of Pediatrics, University of Alberta, Canada
  11. Children’s Hospital Research Institute of Manitoba, Department of Pediatrics and Child Health, University of Manitoba, Canada
  12. Section of Allergy and Immunology, Department of Pediatrics and Child Health, University of Manitoba, Canada
  13. Thrombosis and Atherosclerosis Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Canada
  14. Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Canada
3 figures, 3 tables and 1 additional file

Figures

Schematic overview of the analytical pipeline for the cord blood DNA methylation (DNAm) maternal smoking score and association study.

(A) shows the epigenome-wide association studies conducted in the European cohorts (CHILD and FAMILY); (B) illustrates the workflow for methylation risk score (MRS) construction using an external epigenome-wide association studies (EWAS) (Joubert et al., 2016) as the discovery sample and The Canadian Healthy Infant Longitudinal Development (CHILD) study as the external validation study, while (C) demonstrates the evaluation of the MRS in two independent cohorts of White European (i.e. FAMILY) and South Asian (i.e. START). The validated MRS was then tested for association with smoking-specific, maternal, and children phenotypes in CHILD, FAMILY, and START, as shown in (D). *indicates cohort sample size including those with missing smoking history.

Figure 2 with 8 supplements
Manhattan plots of the meta-analyzed association between cord blood DNA methylation (DNAm) and maternal smoking in Europeans.

Manhattan plots summarized the meta-analyzed association p-values between cord blood DNA methylation levels and current maternal smoking (A; n = 744) or smoking exposure (B ; n = 735) at a common set of 2114 cytosine–phosphate–guanine (CpG) sites. The red line denotes the smallest -log10(p-value) that is below the false discovery rate (FDR) correction threshold of 0.05. The red dots represent established associations with maternal smoking reported by Joubert and colleagues (Joubert et al., 2016).

Figure 2—source data 1

Histogram of the smoking exposure across the three cohorts.

https://cdn.elifesciences.org/articles/93260/elife-93260-fig2-data1-v1.pdf
Figure 2—figure supplement 1
Manhattan plots of the meta-analyzed association between cord blood DNA methylation and ever maternal smoking in the combined European cohorts.

The meta-analyzed association p-values for ever maternal smoking (n = 744) and methylation levels at 2114 cytosine–phosphate–guanine (CpG) sites were summarized in the Manhattan plot. Ever maternal smoking was defined to compare those who were currently smoking or quitted before or during this pregnancy vs. those never smoked. The red line denotes the smallest -log10(p-value) that is below the false discovery rate (FDR) correction threshold of 0.05. The red dots represent established associations with maternal smoking reported in Joubert and colleagues (Joubert et al., 2016).

Figure 2—figure supplement 2
Quantile-quantile plots of the meta-analyzed association between cord blood DNA methylation and maternal smoking history, smoking exposure in the combined European cohorts.

Quantile-quantile plots summarized the association p-values between cord blood DNA methylation levels and current maternal smoking (A; n = 744) or ever maternal smoking (B; n = 744) or weekly smoking exposure (C; n = 735) at 2114 cytosine–phosphate–guanine (CpG) sites. The red line (y=x) is the line of reference and the genomic inflation factor, calculated as the ratio between the observed median and the theoretical median of the association test statistics, was annotated for each outcome. The horizontal lines (in A and B only) correspond to the smallest -log10(p-value) that is below the false discovery rate (FDR) correction threshold of 0.05.

Figure 2—figure supplement 3
Regression diagnostic for association between the top cytosine–phosphate–guanine (CpG) (cg09935388) and smoking exposure (n=339) without data transformation in The Canadian Healthy Infant Longitudinal Development (CHILD).

The Residuals vs. Fitted plot shows the residuals on the y-axis and the fitted values on the x-axis, indicating departure from linearity (measured by distance from the blue line to each point) was quite severe. The Q-Q plot compares the standardized residuals with the theoretical quantiles from a standard normal distribution, showing non-normality was largely driven by the three extreme points. The Scale-Location plot shows the square root of the standardized residuals vs. the fitted values. There were also considerable variance heteroskedasticity as shown in the scale and location diagnostic plot. The Residuals vs. Leverage compares the residuals against the leverage of each observation, showing the main outlying points corresponded to the tail of the smoking exposure phenotype >25 hr/week.

Figure 2—figure supplement 4
Regression diagnostic for association between the top cytosine–phosphate–guanine (CpG) (cg09935388) and smoking exposure (n=396) without data transformation in Family Atherosclerosis Monitoring In early life (FAMILY).

The Residuals vs. Fitted plot shows the residuals on the y-axis and the fitted values on the x-axis, indicating departure from linearity (measured by distance from the blue line to each point) was severe. The Q-Q plot compares the standardized residuals with the theoretical quantiles from a standard normal distribution, showing a large number of data points driving the departure from non-normality. The Scale-Location plot shows the square root of the standardized residuals vs. the fitted values, which suggests considerable variance heteroskedasticity. The Residuals vs. Leverage compares the residuals against the leverage of each observation, showing varying level of leverage points.

Figure 2—figure supplement 5
Regression diagnostic for association between the top cytosine–phosphate–guanine (CpG) (cg09935388) and smoking exposure (n=339) under an inverse normal rank transformation in Canadian Healthy Infant Longitudinal Development (CHILD).

The Residuals vs. Fitted plot shows the residuals on the y-axis and the fitted values on the x-axis, indicating some level of departure from linearity (measured by distance from the blue line to each point), which was improved as compared to Figure 2—figure supplement 3. The Q-Q plot compares the standardized residuals with the theoretical quantiles from a standard normal distribution, some departure from non-normality. The Scale-Location plot shows the square root of the standardized residuals vs. the fitted values, which suggests some variance heteroskedasticity still remained. The Residuals vs. Leverage compares the residuals against the leverage of each observation, suggesting influential observations remained but with reduced influence on the model.

Figure 2—figure supplement 6
Regression diagnostic for association between the top cytosine–phosphate–guanine (CpG) (cg09935388) and smoking exposure (n=396) under an inverse rank transformation in Family Atherosclerosis Monitoring In early life (FAMILY).

The Residuals vs. Fitted plot shows the residuals on the y-axis and the fitted values on the x-axis, indicating some level of departure from linearity (measured by distance from the blue line to each point), which was improved as compared to Figure 2—figure supplement 4. The Q-Q plot compares the standardized residuals with the theoretical quantiles from a standard normal distribution, some departure from non-normality. The Scale-Location plot shows the square root of the standardized residuals vs. the fitted values, which suggests some variance heteroskedasticity still remained. The Residuals vs. Leverage compares the residuals against the leverage of each observation, suggesting influential observations remained but with reduced influence on the model.

Figure 2—figure supplement 7
Scatterplots of meta-analyzed association effects for maternal smoking history or smoking exposure and reported effects of maternal smoking.

(A) shows the scatterplot of meta-analyzed effects for maternal smoking (n=744) in the combined Canadian Healthy Infant Longitudinal Development (CHILD) and Family Atherosclerosis Monitoring In early life (FAMILY) cohorts (x-axis) vs. reported effects for maternal smoking in Joubert et al., 2016 (y-axis) for all cytosine–phosphate–guanines (CpGs) present in CHILD, FAMILY, and Joubert et al., 2016 (# CpGs = 128); (B) is the scatterplot of meta-analyzed effects for weekly smoking exposure (n=735) in the combined CHILD and FAMILY cohorts (x-axis) vs. reported effects for maternal smoking in Joubert et al., 2016 (y-axis) for all CpGs present in CHILD, FAMILY, and Joubert et al., 2016 (# CpGs = 128). The solid gray line is the best fitted line using the ordinary least square method (95% confidence interval shown as the shaded area) for the linear relationship between the effect sizes and the dashed gray line represents the reference of y=x.

Figure 2—figure supplement 8
Manhattan plots of the Epigenome-wide associations between cord blood DNA methylation (DNAm) and maternal smoking history, smoking exposure in Canadian Healthy Infant Longitudinal Development (CHILD).

Manhattan plots summarized the association p-values between cord blood DNA methylation levels and current maternal smoking (A; n=347) or ever maternal smoking (B; n=347) or weekly smoking exposure (C; n=339) at 200,050 cytosine–phosphate–guanine (CpG) sites. The red line denotes the smallest -log10(p-value) that is below the false discovery rate (FDR) correction threshold of 0.05. The red dots represent established associations with maternal smoking reported in Joubert and colleagues (Joubert et al., 2016).

Figure 3 with 3 supplements
Relationships between maternal smoking methylation risk score (MRS) and maternal smoking history categories for Canadian Healthy Infant Longitudinal Development (CHILD) and Family Atherosclerosis Monitoring In early life (FAMILY).

Maternal smoking methylation score (y-axis) was shown as a function of maternal smoking history (x-axis) in levels of severity for prenatal exposure for CHILD (A; n=347), and FAMILY (B; n=397). Each severity level was compared to the never-smoking group and the corresponding two-sample t-test p-value was reported. The analysis of variance via an F-test p-value was used to indicate whether a mean difference in methylation score was present among all smoking history categories. The area under the receiver operating characteristic curve (AUC) for each study was shown in the lower panel.

Figure 3—figure supplement 1
A comparison of results for derived and external maternal smoking methylation risk scores (MRSs).

Maternal smoking methylation score (y-axis) was shown as a function of maternal smoking history (x-axis) in levels of severity ([0]=never smoked; [1]=quit before this pregnancy; [2]=quit during this pregnancy; [3]=currently smoking) for prenatal exposure for each study. The scores shown were validated in (1) Canadian Healthy Infant Longitudinal Development (CHILD; n=347), (2) CHILD but restricted to cytosine–phosphate–guanines (CpGs) that were also present on the targeted array, (3) Family Atherosclerosis Monitoring In early life (FAMILY; n=397) using CpGs on the targeted array. Each severity level was compared to the never smoking group and the corresponding two sample t-test p-value was reported. An omnibus test p-value to test whether a mean difference in methylation score was present among all smoking history categories.

Figure 3—figure supplement 2
A heatmap of correlation between derived and external maternal smoking methylation risk score (MRSs).

This heatmap illustrates the pairwise correlation between MRSs calculated in (A) CHILD (n=352), (B) FAMILY (n=411), and (C) START (n=504). Each cell represents the correlation coefficient, ranging from –1–1, indicating the strength and direction of the association. A value of 1 signifies a perfect positive correlation, while –1 indicates a perfect negative correlation. Values closer to 0 suggest no correlation. The color gradient from deep blue (strong negative correlation), through white (no correlation), to deep red (strong positive correlation), visually encodes the strength of these relationships. The scores in the black box were derived using lassosum and internally validated. Note that these sample size included those with missing smoking history.

Figure 3—figure supplement 3
Comparison of all methylation scores stratified by study.

The boxplots captured the standardized maternal smoking methylation scores (y-axis) stratified by study. The top panels summarized results for all samples in Canadian Healthy Infant Longitudinal Development (CHILD; n=352), Family Atherosclerosis Monitoring In early life (FAMILY; n=411), and SouTh Asian biRth cohorT (START; n=504), while the bottom panels summarized results for only those in CHILD, FAMILY, and START that never smoked. The p-values indicate the significance for a mean difference for each pairwise comparison between the HM450K score validated in CHILD with other scores using two-sample t-tests.

Tables

Table 1
Characteristics of the epigenetic subsample (1267 mother–newborn pairs) from the CHILD, FAMILY, START cohorts.
PhenotypesCHILDFAMILYSTARTANOVA F-test or Chi-squared test p-value for differences
(n=352)(n=411)(n=504)
MotherSmoking History
never smoked247 (70.2%)253 (61.6%)501 (99.4%)<0.001*
quit before this pregnancy72 (20.5%)58 (14.1%)1 (0.2%)
quit during this pregnancy17 (4.8%)57 (13.9%)1 (0.2%)
currently smoking11 (3.1%)29 (7.1%)0 (0%)
Missing5 (1.4%)14 (3.4%)1 (0.2%)
Smoking Exposure (hr/week)
Mean (SD)0.97 (±7.64)2.52 (±12.83)0.33 (±2.67)<0.001
Missing12 (3.4%)5 (1.2%)42 (8.3%)
Gestational Diabetes Mellitus
YES16 (4.5%)66 (16.1%)183 (36.3%)<0.001
NO336 (95.5%)345 (83.9%)320 (63.5%)
Missing0 (0%)0 (0%)1 (0.2%)
Years of Education<0.001
Mean (SD)16.96 (±3.08)16.85 (±3.39)15.81 (±2.41)
Missing7 (2.0%)3 (0.7%)0 (0%)
Mother’s Age
Mean (SD)32.69 (±4.45)31.86 (±5.42)30.12 (±3.91)<0.001
Missing4 (1.1%)0 (0%)0 (0%)
Parity
Mean (SD)0.72 (±0.88)0.80 (±1.02)0.80 (±0.81)0.098
Missing2 (0.6%)0 (0%)13 (2.6%)
Pre-pregnancy BMI (kg/m2)
Mean (SD)24.78 (±5.42)26.46 (±6.38)23.71 (±4.45)<0.001
Missing132 (37.5%)16 (3.9%)2 (0.4%)
Newborn Sex
Male194 (55.1%)211 (51.3%)239 (47.4%)0.083
Female158 (44.9%)200 (48.7%)265 (52.6%)
Plant-Based Diet
Mean (SD)–0.48 (±0.46)0.19 (±0.67)1.56 (±1.14)<0.001
Missing23 (6.5%)36 (8.8%)16 (3.2%)
Health Conscious Diet
Mean (SD)0.21 (±0.81)–0.73 (±0.73)–0.42 (±0.79)<0.001
Missing23 (6.5%)36 (8.8%)16 (3.2%)
Western Diet
Mean (SD)–0.15 (±0.63)1.06 (±1.20)–0.51 (±0.65)<0.001
Missing23 (6.5%)36 (8.8%)16 (3.2%)
NewbornGestational Age (weeks)
Mean (SD)39.53 (±1.38)39.44 (±1.47)39.20 (±1.32)<0.001
Missing4 (1.1%)0 (0%)0 (0%)
Birth Length (cm)
Mean (SD)51.68 (±2.52)50.20 (±2.16)51.44 (±2.69)<0.001
Missing71 (20.2%)10 (2.4%)7 (1.4%)
Birth Weight (kg)
Mean (SD)3.50 (±0.49)3.53 (±0.50)3.26 (±0.46)<0.001
Missing6 (1.7%)0 (0%)1 (0.2%)
Newborn BMI (kg/m2)
Mean (SD)13.11 (±1.41)13.94 (±1.29)12.31 (±1.39)<0.001
Missing72 (20.5%)10 (2.4%)7 (1.4%)
Newborn Ponderal Index (kg/m3)
Mean (SD)25.45 (±3.14)27.79 (±2.55)24.02 (±3.17)<0.001
Missing72 (20.5%)10 (2.4%)7 (1.4%)
Estimated cell proportionsCD8T
Mean (SD)0.01 (±0.01)0.04 (±0.03)0.02 (±0.02)<0.001
CD4T
Mean (SD)0.11 (±0.06)0.13 (±0.06)0.16 (±0.07)<0.001
NK
Mean (SD)0.02 (±0.02)0.03 (±0.03)0.02 (±0.03)<0.001
Bcell
Mean (SD)0.02 (±0.02)0.04 (±0.03)0.04 (±0.03)<0.001
Mono
Mean (SD)0.01 (±0.02)0.04 (±0.03)0.03 (±0.03)<0.001
Gran
Mean (SD)0.80 (±0.10)0.60 (±0.13)0.72 (±0.14)<0.001
nRBC
Mean (SD)0.08 (±0.08)0.12 (±0.11)0.07 (±0.11)<0.001
MNLR
Mean (SD)6.59 (±6.00)3.30 (±3.14)3.98 (±3.08)<0.001
Missing6 (1.7%)0 (0%)3 (0.6%)
* comparison for CHILD and FAMILY only
Table 2
Meta-analysis results of the association between cytosine–phosphate–guanines (CpGs) and maternal smoking and smoking exposure that passed a marginal p<0.05 threshold after the false discovery rate correction in European cohorts.
CHRPositionCpGUCSC reference geneMeta-analysis (CHILD and FAMILY)Cohort-specific association P-valueReported Association EWAS catalog
Fixed effectStandard errorAssociation p-valuep-value for effect heterogeneityFDR adjusted the Association P-valueCHILDFAMILY
Maternal Smoking192481269cg12876356GFI1–1.110.227.33E-070.510.00190.029.45E-06MS;S; AC; BW
192482032cg09935388GFI1–1.150.242.26E-060.520.00290.022.71E-05MS;GA; S; AC; BMI; BW
192482405cg14179389GFI1–1.480.325.03E-060.730.00350.011.12E-04MS;S
192481144cg18146737GFI1–0.920.205.58E-060.500.00350.043.95E-05MS;S; AC; BW
192480576cg09662411GFI1–0.940.221.64E-050.290.00830.103.85E-05MS;S
192481479cg18316974GFI1–0.740.183.58E-050.330.01520.137.34E-05MS;S; AC; BW
172494783943cg01798813–0.830.211.09E-040.340.03950.020.0016A; GA; BMI
Smoking Exposure192482032cg09935388GFI1–0.180.041.39E-050.230.040.152.45E-05MS;GA; S; AC; BMI; BW
172494783943cg01798813–0.180.043.30E-050.130.040.000350.013A; GA; BMI
  1. MS: maternal smoking; GA: gestational age; AC: alcohol consumption; BMI: body mass index; T2D: type 2 diabetes; A: age; BW: birth weight.

Table 3
Significant associations between maternal smoking methylation risk score and phenotypes in CHILD, FAMILY, and START.
CHILDFAMILYSTART
Fixed effectStandard errorAssociation p-valueFixed effectStandard errorAssociation p-valueFixed effectStandard errorAssociation P-value
Smoking exposure (hr/week)1.640.405.40E-052.580.602.34E-050.070.120.58
1 year Smoking exposure (hr/week)0.440.150.0044
3 year Smoking exposure (hr/week)1.150.390.0033
Gestational weight gain (kg)–0.360.380.35–0.620.260.017–0.140.340.69
Gestational age (weeks)1.640.406.32E-052.840.625.52E-060.070.120.59
Birth weight (kg)–0.060.030.016–0.040.020.096–0.030.020.094
Birth length (cm)–0.140.150.35–0.100.100.33–0.370.120.0023
1 year Height (cm)–0.320.160.047–0.340.140.019–0.420.160.0079
2 year Height (cm)–0.130.350.72–0.260.170.14–0.570.210.0067
5 year Height (cm)–0.360.260.16–0.430.260.095–0.470.370.21
3 year Skinfold thickness0.480.190.0140.940.263.46E-040.240.270.38
5 year Skinfold thickness0.560.240.0190.680.370.0680.120.420.77

Additional files

Supplementary file 1

Additional tables and summaries of results.

(A) Quality controls for the inclusion/exclusion of samples and methylation probes. (B) Characteristics of the overall sample include 5176 mother–newborn pairs from the Canadian Healthy Infant Longitudinal Development (CHILD), Family Atherosclerosis Monitoring In early life (FAMILY), and SouTh Asian biRth cohorT (START) cohorts. (C) A summary of available analyses and outcome variables in each cohort. (D) A summary of the DNA methylation (DNAm) maternal smoking score derivation design and results. (E) Characteristics of the epigenetic subsample from CHILD and FAMILY cohorts stratified by smoking status. (F) Score weights for external DNAm maternal smoking scores. (G) summary of cytosine–phosphate–guanines (CpGs) that contribute to the DNAm maternal smoking scores and their weights. (H) Association between maternal smoking methylation risk score and phenotypes in CHILD, FAMILY, and START. (I) Summary of mean difference in methylation risk scores between studies in overall samples and those never smoked.

https://cdn.elifesciences.org/articles/93260/elife-93260-supp1-v1.xlsx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Wei Q Deng
  2. Nathan Cawte
  3. Natalie Campbell
  4. Sandi M Azab
  5. Russell J de Souza
  6. Amel Lamri
  7. Katherine M Morrison
  8. Stephanie A Atkinson
  9. Padmaja Subbarao
  10. Stuart E Turvey
  11. Theo J Moraes
  12. Koon K Teo
  13. Piush J Mandhane
  14. Meghan B Azad
  15. Elinor Simons
  16. Guillaume Paré
  17. Sonia S Anand
(2024)
Maternal smoking DNA methylation risk score associated with health outcomes in offspring of European and South Asian ancestry
eLife 13:RP93260.
https://doi.org/10.7554/eLife.93260.4