The genetic risk of gestational diabetes in South Asian women

  1. Amel Lamri
  2. Jayneel Limbachia
  3. Karleen M Schulze
  4. Dipika Desai
  5. Brian Kelly
  6. Russell J de Souza
  7. Guillaume Paré
  8. Deborah A Lawlor
  9. John Wright
  10. Sonia S Anand  Is a corresponding author
  11. On behalf of for the Born in Bradford and START investigators
  1. Department of Medicine, McMaster University, Canada
  2. Population Health Research Institute, Canada
  3. Department of Health Research Methods, Evidence, and Impact, McMaster University, Canada
  4. Bradford Institute for Health Research, Bradford Royal Infirmary, United Kingdom
  5. Department of Pathology and Molecular Medicine, McMaster University, Canada
  6. Population Health Science, Bristol Medical School, University of Bristol, United Kingdom
  7. MRC Integrative Epidemiology Unit, University of Bristol, United Kingdom
  8. Bristol NIHR Biomedical Research Centre, United Kingdom

Abstract

South Asian women are at increased risk of developing gestational diabetes mellitus (GDM). Few studies have investigated the genetic contributions to GDM risk. We investigated the association of a type 2 diabetes (T2D) polygenic risk score (PRS), on its own, and with GDM risk factors, on GDM-related traits using data from two birth cohorts in which South Asian women were enrolled during pregnancy. 837 and 4372 pregnant South Asian women from the SouTh Asian BiRth CohorT (START) and Born in Bradford (BiB) cohort studies underwent a 75-g glucose tolerance test. PRSs were derived using genome-wide association study results from an independent multi-ethnic study (~18% South Asians). Associations with fasting plasma glucose (FPG); 2 hr post-load glucose (2hG); area under the curve glucose; and GDM were tested using linear and logistic regressions. The population attributable fraction (PAF) of the PRS was calculated. Every 1 SD increase in the PRS was associated with a 0.085 mmol/L increase in FPG ([95% confidence interval, CI=0.07–0.10], p=2.85×10−20); 0.21 mmol/L increase in 2hG ([95% CI=0.16–0.26], p=5.49×10−16); and a 45% increase in the risk of GDM ([95% CI=32–60%], p=2.27×10−14), independent of parental history of diabetes and other GDM risk factors. PRS tertile 3 accounted for 12.5% of the population’s GDM alone, and 21.7% when combined with family history. A few weak PRS and GDM risk factors interactions modulating FPG and GDM were observed. Taken together, these results show that a T2D PRS and family history of diabetes are strongly and independently associated with multiple GDM-related traits in women of South Asian descent, an effect that could be modulated by other environmental factors.

Editor's evaluation

South Asian women have twice the risk of developing Gestational Diabetes Mellitus (GDM) compared with white European women. This clearly presented comprehensive study shows that a T2D polygenic risk score is strongly associated with multiple GDM-related traits in South Asian women and is a significant contributor to the population-attributable fraction of GDM, independently of family history of diabetes. This will be of interest to genetic epidemiologists and clinicians working in this field.

https://doi.org/10.7554/eLife.81498.sa0

Introduction

Gestational diabetes mellitus (GDM) is defined as hyperglycemia first diagnosed during pregnancy. This abnormal increase in blood glucose levels is associated with an increased risk of adverse health outcomes for both mother and their fetus/child during pregnancy, and later in life (Farrar et al., 2016). It is estimated that 1% to >30% of live births are affected by GDM worldwide. This prevalence has been shown to vary widely depending on the participants ethnicity, countries/regions, and on the diagnostic criteria used (Archambault et al., 2014; McIntyre et al., 2019). South Asian women (whose ancestry derives from the Indian subcontinent) have a twofold increased odds of developing GDM, compared to white European women (Anand et al., 2016; Cosson et al., 2014; Farrar et al., 2015; McIntyre et al., 2019). The reasons for this disproportionate risk have not been fully characterized.

Gestational diabetes is a complex disorder influenced by multiple genetic and environmental factors such as maternal age, ethnicity, obesity, poor diet quality, and family history of diabetes (Anand et al., 2017; Hedderson et al., 2011; Solomon et al., 1997). Most genetic and environmental GDM risk factors are shared with type 2 diabetes (T2D; Sattar and Greer, 2002; Zhang and Ning, 2011) another condition that is thought to be very closely related to GDM. For example, women with GDM have a higher probability of having at least one parent with T2D, compared to those with normal gestational glycemia (Jang et al., 1998). Furthermore, women with a GDM history have a tenfold higher risk of subsequently being diagnosed with T2D compared to those without a history of GDM (Vounzoulaki et al., 2020). In terms of genetic architecture, both candidate gene and genome-wide association studies (GWASs) demonstrated a considerable overlap between GDM and T2D (Hayes et al., 2013; Kwak et al., 2012; Pervjakova et al., 2022). Finally, T2D polygenic risk scores (PRSs) have also been associated with GDM risk (Lamri et al., 2020; Pervjakova et al., 2022).

It has been demonstrated that environmental exposures such as diet and/or physical activity may modulate the effect of T2D loci (such as TCF7L2, PPARG, and CDKAL1) on the risk of T2D (Dietrich et al., 2019). Nevertheless, only a handful of studies have investigated genetic×environmental interactions on GDM (Chen et al., 2019; Grotenfelt et al., 2016; Popova et al., 2017), and to date, no study has tested the interaction between a genome-wide PRS with other GDM risk factors, on the risk of GDM.

The aims of this investigation were to: (i) test the association of a T2D PRS, generated from an external multi-ethnic GWAS (~18% South Asians), with GDM and related traits (fasting plasma glucose [FPG], 2 hr post-load glucose (2hG), and area under the curve glucose [AUCg] levels) in pregnant South Asian women from the SouTh Asian biRth cohorT (START) and the Born in Bradford (BiB) studies; (ii) To estimate the population attributable fraction (PAF) of the PRS on GDM; and (iii) To determine whether the effect of the PRS is modulated by other GDM risk factors including age, BMI, diet quality, birth country, education, and parity.

Results

The proportion of women classified with GDM using the IADPSG criteria was 25% and 11.2% in START and BiB, respectively, which was lower than the proportion using the South Asian-specific definition of 36.2% and 22.9%, respectively. Notably the proportion of women with GDM was higher in START compared to BiB irrespective of the classification method used.

The proportion of women of Indian origin in START and BiB was 71.8% and 5.1%, while the proportion of Pakistani women was 23.4% and 94.3%, respectively. The proportion of participants born in the Indian sub-continent was higher in START (88.6%) than in BiB (55.6%), and the average number of years spent in Canada or the United Kingdom among these participants was lower in START compared to BiB (6.6 vs. 9.7 years, respectively). The proportions of primiparous women (40.9% vs. 31.7%) and women with one prior pregnancy (42.4% vs.26.9%) were higher in START than in BiB. Conversely, participants with two or more prior pregnancies were more frequent in BiB than START (41.4% vs. 16.6%, respectively). The proportion of vegetarian participants was higher in START than in BiB (36.4% vs. 1.3%). Finally, the proportion of participants with a post-secondary degree/diploma or higher was greater in START than BiB (84.0% vs. 29.0%).

The standardized PRS ranged between –3.23 and 3.12 in START as compared to –3.51 and 4.16 in BiB. The full list of genetic variants included in the PRS as well as their characteristics are shown in Supplementary file 1a.

Table 1 shows the baseline characteristics of the South Asian women from the START and BiB stratified by GDM case versus non GDM (IADPSG criteria). As expected, women with GDM had a higher mean fasting, 2hG and AUCg levels than non-GDM participants. Participants with GDM were older, had a higher BMI, and were more likely to report a family history of diabetes compared to women without GDM, in both studies. The overall diet quality was lower in participants with GDM compared to non-GDM participants in START (data not available in BiB). Of note, the average difference in BMI between GDM cases and controls was higher in BiB than in START (3.0 and 1.9, respectively) (Table 1). Women with GDM had a higher mean PRS compared to women without GDM. Similarly, women with GDM were more likely to have PRS categorized in tertile 2 or 3, compared to tertile 1 (Table 1).

Table 1
Characteristics of START and BiB study participants included in the analysis.
STARTBiB
No GDMGDMp ValueNo GDMGDMp Value
N (%)759 (75)253 (25)3809 (88.8)481 (11.2)
Age, years29.8 (3.8)31.6 (4)5.55×10–1027.7 (5)30.5 (5.4)1.40×10–22
Height, cm162.5 (6.27)161.13 (6.01)0.002159.9 (5.69)158.3 (5.66)6.19×10–08
Weight, kg a61.7 (11.7)65.6 (12.9)2.00×10–0564.8 (14.1)71.1 (15.1)2.22×10–16
BMI, kg/m2 b23.4 (4.3)25.3 (4.9)4.93×10–0825.4 (5.2)28.4 (5.8)5.89×10–23
Parity, n (%)
 0328 (44.3%)78 (31.1%)0.0011189 (32.2%)129 (27.4%)5.03×10–07
 1299 (40.4%)122 (48.6%)1026 (27.8%)94 (20%)
 2 or more114 (15.4%)51 (20.3%)1473 (39.9%)248 (52.7%)
Post-secondary education, n (%)641 (84.6%)208 (82.2%)0.43952 (29.4%)110 (26.3%)0.22
Country of origin/ancestry, n (%)
 India567 (74.7%)160 (63.2%)0.001175 (5.2%)19 (4.3%)0.37 c
 Pakistan163 (21.5%)74 (29.2%)3198 (94.2%)425 (95.5%)
 Other29 (3.8%)19 (7.5%)23 (0.7%)1 (0.2%)
Born in South Asia, n (%)671 (88.5%)225 (88.9%)0.951836 (54.2%)291 (65.7%)6.50×10–06
Years in recruitment country (Canada/UK) d6.4 (5.8)7.4 (5.8)0.029.3 (9)12.1 (9.4)3.88×10–06
Parental history of diabetes, n (%)282 (37.3%)142 (56.1%)2.25×10–07891 (27.4%)170 (38.9%)8.88×10–07
Vegetarians, n (%)266 (37%)84 (34.6%)0.5412 (1.3%)1 (1.1%)>0.99 c
Low diet quality, n (%)180 (24)88 (35.1)8.00×10–04
Polygenic risk score (z-scores)–0.11 (1)0.347 (0.93)1.51×10–08–0.04 (0.99)0.32 (1.04)4.98×10–12
Polygenic risk score
 Tertile 1240 (37.7%)39 (19.4%)2.74×10–061309 (34.4%)117 (24.3%)7.60×10–10
 Tertile 2206 (32.4%)73 (36.3%)1291 (33.9%)142 (29.5%)
 Tertile 3190 (29.9%)89 (44.3%)1209 (31.7%)222 (46.2%)
Fasting plasma glucose, mmol/L4.27 (0.32)5.02 (0.83)5.51×10–324.53 (0.41)5.34 (1.14)3.18×10–43
1 hr post-load glucose, mmol/L7.31 (1.38)10.26 (2.02)6.04×10–57
2 hr post-load glucose, mmol/L5.96 (1.16)8.47 (2.16)1.53×10–425.49 (1.02)9.14 (1.97)1.57×10–155
Area under curve glucose, mmol.hr e12.43 (1.83)17.02 (2.89)2.27×10–6310.02 (1.21)14.48 (2.77)3.82×10–133
  1. Characteristics of participants with available PRS and GDM IADPSG, FPG, 1 hr, 2 hr post-load glucose levels or AUC glucose data. Presented data are means (standard deviation) unless otherwise indicated. p Values are calculated from Chi-squared test for categorical variables and independent t-test for continuous variables. a Pre-pregnancy values in START vs. weight at antenatal clinic (average 12 completed weeks of pregnancy) in BiB. b Derived using height measured at initial visit (in both studies) and pre-pregnancy weights (START) or antenatal clinic weights (BiB).c Approximation may be incorrect due to small counts. d Canada for START samples and UK for BiB. e Derived using fasting, 1 hr and 2 hr post-load measurements in START vs. fasting and 2 hr post-load measurements in BiB. Abbreviations: AUC, area under the curve glucose; BiB, Born in Bradford; BMI, body mass index; GDM, gestational diabetes mellitus; IADPSG, International Association of Diabetes and Pregnancy Study Groups; START, south Asian birth cohort; T2D, type 2 diabetes; UK, United Kingdom; vs., versus.

Genetic risk and GDM-related traits in univariate models

The continuous PRS was associated with FPG, 2hG, and AUCg in START and BiB in univariate models. Every 1 SD increase in the PRS was associated with a 0.09 mmol/L increase in FPG (95% confidence interval [CI]=0.07–0.10), 0.23 mmol/L increase in 2hG (95% CI=0.18–0.28), and a 0.17 unit increase in AUCg z-scores (0.14–0.20) in the meta-analyzed results (Supplementary file 1b).

The PRS was also associated with the risk of GDM IADPSG in univariate models whereby a 1 SD increase in PRS was associated with a 47% increase in risk of GDM after meta-analysis (95% CI=35–60%). A similar association is observed using the South Asian-specific definition of GDM, with moderate between-study heterogeneity observed (Supplementary file 1b).

Overall, the risk of GDMIADPSG increased progressively comparing tertile 2 of the PRS to tertile 1, and tertile 3 to tertile 1 (43% and 230%, respectively; Supplementary file 1b). Higher PRS categories were also associated with higher FPG, 2hG, and AUCg levels (Supplementary file 1b).

Multivariable models of GDM risk factors and GDM-related traits

The continuous PRS was strongly and independently associated with FPG, 2hG, and AUCg levels in a multivariable model adjusted for age, BMI, parity, parental history of diabetes, region of birth (South Asia vs. other), education level, and diet quality (available in START only), and the first five PCs (Table 2). For example, every 1 SD increase in the PRS was associated with a 0.08 mmol/L increase in FPG, and 0.21 mmol/L increase in 2hG levels (Table 2). The continuous PRS was also associated with a higher risk of GDM in a model with similar adjustments, whereby every 1 SD increase in the PRS was associated with a 45% increase in the risk of GDM IADPSG (Table 2). Similar association results for GDM using the South Asian-specific criteria were observed and are shown in Supplementary file 1c.

Table 2
Association between GDM risk factors and GDM-related traits: results from multivariate models in START and BiB cohorts.
Dependent variableIndependent variablesSTARTBiBMeta-analysis
Beta/OR [95% CI]p ValueBeta/OR [95% CI]p ValueBeta (SE)/OR [95% CI]p ValueI2QE p value
Fasting glucosePRS (per 1 SD increase)0.083 [0.043–0.123]6.00×10–050.085 [0.065–0.105]1.67×10–160.085 [0.067–0.103]2.85×10–2000.92
Age (year)0.021 [0.01–0.032]2.00×10–040.014 [0.009–0.019]2.49×10–080.015 [0.011–0.02]4.19×10–11220.26
BMI (kg/m2)0.024 [0.014–0.033]5.51×10–070.032 [0.028–0.036]6.53×10–530.031 [0.027–0.034]7.99×10–60630.1
Born in South Asia (Yes/No)0.037 [−0.088 to 0.162]0.560.08 [0.039–0.122]2.00×10–040.076 [0.037–0.115]2.00×10–0400.52
Parental history of T2D (Yes/No)0.04 [−0.043 to 0.123]0.340.066 [0.02–0.111]0.0050.06 [0.02–0.1]0.00300.6
Parity–0.046 [−0.102 to 0.01]0.11–0.015 [−0.033 to 0.004]0.13–0.018 [−0.036 to 0]0.0590.29
Education level (per level)–0.031 [−0.068 to 0.006]0.1–0.016 [−0.035 to 0.002]0.09–0.019 [−0.036 to −0.002]0.0200.49
Low diet quality (Yes/No)0.102 [0.01–0.193]0.03
2 hr postload glucosePRS (per 1 SD increase)0.189 [0.068–0.311]0.0020.211 [0.156–0.266]7.87×10–140.207 [0.157–0.257]5.49×10–1600.75
Age (year)0.127 [0.093–0.161]8.34×10–130.068 [0.055–0.082]8.82×10–230.076 [0.064–0.089]1.49×10–32900.002
BMI (kg/m2)0.047 [0.019–0.074]0.0010.064 [0.053–0.075]1.43×10–290.062 [0.051–0.072]3.44×10–32230.25
Born in South Asia (Yes/No)0.298 [−0.08 to 0.675]0.120.308 [0.195–0.422]1.14×10–070.308 [0.199–0.416]3.09×10–0800.96
Parental history of T2D (Yes/No)0.361 [0.109–0.613]0.0050.242 [0.117–0.366]1.00×10–040.265 [0.154–0.377]3.23×10–0600.4
Parity–0.279 [−0.45 to −0.109]0.001–0.095 [−0.146 to −0.043]3.00×10–04–0.11 [−0.16 to −0.061]1.00×10–05760.04
Education level (per level)–0.063 [−0.176 to 0.051]0.28–0.073 [−0.124 to −0.022]0.005–0.071 [−0.118 to −0.025]0.00200.87
Low diet quality (Yes/No)0.365 [0.086–0.644]0.01
AUC glucosePRS (per 1 SD increase)0.165 [0.099–0.231]1.08×10–060.152 [0.119–0.185]2.50×10–190.155 [0.125–0.184]7.74×10–2500.74
Age (per year)0.068 [0.05–0.087]1.16×10–120.043 [0.035–0.051]1.01×10–240.047 [0.039–0.054]3.34×10–35840.01
BMI (kg/m2)0.047 [0.032–0.062]2.12×10–090.047 [0.041–0.054]8.89×10–440.047 [0.041–0.053]4.27×10–5300.94
Born in South Asia (Yes/No)0.081 [−0.123 to 0.285]0.440.201 [0.133–0.269]8.15×10–090.189 [0.124–0.253]1.01×10–08170.27
Parental history of T2D (Yes/No)0.122 [−0.015 to 0.258]0.080.138 [0.063–0.213]3.00×10–040.134 [0.069–0.2]6.00×10–0500.83
Parity–0.122 [−0.214 to −0.029]0.01–0.057 [−0.088 to −0.026]3.00×10–04–0.063 [−0.093 to −0.034]2.00×10–05410.19
Education level (per level)–0.045 [−0.106 to 0.016]0.15–0.045 [−0.075 to −0.014]0.004–0.045 [−0.072 to −0.017]0.00100.99
Low diet quality (Yes/No)0.215 [0.064–0.366]0.005
GDM (IADPSG criteria)PRS (per 1 SD increase)1.56 [1.3–1.88]2.97×10–061.42 [1.27–1.59]1.09×10–091.45 [1.32–1.6]2.27×10–1400.4
Age (year)1.13 [1.07–1.19]2.50×10–061.1 [1.07–1.13]1.07×10–131.11 [1.08–1.13]1.98×10–1800.4
BMI (kg/m2)1.08 [1.04–1.12]1.00×10–041.08 [1.06–1.11]1.01×10–141.08 [1.06–1.1]6.25×10–1800.87
Born in South Asia (Yes/No)1.35 [0.78–2.43]0.31.72 [1.35–2.19]1.00×10–051.65 [1.33–2.06]8.37×10–0600.44
Parental history of T2D (Yes/No)1.67 [1.17–2.38]0.0051.53 [1.21–1.94]5.00×10–041.57 [1.29–1.92]7.06×10–0600.69
Parity0.86 [0.68–1.09]0.230.87 [0.79–0.95]0.0030.87 [0.79–0.95]0.00100.99
Education level (per level)0.89 [0.76–1.05]0.180.9 [0.81–0.99]0.040.9 [0.82–0.98]0.0100.96
Low diet quality (Yes/No)1.68 [1.14–2.47]0.008
  1. Models were additionally adjusted for the first five principal components (PCs) of each study. Abbreviations: BiB, Born in Bradford; BMI, Body mass index; CI, Confidence interval; GDM, Gestational diabetes mellitus; IADPSG, International Association of Diabetes and Pregnancy Study Groups; OR, Odds ratio; QE P, P-value from the test for (residual) heterogeneity; SA, South Asia; SD, Standard deviation; START, South Asian birth cohort; T2D, Type 2 diabetes.

When testing tertiles of PRS with similar covariates, our results show that participants in the second and third PRS tertiles have a 37% and 119% increase in the risk of GDMIADPSG compared to participants in tertile 1, respectively (Supplementary file 1d). Higher PRS tertiles were also associated with higher FPG, 2hG, and AUCg levels (Supplementary file 1d). The effect sizes associated with tertiles 2 were higher in START than BiB across multiple GDM-related traits (2hG, AUCg, and GDM; Supplementary file 1d).

Population attributable fraction and detection rate

In a model adjusted for maternal age, BMI, education, birth in South Asia (yes/no), parental history of diabetes, and diet quality (in START only), the PRS tertile 3 accounted for 12.5% of the population’s total GDM IADPSG cases overall, and was higher in START than in BiB (Table 3). The combined effect of PRS and parental history of diabetes on GDM accounted for ~21.7% of the population’s GDM cases in the two studies combined (Table 3).

Table 3
Population attributable fractions of GDM risk factors in mothers from the START and Born in Bradford studies (multivariable models).
Independent variableSTARTBiBMeta-analysis
AF [95% CI]p ValueAF [95% CI]p ValueAF [95% CI]p ValueI2QE p value
Age (29–31 vs. <29 years)5.6 [−9.1 to 20.2]0.468.3 [3.5–13]6.00×10–048 [3.5–12.5]5.00×10–0400.73
Age (>32 vs. <29 years)31.2 [17.1–45.3]1.00×10–0520.2 [14.8–25.7]4.72×10–1321.7 [16.6–26.8]9.19×10–17500.16
Body mass index (≥23 vs.<23)21.8 [8.7–34.9]0.00133.8 [25.4–42.2]2.47×10–1530.3 [23.3–37.4]3.59×10–17560.13
Born in SA (Yes vs. No)13.5 [−17.2 to 44.3]0.3919.3 [12.6–26]1.47×10–0819 [12.5–25.6]1.07×10–0800.72
Education (Post-secondary vs. less)–18.2 [−46.8 to 10.5]0.21–0.8 [−4.6 to 3.1]0.7–1.1 [−4.9 to 2.7]0.58280.24
Parental history of T2D (Yes vs. No)15.1 [4.4–25.7]0.0058.3 [4.1–12.5]1.00×10–049.2 [5.3–13.1]3.54×10–06260.24
PRS (Tertile 3 vs. 1+2)13.8 [4.9–22.6]0.00212.2 [7.8–16.6]5.14×10–0812.5 [8.6–16.5]4.47×10–1000.76
Low Diet Quality (Yes vs. No)8.9 [1.5–16.4]0.02
Sum PAF of PRS (T3) and parental history of diabetes28.920.521.7
  1. GDM status derived using IADPSG criteria. Multivariate models included age, BMI, region of birth (South Asia vs other), education, parental history of diabetes, parity, principal components 1–5, and diet quality (START only) when applicable. Abbreviations: BiB, Born in Bradford; BMI, Body mass index, CI, Confidence interval; GDM, Gestational diabetes mellitus; IADPSG, International Association of Diabetes and Pregnancy Study Groups; PAF, Population attributable fraction; PRS, Polygenic risk score; QE P, P-value from the test for (residual) heterogeneity; START, SouTh Asian BiRth CohorT.

The detection rate associated with the top versus lower PRS tertile was equal to 10% for a 5% false positive rate.

Interactions between the PRS and GDM risk factors on GDM

No consistent interactions were observed between the PRS and maternal age; parity; or education level modulating FPG, 2hG, AUCg, or GDM in START or BiB (Table 4 and Supplementary file 1e).

Table 4
Interaction effects between GDM risk factors and T2D PRS in START and BiB.
Interaction termDependent variableSTARTBiBMeta-analysis
Beta/OR [95% CI]aPinteractionBeta/OR [95% CI]aPinteractionQE p valueI2Beta/OR [95% CI] aPinteraction
Fasting glucosePRS×Age–0.006 [−0.016 to 0.003]0.20.004 [0–0.008]0.070.002 [−0.001 to 0.006]0.23720.06
PRS×BMI0.01 [−0.019 to −0.002]0.010.004 [0–0.008]0.050.001 [−0.002 to 0.005]0.42890.002
PRS×Born in South Asia0.137 [−0.268 to −0.006]0.040.037 [−0.003 to 0.078]0.070.022 [−0.016 to 0.061]0.26840.01
PRS×Parental history of T2D–0.059 [−0.139 to 0.022]0.150.016 [−0.028 to 0.061]0.48–0.001 [−0.04 to 0.037]0.94610.11
PRS×Parity–0.014 [−0.062 to 0.034]0.560.004 [−0.01 to 0.018]0.60.002 [−0.011 to 0.016]0.7300.48
PRS×Education level–0.014 [−0.05 to 0.022]0.45–0.005 [−0.022 to 0.013]0.6–0.006 [−0.022 to 0.009]0.4200.65
PRS×Low diet quality0.141 [0.053–0.228]0.002
2 hr post-load glucosePRS×Age0 [−0.03 to 0.03]0.980.01 [−0.001 to 0.021]0.070.009 [−0.001 to 0.019]0.0800.54
PRS×BMI–0.022 [−0.047 to 0.003]0.090 [−0.01 to 0.011]0.94–0.003 [−0.012 to 0.007]0.56610.11
PRS×Born in South Asia–0.191 [−0.586 to 0.205]0.340.072 [−0.039 to 0.182]0.20.053 [−0.054 to 0.159]0.33360.21
PRS×Parental history of T2D–0.092 [−0.335 to 0.151]0.460.055 [−0.066 to 0.177]0.370.026 [−0.083 to 0.135]0.64110.29
PRS×Parity–0.039 [−0.184 to 0.107]0.60.009 [−0.03 to 0.047]0.660.005 [−0.032 to 0.043]0.7700.54
PRS×Education level0.037 [−0.072 to 0.146]0.510.008 [−0.039 to 0.056]0.730.013 [−0.031 to 0.056]0.5600.64
PRS×Low diet quality0.068 [−0.199 to 0.335]0.62
AUC glucosePRS×Age–0.007 [−0.023 to 0.009]0.410.004 [−0.002 to 0.011]0.190.003 [−0.003 to 0.009]0.36370.21
PRS×BMI0.014 [−0.027 to 0]0.050.002 [−0.004 to 0.008]0.52–0.001 [−0.006 to 0.005]0.82770.04
PRS×Born in South Asia–0.126 [−0.34 to 0.088]0.250.015 [−0.051 to 0.081]0.650.003 [−0.06 to 0.066]0.93350.22
PRS×Parental history of T2D–0.027 [−0.158 to 0.105]0.690.025 [−0.048 to 0.098]0.490.013 [−0.051 to 0.077]0.6800.5
PRS×Parity–0.057 [−0.135 to 0.022]0.160.006 [−0.017 to 0.029]0.60.001 [−0.021 to 0.023]0.91560.13
PRS×Education level0.007 [−0.052 to 0.066]0.82–0.008 [−0.036 to 0.021]0.6–0.005 [−0.03 to 0.021]0.7100.67
PRS×Low diet quality0.07 [−0.074 to 0.214]0.34
GDM (IADPSG criteria)PRS×Age0.99 [0.94–1.03]0.590.99 [0.96–1.01]0.170.99 [0.97–1]0.1400.92
PRS×BMI0.97 [0.94–1.01]0.150.98 [0.96–1]0.030.98 [0.96–0.99]0.0100.76
PRS×Born in South Asia0.65 [0.33–1.23]0.21.04 [0.82–1.31]0.760.98 [0.79–1.23]0.89410.19
PRS×Parental history of T2D0.72 [0.5–1.04]0.081.07 [0.85–1.35]0.590.95 [0.78–1.16]0.63670.08
PRS×Parity0.88 [0.71–1.09]0.230.99 [0.92–1.07]0.860.98 [0.91–1.05]0.57150.28
PRS×Education level1.01 [0.85–1.19]0.930.93 [0.85–1.02]0.120.95 [0.87–1.03]0.200.4
PRS×Low diet quality1.26 [0.85–1.89]0.26
  1. Nominally significant results are shown in bold. Results from models adjusted for age, BMI, education level, birth region (South Asia vs. other), parity, parental history of diabetes, and genetic PC axes 1–5. a Values are Beta for continuous dependent variables (fasting 2 hr, and AUC glucose), and OR for binary variable (i.e. GDM). Abbreviations: AUC, area under the curve; BiB, Born in Bradford; BMI, body mass index; CI, confidence interval; GDM, gestational diabetes mellitus; IADPSG, International Association of Diabetes and Pregnancy Study Groups; OR, odds ratio; PRS, polygenic risk score; QE P, P-value from the test for (residual) heterogeneity; START, SouTh Asian BiRth CohorT.

A couple of nominally significant interactions modulating the continuous trait of FPG were observed in START were not confirmed in BiB and vice versa. These included the PRS×BMI and the PRS×birth in South Asia (yes/no) interactions (START Pinteraction=0.01 and 0.04, respectively), yet non-significant in BiB (Pinteraction PRS×BMI=0.05 and P interaction PRS×birth in South Asia=0.07), with different effect sizes and opposing direction of effect between the two studies (Supplementary file 1f), resulting in non-significant meta-analysis of these effects (Pinteraction PRS×BMI=0.42 and P interaction PRS×birth in South Asia=0.26, respectively). Another interaction between the PRS and BMI modulating the risk of GDM was observed in BiB (Pinteraction=0.03), but not in START (Pinteraction=0.15; Table 4). Given that the overall direction of effect was similar in the two studies, this interaction remained significant after meta-analysis (Pinteraction=0.01). Nevetheless, this result in START could be a false negative given the study’s smaller sample size (with a power to detect a similar interaction to BiB of 9.9%). Subgroup analysis shows that the impact of a higher PRS on the risk of GDM was stronger in participants in lower BMI categories (Supplementary file 1f, Figure 1). Finally, a PRS×diet quality interaction on FPG was detected in START (Pinteraction=0.002; Table 4), whereby the effect of the PRS appeared to be stronger in participants with a low diet quality (Beta=0.17 [95% CI=0.10–0.24]) than in participants with a medium or high diet quality (Beta=0.05 [95% CI=0.00–0.09]) (Supplementary file 1f and Figure 2). Our analysis shows that we have 90% power to detect such an interaction. The overall diet quality score was not available in BiB; hence, this interaction could not be tested for replication.

Predicted probability of GDMIADPSG as a function of PRS (continuous), stratified by BMI groups in BiB.

Lines (with 95% confidence limits) represent predicted probabilities of GDM stratified by BMI groups (upper, middle, and lower terciles). Models are adjusted for maternal age. BiB, Born in Bradford; BMI, body mass index; GDM, gestational diabetes mellitus; IADPSG, International Association of Diabetes and Pregnancy Study Groups; PRS, polygenic risk score; SD, standard deviation.

Multivariable regression of PRS (continuous) fasting plasma glucose (FPG) stratified by diet quality score in START.

Regression lines (with 95% confidence limits) represent predictions of FPG. Models are adjusted for maternal age, and BMI. BMI, body mass index; PRS, polygenic risk score; START, SouTh Asian BiRth CohorT.

Discussion

We demonstrate that a T2D PRS, based on an independent and multi-ethnic GWAS meta-analysis (with ~18% South Asian participants), is strongly associated with GDM and related glucose traits among South Asian pregnant women settled in Canada and the United Kingdom. This association is independent of other known GDM risk factors, including age, BMI, parental history of diabetes, and birth country. The PRS highest tertile accounted for 12.5% of the PAF of GDM. Consistent with a recent trans-ethnicity GWAS of GDM, and these results support the hypothesis that GDM and T2D are part of the same underlying pathology (Pervjakova et al., 2022).

Family history of T2D is often used as a surrogate marker of the genetic risk of T2D. Our results show that the addition of the PRS to the multivariate models does not nullify the impact of parental history on GDM and vice versa. This suggests that the PRS and family history of diabetes both partially convey independent information. This partial independence could be explained by the fact that the PRS does not entirely capture the genetic association signals with GDM. On the other hand, family history reflects not only genetic similarity, but also shared non-genetic lifestyle factors.

By deriving a T2D PRS and showing its significant association with the risk of GDM, we confirm that the two diseases share a substantial proportion of their genetic background. In their recent publication, Pervjakova et al., 2022 also describe strong genetic similarities between the two traits by comparing the association and effect size of T2D variants to their effect on GDM. This convergence of observations using two different approaches (testing a PRS in our case versus independent loci in Pervjakova et al.) solidifies the hypothesis of a common genetic background between T2D and GDM. It is however important to note that, although BiB’s South Asian mothers were included in both analysis, they represented ~1.2% of the total sample size in Pervjakova et al., which suggests that our congruent conclusions are unlikely to have been driven by the sample overlap between the two studies.

Overall, the evidence for modulation of the PRS’s effect on GDM-related traits by other GDM risk factors was weak. Most interactions tested were not significant in both studies. This absence of significance should however be treated with caution since our power analysis suggests that, given our sample size, we are only able to detect strong interaction effects. Two marginal PRS×BMI and PRS×South Asia born interactions on FPG were observed, these were close to significance in both studies but did not replicate definitively, both in terms effect sizes and direction of effect, which precludes a power issue, and suggests differences in the effect of these environmental factors between the two studies, or possibly false positive results. Furthermore, these interactions would not pass multiple testing corrections if applied. Two potentially stronger PRS×diet quality, and PRS×BMI interactions modulating FPG and GDM were observed in START, and BiB respectively. However, since it was not possible to replicate these interactions (i.e., no comparable diet data available in BiB, and low power in START), future investigations are required in order to validate these observations. If confirmed, these interactions may help identify a subpopulation who will benefit the most from a targeted intervention for the prevention of GDM. Given the transient nature of GDM, another important research question would be the identification of women at greater risk of developing T2D after developing GDM, and how the genetic risk modulates this progression. This could be done by testing the interactions between a GDM/T2D PRS and T2D status in women with prior GDM. This could reveal whether women with prior GDM and a high genetic risk are more likely to develop T2D than women with prior GDM and a low genetic risk. Finally, given the low sensitivity of the PRS themselves, future studies should focus on deriving and estimating the predictive value of a composite score which combines the GDM/T2D PRS, family history of diabetes, prior GDM status, and diet quality score in order to improve the identification of women at higher risk of developing T2D.

The overall clinical implications of our findings should be carefully considered. At present, the use of laboratory-derived genetic information in the clinical setting remains expensive and is not implemented for complex diseases like GDM or T2D. Furthermore, our results show that, despite a strong association, the PRS has a low discriminatory value (detection rate of 10% for a 5% false positive rate) regarding GDM cases. This is in line with the observations of Wald and Old, 2019 stating that most polygenic scores of complex traits derived to date would perform poorly as a screening tests in a clinical setting.

Our study has been considerably strengthened by the use of a PRS optimized for a large population of South Asians from two independent cohorts, as well as by the fact that GDM status was determined using objective OGTT measures. Nevertheless, there are some limitations to our analysis that should be considered: (i) the weights attributed to the genetic variants included in the PRS are derived from a T2D study. Overall, evidence points to a strong correlation between top variants from T2D and GDM GWASs. However, variants at some common loci (e.g., MTNR1B) might have significantly different effect size depending on the phenotype studied (Pervjakova et al., 2022). In addition, variants in at least one locus (HKDC1) have been strongly associated to GDM but not T2D (Pervjakova et al., 2022). More GDM-specific loci, or loci with a different magnitude of effect between GDM and T2D might be identified from future, larger studies. These observations suggest that future PRSs based on a GDM GWAS may have more power to detect gene×environment interactions. (ii) Second, some differences in measurements exist between START and BiB studies, including the timing of weight measurements, and the number of data points included in the calculation of AUCg. However, since data were standardized in both studies, we do not expect that AUCg measurements differences had a major impact on the results. (iii) Finally, the comparison of genetic data between START and BiB revealed the existence of slight genetic heterogeneity, both between and within the samples of these two cohorts. It is our assumption that these differences can be explained by the difference of sample size (START being smaller than BiB), as well as by historical differences in migration patterns from South Asia to Canada and the United Kingdom. For example, most START participants were first-generation migrants from India, whereas the majority of South Asians in BiB are descendants of Pakistani migrants who settled in the United Kingdom for several generations. In order to account for this genetic heterogeneity, we derived our T2D PRS by combining samples from the two studies. This PRS should be more generalizable to other South Asian studies. Another measure implemented to reduce the effect of population stratification was the adjustment for the PC axes in our analysis. Given the absence of heterogeneity in our FPG, 2hG, or GDMIADPSG PC adjusted models, we consider that population stratification effects have been accounted for.

Conclusion

A T2D-derived PRS is strongly associated with the risk of GDM in pregnant women of South Asian descent, independent of parental history of diabetes, and other GDM risk factors.

Methods

Study design and participants

START is a prospective cohort study designed to evaluate the environmental and genetic determinants of cardio-metabolic traits among South Asian women and their offspring living in Canada (Anand et al., 2013). In brief, 1012 South Asian pregnant women, aged between 18 and 40 years old, were recruited during their second trimester of pregnancy from the Peel Region (Ontario, Canada) through physician referrals between 2011 and 2015. All START participants provided informed consent, and the study was approved by local ethics committees (Hamilton Integrated Research Ethics Board [ID:10-640], William Osler Health System [ID:11-0001], and Trillium Health Partners [RCC:11-018, ID:492]).

BiB is a prospective, longitudinal family cohort study designed to investigate the causes of illness, and develop interventions to improve health in a deprived multi-ethnic population in Bradford, England, UK (Wright et al., 2013). Between 2007 and 2011, 12,453 women of various ethnic backgrounds (~46% South Asian origin) were recruited between their 24th and 28th week of pregnancy. Detailed information on socio-economic characteristics, ethnicity, family history, environmental, and physical risk factors has been collected (Farrar et al., 2015; Wright et al., 2013). Ethical approval for all aspects of the research was granted by Bradford Research Ethics Committee [Ref 07/H1302/112].

Measurements and questionnaires

SouTh Asian BiRth CohorT

A detailed description of the maternal measurements has been published previously (Anand et al., 2017). Briefly, weight and height were measured using standard procedures, and information about pre-pregnancy weight, family, and personal medical history was collected using questionnaires. Parental history of diabetes was derived from baseline questionnaires and categorized as neither parent, or either one, or both parents had a history of diabetes. Birth country, number of years spent in Canada, and education-related variables were self-reported. Participants’ highest level of education was coded as a five-category ordinal variable as: 1—less than high school; 2—high school completed; 3—Diploma or certificate from trade, technical or vocational school; 4— Bachelor’s or undergraduate degree, or teacher’s college; and 5— Master’s, Doctorate or professional degree. A binary ‘born in South Asia’ variable was categorized as participants born in South Asia (India, Pakistan, Sri Lanka, or Bangladesh versus participants were born in any other country). A validated ethnic-specific food frequency questionnaire (FFQ) was used to collect dietary information (Kelemen et al., 2003). The following steps were implemented in order to calculate the diet quality of each participant: (i) for each of the following four food groups (green leafy vegetables; raw vegetables; other cooked vegetables; and fruits), 1 point was given for consuming ≥the study population median (vs. 0 points if intake <population median); (ii) for each of the following two food groups (fried foods/fast food/snacks; and meat/poultry), 1 point was given for consuming <the study population median (vs. 0 points if intake ≥population median); (iii) the points attributed to each of the six food groups mentioned above were summed in order to derive a continuous food score (ranging from 0 to 6 points), which was subsequently divided into three categories (Low diet quality — if food score=1 or 2; Medium diet quality — if food score=3 or 4; and High diet quality if food score=5 or 6). (iv) A binary diet quality variable used in our analysis was coded as follows (Low diet quality — if food score=1 or 2; medium or high quality — if food score≥3) (Anand et al., 2017).

Born in Bradford

Maternal height was measured during the recruitment visit (24–28th weeks of pregnancy) using standard procedures. In the absence of pre-pregnancy weight data, weight from the first antenatal clinic visit (average 12 weeks of pregnancy) was used to calculate BMI. Ethnicity of participants and years spent in the United Kingdom were self-reported at recruitment through an interview administered questionnaire; missing ethnicity data were backfilled from primary care data when available. The South Asian ethnicity of all participants included in this analysis was validated using genetic data. Parental history of diabetes and ‘born in South Asia’ variables were derived from the baseline questionnaire data and coded as in START. Since only a very small proportion of BiB’s participants completed an FFQ that included information about fruits and vegetables intake, the diet quality score could not be derived in BiB. Data regarding the participant’s highest educational qualification were equalized (using UK standards) and recoded into the following categories: 1— less than 5 General Certificate of Secondary Education (GCSE) equivalent; 2— 5 GCSE equivalent; 3— A-level equivalent; and 4— higher than A-level. Data for unclassifiable foreign degrees were considered as missing.

Outcomes

Study participants without prior T2D were invited to undertake a 75-g oral glucose tolerance test (OGTT) in both START and BiB, and FPG, and 2hG levels were measured (1 hr post-load glucose was measured in START only). AUCg was calculated using the FPG and 2hG glucose levels in BiB, and using the FPG, 1 hr post-load glucose, and 2hG levels in START (Anand et al., 2017). Given the difference in the number of data points included in the calculation of AUC between the two studies and the skewness of the distributions, values were log-transformed, winsorized, and standardized in each study before analysis. Gestational diabetes status of women without pre-existing T2D was primarily defined based on OGTT results in both studies using the International Association of Diabetes and Pregnancy Study Group (IADPSG) GDM criteria (FPG≥5.1 mmol/L or higher, or a 1hG≥10.8 or a 2hG≥8.5 mmol/L or higher) (Metzger et al., 2010). Our secondary outcome was GDM using BiB’s South Asian specific definition (FPG of 5.2 mmol/L or higher, or a 2hG of 7.2 mmol/L or higher) (Farrar et al., 2015), which will be referred to as the South Asian-specific definition hereafter. Self-reported GDM status or data from the birth chart were used to determine GDM’s status if OGTT measures were unavailable (N=65 and 31 in START and BiB, respectively). Women with pre-existing diabetes at baseline were not included in this analysis. Pre-pregnancy diabetes status was determined using maternal self-reported data (about diabetes diagnosis, diabetes medication, and/or insulin intake prior to pregnancy) in START. In BiB, information on pre-pregnancy diabetes was backfilled from electronic medical records.

In order to keep a single pregnancy (and a single GDM status) per mother in BiB, only pregnancies with no missing data for GDM were included. For mothers with available data at multiple pregnancies at this stage, pregnancies with no missing data across all covariates (age, BMI, family history, birth country, parity, and education level) were prioritized. Next, only pregnancies with the least amount of missing data across all covariates were kept. The following two additional filtering approaches were then applied for mothers with multiple pregnancies remaining: (i) if GDM was not diagnosed at any of the pregnancies, phenotype data at the latest available time point was kept (i.e., keep older GDM controls) and (ii) if GDM was diagnosed during any of the pregnancies included in the study, the earliest time point where GDM was diagnosed was kept (i.e., keep younger GDM cases).

DNA extraction, genotyping, imputation, and filtering

SouTh Asian BiRth CohorT

DNA was extracted and genotyped for 867 mothers using the Illumina Human CoreExome-24 and Infinium CoreExome-24 arrays (Illumina, San Diego, CA). About 837 samples passed standard quality control procedures (Anderson et al., 2010). Genotype data was handeled using PLINK v1.90b6.8 (Chang et al., 2015) . Genotypes were phased and imputed using SHAPEIT v2.12 (Delaneau et al., 2014), and IMPUTE v2.3.2 (Howie et al., 2009), respectively, using the 1000 Genomes (phase 3) data as a reference panel (Auton et al., 2015). Variants with an info score <0.7 were removed from analysis. In total, 837 START participants with both genotypes and available GDM status, FPG, 1hG, and/or 2hG levels were included in the analysis.

Born in Bradford

DNA was extracted and genotyped for 16,267 and 3663 BiB participants using the Illumina HumanCoreExome (12v1.0, 12v1.1, or 24v1.0) and InfiniumGlobal Screening Array (24v2.0) arrays, respectively (Illumina, San Diego, CA). About 4372 South Asian mothers passed genotyping quality controls, had GDM status, FPG, and/or 2hG levels available, and were included in our analysis. Genotype data was handeled using PLINK v1.90b6.8 (Chang et al., 2015).

Deriving the PRS

Given the absence of publicly available South Asian-specific T2D or GDM GWAS data at the time of the analysis, weights were derived from the DIAGRAM’s 2014 multi-ethnic T2D GWAS meta-analysis, which included over 18% of South Asians (~63% European and 19% other ethnic backgrounds) (Mahajan et al., 2014). A grid search approach was used to identify the optimal parameters (17 p values tested, ranging from 5×10–8 to 1 with 0.1 increase; 4 heritability values tested: 0.023, 0.06, 0.08, and 0.12). START and BiB genotypes were pooled. About 70% of the samples’ data were used for training and 30% for validation (random sampling stratified by study) in order to minimize the impact of population stratification. The PRS was derived using LDpred2 (Privé et al., 2020). The best PRS (i.e., that maximized the AUC) was characterized by a p value≤0.0014 and an h2=0.08 (NSNVs=6492). The PRS was standardized (mean=0, standard deviation=1) in both studies before analysis.

Principal component analysis of genetic data

A principal component analysis (PCA) was performed using the PC-Air function from the GENESIS R package (v2.20.0) (Conomos et al., 2015a; Conomos et al., 2015b). Kinship matrices (required to derive PCs with PC-Air) were derived using KING (v2.2.5) (Manichaikul et al., 2010a; Manichaikul et al., 2010b).

Statistical analysis

Regression models

The statistical analysis was conducted using R (v3.6.3) (R core Team, 2016). Linear regression models were used to test the association between the PRS and FPG, 2hG and AUCg. PRS and GDM associations were tested using logistic regression. Both univariate and multivariate models were constructed with adjustment for GDM risk factors (age, BMI, parity, birth in South Asia [yes vs. no]), education level, and diet quality (in START only) and the first five PCs (in order to minimize the effect of population stratification). Interactions between the PRS and each risk factor was also tested. Interaction plots were produced using the interactions R package (v1.2.0.9000) (Long, 2021).

Population attributable fractions

The estimated PAFs and their corresponding standard errors were calculated using the AF R package (v.0.1.5) (Dahlqwist and Sjolander, 2019). To this end, continuous variables were recoded into categorical variables: age was divided into two categories ([29–31, 32–43] vs. 19–28); BMI was stratified into a two categories variable using South Asian obesity cutoff points suggested by Gray et al., 2011 (<23 vs. ≥23); the PRS was divided into two categories (tertiles 1+2 versus tertile 3); parity was divided into two categories (primiparity versus 1 pregnancy or more); education level was divided into two categories (completed high school or lower versus higher degree, diploma, or certificate in START; and A-level equivalent or lower versus higher than A-level in BiB).

Detection and false positive rates

Detection rate (sensitivity) and false positive rate (1-specificity) for the OR of association of PRS tertile 3 versus 1 was estimated using the risk-screening converter tool developed by Wald and Morris, 2011.

Power analysis for interactions

Power to detect interactions was estimated using the InteractionPoweR R package (v0.1.1) (Baranger et al., 2022). Monte-carlo simulation was used using 10.000 simulations and an alpha of 0.05.

Data availability

Data from START is not publicly available, since the study is bound by consent which indicates the data will not be used by an outside group. Requests for collaboration or replication will be considered for research purposes only (no commercial use allowed, as per the study's informed consent). Requests should be addressed to the study's principal investigator (Sonia Anand, anands@mcmaster.ca) via a form which will be provided upon request by emailing natcampb@mcmaster.ca. The request will be evaluated by PIs and co-investigators, and projects deemed of scientific interest will be further evaluated/validated by local REB chair. Born in Bradford data are available for research purposes only by sending an expression of interest form downloadable from https://borninbradford.nhs.uk/wp-content/uploads/BiB_EoI_v3.1_10.05.21.doct to borninbradford@bthft.nhs.uk . The proposal will be reviewed by BiB's executive team. If the request is approved, the requester will be asked to sign a Data Sharing Contract and a Data Sharing Agreement. Full details on how to access data and forms can be found here https://borninbradford.nhs.uk/research/how-to-access-data/. The code used to analyze the data is available at https://github.com/AmelLamri/Paper_T2dPrsGdm_StartBiB (copy archived at swh:1:rev:78a26e8d3c4088325572b8a79e132dca65b7a67f). All Sharable processed versions of the datasets used in the manuscript are made available as supplementary material or at https://github.com/AmelLamri/Paper_T2dPrsGdm_StartBiB.

References

    1. Archambault C
    2. Arel R
    3. Filion KB
    (2014)
    Gestational diabetes and risk of cardiovascular disease: a scoping review
    Open Medicine 8:e1–e9.
    1. Mahajan A
    2. Go MJ
    3. Zhang W
    4. Below JE
    5. Gaulton KJ
    6. Ferreira T
    7. Horikoshi M
    8. Johnson AD
    9. Ng MCY
    10. Prokopenko I
    11. Saleheen D
    12. Wang X
    13. Zeggini E
    14. Abecasis GR
    15. Adair LS
    16. Almgren P
    17. Atalay M
    18. Aung T
    19. Baldassarre D
    20. Balkau B
    21. Bao Y
    22. Barnett AH
    23. Barroso I
    24. Basit A
    25. Been LF
    26. Beilby J
    27. Bell GI
    28. Benediktsson R
    29. Bergman RN
    30. Boehm BO
    31. Boerwinkle E
    32. Bonnycastle LL
    33. Burtt N
    34. Cai Q
    35. Campbell H
    36. Carey J
    37. Cauchi S
    38. Caulfield M
    39. Chan JCN
    40. Chang L-C
    41. Chang T-J
    42. Chang Y-C
    43. Charpentier G
    44. Chen C-H
    45. Chen H
    46. Chen Y-T
    47. Chia K-S
    48. Chidambaram M
    49. Chines PS
    50. Cho NH
    51. Cho YM
    52. Chuang L-M
    53. Collins FS
    54. Cornelis MC
    55. Couper DJ
    56. Crenshaw AT
    57. van Dam RM
    58. Danesh J
    59. Das D
    60. de Faire U
    61. Dedoussis G
    62. Deloukas P
    63. Dimas AS
    64. Dina C
    65. Doney AS
    66. Donnelly PJ
    67. Dorkhan M
    68. van Duijn C
    69. Dupuis J
    70. Edkins S
    71. Elliott P
    72. Emilsson V
    73. Erbel R
    74. Eriksson JG
    75. Escobedo J
    76. Esko T
    77. Eury E
    78. Florez JC
    79. Fontanillas P
    80. Forouhi NG
    81. Forsen T
    82. Fox C
    83. Fraser RM
    84. Frayling TM
    85. Froguel P
    86. Frossard P
    87. Gao Y
    88. Gertow K
    89. Gieger C
    90. Gigante B
    91. Grallert H
    92. Grant GB
    93. Grrop LC
    94. Groves CJ
    95. Grundberg E
    96. Guiducci C
    97. Hamsten A
    98. Han B-G
    99. Hara K
    100. Hassanali N
    101. Hattersley AT
    102. Hayward C
    103. Hedman AK
    104. Herder C
    105. Hofman A
    106. Holmen OL
    107. Hovingh K
    108. Hreidarsson AB
    109. Hu C
    110. Hu FB
    111. Hui J
    112. Humphries SE
    113. Hunt SE
    114. Hunter DJ
    115. Hveem K
    116. Hydrie ZI
    117. Ikegami H
    118. Illig T
    119. Ingelsson E
    120. Islam M
    121. Isomaa B
    122. Jackson AU
    123. Jafar T
    124. James A
    125. Jia W
    126. Jöckel K-H
    127. Jonsson A
    128. Jowett JBM
    129. Kadowaki T
    130. Kang HM
    131. Kanoni S
    132. Kao WHL
    133. Kathiresan S
    134. Kato N
    135. Katulanda P
    136. Keinanen-Kiukaanniemi KM
    137. Kelly AM
    138. Khan H
    139. Khaw K-T
    140. Khor C-C
    141. Kim H-L
    142. Kim S
    143. Kim YJ
    144. Kinnunen L
    145. Klopp N
    146. Kong A
    147. Korpi-Hyövälti E
    148. Kowlessur S
    149. Kraft P
    150. Kravic J
    151. Kristensen MM
    152. Krithika S
    153. Kumar A
    154. Kumate J
    155. Kuusisto J
    156. Kwak SH
    157. Laakso M
    158. Lagou V
    159. Lakka TA
    160. Langenberg C
    161. Langford C
    162. Lawrence R
    163. Leander K
    164. Lee J-M
    165. Lee NR
    166. Li M
    167. Li X
    168. Li Y
    169. Liang J
    170. Liju S
    171. Lim W-Y
    172. Lind L
    173. Lindgren CM
    174. Lindholm E
    175. Liu C-T
    176. Liu JJ
    177. Lobbens S
    178. Long J
    179. Loos RJF
    180. Lu W
    181. Luan J
    182. Lyssenko V
    183. Ma RCW
    184. Maeda S
    185. Mägi R
    186. Männisto S
    187. Matthews DR
    188. Meigs JB
    189. Melander O
    190. Metspalu A
    191. Meyer J
    192. Mirza G
    193. Mihailov E
    194. Moebus S
    195. Mohan V
    196. Mohlke KL
    197. Morris AD
    198. Mühleisen TW
    199. Müller-Nurasyid M
    200. Musk B
    201. Nakamura J
    202. Nakashima E
    203. Navarro P
    204. Ng P-K
    205. Nica AC
    206. Nilsson PM
    207. Njølstad I
    208. Nöthen MM
    209. Ohnaka K
    210. Ong TH
    211. Owen KR
    212. Palmer CNA
    213. Pankow JS
    214. Park KS
    215. Parkin M
    216. Pechlivanis S
    217. Pedersen NL
    218. Peltonen L
    219. Perry JRB
    220. Peters A
    221. Pinidiyapathirage JM
    222. Platou CG
    223. Potter S
    224. Price JF
    225. Qi L
    226. Radha V
    227. Rallidis L
    228. Rasheed A
    229. Rathman W
    230. Rauramaa R
    231. Raychaudhuri S
    232. Rayner NW
    233. Rees SD
    234. Rehnberg E
    235. Ripatti S
    236. Robertson N
    237. Roden M
    238. Rossin EJ
    239. Rudan I
    240. Rybin D
    241. Saaristo TE
    242. Salomaa V
    243. Saltevo J
    244. Samuel M
    245. Sanghera DK
    246. Saramies J
    247. Scott J
    248. Scott LJ
    249. Scott RA
    250. Segrè AV
    251. Sehmi J
    252. Sennblad B
    253. Shah N
    254. Shah S
    255. Shera AS
    256. Shu XO
    257. Shuldiner AR
    258. Sigurđsson G
    259. Sijbrands E
    260. Silveira A
    261. Sim X
    262. Sivapalaratnam S
    263. Small KS
    264. So WY
    265. Stančáková A
    266. Stefansson K
    267. Steinbach G
    268. Steinthorsdottir V
    269. Stirrups K
    270. Strawbridge RJ
    271. Stringham HM
    272. Sun Q
    273. Suo C
    274. Syvänen A-C
    275. Takayanagi R
    276. Takeuchi F
    277. Tay WT
    278. Teslovich TM
    279. Thorand B
    280. Thorleifsson G
    281. Thorsteinsdottir U
    282. Tikkanen E
    283. Trakalo J
    284. Tremoli E
    285. Trip MD
    286. Tsai FJ
    287. Tuomi T
    288. Tuomilehto J
    289. Uitterlinden AG
    290. Valladares-Salgado A
    291. Vedantam S
    292. Veglia F
    293. Voight BF
    294. Wang C
    295. Wareham NJ
    296. Wennauer R
    297. Wickremasinghe AR
    298. Wilsgaard T
    299. Wilson JF
    300. Wiltshire S
    301. Winckler W
    302. Wong TY
    303. Wood AR
    304. Wu J-Y
    305. Wu Y
    306. Yamamoto K
    307. Yamauchi T
    308. Yang M
    309. Yengo L
    310. Yokota M
    311. Young R
    312. Zabaneh D
    313. Zhang F
    314. Zhang R
    315. Zheng W
    316. Zimmet PZ
    317. Altshuler D
    318. Bowden DW
    319. Cho YS
    320. Cox NJ
    321. Cruz M
    322. Hanis CL
    323. Kooner J
    324. Lee J-Y
    325. Seielstad M
    326. Teo YY
    327. Boehnke M
    328. Parra EJ
    329. Chambers JC
    330. Tai ES
    331. McCarthy MI
    332. Morris AP
    333. DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium
    334. Asian Genetic Epidemiology Network Type 2 Diabetes (AGEN-T2D) Consortium
    335. South Asian Type 2 Diabetes (SAT2D) Consortium
    336. Mexican American Type 2 Diabetes (MAT2D) Consortium
    337. Type 2 Diabetes Genetic Exploration by Nex-generation sequencing in muylti-Ethnic Samples (T2D-GENES) Consortium
    (2014) Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility
    Nature Genetics 46:234–244.
    https://doi.org/10.1038/ng.2897
    1. Solomon CG
    2. Willett WC
    3. Carey VJ
    4. Rich-Edwards J
    5. Hunter DJ
    6. Colditz GA
    7. Stampfer MJ
    8. Speizer FE
    9. Spiegelman D
    10. Manson JE
    (1997)
    A prospective study of pregravid determinants of gestational diabetes mellitus
    JAMA 278:1078–1083.

Decision letter

  1. Edward D Janus
    Reviewing Editor; University of Melbourne, Australia
  2. Ricardo Azziz
    Senior Editor; University at Albany, SUNY, United States
  3. Edward D Janus
    Reviewer; University of Melbourne, Australia

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "The Genetic Risk of Gestational Diabetes in South Asian women" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, including Edward D Janus as Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Ricardo Azziz as the Senior Editor.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) Given that after delivery GDM resolves to normoglycaemia in most cases there is considerable interest in finding ways to determine which women with GDM (rather than all) should be targeted for diet and lifestyle intervention to prevent development of subsequent T2DM. While this is not the objective of your study you could nevertheless note this in your discussion and suggest how this might be studied to further understanding of this issue.

2) Are the 'Born in Bradford' participants in this study also in the Pervjakova paper? – Pervjakova et al. 2022, PMID: 35220425 which is referenced in the manuscript (as a medRxiv – now published). Please make clear what the degree of overlap between BIB participants is between these papers if any. Please be more explicit about what this paper adds to the Pervjakova analysis in the Discussion. This will be useful to readers, because that previous paper reached similar conclusions about the overlap between T2D and GDM genetic susceptibilities, and most of the statistical power here comes from the BIB cohort.

3) The support for the hypothesis from the newly reported START cohort is strong and important. It seems unlikely that the interaction analyses would have had adequate power for the negative result to be interpreted as a strong negative. Can the authors comment on this, and possibly quantify the power they had?

4) There is a debate about the utility of PRS in screening which goes beyond cost (p16); as the authors will know Wald and others have argued that for most diseases PRS, even with a high aetiological fraction, would lack sensitivity and specificity if used for screening. Readers might be interested to know what sort of sensitivity and specificity for GDM might result with the OR of 1.47 found here. The authors do use the phrase "enhanced predictive value" on p16 so I believe asking for that to be quantified is fair.

https://doi.org/10.7554/eLife.81498.sa1

Author response

Essential revisions:

1) Given that after delivery GDM resolves to normoglycaemia in most cases there is considerable interest in finding ways to determine which women with GDM (rather than all) should be targeted for diet and lifestyle intervention to prevent development of subsequent T2DM. While this is not the objective of your study you could nevertheless note this in your discussion and suggest how this might be studied to further understanding of this issue.

Thank you for this suggestion. This point is now addressed in the discussion page 11 line 167 of the revised.docx manuscript (line 168 in the.pdf file) where we state that:

“Given the transient nature of GDM, another important research question would be the identification of women at greater risk of developing T2D after developing GDM, and how the genetic risk modulates this progression. This could be done by testing the interactions between a GDM/T2D PRS and T2D status in women with prior GDM. This could reveal whether women with prior GDM and a high genetic risk are more likely to develop T2D than women with prior GDM and a low genetic risk. Finally, given the low sensitivity of the PRS themselves, future studies should focus on deriving and estimating the predictive value of a composite score which combines the GDM/T2D PRS, family history of diabetes, prior GDM status, and diet quality score in order to improve the identification of women at higher risk of developing T2D.”

2) Are the 'Born in Bradford' participants in this study also in the Pervjakova paper? – Pervjakova et al. 2022, PMID: 35220425 which is referenced in the manuscript (as a medRxiv – now published). Please make clear what the degree of overlap between BIB participants is between these papers if any. Please be more explicit about what this paper adds to the Pervjakova analysis in the Discussion. This will be useful to readers, because that previous paper reached similar conclusions about the overlap between T2D and GDM genetic susceptibilities, and most of the statistical power here comes from the BIB cohort.

This is indeed an important point, which we now discuss on page 10 line 142 of the revised.docx manuscript (line 143 in.pdf). In brief: We mention the overlap between the BiB samples in the Pervjakova paper and ours, but state that it probably didn’t have a major impact given that BiB’s South Asian participants represent ~1.2% of Pervjakova’s total sample size. As for the added value of our manuscript, we state that the two studies reached a similar conclusion, but by using two different approaches (PRS in our case, locus by locus in Pervjakova) which helps strengthens the claim of a common genetic background.

3) The support for the hypothesis from the newly reported START cohort is strong and important. It seems unlikely that the interaction analyses would have had adequate power for the negative result to be interpreted as a strong negative. Can the authors comment on this, and possibly quantify the power they had?

Lack of power is indeed an issue, especially for interaction tests. We now acknowledge this in the discussion page 11 line 153 (line 154 in.pdf) by stating that:

“Most interactions tested were not significant in both studies. This absence of significance should however be treated with caution since our power analysis suggests that, given our sample size, we are only able to detect strong interaction effects for the majority of our tests”.

We also show the power estimations for the two strongest interactions (FPG ~ PRS x Diet and GDM ~ PRS x BMI) in the Results section page 19 lines 117 and 123 (lines 119 and 125 in.pdf version). Although we did calculate it, we do not show power tests results for the FPG ~ PRS x BMI and FPG ~ PRS x Birth in South Asia (y/n) since the interaction are significant (or close to significance) in both studies but have opposite direction of effect, which, as mentioned on page 11 line 158 (159 in the.pdf) “precludes a power issue, and suggests differences in the effect of these environmental factors between the two studies, or possibly false positive results.”.

4) There is a debate about the utility of PRS in screening which goes beyond cost (p16); as the authors will know Wald and others have argued that for most diseases PRS, even with a high aetiological fraction, would lack sensitivity and specificity if used for screening. Readers might be interested to know what sort of sensitivity and specificity for GDM might result with the OR of 1.47 found here. The authors do use the phrase "enhanced predictive value" on p16 so I believe asking for that to be quantified is fair.

Thank you to the reviewer for this comment. We now estimate the sensitivity and specificity of the PRS as suggested on page 8 line 101 (102 in.pdf file), and discuss the point by stating that “despite a strong association, the PRS has a low discriminatory value (detection rate of 10% for a 5% false positive rate) regarding GDM cases. This is in line with the observations of Wald et al. stating that most polygenic scores of complex traits derived to date would perform poorly as a screening tests in a clinical setting (Wald and Old, 2019).” (Page 12, line 178 of the revised.docx manuscript , and line 179 in the.pdf file).

https://doi.org/10.7554/eLife.81498.sa2

Article and author information

Author details

  1. Amel Lamri

    1. Department of Medicine, McMaster University, Hamilton, Canada
    2. Population Health Research Institute, Hamilton, Canada
    Contribution
    Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing - original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7182-0661
  2. Jayneel Limbachia

    1. Population Health Research Institute, Hamilton, Canada
    2. Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
    Contribution
    Formal analysis, Writing - original draft, Writing – review and editing
    Competing interests
    No competing interests declared
  3. Karleen M Schulze

    Population Health Research Institute, Hamilton, Canada
    Contribution
    Data curation, Writing – review and editing
    Competing interests
    No competing interests declared
  4. Dipika Desai

    Population Health Research Institute, Hamilton, Canada
    Contribution
    Conceptualization, Project administration, Writing – review and editing
    Competing interests
    No competing interests declared
  5. Brian Kelly

    Bradford Institute for Health Research, Bradford Royal Infirmary, Bradford, United Kingdom
    Contribution
    Data curation, Writing – review and editing
    Competing interests
    No competing interests declared
  6. Russell J de Souza

    1. Population Health Research Institute, Hamilton, Canada
    2. Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
    Contribution
    Conceptualization, Writing – review and editing
    Competing interests
    No competing interests declared
  7. Guillaume Paré

    1. Population Health Research Institute, Hamilton, Canada
    2. Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
    3. Department of Pathology and Molecular Medicine, McMaster University, Hamilton, Canada
    Contribution
    Supervision, Validation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6795-4760
  8. Deborah A Lawlor

    1. Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
    2. MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
    3. Bristol NIHR Biomedical Research Centre, Bristol, United Kingdom
    Contribution
    Resources, Funding acquisition, Writing – review and editing
    Competing interests
    has received support from Medtronic Ltd and Roche Diagnostics for research unrelated to that presented here. No financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work
  9. John Wright

    Bradford Institute for Health Research, Bradford Royal Infirmary, Bradford, United Kingdom
    Contribution
    Resources, Funding acquisition, Writing – review and editing
    Competing interests
    No competing interests declared
  10. Sonia S Anand

    1. Department of Medicine, McMaster University, Hamilton, Canada
    2. Population Health Research Institute, Hamilton, Canada
    3. Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Writing - original draft, Project administration, Writing – review and editing
    For correspondence
    anands@mcmaster.ca
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3692-7441

Funding

Canadian Institutes of Health Research (298104)

  • Sonia S Anand

Canadian Institutes of Health Research (FDN-143255)

  • Sonia S Anand

Bristol NIHR Biomedical Research Center

  • Deborah A Lawlor

UK Medical Research Council (MC_UU_00011/6)

  • Deborah A Lawlor

British Heart Foundation (CH/F/20/90003)

  • Deborah A Lawlor

Canada Research Chairs

  • Sonia S Anand

Heart and Stroke Foundation (Michael G. DeGroote Chair)

  • Sonia S Anand

Wellcome (WT101597MA)

  • Deborah A Lawlor
  • John Wright

Medical Research Council (MR/N024397/1)

  • Deborah A Lawlor
  • John Wright

Economic and Social Research Council (MR/N024397/1)

  • Deborah A Lawlor
  • John Wright

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.

Acknowledgements

Research projects in START and Born in Bradford are only possible because of the enthusiasm and commitment of the parents and children involved in these two studies. The authors are grateful to all the participants, teachers, school staff, health professionals, and researchers, and other contributors who have made these studies happen. Studies: The South Asian Birth Cohort (START) study data were collected as part of a program funded by the Indian Council of Medical Research in Canada and by the Canadian Institutes of Health Research (Grant INC-109205), and the Heart and Stroke Foundation (Grant NA7283) with founding principal investigators: Sonia S Anand, Anil Vasudevan, Milan Gupta, Katherine Morrison, Anura Kurpad, Koon K Teo, and Krishnamachari Srinivasan. The Born in Bradford (BiB) The Born in Bradford cohort is funded by the National Institute for Health Research Collaboration for Applied Health Research and Care (NIHR CLAHRC) and the Programme Grants for Applied Research funding scheme (RP-PG-0407-10044). The study also receives funding from the Wellcome Trust (WT101597MA), a joint grant from the UK Medical Research Council (MRC) and Economic and Social Science Research Council (ESRC) (MR/N024397/1) and the British Heart Foundation (CS/16/4/32482). DNA extraction was funded by the UK Medical Research Council via the Integrative Epidemiology Unit (MRC IEU; MC_UU_12013/5) and genotyping via the MRC IEU and a National Institute of Health Research Senior Investigator Award to DAL (NF-0616-10102). Research associate (AL) and graduate student (JL) costs were covered by two Canadian Institutes of Health Research Grants [Project grant number: 298104, Foundation Scheme grant number: FDN-143255, Study grant numbers: INC 109205, NA 7283] awarded to SSA; DAL’s contribution to this study is supported by the Bristol NIHR Biomedical Research Centre, the UK Medical Research Council (MC_UU_00011/6) and the British Heart Foundation (CH/F/20/90003). SSA is supported by a Tier 1 Canada Research Chain in Ethnic Diversity and Cardiovascular Disease, and a Heart and Stroke Foundation/Michael G DeGroote Chair in Population Health Research at McMaster University.

Ethics

Human subjects: All START and BiB participants provided informed consent. The START study was approved by local ethics committees (Hamilton Integrated Research Ethics Board [ID:10-640], William Osler Health System [ID:11-0001], and Trillium Health Partners [RCC:11-018, ID:492]). Ethical approval for all aspects of the research was granted by Bradford Research Ethics Committee [Ref 07/H1302/112].

Senior Editor

  1. Ricardo Azziz, University at Albany, SUNY, United States

Reviewing Editor

  1. Edward D Janus, University of Melbourne, Australia

Reviewer

  1. Edward D Janus, University of Melbourne, Australia

Publication history

  1. Received: June 30, 2022
  2. Preprint posted: July 16, 2022 (view preprint)
  3. Accepted: November 2, 2022
  4. Accepted Manuscript published: November 22, 2022 (version 1)
  5. Version of Record published: November 23, 2022 (version 2)

Copyright

© 2022, Lamri et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 261
    Page views
  • 29
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Amel Lamri
  2. Jayneel Limbachia
  3. Karleen M Schulze
  4. Dipika Desai
  5. Brian Kelly
  6. Russell J de Souza
  7. Guillaume Paré
  8. Deborah A Lawlor
  9. John Wright
  10. Sonia S Anand
  11. On behalf of for the Born in Bradford and START investigators
(2022)
The genetic risk of gestational diabetes in South Asian women
eLife 11:e81498.
https://doi.org/10.7554/eLife.81498

Further reading

    1. Cancer Biology
    2. Genetics and Genomics
    Minsu Kang, Hee Young Na ... Jong Seok Lee
    Research Article

    We aimed to elucidate the evolutionary trajectories of gallbladder adenocarcinoma (GBAC) using multi-regional and longitudinal tumor samples. Using whole-exome sequencing data, we constructed phylogenetic trees in each patient and analyzed mutational signatures. A total of 11 patients including 2 rapid autopsy cases were enrolled. The most frequently altered gene in primary tumors was ERBB2 and TP53 (54.5%), followed by FBXW7 (27.3%). Most mutations in frequently altered genes in primary tumors were detectable in concurrent precancerous lesions (biliary intraepithelial neoplasia, BilIN), but a substantial proportion was subclonal. Subclonal diversity was common in BilIN (n=4). However, among subclones in BilIN, a certain subclone commonly shrank in concurrent primary tumors. In addition, selected subclones underwent linear and branching evolution, maintaining subclonal diversity. Combined analysis with metastatic tumors (n=11) identified branching evolution in 9 patients (81.8%). Of these, 8 patients (88.9%) had a total of 11 subclones expanded at least 7-fold during metastasis. These subclones harbored putative metastasis-driving mutations in cancer-related genes such as SMAD4, ROBO1, and DICER1. In mutational signature analysis, 6 mutational signatures were identified: 1, 3, 7, 13, 22, and 24 (cosine similarity >0.9). Signatures 1 (age) and 13 (APOBEC) decreased during metastasis while signatures 22 (aristolochic acid) and 24 (aflatoxin) were relatively highlighted. Subclonal diversity arose early in precancerous lesions and clonal selection was a common event during malignant transformation in GBAC. However, selected cancer clones continued to evolve and thus maintained subclonal diversity in metastatic tumors.

    1. Genetics and Genomics
    Sara A Carioscia, Kathryn J Weaver ... Rajiv C McCoy
    Research Article

    Recently published single-cell sequencing data from individual human sperm (n = 41,189; 969-3,377 cells from each of 25 donors) offer an opportunity to investigate questions of inheritance with improved statistical power, but require new methods tailored to these extremely low-coverage data (∼0.01 x per cell). To this end, we developed a method, named rhapsodi, that leverages sparse gamete genotype data to phase the diploid genomes of the donor individuals, impute missing gamete genotypes, and discover meiotic recombination breakpoints, benchmarking its performance across a wide range of study designs. Mendel's Law of Segregation states that the offspring of a diploid, heterozygous parent will inherit either allele with equal probability. While the vast majority of loci adhere to this rule, research in model and non-model organisms has uncovered numerous exceptions whereby 'selfish' alleles are disproportionately transmitted to the next generation. Evidence of such 'transmission distortion' (TD) in humans remains equivocal in part because scans of human pedigrees have been under-powered to detect small effects. After applying rhapsodi to the sperm sequencing data, we therefore scanned the gametes for evidence of TD. Our results exhibited close concordance with binomial expectations under balanced transmission. Together, our work demonstrates that rhapsodi can facilitate novel uses of inferred genotype data and meiotic recombination events, while offering a powerful quantitative framework for testing for TD in other cohorts and study systems.