Polymorphisms in intron 1 of HLA-DRA differentially associate with type 1 diabetes and celiac disease and implicate involvement of complement system genes C4A and C4B
Figures
Cox Proportional Hazard (PH) regression results for all tested outcomes.
Log hazard ratios (log(HR)) and 95% confidence intervals of the tri-single-nucleotide polymorphism (SNP) 101 haplotype as well as the known risk factors first-degree relative (FDR) and sex from the Cox PH models for outcomes type 1 diabetes (T1D) (A), islet antigen (IA) (B), insulin autoantibody (IAA)-first (C), glutamic acid decarboxylase autoantibody (GADA)-first (D), celiac disease (CD) (E), and celiac disease autoimmunity (CDA) (F), using the entire cohort (left) or only the DR3-DQ2 homozygote individuals (right). Dashed vertical line at 0 indicating an HR of 1 (log(HR)=0), i.e., no effect on risk. Left side of the vertical line indicates reduced risk vs increased risk on the right side. Whiskers indicate 95% CI around HR. The model assesses the independent risk/protection afforded by each covariate compared to the baseline for categorical covariates FDR and sex for which the baselines are having no FDR and female sex, respectively. Tri-SNP 101 is modeled numerically, so the HR reported is per each additional 101 allele.
Complete model output for type 1 diabetes (T1D) outcome.
HR values and 95% CI are plotted for each covariate included in the model. Vertical line marks the HR = 1 (no change in risk). Nobservations = 7638, Nevents = 395.
Complete model output for type 1 diabetes (T1D) outcome using only DR3-DQ2 homozygotes.
HR values and 95% CI are plotted for each covariate included in the model. Vertical line marks the HR = 1 (no change in risk). Nobservations = 1589, Nevents = 36.
Complete model output for islet antigen (IA) outcome using all samples.
HR values and 95% CI are plotted for each covariate included in the model. Vertical line marks the HR = 1 (no change in risk). Nobservations = 7614, Nevents = 855.
Complete model output for islet antigen (IA) outcome using only DR3-DQ2 homozygotes.
HR values and 95% CI are plotted for each covariate included in the model. Vertical line marks the HR = 1 (no change in risk). Nobservations = 1585, Nevents = 121.
Complete model output for glutamic acid decarboxylase autoantibody (GADA)-first outcome.
HR values and 95% CI are plotted for each covariate included in the model. Vertical line marks the HR = 1 (no change in risk). Nobservations = 7614, Nevents = 382.
Complete model output for glutamic acid decarboxylase autoantibody (GADA)-first outcome using only DR3-DQ2 homozygotes.
HR values and 95% CI are plotted for each covariate included in the model. Vertical line marks the HR = 1 (no change in risk). Nobservations = 1585, Nevents = 85.
Complete model output for insulin autoantibody (IAA)-first outcome.
HR values and 95% CI are plotted for each covariate included in the model. Vertical line marks the HR = 1 (no change in risk). Nobservations = 7614, Nevents = 313.
Complete model output for insulin autoantibody (IAA)-first outcome using only DR3-DQ2 homozygotes.
HR values and 95% CI are plotted for each covariate included in the model. Vertical line marks the HR = 1 (no change in risk). Nobservations = 1570, Nevents = 29.
Complete model output for celiac disease (CD) outcome.
log(HR) values and 95% CI are plotted for each covariate included in the model. Vertical line marks the log(HR)=0 (no change in risk). Nobservations = 6530, Nevents = 608.
Complete model output for celiac disease (CD) outcome using only DR3-DQ2 homozygotes.
log(HR) values and 95% CI are plotted for each covariate included in the model. Vertical line marks the log(HR)=0 (no change in risk). Nobservations = 1364, Nevents = 298.
Complete model output for celiac disease autoimmunity (CDA) outcome.
log(HR) values and 95% CI are plotted for each covariate included in the model. Vertical line marks the log(HR)=0 (no change in risk). Nobservations = 6557, Nevents = 1282.
Complete model output for celiac disease autoimmunity (CDA) outcome using only DR3-DQ2 homozygotes.
log(HR) values and 95% CI are plotted for each covariate included in the model. Vertical line marks the log(HR)=0 (no change in risk). Nobservations = 1370, Nevents = 526.
C4 gene expression values with respect to tri-single-nucleotide polymorphism (SNP).
Count per million (CPM) values in 129 DR3 homozygous individuals showing decreasing C4A and increasing C4B gene expression as tri-SNP 101 allele count increases. Each point represents the median CPM value of multiple samples from one individual. Boxes represent the interquartile range (IQR) and midlines mark the median value.
Unique sequence read coverage in C4 region and copy number calls.
Uniquely mapping read coverage from whole genome sequencing (WGS) data of 188 homozygous DR3-DQ2 individuals. C4A and C4B genes share extensive sequence identity along the genes except a ~3 kilobase region indicated with boxes. Reads mapping to these regions were used to estimate C4A (left column) and C4B (right column) copy numbers per sample. Samples were sorted based on C4A copy numbers. A maximum value of 4 was used for the heatmap to moderate high outlier values.
Sequence read coverage in C4 region.
(A) Normalized read coverage from whole genome sequencing (WGS) data of 188 homozygous DR3-DQ2 individuals. Reduced coverage of C4 genes relative to the flanking regions indicates the presence of frequent gene deletions. Total C4 copy numbers per sample indicated on the left side of the heatmap were estimated based on the average coverage per sample in C4 region demarcated with blue boxes below the heatmap, including both C4A and C4B but excluding reads mapping to intronic HERV insertion. A maximum value of 4 was used for the heatmap to moderate high outlier values. (B) Histograms and the kernel density estimates (kde) of the sample distribution based on average read coverage for C4 (blue boxes below heatmap) and copy number invariant flanking regions (gray boxes below heatmap). Flanking regions show a tight distribution with a peak around 2 indicating normal/diploid status. C4 coverage distribution shows three peaks around 1, 1.5, and 2; most likely corresponding to 2, 3, and 4 total copies. (C) Distribution of unique coverage corresponding to C4A and C4B genes (see Figure 3). Major C4A peak at 0 coverage indicating C4A null samples and peaks around 0.3 and 0.6 are likely 1 and 2 copies of C4A, respectively. C4B distribution shows a single null sample as well as 2 peaks around 0.3 and 0.6.
Tables
Distribution of human leukocyte antigen (HLA) and Tri-single-nucleotide polymorphism (SNP) haplotypes among The Environmental Determinants of Diabetes in the Young (TEDDY) samples.
| Tri-SNP Genotype | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 010/010 | 010/101 | 101/101 | 000/010 | 001/010 | 000/101 | 000/000 | 000/001 | 001/101 | 011/101 | 010/011 | All | |
| HLA | ||||||||||||
| DR3/DR3 | 62 | 348 | 1177 | 13 | 2 | 11 | 0 | 0 | 2 | 1 | 0 | 1616 (20.8%) |
| DR3/DR4 | 408 | 2550 | 3 | 19 | 8 | 30 | 0 | 1 | 1 | 1 | 1 | 3022 (38.9%) |
| DR4/DR4 | 1480 | 5 | 3 | 39 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 1531 (19.7%) |
| DR4/DR8 | 14 | 0 | 0 | 1297 | 5 | 0 | 17 | 1 | 0 | 0 | 0 | 1334 (17.2%) |
| DR1/DR4 | 0 | 0 | 0 | 15 | 143 | 0 | 0 | 2 | 0 | 0 | 0 | 160 (2.1%) |
| DR4/DR13 | 0 | 0 | 0 | 0 | 56 | 0 | 0 | 1 | 0 | 0 | 0 | 57 (0.7%) |
| DR3/DR9 | 2 | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 18 (0.2%) |
| DR4/DR9 | 13 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 (0.2%) |
| DR4/DR4*030 X/020 X | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 (0.1%) |
| DR4/DR4*030 X/0304 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 (0.0%) |
| All | 1986 (25.6%) | 2920 (37.6%) | 1183 (15.2%) | 1383 (17.8%) | 217 (2.8%) | 42 (0.5%) | 17 (0.2%) | 5 (0.1%) | 3 (0.0%) | 2 (0.0%) | 1 (0.0%) | 7759 (100%) |
Descriptive characteristics of children with respect to the islet autoimmunity (IA) outcome in number (percentage).
| IA | |||
|---|---|---|---|
| No | Yes | All | |
| Sex | |||
| Female | 3388 (49.5) | 390 (45.5) | 3778 (49.0) |
| Male | 3457 (50.5) | 468 (54.5) | 3925 (51.0) |
| POP | |||
| EUR | 6122 (89.4) | 796 (92.8) | 6918 (89.8) |
| AMR | 647 (9.5) | 56 (6.5) | 703 (9.1) |
| AFR | 76 (1.1) | 6 (0.7) | 82 (1.1) |
| HLA Type | |||
| DR1/DR4 | 142 (2.1) | 18 (2.1) | 160 (2.1) |
| DR3/DR3 | 1486 (21.7) | 121 (14.1) | 1607 (20.9) |
| DR3/DR4 | 2599 (38.0) | 418 (48.7) | 3017 (39.2) |
| DR4/DR13 | 47 (0.7) | 10 (1.2) | 57 (0.7) |
| DR4/DR4 | 1371 (20.0) | 158 (18.4) | 1529 (19.8) |
| DR4/DR8 | 1200 (17.5) | 133 (15.5) | 1333 (17.3) |
| Country | |||
| US | 2885 (42.1) | 297 (34.6) | 3182 (41.3) |
| SWE | 2019 (29.5) | 290 (33.8) | 2309 (30.0) |
| FIN | 1482 (21.7) | 214 (24.9) | 1696 (22.0) |
| GER | 459 (6.7) | 57 (6.6) | 516 (6.7) |
| FDR | |||
| 0 | 6155 (89.9) | 691 (80.5) | 6846 (88.9) |
| 1 | 690 (10.1) | 167 (19.5) | 857 (11.1) |
| All | 6845 (88.9) | 858 (11.1) | 7703 (100) |
Descriptive characteristics of children with respect to type 1 diabetes diagnosis (T1D) outcome in number (percentage).
| T1D | |||
|---|---|---|---|
| Sex | No | Yes | All |
| Female | 3597 (49.2) | 181 (45.6) | 3778 (49.0) |
| Male | 3709 (50.8) | 216 (54.4) | 3925 (51.0) |
| POP | |||
| EUR | 6552 (89.7) | 366 (92.2) | 6918 (89.8) |
| AMR | 674 (9.2) | 29 (7.3) | 703 (9.1) |
| AFR | 80 (1.1) | 2 (0.5) | 82 (1.1) |
| HLA Type | |||
| DR1/DR4 | 146 (2.0) | 14 (3.5) | 160 (2.1) |
| DR3/DR3 | 1571 (21.5) | 36 (9.1) | 1607 (20.9) |
| DR3/DR4 | 2797 (38.3) | 220 (55.4) | 3017 (39.2) |
| DR4/DR13 | 51 (0.7) | 6 (1.5) | 57 (0.7) |
| DR4/DR4 | 1457 (19.9) | 72 (18.1) | 1529 (19.8) |
| DR4/DR8 | 1284 (17.6) | 49 (12.3) | 1333 (17.3) |
| Country | |||
| US | 3034 (41.5) | 148 (37.3) | 3182 (41.3) |
| SWE | 2204 (30.2) | 105 (26.4) | 2309 (30.0) |
| FIN | 1591 (21.8) | 105 (26.4) | 1696 (22.0) |
| GER | 477 (6.5) | 39 (9.8) | 516 (6.7) |
| FDR | |||
| 0 | 6560 (89.8) | 286 (72.0) | 6846 (88.9) |
| 1 | 746 (10.2) | 111 (28.0) | 857 (11.1) |
| All | 7306 (94.8) | 397 (5.2) | 7703 (100) |
Descriptive characteristics of children with respect to glutamic acid decarboxylase autoantibody (GADA)-first appearing antibody outcome in number (percentage).
| GADA first | |||
|---|---|---|---|
| No | Yes | All | |
| Sex | |||
| Female | 3601 (49.2) | 177 (46.3) | 3778 (49.0) |
| Male | 3720 (50.8) | 205 (53.7) | 3925 (51.0) |
| POP | |||
| EUR | 6565 (89.7) | 353 (92.4) | 6918 (89.8) |
| AMR | 678 (9.3) | 25 (6.5) | 703 (9.1) |
| AFR | 78 (1.1) | 4 (1.0) | 82 (1.1) |
| HLA Type | |||
| DR1/DR4 | 157 (2.1) | 3 (0.8) | 160 (2.1) |
| DR3/DR3 | 1522 (20.8) | 85 (22.3) | 1607 (20.9) |
| DR3/DR4 | 2828 (38.6) | 189 (49.5) | 3017 (39.2) |
| DR4/DR13 | 55 (0.8) | 2 (0.5) | 57 (0.7) |
| DR4/DR4 | 1472 (20.1) | 57 (14.9) | 1529 (19.8) |
| DR4/DR8 | 1287 (17.6) | 46 (12.0) | 1333 (17.3) |
| Country | |||
| US | 3036 (41.5) | 146 (38.2) | 3182 (41.3) |
| SWE | 2164 (29.6) | 145 (38.0) | 2309 (30.0) |
| FIN | 1622 (22.2) | 74 (19.4) | 1696 (22.0) |
| GER | 499 (6.8) | 17 (4.5) | 516 (6.7) |
| FDR | |||
| No | 6531 (89.2) | 315 (82.5) | 6846 (88.9) |
| Yes | 790 (10.8) | 67 (17.5) | 857 (11.1) |
| All | 7321 (95.0) | 382 (5.0) | 7703 (100) |
Descriptive characteristics of children with respect to insulin autoantibody (IAA)-first appearing antibody outcome in number (percentage).
| IAA first | |||
|---|---|---|---|
| No | Yes | All | |
| Sex | |||
| Female | 3636 (49.2) | 142 (45.4) | 3778 (49.0) |
| Male | 3754 (50.8) | 171 (54.6) | 3925 (51.0) |
| POP | |||
| EUR | 6624 (89.6) | 294 (93.9) | 6918 (89.8) |
| AMR | 685 (9.3) | 18 (5.8) | 703 (9.1) |
| AFR | 81 (1.1) | 1 (0.3) | 82 (1.1) |
| HLA Type | |||
| DR1/DR4 | 151 (2.0) | 9 (2.9) | 160 (2.1) |
| DR3/DR3 | 1578 (21.4) | 29 (9.3) | 1607 (20.9) |
| DR3/DR4 | 2870 (38.8) | 147 (47.0) | 3017 (39.2) |
| DR4/DR13 | 52 (0.7) | 5 (1.6) | 57 (0.7) |
| DR4/DR4 | 1471 (19.9) | 58 (18.5) | 1529 (19.8) |
| DR4/DR8 | 1268 (17.2) | 65 (20.8) | 1333 (17.3) |
| Country | |||
| US | 3082 (41.7) | 100 (31.9) | 3182 (41.3) |
| SWE | 2215 (30.0) | 94 (30.0) | 2309 (30.0) |
| FIN | 1597 (21.6) | 99 (31.6) | 1696 (22.0) |
| GER | 496 (6.7) | 20 (6.4) | 516 (6.7) |
| FDR | |||
| No | 6600 (89.3) | 246 (78.6) | 6846 (88.9) |
| Yes | 790 (10.7) | 67 (21.4) | 857 (11.1) |
| All | 7390 (95.9) | 313 (4.1) | 7703 (100) |
Descriptive characteristics of children with respect to celiac disease diagnosis (CD) outcome in number (percentage).
| CD | |||
|---|---|---|---|
| No | Yes | All | |
| Sex | |||
| Female | 3413 (48.2) | 365 (59.2) | 3778 (49.0) |
| Male | 3673 (51.8) | 252 (40.8) | 3925 (51.0) |
| POP | |||
| EUR | 6325 (89.3) | 593 (96.1) | 6918 (89.8) |
| AMR | 681 (9.6) | 22 (3.6) | 703 (9.1) |
| AFR | 80 (1.1) | 2 (0.3) | 82 (1.1) |
| HLA Type | |||
| DR1/DR4 | 155 (2.2) | 5 (0.8) | 160 (2.1) |
| DR3/DR3 | 1308 (18.5) | 299 (48.5) | 1607 (20.9) |
| DR3/DR4 | 2805 (39.6) | 212 (34.4) | 3017 (39.2) |
| DR4/DR13 | 54 (0.8) | 3 (0.5) | 57 (0.7) |
| DR4/DR4 | 1444 (20.4) | 85 (13.8) | 1529 (19.8) |
| DR4/DR8 | 1320 (18.6) | 13 (2.1) | 1333 (17.3) |
| Country | |||
| US | 2955 (41.7) | 227 (36.8) | 3182 (41.3) |
| SWE | 2049 (28.9) | 260 (42.1) | 2309 (30.0) |
| FIN | 1594 (22.5) | 102 (16.5) | 1696 (22.0) |
| GER | 488 (6.9) | 28 (4.5) | 516 (6.7) |
| FDR | |||
| No | 6447 (96.4) | 496 (80.7) | 6943 (95.1) |
| Yes | 242 (3.6) | 119 (19.3) | 361 (4.9) |
| All | 7086 (92.0) | 617 (8.0) | 7703 (100) |
Descriptive characteristics of children with respect to celiac disease autoimmunity (CDA) outcome in number (percentage).
| CDA | |||
|---|---|---|---|
| No | Yes | All | |
| Sex | |||
| Female | 2555 (47.2) | 735 (56.8) | 3290 (49.0) |
| Male | 2860 (52.8) | 559 (43.2) | 3419 (51.0) |
| POP | |||
| EUR | 4855 (89.7) | 1237 (95.6) | 6092 (90.8) |
| AMR | 515 (9.5) | 55 (4.3) | 570 (8.5) |
| AFR | 45 (0.8) | 2 (0.2) | 47 (0.7) |
| HLA Type | |||
| DR1/DR4 | 134 (2.5) | 8 (0.6) | 142 (2.1) |
| DR3/DR3 | 870 (16.1) | 530 (41.0) | 1400 (20.9) |
| DR3/DR4 | 2146 (39.6) | 503 (38.9) | 2649 (39.5) |
| DR4/DR13 | 46 (0.8) | 4 (0.3) | 50 (0.7) |
| DR4/DR4 | 1130 (20.9) | 193 (14.9) | 1323 (19.7) |
| DR4/DR8 | 1089 (20.1) | 56 (4.3) | 1145 (17.1) |
| Country | |||
| US | 2232 (41.2) | 471 (36.4) | 2703 (40.3) |
| SWE | 1588 (29.3) | 479 (37.0) | 2067 (30.8) |
| FIN | 1262 (23.3) | 272 (21.0) | 1534 (22.9) |
| GER | 333 (6.1) | 72 (5.6) | 405 (6.0) |
| FDR | |||
| No | 5118 (96.5) | 1119 (86.9) | 6237 (94.6) |
| Yes | 186 (3.5) | 169 (13.1) | 355 (5.4) |
| All | 5415 (80.7) | 1294 (19.3) | 6709 (100) |
Previously published Genome-wide association studies (GWAS) associations.
Single-nucleotide polymorphisms (SNPs) used as covariates in our CoxPH analysis. Celiac disease (CD), celiac disease autoimmunity (CDA), type 1 diabetes (T1D), and islet antigen (IA) columns indicating whether the SNP has been shown to be associated with that outcome and hence used in the model (yes) or not (no). Statistically significant hazard ratios for the associated outcome are also provided under HR columns, protective associations in bold.
| SNP | Locus | CD | CDA | T1D | IA | Publication | HR CD | HR CDA | HR T1D | HR IA |
|---|---|---|---|---|---|---|---|---|---|---|
| rs4851575 | IL18R1, IL18RAP | yes | no | no | no | Sharma et al., 2016 | 1.45 | |||
| rs114569351 | PLEK, FBXO48 | yes | no | no | no | Sharma et al., 2016 | 2.64 | |||
| rs12493471 | CCR9, LZTFL1, CXCR6 | yes | no | no | no | Sharma et al., 2016 | 1.40 | |||
| rs1054091 | RSPH3, TAGAP | yes | no | no | no | Sharma et al., 2016 | 1.59 | |||
| rs72704176 | ASH1L | yes | no | no | no | Sharma et al., 2016 | 2.26 | |||
| rs3771689 | BAZ2B | yes | no | no | no | Sharma et al., 2016 | 0.56 | |||
| rs13014907 | ZNF804A | yes | no | no | no | Sharma et al., 2016 | 2.46 | |||
| rs11739460 | TCOF1 | yes | no | no | no | Sharma et al., 2016 | 1.41 | |||
| rs77532435 | GRB10 | yes | no | no | no | Sharma et al., 2016 | 2.05 | |||
| rs6967298 | AUTS2 | yes | no | no | no | Sharma et al., 2016 | 0.61 | |||
| rs61751041 | LAMB1 | yes | no | no | no | Sharma et al., 2016 | 2.23 | |||
| rs2409747 | XKR6 | yes | yes | no | no | Sharma et al., 2016 | 1.58 | 1.37 | ||
| rs12990970 | NPM1P33, CTLA4 | no | yes | no | no | Sharma et al., 2016 | 0.76 | |||
| rs11709472 | LPP | no | yes | no | no | Sharma et al., 2016 | 0.80 | |||
| rs72717025 | FCGR2A | no | yes | no | no | Sharma et al., 2016 | 1.84 | |||
| rs114157400 | BANK1 | no | yes | no | no | Sharma et al., 2016 | 1.62 | |||
| rs117561283 | IFNG | no | yes | no | no | Sharma et al., 2016 | 1.81 | |||
| rs8013918 | FOS | no | yes | no | no | Sharma et al., 2016 | 0.80 | |||
| rs73043122 | RNASET2, MIR3939 | no | no | yes | no | Sharma et al., 2018 | 3.35 | |||
| rs113306148 | PLEKHA1, MIR3941 | no | no | yes | no | Sharma et al., 2018 | 3.06 | |||
| rs428595 | PPIL2 | no | no | yes | yes | Sharma et al., 2018 | 3.42 | 2.46 | ||
| rs1004446 | INS | no | no | yes | yes | Krischer et al., 2017 | 0.55 | 0.67 | ||
| rs2476601 | PTPN22 | no | no | yes | yes | Krischer et al., 2017 | 1.91 | 1.73 | ||
| rs2292239 | ERBB3 | no | no | yes | yes | Krischer et al., 2017 | 1.68 | 1.45 | ||
| rs3184504 | SH2B3 | no | no | no | yes | Krischer et al., 2017 | 1.40 | |||
| rs9934817 | RBFOX1 | no | no | no | yes | Sharma et al., 2018 | 2.66 | |||
| rs11705721 | PXK, PDHB | no | no | no | yes | Sharma et al., 2018 | 1.41 |
Estimated C4A gene copy number with respect to tri-single-nucleotide polymorphism (SNP) haplotype.
| tri-SNP 101 | |||
|---|---|---|---|
| C4A copy number | 0 | 1 | 2 |
| 0 | 0 | 0 | 107 |
| 1 | 0 | 44 | 10 |
| 2 | 9 | 9 | 7 |
| 3 | 1 | 0 | 1 |
Estimated C4B gene copy number with respect to tri-single-nucleotide polymorphism (SNP) haplotype.
| tri-SNP 101 | |||
|---|---|---|---|
| C4B copy number | 0 | 1 | 2 |
| 0 | 1 | 0 | 0 |
| 1 | 5 | 34 | 4 |
| 2 | 4 | 17 | 119 |
| 3 | 0 | 2 | 2 |
Additional files
-
MDAR checklist
- https://cdn.elifesciences.org/articles/89068/elife-89068-mdarchecklist1-v1.docx
-
Supplementary file 1
Numerical results from all CoxPH analysis and differential gene expression analysis.
- https://cdn.elifesciences.org/articles/89068/elife-89068-supp1-v1.xlsx