Chromatin accessibility variation provides insights into missing regulation underlying immune-mediated diseases

  1. Raehoon Jeong
  2. Martha L Bulyk  Is a corresponding author
  1. Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, United States
  2. Bioinformatics and Integrative Genomics Graduate Program, Harvard University, United States
  3. Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, United States

eLife Assessment

This paper addresses a significant question regarding the low overlap between genetic discoveries for human complex diseases and those for gene expression by emphasizing the contribution of cell-type-specific chromatin accessibility QTLs. The analyses supporting the main claims are convincing, and the key conclusions are valuable and of interest to readers in the fields of human genetics and functional genomics.

https://doi.org/10.7554/eLife.98289.3.sa0

Abstract

Most genetic loci associated with complex traits and diseases through genome-wide association studies (GWAS) are noncoding, suggesting that the causal variants likely have gene regulatory effects. However, only a small number of loci have been linked to expression quantitative trait loci (eQTLs) detected currently. To better understand the potential reasons for many trait-associated loci lacking eQTL colocalization, we investigated whether chromatin accessibility QTLs (caQTLs) in lymphoblastoid cell lines (LCLs) explain immune-mediated disease associations that eQTLs in LCLs did not. The power to detect caQTLs was greater than that of eQTLs and was less affected by the distance from the transcription start site of the associated gene. Meta-analyzing LCL eQTL data to increase the sample size to over a thousand led to additional loci with eQTL colocalization, demonstrating that insufficient statistical power is still likely to be a factor. Moreover, further eQTL colocalization loci were uncovered by surveying eQTLs of other immune cell types. Altogether, insufficient power and context specificity of eQTLs both contribute to the ‘missing regulation’.

Introduction

More than a decade of genome-wide association studies (GWAS) has revealed several properties of the genetic architecture of complex traits and diseases (Visscher et al., 2017; Claussnitzer et al., 2020). Most (~93%) of the genetic associations are detected in the noncoding portion of the genome (Maurano et al., 2012), and disease heritability is concentrated in putative regulatory regions (Gusev et al., 2014). Expression quantitative trait loci (eQTLs), which are loci associated with gene expression levels, are enriched for trait associations (Nicolae et al., 2010; Hormozdiari et al., 2018). Complex traits are also characterized by their extreme polygenicity, where individual genetic association has only a small effect on the trait (O’Connor et al., 2019). Altogether, these observations have led to a prevalent theory that causal genetic variants affect regulation of key genes across the genome, where each gene explains a modest proportion of trait variation (Boyle et al., 2017). There are experimental strategies aimed at nominating putative causal genes at noncoding GWAS loci (Nasser et al., 2021; Weeks et al., 2023; Morris et al., 2023). As an alternate approach, an eQTL signal colocalizing with the GWAS signal illustrates the effect of the causal variant on gene expression and suggests that the affected gene contributes to the trait. Detection of disease-associated eQTLs thus can identify putative disease genes, helping to elucidate disease mechanisms and develop therapeutics targeting them (Plenge et al., 2013).

 Although it has become expected that eQTLs will be discovered in most noncoding GWAS loci, only a minority of trait-associated loci have been explained by eQTLs (Chun et al., 2017; Barbeira et al., 2021; Connally et al., 2022; Yao et al., 2020). The Genotype-Tissue Expression (GTEx) eQTL study across 49 human tissues recognized that, for a typical complex trait, about 20% of GWAS loci contained a colocalized eQTL in the cis region (i.e. 1 Mb) around the gene (i.e. cis-eQTL) (Barbeira et al., 2021). Even when focusing just on putatively causal genes, the rate of colocalization was very low (8%) (Connally et al., 2022). Furthermore, the proportion of trait heritability mediated by cis-eQTLs (h2med/h2SNP) of assayed gene expression was estimated to be only about 11% on average (Yao et al., 2020). We will call the missing link between genetic association to traits and regulatory function of the associated noncoding variants as ‘missing regulation’, as Connally et al., 2022, introduced. To be able to detect eQTLs in the unexplained disease-associated loci, a better understanding of the possible reasons for why they have been missing is essential.

There are many possible explanations for why disease-associated loci are missing colocalized eQTLs (Connally et al., 2022; Umans et al., 2021; Mostafavi et al., 2023; Hukku et al., 2021) and for why h2med/h2SNP estimates for eQTLs are relatively low (Yao et al., 2020). First, statistical power to detect disease-associated eQTLs may be insufficient (Hukku et al., 2021). For example, negative selection against gene expression variation may lead to challenges in detecting eQTLs for trait-relevant genes (Mostafavi et al., 2023; Glassberg et al., 2019). Since disease genes are likely to be dosage sensitive, there would be selection against their having large eQTL effects (Glassberg et al., 2019). Consequently, the negative selection induces a ‘flattening’ effect, in which weak eQTL variants, often in regions distal to the gene’s promoter (Dimas et al., 2009), may reach high enough frequency, whereas strong eQTL variants remain at low frequency (O’Connor et al., 2019). In fact, eQTL-mediated heritability was enriched in genes showing mutational constraint and those with lower cis-heritability (Yao et al., 2020). These weaker or low-allele-frequency eQTL effects would require larger sample sizes to be detected with statistical significance and to show colocalization (Hukku et al., 2021). Second, causal eQTL effects may be specific to cell types that have not been assayed (Umans et al., 2021). For example, immune cell eQTLs are highly cell-type specific, and eQTL effects specific to some immune cell types may mediate immune disease risk (Schmiedel et al., 2018). Specificity of eQTL effects can also be limited to specific cell states (Alasoo et al., 2018; Nathan et al., 2022). Detecting cell-type or cell-state-specific eQTL effects requires the necessary gene expression datasets from the relevant cell types and states (Umans et al., 2021), which has been a limiting resource for such analyses.

To investigate why disease-associated eQTL signals have been missing, we focused on immune-mediated diseases (IMDs) as a model set of complex traits. We aimed to collect IMD-associated loci that are expected to show eQTL signals in some cell type. Since active regulatory elements coordinate target gene expression (Field and Adelman, 2020), we reasoned that variants that affect chromatin phenotypes at regulatory elements, such as transcription factor (TF) binding (Kasowski et al., 2010; Kilpinen et al., 2013; Waszak et al., 2015) and chromatin accessibility (Degner et al., 2012; Kumasaka et al., 2019), have the potential to impact gene expression (Albert and Kruglyak, 2015). These chromatin phenotypes may show detectable genetic effects even when an eQTL effect in the same cell type was not identified in the locus (Wu et al., 2023). For example, only about 20% of lymphoblastoid cell lines’ (LCLs’) PU.1 binding QTLs (bQTLs) that colocalized with blood cell traits’ association showed an eQTL effect for a nearby gene in LCLs (Jeong and Bulyk, 2023).

Here, we analyzed genetic and functional genomic (i.e. ATAC-seq and RNA-seq) data in LCLs. LCLs are derived from B lymphocytes, and their cis-regulatory elements were enriched for variants associated with some IMDs (Kundaje et al., 2015; Farh et al., 2015). We evaluated whether chromatin accessibility QTLs (caQTLs) in LCLs potentially explain IMD associations using mediated heritability analysis (Yao et al., 2020) and colocalization (Giambartolomei et al., 2014; Pickrell et al., 2016). Then, we searched for disease-associated loci that were significant caQTLs, but not eQTLs.

We examined whether the various potential reasons for missing eQTLs can account for IMD-associated loci that are explained by caQTLs but not eQTLs. First, we explored the extent to which eQTLs may have been missed because of limited statistical power. We compared cis-heritability of colocalized caQTLs and eQTLs stratified by distance between the accessible region and the transcription start site (TSS) of the associated gene. We also investigated whether meta-analysis of published LCL eQTL summary statistics, in order to effectively increase the sample size, can uncover previously missed eQTLs. Second, we surveyed whether cell-type specificity of regulatory variant effect may account for the missing regulation. We surveyed various immune cell eQTL data to identify loci with which they colocalize even if LCL eQTLs did not colocalize with those loci.

Through this study, we present how regulatory QTLs beyond eQTLs, such as caQTLs, can be effective in detecting the potential molecular consequences of disease-associated variants. Moreover, results from inspecting disease-associated loci where genetic effects are detected on chromatin accessibility but not on expression suggest reasons why the effects on gene expression may have been missed. These results provide insights on which strategies may be effective in uncovering more genes that underlie diseases.

Results

Accessible chromatin in LCLs explains a significant proportion of immune-mediated disease heritability

We aimed to evaluate whether variants that alter chromatin accessibility in LCLs may explain genetic associations to IMD. First, we verified whether accessible regions in LCLs are enriched for IMD heritability. We reanalyzed 100 LCL ATAC-seq samples (Kumasaka et al., 2019) to define accessible regions in this cell type. With stratified LD score regression (S-LDSC) (Finucane et al., 2015), we estimated their heritability enrichment across 13 IMDs, including 11 autoimmune diseases – autoimmune thyroid disease (ATD) (Cordell et al., 2021), celiac disease (CEL) (Dubois et al., 2010), Crohn’s disease (CD) (de Lange et al., 2017), inflammatory bowel disease (IBD) (de Lange et al., 2017), juvenile idiopathic arthritis (JIA) (López-Isac et al., 2021), multiple sclerosis (MS) (Patsopoulos, 2019; https://imsgc.net/), primary biliary cholangitis (PBC) (Cordell et al., 2021), rheumatoid arthritis (RA) (Ishigaki et al., 2022), systemic lupus erythematosus (SLE) (Bentham et al., 2015), ulcerative colitis (UC) (de Lange et al., 2017), and vitiligo (VIT) (Jin et al., 2016) – and 2 allergic diseases – allergy (ALL) (Loh et al., 2018) and asthma (AST) (Loh et al., 2018). We also analyzed genome-wide association study (GWAS) data for 3 non-immune diseases – type 2 diabetes (T2D) (Morris et al., 2012), coronary artery disease (CAD) (Schunkert et al., 2011), and schizophrenia (SCZ) (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014) – for comparison. Single nucleotide polymorphisms (SNPs) in accessible regions in LCLs were significantly enriched for IMD heritability (p<0.003125 [Bonferroni-corrected threshold], S-LDSC; Figure 1A) and there was no significant enrichment for nonimmune diseases (p>0.05, S-LDSC). These results indicate that accessible regions in LCLs harbor many variants specifically associated with IMDs, and therefore that LCLs share IMD-associated accessible regions with those of the causal cell type(s).

Figure 1 with 3 supplements see all
Immune disease heritability mediated by chromatin accessibility in lymphoblastoid cell lines (LCLs).

(A–B) Heritability enrichment in accessible regions in LCLs, based on (A) stratified linkage disequilibrium (LD) score regression (S-LDSC) and (B) the proportion of heritability mediated by chromatin accessibility quantitative trait loci (caQTLs) in LCLs based on mediated expression score regression (MESC). For both, error bars represent jackknife standard errors of the mean. The color of the bars indicates disease type. *: p<0.003125 (Bonferroni-corrected). (C) Mediated heritability enrichment of accessible regions by peak strength quintile. The strongest peaks are in the 1st quintile, and the weakest peaks are in the 5th quintile. Only enrichment values with FDR < 5% (based on q-value) are shown. Stronger and more significant enrichment indicates that mediated heritability is concentrated in that subset. (D) Mediated heritability enrichment of accessible regions by histone mark annotation. The percentages in parentheses represent the proportion of accessible regions with the indicated histone mark. Only enrichment values with FDR < 5% (based on q-value) are shown. Color and size of the points are on the same scale as in (C). PBC, primary biliary cholangitis; MS, multiple sclerosis; CEL, celiac disease; RA, rheumatoid arthritis; JIA, juvenile idiopathic arthritis; SLE, systemic lupus erythematosus; CD, Crohn’s disease; UC, ulcerative colitis; IBD, inflammatory bowel disease; ATD, autoimmune thyroid disease; VIT, vitiligo; AST, asthma; ALL, allergies; SCZ, schizophrenia; T2D, type 2 diabetes; CAD, coronary artery disease.

Next, we applied mediated expression score regression (MESC) (Yao et al., 2020) to investigate the causal relationship between caQTLs in LCLs and IMD associations (Figure 1—figure supplement 1A). Compared to S-LDSC analysis that tests for heritability enrichment of SNPs with some functional annotation (e.g. accessible regions), MESC analysis specifically estimates the heritability that is mediated (i.e. h2med) by the SNPs’ cis-effects on a molecular phenotype (e.g. caQTLs). We estimated that caQTLs in LCLs mediate 16.3–42.7% of autoimmune disease heritability and 8.5–9.4% of allergic disease heritability (Figure 1B). For nonimmune diseases, the estimates were lower and not significant (p>0.003125 [Bonferroni-corrected threshold], MESC). Interestingly, SCZ showed a nominally significant proportion of caQTL-mediated heritability in LCLs (p<0.05, MESC), consistent with the hypothesis that B cells may play some role in SCZ pathogenesis (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014; van Mierlo et al., 2019). Our results indicate that LCLs are a valid cell type in which to search for caQTLs that mediate genetic risk for 7 IMDs – CD, IBD, MS, PBC, RA, SLE, and UC – but not for allergic diseases. In subsequent analyses, we focused on 7 IMDs – CD, IBD, MS, PBC, RA, SLE, and UC – that showed significant caQTL-mediated heritability (p<0.003125 [Bonferroni-corrected threshold], MESC).

Regions with higher levels of accessibility and active histone marks explain most of caQTL-mediated heritability

To understand which features characterize accessible regions that mediate IMD heritability, we estimated h2med enrichment (proportion of h2med/proportion of peaks) (Yao et al., 2020) in specific sets of accessible regions. We found that peaks with a larger number of nonredundant sequencing reads (i.e., ‘stronger’ peaks) in LCLs showed stronger h2med enrichment (Figure 1C) and thus likely affect IMD-relevant gene expression more than ‘weaker’ peaks do. This observation is consistent with the ‘Activity-by-Contact’ model (Nasser et al., 2021), in which peaks with greater chromatin accessibility and H3K27ac ChIP-seq signal are predicted to have proportional effects on target gene expression.

Next, we considered peaks with the active histone marks H3K27ac, H3K4me1, or H3K4me3 (Waszak et al., 2015; Delaneau et al., 2019). Consistent with prior observations that putative cell-type-specific regulatory elements marked with H3K27ac and H3K4me1 are enriched for relevant disease associations (Kundaje et al., 2015; Farh et al., 2015), we found that caQTLs with H3K27ac and H3K4me1 marks in LCLs were significantly enriched for mediated IMD heritability (q-value <0.05, MESC; Figure 1D). Strikingly, both peak sets explained almost all of caQTL-mediated IMD heritability (Figure 1—figure supplement 2). Peaks with H3K4me3 marks, representative of promoters (Heintzman et al., 2007), also showed significant h2med enrichment for most IMDs (q-value <0.05, MESC). Peaks with promoter-like signatures (i.e. H3K27ac and H3K4me3) (Abascal et al., 2020) and those with enhancer-like signatures (i.e. H3K27ac, but no H3K4me3) (Abascal et al., 2020) were also enriched for all IMD heritability (q-value <0.05, MESC). Conversely, peaks without any of the three active histone marks were completely depleted of caQTL-mediated IMD heritability (Figure 1—figure supplement 2). These ‘ATAC-only’ peaks were shorter, weaker, and further away from the TSS compared to peaks with active histone marks (Figure 1—figure supplement 3). Altogether, these results indicate that peaks characterized as putatively active regulatory elements explain nearly all of caQTL-mediated IMD heritability.

caQTLs share IMD heritability with eQTLs and explain more of IMD heritability than do eQTLs

The model that gene regulatory activity explains a significant fraction of noncoding genetic associations to IMDs is supported by our findings that caQTLs mediate a significant proportion of IMD heritability and that those with active histone marks show strong h2med enrichment. This is in contrast with relatively low average h2med/h2SNP estimates (~11%) previously having been observed for eQTLs across 48 human tissues in GTEx and various human traits (Aguet et al., 2020). To directly compare the proportion of IMD heritability mediated by caQTLs and eQTLs in the same cell type (i.e. LCLs), we additionally applied MESC to gene expression data from LCLs (i.e. Geuvadis data, Figure 1—figure supplement 1B; Lappalainen et al., 2013).

Across the seven autoimmune diseases, the estimated proportion of heritability mediated by eQTLs (h2med; eQTL/h2SNP) ranged from 9% to 22% (Figure 2A). For all seven diseases, we estimated that eQTLs mediated less heritability than did caQTLs, even though the caQTLs’ smaller sample size would potentially bias the estimates toward zero (Yao et al., 2020). A possible explanation is that some IMD-associated regulatory variants may show detectable effects on chromatin accessibility, but not on gene expression, in LCLs at the current sample size (n=373); such loci may account for the missing regulation.

Figure 2 with 4 supplements see all
Immune-mediated diseases (IMD) heritability mediated by chromatin accessibility quantitative trait loci (caQTLs) and expression quantitative trait loci (eQTLs).

(A) h2med/h2SNP estimates of various IMDs for caQTLs, eQTLs, and their union. The error bars represent jackknife standard errors of the mean. AID: autoimmune disease. Disease abbreviations along the x-axis are as in Figure 1. (B) Schema of the potential causal relationships between genetic variants, caQTLs, eQTLs, and IMD risk. The two diagrams depict possible h2med/h2SNP trends depending on the causal relationship between caQTLs and eQTLs. (C) Number of IMD-associated loci colocalized with caQTLs or eQTLs. The proportion is out of the total number of IMD-associated (p<10–6) loci. (D) ELMO1 locus plot showing association to rheumatoid arthritis (RA), primary biliary cholangitis (PBC), multiple sclerosis (MS), chromatin accessibility, and ELMO1 expression in lymphoblastoid cell lines (LCLs). Purple shading in the gene plot at the bottom indicates the caQTL peak, and the purple diamond is the lead variant (rs60600003) that is within that peak. The other variants are colored by the degree of linkage disequilibrium (LD) with the annotated variant. (E) Enrichment of the Biological Process Gene Ontology (GO) terms of genes in proximity to IMD-colocalized caQTLs without eQTL colocalization.

We anticipated that IMD-associated variants that affect gene expression in cis do so by modulating regulatory element activity. Therefore, we investigated whether eQTL-mediated IMD heritability is shared by caQTL-mediated signals (Figure 2B). We performed MESC on both caQTLs and eQTLs together to estimate the amount of IMD heritability mediated by both collectively (Mediated heritability estimation of QTLs). For the 7 IMDs, the combined h2med; caQTL ∪ eQTL/h2SNP was only slightly higher (2.2–9.0%) than the estimates for just caQTLs (h2med; just caQTL/h2SNP; Figure 2A), suggesting that approximately 56–82% of eQTL-mediated heritability is shared with caQTL-mediated heritability (i.e. h2med; caQTL ⋂ eQTL/h2med; eQTL; Figure 2—figure supplement 1). These estimates are consistent with substantial sharing of caQTL- and eQTL-mediated IMD heritability. Nevertheless, 9–27% of IMD heritability is explained just by caQTLs, while only 2–9% of IMD heritability is explained just by eQTLs.

Many IMD-associated loci show colocalization with caQTLs but not with eQTLs

We applied colocalization analysis (Pickrell et al., 2016) to identify IMD-associated loci that share genetic signals with caQTLs or eQTLs in LCLs. We selected candidate loci of 200 kb windows for each IMD with the following conditions: (1) lead IMD association at p<10–6, (2) lead caQTL or eQTL association at p<10–4, and (3) at least one variant simultaneously showed caQTL or eQTL χ2 statistics greater than 0.8×lead χ2 statistics for the caQTL or eQTL, respectively, and IMD association χ2 statistics greater than 0.8 × χ2 statistics for the IMD lead variant in the locus. We applied gwas-pw (Pickrell et al., 2016) and considered loci with posterior probability of colocalization (PPA3) >0.98 to be colocalized (Kundu et al., 2022). Some loci colocalized with only either a caQTL or an eQTL, while others colocalized with both (Figure 2C and Supplementary file 1A and B).

We investigated which proteins might be interacting with the colocalized caQTL peaks using Cistrome (Liu et al., 2011). We tested for overlap of the colocalized caQTL peak regions with ChIP-seq peaks detecting diverse proteins in immune-related cells (Supplementary file 1C). Consistent with the enriched mediated heritability in accessible regions with active histone marks (Figure 1D and Figure 1—figure supplement 2), proteins that are most often detected at the colocalized accessible regions are those related to RNA transcription (POL2RA and MED1), chromatin remodeling (EP300, BRD4, SMARCA4, and MTA2), or immune cell transcription factors (IKZF1, RUNX3, SPI1, RELA, RUNX1, and EBF1) (Figure 2—figure supplement 2). Interestingly, TRIM28, which functions as a repressor, was one of the most overlapping protein factors.

To confirm that the colocalized genes are relevant to IMD, we tested for their enrichment of Gene Ontology (GO) annotation terms for specific biological processes (Thomas et al., 2022). Considering all genes within 500 kb of the IMD GWAS lead variants at colocalized loci as background, the genes that showed eQTL colocalization for any IMD were enriched for various immune responses and signaling processes, such as ‘positive regulation of immune system process’ and ‘regulation of lymphocyte activation’ (Figure 2—figure supplement 3), indicating that the colocalized genes in LCLs are involved in immune function. For example, IL6R and IL12A encode direct or indirect targets of approved drugs – Tocilizumab and Ustekinumab – for autoimmune diseases like RA (Sanmartí et al., 2018) and CD (Khanna and Feagan, 2013). These two genes showed colocalization with both caQTLs and eQTLs in CD and PBC GWAS, respectively (Figure 2—figure supplement 4A and B). Increased IL6R expression was associated with higher risk for CD, and increased IL12A expression was associated with lower risk for PBC and SLE. The former observation is in line with Tocilizumab, a monoclonal antibody to IL-6 receptor, showing efficacy in CD patients (Sanmartí et al., 2018), although it is not pursued for approval because of potential side effects (Monemi et al., 2016). Interestingly, IL6R and IL12A eQTLs did not colocalize with the association signals of RA and CD, respectively, which are the diseases for which these drugs are approved (Figure 2—figure supplement 4C and D). Moreover, ELMO1, which previously had not been associated with autoimmune diseases, showed eQTL colocalization with RA, PBC, and MS association signals (Figure 2D). In all three, decreased ELMO1 expression was associated with increased disease risk. In mice, Elmo1 was required for polarization and migration of B and T lymphocytes (Stevenson et al., 2014).

Across the IMDs, there were many loci that colocalized with a caQTL but not with an eQTL (Figure 2C). These ‘caQTL-only’ loci showed enrichment for immune response genes in cis compared to all accessible regions in LCLs (McLean et al., 2010; Tanigawa et al., 2022), even though the colocalized eQTLs were enriched for immune response genes as well (Figure 2E and Figure 2—figure supplement 3), indicating that IMD-relevant genes without eQTL colocalization in Geuvadis LCL data (Lappalainen et al., 2013) are likely found in these loci.

Distance to TSSs affects eQTLs but not caQTLs

Why might there be loci with caQTL colocalization only, despite the caQTL data having fewer samples than the eQTL data (100 vs 373)? Limited statistical power can prevent some eQTLs from being detected and showing significant colocalization (Hukku et al., 2021). As hypothesized by Mostafavi and colleagues, disease-relevant eQTLs may be weaker and more distal (Mostafavi et al., 2023). To understand the extent to which this effect may result in many loci showing colocalization only with caQTLs, we compared the cis-heritability (h2cis) of caQTLs and eQTLs depending on the distance from the ATAC peak to the TSS of the gene (i.e. peak-to-TSS distance). We considered all caQTLs and eQTLs regardless of disease association.

We identified pairs of caQTLs and eQTLs that colocalized with each other (Pickrell et al., 2016), which implies that the regulatory variant modulating chromatin accessibility also affects gene expression. Then, the distance between the ATAC peak and the TSS of the eQTL gene (i.e. eGene) is the distance between a regulatory element and its target gene’s TSS. We stratified the pairs into peak-to-TSS distance quintiles and compared the eQTL h2cis distribution of the first quintile (i.e. closest pairs) with that of the later quintiles. We observed that eQTL h2cis distribution decreased with increasing distance of the paired ATAC peaks from the TSS (p=1.0 × 10–4, 2.1×10–10, 1.6×10–15, and 1.7×10–20, respectively, one-sided Wilcoxon rank-sum test; Figure 3A), consistent with the negative relationship between promoter-enhancer genomic distance and impact on gene expression (Fulco et al., 2019; Zuin et al., 2022). This result also explains why discovered eQTLs are concentrated near the promoter, where the variants are more likely to show stronger effects (Võsa et al., 2021). In contrast, caQTL h2cis distribution was similar across peak-to-TSS distances (p>0.05, one-sided Wilcoxon rank-sum test; Figure 3A). For all distance quintiles, caQTL h2cis was significantly higher than that of the paired eQTLs (p=4.0 × 10–23, 1.3×10–24, 1.5×10–19, 1.1×10–28, and 2.4×10–49, respectively, one-sided paired Wilcoxon rank-sum test; Figure 3A), and the contrast between them was greater at more distant quintiles, suggesting that the statistical power to detect and colocalize eQTLs is increasingly lower than that for caQTLs for regulatory effects far from the TSS.

Effect of peak-to-TSS distance on cis-heritability of chromatin accessibility quantitative trait loci (caQTLs) and expression quantitative trait loci (eQTLs) and immune-mediated disease (IMD) heritability.

(A) Distribution of cis-heritability (h2cis) of caQTLs and eQTLs by peak-to-TSS distance quintiles. The ranges of peak-to-TSS distance are shown in parentheses. The comparisons shown on the top (in respective colors) are between the nearest and each subsequent quintile of the respective QTL h2cis distribution (i.e. one-sided Wilcoxon rank-sum test). The comparisons shown on the bottom (in black) are between caQTL and eQTL h2cis distribution (one-sided paired Wilcoxon rank-sum test). *: p<10–4, **: p<10–10, ***: p<10–20, and ns: p>0.05. (B) Regression estimates and their standard errors of the linear regression model testing the effects of caQTL h2cis and peak-to-TSS distance on eQTL h2cis. Peak-to-TSS distance was expressed in units of 100 kb to neatly visualize the effect size estimates. The error bars represent standard errors of the regression estimate. SE: standard error. (C) Proportion of caQTL-mediated IMD heritability explained by ATAC peaks within various TSS windows. Percentage for each TSS window denotes the proportion of ATAC peaks in that window. The error bars represent jackknife standard errors of the mean. Disease abbreviations along the x-axis are as in Figure 1. (D) A model of the relationship between peak-to-TSS distance and power to detect a corresponding caQTL or eQTL. The thickness of the arrows indicates the variant effect size on chromatin accessibility (yellow) or gene expression (blue). TSS, transcription start site; CRE, cis-regulatory element.

Overall, caQTL h2cis had a significant positive effect (p=6.6 × 10–11, linear regression) and peak-to-TSS distance had a negative effect (p=2.4 × 10–23, linear regression) on eQTL h2cis (Figure 3B). Thus, for regulatory variants that showed both caQTL and eQTL signals, those with larger effects on chromatin accessibility tended to exhibit larger effects on gene expression, but their eQTL effects diminished with increasing distance from TSSs.

Next, we investigated how caQTL-mediated IMD heritability is distributed with respect to TSS. If caQTLs beyond the typical cis-eQTL window of 1 megabase (Mb) around the genes’ TSS explain some proportion of IMD heritability, then cis-eQTL analyses might require a wider window to detect disease-associated eQTLs. Across the seven diseases, caQTLs within 500 kb of the TSS of expressed genes explained almost all of the caQTL-mediated IMD heritability (92–100%; Figure 3C), indicating that regulatory variants are most likely within 500 kb of the target gene’s TSS and supporting the use of a 1 Mb window for cis-eQTL analyses. Depending on the disease, 41–66% of the caQTL-mediated IMD heritability was detected in distal peaks further than 10 kb from the TSS of expressed genes, further supporting the analysis of regulatory variants beyond promoter regions.

In sum, the power to detect eQTLs diminishes with increasing distance of the variant from the TSS, but the power to detect caQTLs is largely invariant regardless of peak-to-TSS distance (Figure 3D). Since h2med; caQTL are distributed mostly within 500 kb of genes’ TSS, the IMD loci colocalizing only with caQTLs could still be weak, undetected eQTLs. Under this model, we predicted that increasing the power to detect eQTLs in LCLs, such as increasing sample size, may lead to further eQTL colocalizations in loci in which we observed only caQTL colocalization.

Increasing the sample size reveals some eQTL colocalization

For genetic association studies, increasing the sample size is a way to increase statistical power. Therefore, we meta-analyzed four LCL eQTL summary statistics (Aguet et al., 2020; Lappalainen et al., 2013; Gutierrez-Arcelus et al., 2013; Buil et al., 2015), leading to a total sample size of 1128 individuals. We performed colocalization analysis using the meta-analyzed summary statistics to evaluate whether effectively increasing the sample sizes would uncover more disease-associated eQTLs in LCLs, especially in loci where a caQTL already showed IMD colocalization. Up to six additional loci showing eQTL colocalization were thus detected for each IMD (Figure 4A and Supplementary file 1D). For example, CIITA is the class II major histocompatibility complex transactivator, which causes severe immunodeficiency if dysfunctional (Dziembowska et al., 2002). The CIITA locus is associated with IBD right below the genome-wide significance level (rs10445003, p=7.5 × 10–8), and it colocalized with a caQTL signal, but initially not with any eQTL (Figure 4—figure supplement 1A). However, the meta-analyzed statistics showed a stronger association to CIITA expression (p=6.7 × 10–8) than without meta-analysis (p=5.4 × 10–4) and exhibited a significant colocalization. Interestingly, two of the causal CD genes that previously lacked colocalized eQTLs (Connally et al., 2022), CARD9 and ATG16L1, showed significant colocalization in the meta-analyzed LCL eQTL data (Figure 4—figure supplement 1B and C).

Figure 4 with 2 supplements see all
Additional colocalization of chromatin accessibility quantitative trait locus (caQTL)-colocalized immune-mediated disease (IMD) loci with meta-analyzed lymphoblastoid cell line (LCL) expression quantitative trait locus (eQTL) data.

(A) Number of caQTL-colocalized IMD loci that showed eQTL colocalization in LCLs. Disease abbreviations along the x-axis are as in Figure 1. (B) Distribution of peak-to-TSS distance of all caQTL-eQTL pairs and of those colocalized with IMD association. The number of loci in each category is shown in parentheses. *: p<0.01, ns: p>0.05. (C) POU3F1 eQTL that became significantly colocalized with rheumatoid arthritis (RA) association by meta-analyzing LCL eQTL data. Purple shading in the gene plot at the bottom indicates the caQTL peak, and the purple diamond is the lead variant (rs60600003) that is within that peak. The other variants are colored by the degree of linkage disequilibrium (LD) with the annotated variant.

We hypothesized that increased sample size would improve the power to detect weaker and distal eQTL colocalization. Comparison of the accessibility peak’s distance to the paired eQTL gene’s (eGene) TSS showed that the newly detected eQTLs tended to be more distal (p=0.06, one-sided Wilcoxon rank-sum test; Figure 4B). However, compared to the distribution of the peak-to-TSS distance for all caQTL-eQTL pairs showing colocalization, IMD-associated loci that showed caQTL and eQTL colocalization had greater peak-to-TSS distance on average (p=0.002 for Geuvadis and p=5.9 × 10–6 for the meta-analyzed data; Figure 4B). For example, an RA-associated locus near POU3F1, a neuronal transcription factor that is also induced by interferon (Hofmann et al., 2010), colocalized with a distal eQTL located about 126 kb upstream of its promoter, after meta-analysis strengthened the eQTL association (p<10–10; Figure 4C). These results suggest that additional IMD-associated loci with distal, weaker eQTLs in LCLs might be found if eQTL data were generated for a larger number of individuals.

IMD loci that colocalized with caQTLs but not eQTLs showed lower levels of active histone marks

Despite uncovering more eQTL colocalizations through meta-analysis, more than 40% of the caQTL-colocalized loci nevertheless showed no colocalization with an eQTL in LCLs (Figure 4B). We investigated whether these ‘caQTL-only’ peaks might be inactive regulatory elements. We quantified the active histone mark levels for H3K27ac, H3K4me1, and H3K4me3 at colocalized caQTL peaks in LCLs and then compared their levels between the ‘caQTL and eQTL’ and ‘caQTL only’ loci. On average, H3K27ac marks were stronger at ‘caQTL and eQTL’ peaks, supporting that the corresponding regulatory elements might be more active (Figure 4—figure supplement 2). H3K4me3 marks were detected more often at ‘caQTL and eQTL’ peaks, leading to stronger average signal. In contrast, H3K4me1 levels were highly similar between the two sets of peaks. Although ‘caQTL-only’ peaks generally showed lower levels of active histone marks, several individual ‘caQTL-only’ peaks showed comparable levels. These peaks could be inactive cis-regulatory elements in LCLs that affect gene expression in a different cellular context. Therefore, we next examined whether those caQTLs appear as eQTLs in other immune cell types.

Various immune cell types exhibit eQTL colocalization, where LCLs did not

We downloaded eQTL summary statistics generated from 26 naïve and stimulated immune cell types (Schmiedel et al., 2018; Alasoo et al., 2018; Chen et al., 2016; Soskic et al., 2022; Bossini-Castillo et al., 2022) to search for eQTLs that may correspond to the remaining, IMD-colocalized caQTLs (Supplementary file 1E). The profiled cell types range from B cells and monocytes to subtypes of T cells, as well as stimulated T cells and macrophages. We tested for colocalization of these eQTLs to IMD associations.

 25–42% of the caQTL-colocalized IMD loci that were missing eQTL colocalizations in LCLs showed eQTL colocalizations in at least one immune cell type (Figure 5A and Supplementary file 1F). The overlap of LCL caQTLs with non-LCL immune cell eQTLs was greater than expected by chance for 5 of the 7 IMDs (p<0.00714 [Bonferroni-corrected threshold], Fisher’s exact test, for CD, IBD, PBC, RA, and UC; Supplementary file 1G), suggesting that the caQTLs found in LCLs may also show regulatory function in those immune cell types. Comparing across the datasets, we found that the number of loci with eQTL colocalization varied depending on the cell type, but that the effect of the sample size was more profound (r2=0.60–0.79; Figure 5B and Supplementary file 1H). We meta-analyzed eQTL data of three immune cell types with multiple sources – naïve CD4+ T cell, monocyte, and memory regulatory T cell (Treg) – and this also increased the number of loci with significant eQTL colocalization (Supplementary file 1I). Altogether, these results suggest that although generating eQTL data in more cell types and cell states uncovers context-specific eQTLs, increasing the sample size should also be a priority to ensure sufficient statistical power.

Figure 5 with 1 supplement see all
Added utility of various immune cell expression quantitative trait locus (eQTL) data.

(A) Number of loci that additionally colocalized with eQTLs by lymphoblastoid cell line (LCL) meta-analysis (orange) and immune cell data (cyan) compared to the original analysis with Geuvadis LCL eQTL data (purple). The height of the bar is the proportion of loci with eQTL colocalization out of the total immune-mediated disease (IMD) loci with chromatin accessibility quantitative trait locus (caQTL) colocalization in the earlier analysis. Disease abbreviations along the x-axis are as in Figure 1. (B) Relationship between the number of multiple sclerosis (MS) genome-wide association study (GWAS) loci with eQTL colocalization and sample size for each eQTL dataset. Meta-analyzed eQTL data are labeled with their cell types. NK cell, natural killer cell; Treg, regulatory T cell.

 We investigated the potential reasons why IMD loci that colocalized with caQTLs in LCLs showed eQTLs not in LCLs but in other immune cells. First, LCL caQTLs may correspond to gene regulatory elements that exert their effects on gene expression in a different cellular context. For instance, monocyte H3K27ac levels in the ‘caQTL-only’ loci where monocyte eQTLs colocalized were higher than those with no monocyte eQTLs (Figure 5—figure supplement 1). Second, some examples, such as for TNFSF15, were due to cell-type-specific gene expression (Figure 6): despite a significant colocalization of IBD with a caQTL in LCLs in the TNFSF15 locus, disease-associated eQTL signal was detected only in monocytes (Figure 6A). Tumor necrosis factor-like cytokine 1A (TL1A), the protein encoded by TNFSF15, is secreted by monocytes and many other cells to activate helper T cells, Treg, and B cells (Xu et al., 2022). TNFSF15 expression was low in LCLs (mean transcript per million [TPM]=0.30), but higher in monocytes (mean TPM = 2.23). Of the genes that showed exclusively monocyte eQTL colocalization, those with low expression (mean TPM <1) in LCLs generally showed higher expression in monocytes (Figure 6B and C). However, low expression in LCLs was likely not the explanation for most cases of ‘monocyte-only’ eQTLs (blue points in Figure 6B) because most ‘monocyte-only’ eQTL genes were expressed at a level higher than 1 TPM in LCLs.

Immune-mediated disease (IMD) loci that colocalized with an expression quantitative trait locus (eQTL) in monocytes, but not in lymphoblastoid cell lines (LCLs), even though they colocalized with chromatin accessibility quantitative trait loci (caQTLs) in LCLs.

(A) TNFSF15 locus plot showing genetic association to inflammatory bowel disease (IBD), chromatin accessibility in LCLs, and TNFSF15 expression in LCLs and monocytes. Purple shading in the gene plot at the bottom indicates the caQTL peak, and the purple diamond shows a strongly associated variant (rs7848647) that is within that peak. The other variants are colored by the degree of linkage disequilibrium (LD) with the annotated variant. (B) Expression levels of genes in LCLs and monocytes colored according to their eQTL colocalization outcome. TPM: transcripts per million. (C) Number of genes with eQTL colocalization in monocytes, but not in LCLs, separated by the gene’s expression level in LCLs (column) and whether it is lower or higher than that in monocytes (row).

Overall, expanding the eQTL search to various immune cell types increased the number of eQTL-colocalized loci among those that previously colocalized only with caQTLs (Figure 5A). On average, approximately 75% of the caQTL-colocalized loci ultimately showed eQTL colocalization. These results highlight the utility of eQTL data across a range of immune cell types for discovery of IMD-associated eQTLs.

Discussion

A lack of the link between many noncoding GWAS loci to the associated variants’ gene regulatory effects has posed challenges in understanding their genetic mechanism (Connally et al., 2022). There have been various hypotheses presented from disease genes showing more complex gene regulation (Wang and Goldstein, 2020), context specificity of gene regulation (Umans et al., 2021), and a combination of both, due to selective constraints against damaging eQTLs (Mostafavi et al., 2023). A better evaluation of the potential reasons for the missing regulation will guide future data generation projects to elucidate disease-associated loci. To determine why some loci might lack colocalized eQTLs, we focused on chromatin accessibility, which is a molecular phenotype affected by regulatory variants more directly than are eQTLs. We approached this question with mediated heritability analysis (Yao et al., 2020) and colocalization analysis (Giambartolomei et al., 2014; Pickrell et al., 2016).

We found that caQTLs in LCLs mediate a significant proportion of heritability for many autoimmune diseases. In contrast, LCLs did not appear to be an effective cell type to model gene regulatory effects in allergic diseases. The h2med/h2SNP estimates for caQTLs were higher than those of eQTLs in most autoimmune diseases, even though the smaller sample size of caQTL data (i.e. 100 vs 373) could bias caQTLs’ estimates toward zero (Yao et al., 2020). We also showed that disease-associated chromatin accessibility effects often share the genetic signal with gene expression effects but that there are also many loci without an eQTL detected in LCLs.

By focusing on disease-associated caQTL that lacked significant eQTLs, we explored how additional colocalized eQTL effects could be uncovered. First, increasing the sample sizes for the eQTL statistics via meta-analysis demonstrated that more eQTL colocalizations can be detected with increased statistical power and robustness (Hukku et al., 2021). These results are consistent with the hypothesis that disease-associated eQTLs are typically weaker and distal due to negative selection against large expression changes for causal genes (Mostafavi et al., 2023). Second, many caQTLs in LCLs without eQTLs in LCLs showed eQTL colocalization in other immune cell types. Context specificity of eQTLs has been widely considered to be the primary explanation for the difficulty of pinpointing disease-associated eQTLs (Umans et al., 2021; Schmiedel et al., 2018; Alasoo et al., 2018; Soskic et al., 2022). Our observation that many IMD-colocalized caQTLs in LCLs show eQTLs in other immune cell types suggests that caQTL effects may be shared across cell types, whereas eQTL effects are more context-specific (Alasoo et al., 2018). If a shared set of transcription factors is expressed in similar yet distinct cell types, like the immune cells, genetic variants affecting their DNA binding would affect chromatin accessibility similarly. On the other hand, regulation of gene expression is a result of multiple regulatory elements, each binding multiple transcription factors (Kim and Wysocka, 2023), so the measured effect of a regulatory variant on gene expression likely depends much more on the cellular context. In such cases, eQTL data for the specific cellular contexts need to be generated in future studies to uncover genetic signal shared with a complex trait or disease. All in all, our results suggest that both increasing the sample size and generating gene expression data from more relevant cellular contexts would be useful strategies for discovering more disease-associated eQTLs.

Finally, we demonstrated that caQTLs can reveal the regulatory variant effect of disease-associated variants that may have been difficult to detect with eQTLs, particularly in TSS-distal regions. Although caQTLs cannot directly identify the target gene or the causal cellular context, we anticipate that integrated analyses can improve the power to detect weaker eQTL signals, as multi-trait GWAS analyses have shown (Turley et al., 2018). Moreover, integrating multiple molecular QTL data, like transcription factor bQTLs and histone mark QTLs, may highlight the regulatory elements associated with the GWAS phenotype, which may ultimately contribute to identifying the causal gene (Jeong and Bulyk, 2023). Therefore, we anticipate the generation of data across the various molecular phenotypes upstream of gene expression for QTL analyses will be informative.

Limitations of the study

The h2med/h2SNP estimates can be biased because of insufficient sample size of the QTL data. Thus, the proportion of mediated IMD heritability could change based on the specific caQTL and eQTL data. Colocalization analysis tests whether the QTL and GWAS data likely share genetic signal, and such shared signal could arise from either causal mediation or pleiotropy. Therefore, further experiments are needed to establish causality of colocalized eQTL genes. Lastly, we hypothesized that caQTL effects are often shared across cell types, but we did not have access to relevant data to test this hypothesis.

Methods

ATAC-seq data processing and peak calling

We downloaded LCL ATAC-seq data of British (GBR) samples (n=100) from European Nucleotide Archive (ENA) under accession ERP110508 (Kumasaka et al., 2019). The available files were cram alignment files mapped to the b37 reference genome, so we extracted unique read pairs using SAMtools (Danecek et al., 2021) and bamtofastq command from bedtools (Quinlan and Hall, 2010). The reads were paired-end and each 75 base pairs (bp) long. The data contained reads with Nextera transposase adapters, so we removed the adapter sequences and bases of poor quality at the 3’ end using cutadapt. Trimmed reads with both pairs shorter than 20 bp were discarded. The command was ‘cutadapt -a file:${forward} -A file:${reverse} -e 0.25 j 2 -q 15 --pair-filter=both -m 20’. (${forward} and ${reverse} files contain the forward and reverse Nextera transposase adapters). We mapped the reads to the GRCh38 reference genome using Bowtie 2 (Langmead and Salzberg, 2012) with the ‘GRCh38_noalt_decoy_as’ index provided on the tool’s website. The command was ‘bowtie2 --very-sensitive --no-mixed --no-discordant -I 20 -X 2000’. We kept only read alignments with mapping quality greater than 1. We also removed reads aligning to the mitochondrial genome, those overlapping ENCODE exclusion regions (file ID: ENCFF356LFX) (Dunham, 2012) and potential PCR duplicates using scripts from WASP (van de Geijn et al., 2015).

To represent peaks across the samples, we subsampled 3 million read pairs from each and pooled them. Then, we used MACS2 (Zhang et al., 2008) with the BAMPE option for peak calling. The command was ‘macs2 callpeak -f BAMPE -g hs -q 0.05’. We further used the ‘bdgcmp’ and ‘bdgpeakcall’ subcommand to find peaks that are at least 100 bp long (-l) and merge those that are less than 100 bp apart (-g). We also merged peaks similarly derived from individual samples using the ‘merge’ command in bedtools.

Furthermore, for the sake of comprehensiveness, we repeated these steps with LCL ATAC-seq data of Yoruban (YRI) samples (Banovich et al., 2018) and merged the peaks with the earlier peak set derived from GBR samples. In total, there were 443,403 peaks genome-wide.

RNA expression data preparation

We downloaded RNA expression level data of the LCL samples (Lappalainen et al., 2013) from the Expression Atlas (Papatheodorou et al., 2020). This data consisted of TPM values of genes as processed by the Expression Atlas. We retained TPM values of protein-coding and long noncoding RNAs in European samples for downstream analyses, including QTL analysis, mediated heritability analysis, and colocalization.

LCL samples’ genotype preparation

To utilize the genotype calls of the highest quality, we downloaded the high-coverage 1000 Genomes (1kG) Project data (Byrska-Bishop et al., 2022). Of the European samples with ATAC-seq or RNA-seq data, 14 samples had genotypes derived from microarrays (Auton et al., 2015), and the remaining samples had genotypes derived from the high-coverage whole-genome sequencing data. The samples with only microarray-based genotypes that needed imputation are listed in Supplementary file 1J. We lifted over the microarray data based on the hg19 reference genome to GRCh38 and filtered for variants present in the high-coverage 1kG data. We first imputed microarray genotype data using the TOPMed imputation server (Das et al., 2016) and extracted SNPs with imputation R2 ≥0.5 and imputed the rest of the variants, including short indels, using high-coverage data with Beagle5.2 (Browning et al., 2018; Browning et al., 2021). After keeping only variants with DR2 ≥0.7, we merged the imputed genotypes with the high-coverage 1kG genotypes.

Curation of IMD GWAS data

We downloaded GWAS summary statistics for 13 IMDs, including 11 autoimmune diseases (ATD [Cordell et al., 2021], CEL [Dubois et al., 2010], CD [de Lange et al., 2017], IBD [de Lange et al., 2017], JIA [López-Isac et al., 2021], MS [Patsopoulos, 2019], PBC [Cordell et al., 2021], RA [Ishigaki et al., 2022], SLE [Bentham et al., 2015], UC [de Lange et al., 2017], and VIT [Jin et al., 2016]) and 2 allergic diseases (ALL [Loh et al., 2018] and AST [Loh et al., 2018]). For each disease, we searched for more recent studies with larger sample sizes and prioritized those with genome-wide statistics, rather than those with only Immunochip variants (Cortes and Brown, 2011). To compare with nonimmune diseases, we also downloaded summary statistics for 3 nonimmune diseases (T2D [Morris et al., 2012], CAD [Schunkert et al., 2011], and SCZ [Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014]). In this study, we analyzed only those GWAS summary statistics derived from cohorts of individuals with European ancestries.

 Since the LCL samples’ genotypes are based on the GRCh38 reference genome, we lifted over any GWAS data based on b37 reference genome to the GRCh38 genomic coordinates. Briefly, we formatted the summary statistics as bed files and used the liftOver tool (Kent et al., 2002) to convert them to GRCh38 genomic coordinates. Then, to ensure that the reference alleles match the sequences of the GRCh38 reference genome, we used the gwas2vcf tool (Lyon et al., 2021).

caQTLs in LCLs

First, we quantified the chromatin accessibility levels at ATAC-seq peaks identified earlier. We counted the number of read fragments overlapping each peak using featureCounts (Liao et al., 2014). For each sample, the read counts were normalized for library size using trimmed mean of M-values (Robinson and Oshlack, 2010) so that the values are comparable across the samples. Then, the phenotype values were further normalized to follow a standard normal distribution across the samples, using quantile normalization. Peaks with counts per million (CPM)<0.8 or counts <10 for more than 20% of the samples were discarded.

 Next, we performed a principal component analysis (PCA) on the phenotype matrix to derive potential latent covariates. We selected the number of principal components (PCs) to incorporate in the regression model based on the Buja and Eyuboglu algorithm (Buja and Eyuboglu, 1992) that is implemented in PCAForQTL (Zhou et al., 2022). Ultimately, we accounted for sex, library size, 3 genotype PCs, and 13 phenotype PCs in the QTL analysis. We performed genetic association tests on variants within 200 kilobases (kb) of the peak using tensorQTL (Taylor-Weiner et al., 2019). We discarded variants with minor allele frequency less than 5%.

IMD heritability enrichment in accessible regions of LCLs

We evaluated the relevance of accessible regions in LCLs to IMD heritability using S-LDSC (Finucane et al., 2015). We used the baselineLD v2.2 annotation in hg38 and the European LD reference from the 1000 Genomes Project (downloaded from the S-LDSC website, https://alkesgroup.broadinstitute.org/LDSCORE/GRCh38/). We used the set of filtered ATAC-seq peaks that we tested for QTL associations. We accounted for 16 diseases (13 IMDs and 3 non-IMDs) for the Bonferroni-corrected p-value threshold.

Mediated heritability estimation of QTLs

We estimated the heritability mediated by QTLs (h2med) using MESC (Yao et al., 2020). We denote heritability of tested SNPs as h2SNP.

caQTL-mediated heritability

We estimated the ‘expression scores’ for chromatin accessibility in LCLs using individual genotypes and phenotypes. We analyzed the same set of peaks and accounted for the same covariates as we did for the QTL analysis above. For mediated heritability estimation, we accounted for baseline LD v2.2 annotation in hg38 without the QTL annotations, as they could be redundant. The estimand of interest is the proportion of heritability mediated by caQTLs (h2med/h2SNP).

 To evaluate whether certain peak sets are enriched for mediated heritability, we utilized the gene set analysis functionality. For peak strength, we considered the 95% percentile CPM value of each peak and stratified the peaks into quintiles. For histone marks, we first generated histone ChIP-seq peaks using data from Delaneau et al., 2019. Similar to calling ATAC-seq peaks, we sampled 3 million reads per sample and merged them before applying MACS2 (Zhang et al., 2008). We downloaded three control ChIP-seq data from ENCODE to use as input (File IDs: ENCFF066RCS, ENCFF159XTB, and ENCFF850RIE) (Dunham, 2012). Then, we curated sets of ATAC-seq peaks that overlapped H3K27ac, H3K4me1, and H3K4me3 ChIP-seq peaks by at least 1 bp. ATAC-seq peaks that overlapped H3K27ac but not H3K4me3 regions were labeled as ‘enhancer-like signature’, while those that overlapped H3K27ac and H3K4me3 regions were labeled as ‘promoter-like signature’. The estimand of interest is the proportion of mediated heritability explained by the peak set (peak set h2med/total h2med).

Comparison with eQTL-mediated heritability

First, we estimated h2med/h2SNP of eQTLs (i.e. h2med; eQTL/h2SNP) the same way as we did for that of caQTLs. Then, we also estimated h2med/h2SNP of caQTLs and eQTLs together (i.e. h2med; caQTL ∪ eQTL) with MESC meta-analysis (Yao et al., 2020). caQTLs and eQTLs were also stratified as separate sets to account for potential differences in the relationship of QTL cis-heritability and h2med. This meta-analyzed estimate is effectively the amount of heritability mediated by either caQTLs or eQTLs in LCLs. This estimate reveals the overall relationship between heritability mediated by caQTLs and eQTLs. For instance, the estimate of the heritability mediated exclusively by caQTLs would be h2med; just caQTL = h2med; caQTL ∪ eQTLh2med; caQTL. The estimate of mediated heritability shared by caQTLs and eQTLs is h2med; caQTL ⋂ eQTL = (h2med; caQTL + h2med; eQTL) – h2med; caQTL ∪ eQTL.

Colocalization analyses

caQTL and eQTL colocalization with IMD GWAS

First, we selected candidate colocalization loci by filtering for overlapping ‘significant’ variants. The candidate loci met the following conditions: (1) lead IMD association at p<10–6, (2) lead caQTL or eQTL association at p<10–4, and (3) at least one variant simultaneously showed caQTL or eQTL χ2 statistics greater than 0.8×lead χ2 statistics for the caQTL or eQTL and IMD association χ2 statistics greater than 0.8 × χ2 statistics for the IMD lead variant in the locus. Then, we applied gwas-pw (Pickrell et al., 2016) on the variants within 100 kb of the lead variant. We considered loci with posterior probability of colocalization (PP3)>0.98 to be colocalized (Kundu et al., 2022).

Colocalization of caQTL with eQTL

We performed a colocalization analysis of caQTLs and eQTLs to curate a set of loci where the same genetic signal likely explains both associations. The pairs of caQTLs and eQTLs reveal the distance between the regulatory element and the target gene’s TSS. We selected candidate colocalization loci with: (1) IMD association at p<10–6, (2) QTL association at p<10–4, and (3) the two lead variants showed LD r2>0.8. We applied gwas-pw (Pickrell et al., 2016) on the variants within 200 kb of the tested caQTL peak. We considered loci with posterior probability of colocalization (PP3)>0.98 to be colocalized (Kundu et al., 2022).

Overlap of protein factor ChIP-seq and colocalized caQTL peaks

We searched for any protein factors detected at the colocalized caQTL peaks using the Cistrome database (Liu et al., 2011), which we accessed on July 24, 2024. We considered only ChIP-seq data from immune cell types, progenitors, and stem cells that can differentiate into immune cells. For each ChIP-seq peak, we searched for those that show overlap of more than 50% with one of 305 caQTL peaks that colocalized with IMD GWAS signals (Supplementary file 1A). Each protein factor detected in the caQTL peaks and the cell type used to generate the ChIP-seq data are listed in Supplementary file 1C.

Enrichment of biological processes

To test whether colocalized genes are likely relevant to autoimmune diseases, we surveyed which Biological Process GO terms were overrepresented compared to all the genes within 500 kb of each IMD association signal. Enrichment of biological processes was evaluated using Protein Analysis Through Evolutionary Relationships (PANTHER) (Mi et al., 2019). The foreground list comprised all of the genes whose eQTL signal colocalized with one of the seven autoimmune diseases (CD, IBD, MS, PBC, RA, SLE, and UC). The background list of genes was all of the genes within 500 kb of each IMD lead variant for which we observed colocalization. Moreover, we tested whether colocalized caQTLs without eQTLs are closer to genes related to immune processes than expected by chance. For this, we used Genomic Regions Enrichment of Annotations Tool (GREAT) (McLean et al., 2010). The foreground list comprised ATAC-seq peaks at IMD loci showing colocalization only with caQTLs and not with eQTLs. The background list is all ATAC-seq peaks identified in LCLs that we tested for caQTL association.

Relationship between peak-to-TSS distance and cis-heritability of caQTL and eQTL

The distance between the colocalized caQTL peak and the eGene’s TSS (i.e. peak-to-TSS distance) was determined to be the shortest distance from one end of the caQTL peak to the TSS. The pairs of caQTLs and eQTLs were split into quintiles based on their peak-to-TSS distance from closest to farthest.

 MESC analysis uses REML implemented in GCTA (Yang et al., 2011) to estimate QTL cis-heritability. We referred to its output and compared the cis-heritability of caQTLs and eQTLs based on peak-to-TSS distance. To visualize the distribution of cis-heritability estimates, we grouped pairs of colocalized caQTLs and eQTLs based on peak-to-TSS distance quintiles.

LCL eQTL meta-analysis

We downloaded LCL eQTL data from three studies through eQTL Catalogue release 6 (Kerimov et al., 2021). The sample sizes were 190, 147, and 418 for GENCORD (Gutierrez-Arcelus et al., 2013), GTEx (Aguet et al., 2020), and TwinsUK (Buil et al., 2015), respectively. We meta-analyzed the summary statistics using the inverse variance weighted fixed effects model. If a variant was missing in a subset of the studies, then only the available statistics were meta-analyzed. We used these meta-analyzed statistics to perform colocalization the same way as earlier colocalization analyses.

Colocalization analysis with immune cell eQTL data

We downloaded eQTL data for various immune cell types from the eQTL Catalogue release 6 (Kerimov et al., 2021). The source studies from the eQTL Catalogue include BLUEPRINT (Chen et al., 2016), DICE (Schmiedel et al., 2018), Alasoo et al., 2018, and Bossini-Castillo et al., 2022. The represented immune cell types include T cell subtypes, B cells, neutrophils, and monocytes. We also separately downloaded data for CD4+ T cells with and without stimulation (Soskic et al., 2022). The selection of candidate loci and colocalization analysis on them followed the same procedure as that for other QTLs.

 We also evaluated whether meta-analysis of eQTL data can increase the number of loci with eQTL colocalization. We meta-analyzed eQTL data for three immune cell types with multiple sources – naïve CD4+ T cell, monocyte, and memory Treg – using the inverse variance weighted fixed effects model. Specifically, we meta-analyzed naïve CD4+ T cell eQTL summary statistics from Soskic et al., 2022, BLUEPRINT (Chen et al., 2016), and DICE (Schmiedel et al., 2018). We meta-analyzed monocyte eQTL summary statistics from BLUEPRINT (Chen et al., 2016) and DICE (Schmiedel et al., 2018). We meta-analyzed memory Treg eQTL summary statistics from Bossini-Castillo et al., 2022, and DICE (Schmiedel et al., 2018). We used these meta-analyzed statistics to perform colocalization the same way as earlier colocalization analyses.

Data availability

Processed data and code for generating the figures presented in the manuscript are available at https://github.com/BulykLab/IMD-colocalization-manuscript-figures (copy archived at Jeong, 2024).

The following previously published data sets were used
    1. Byrska-Bishop M
    2. Evani US
    3. Zhao X
    (2022) The International Genome Sample Resource
    ID 30x-grch38. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios.
    1. Kumasaka N
    2. Knights AJ
    3. Gaffney DJ
    (2018) European Nucleotide Archive
    ID PRJEB28318. High-resolution genetic mapping of putative causal interactions between regions of open chromatin.
    1. Lappalainen T
    2. Sammeth M
    3. Friedländer MR
    (2013) Expression atlas
    ID E-GEUV-1. Transcriptome and genome sequencing uncovers functional variation in humans.
    1. Cordell HJ
    2. Fryett JJ
    3. Ueno K
    (2021) GWAS Catalog
    ID GCST90061440. An international genome-wide meta-analysis of primary biliary cholangitis: Novel risk loci and candidate drugs.
    1. Dubois PC
    2. Trynka G
    3. Franke L
    (2010) GWAS Catalog
    ID GCST000612. Multiple common variants for celiac disease influencing immune gene expression.
    1. de Lange KM
    2. Moutsianas L
    3. Lee JC
    (2017) GWAS Catalog
    ID GCST004131. Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease.
    1. López-Isac E
    2. Smith SL
    3. Marion MC
    (2021) GWAS Catalog
    ID GCST90010715. Combined genetic analysis of juvenile idiopathic arthritis clinical subtypes identifies novel risk loci, target genes and key regulatory mechanisms.
    1. Ishigaki K
    2. Sakaue S
    3. Terao C
    (2022) GWAS Catalog
    ID GCST90132223. Multi-ancestry genome-wide association analyses identify novel genetic mechanisms in rheumatoid arthritis.
    1. Bentham J
    2. Morris DL
    3. Graham DSC
    (2015) GWAS Catalog
    ID GCST003156. Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus.
    1. Jin Y
    2. Andersen G
    3. Yorgov D
    (2016) GWAS Catalog
    ID GCST004785. Genome-wide association studies of autoimmune vitiligo identify 23 new risk loci and highlight key pathways and regulatory variants.
    1. Schmiedel BJ
    2. Singh D
    3. Madrigal A
    (2018) eQTL Catalogue
    ID QTS000026. Impact of Genetic Polymorphisms on Human Immune Cell Gene Expression.
    1. Chen L
    2. Ge B
    3. Casale FP
    (2016) eQTL Catalogue
    ID QTS000002. Genetic Drivers of Epigenetic and Transcriptional Variation in Human Immune Cells.
    1. Bossini-Castillo L
    2. Glinos DA
    3. Kunowska N
    (2022) eQTL Catalogue
    ID QTS000003. Immune disease variants modulate gene expression in regulatory CD4+ T cells.
    1. Soskic B
    2. Cano-Gamez E
    3. Smyth DJ
    (2022) Zenodo
    Immune disease risk variants regulate gene expression dynamics during CD4+ T cell activation.
    https://doi.org/10.5281/zenodo.6006795
    1. Alasoo K
    2. Rodrigues J
    3. Mukhopadhyay S
    (2018) eQTL Catalogue
    ID QTS000001. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response.
    1. de Lange KM
    2. Moutsianas L
    3. Lee JC
    (2017) GWAS Catalog
    ID GCST004132. Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease.
    1. de Lange KM
    2. Moutsianas L
    3. Lee JC
    (2017) GWAS Catalog
    ID GCST004133. Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease.

References

    1. Abascal F
    2. Acosta R
    3. Addleman NJ
    4. Adrian J
    5. Afzal V
    6. Ai R
    7. Aken B
    8. Akiyama JA
    9. Jammal OA
    10. Amrhein H
    11. Anderson SM
    12. Andrews GR
    13. Antoshechkin I
    14. Ardlie KG
    15. Armstrong J
    16. Astley M
    17. Banerjee B
    18. Barkal AA
    19. Barnes IHA
    20. Barozzi I
    21. Barrell D
    22. Barson G
    23. Bates D
    24. Baymuradov UK
    25. Bazile C
    26. Beer MA
    27. Beik S
    28. Bender MA
    29. Bennett R
    30. Bouvrette LPB
    31. Bernstein BE
    32. Berry A
    33. Bhaskar A
    34. Bignell A
    35. Blue SM
    36. Bodine DM
    37. Boix C
    38. Boley N
    39. Borrman T
    40. Borsari B
    41. Boyle AP
    42. Brandsmeier LA
    43. Breschi A
    44. Bresnick EH
    45. Brooks JA
    46. Buckley M
    47. Burge CB
    48. Byron R
    49. Cahill E
    50. Cai L
    51. Cao L
    52. Carty M
    53. Castanon RG
    54. Castillo A
    55. Chaib H
    56. Chan ET
    57. Chee DR
    58. Chee S
    59. Chen H
    60. Chen H
    61. Chen JY
    62. Chen S
    63. Cherry JM
    64. Chhetri SB
    65. Choudhary JS
    66. Chrast J
    67. Chung D
    68. Clarke D
    69. Cody NAL
    70. Coppola CJ
    71. Coursen J
    72. D’Ippolito AM
    73. Dalton S
    74. Danyko C
    75. Davidson C
    76. Davila-Velderrain J
    77. Davis CA
    78. Dekker J
    79. Deran A
    80. DeSalvo G
    81. Despacio-Reyes G
    82. Dewey CN
    83. Dickel DE
    84. Diegel M
    85. Diekhans M
    86. Dileep V
    87. Ding B
    88. Djebali S
    89. Dobin A
    90. Dominguez D
    91. Donaldson S
    92. Drenkow J
    93. Dreszer TR
    94. Drier Y
    95. Duff MO
    96. Dunn D
    97. Eastman C
    98. Ecker JR
    99. Edwards MD
    100. El-Ali N
    101. Elhajjajy SI
    102. Elkins K
    103. Emili A
    104. Epstein CB
    105. Evans RC
    106. Ezkurdia I
    107. Fan K
    108. Farnham PJ
    109. Farrell NP
    110. Feingold EA
    111. Ferreira AM
    112. Fisher-Aylor K
    113. Fitzgerald S
    114. Flicek P
    115. Foo CS
    116. Fortier K
    117. Frankish A
    118. Freese P
    119. Fu S
    120. Fu XD
    121. Fu Y
    122. Fukuda-Yuzawa Y
    123. Fulciniti M
    124. Funnell APW
    125. Gabdank I
    126. Galeev T
    127. Gao M
    128. Giron CG
    129. Garvin TH
    130. Gelboin-Burkhart CA
    131. Georgolopoulos G
    132. Gerstein MB
    133. Giardine BM
    134. Gifford DK
    135. Gilbert DM
    136. Gilchrist DA
    137. Gillespie S
    138. Gingeras TR
    139. Gong P
    140. Gonzalez A
    141. Gonzalez JM
    142. Good P
    143. Goren A
    144. Gorkin DU
    145. Graveley BR
    146. Gray M
    147. Greenblatt JF
    148. Griffiths E
    149. Groudine MT
    150. Grubert F
    151. Gu M
    152. Guigó R
    153. Guo H
    154. Guo Y
    155. Guo Y
    156. Gursoy G
    157. Gutierrez-Arcelus M
    158. Halow J
    159. Hardison RC
    160. Hardy M
    161. Hariharan M
    162. Harmanci A
    163. Harrington A
    164. Harrow JL
    165. Hasz RD
    166. Hatan M
    167. Haugen E
    168. Hayes JE
    169. He P
    170. He Y
    171. Heidari N
    172. Hendrickson D
    173. Heuston EF
    174. Hilton JA
    175. Hitz BC
    176. Hochman A
    177. Holgren C
    178. Hou L
    179. Hou S
    180. Hsiao YHE
    181. Hsu S
    182. Huang H
    183. Hubbard TJ
    184. Huey J
    185. Hughes TR
    186. Hunt T
    187. Ibarrientos S
    188. Issner R
    189. Iwata M
    190. Izuogu O
    191. Jaakkola T
    192. Jameel N
    193. Jansen C
    194. Jiang L
    195. Jiang P
    196. Johnson A
    197. Johnson R
    198. Jungreis I
    199. Kadaba M
    200. Kasowski M
    201. Kasparian M
    202. Kato M
    203. Kaul R
    204. Kawli T
    205. Kay M
    206. Keen JC
    207. Keles S
    208. Keller CA
    209. Kelley D
    210. Kellis M
    211. Kheradpour P
    212. Kim DS
    213. Kirilusha A
    214. Klein RJ
    215. Knoechel B
    216. Kuan S
    217. Kulik MJ
    218. Kumar S
    219. Kundaje A
    220. Kutyavin T
    221. Lagarde J
    222. Lajoie BR
    223. Lambert NJ
    224. Lazar J
    225. Lee AY
    226. Lee D
    227. Lee E
    228. Lee JW
    229. Lee K
    230. Leslie CS
    231. Levy S
    232. Li B
    233. Li H
    234. Li N
    235. Li S
    236. Li X
    237. Li YI
    238. Li Y
    239. Li Y
    240. Li Y
    241. Lian J
    242. Libbrecht MW
    243. Lin S
    244. Lin Y
    245. Liu D
    246. Liu J
    247. Liu P
    248. Liu T
    249. Liu XS
    250. Liu Y
    251. Liu Y
    252. Long M
    253. Lou S
    254. Loveland J
    255. Lu A
    256. Lu Y
    257. Lécuyer E
    258. Ma L
    259. Mackiewicz M
    260. Mannion BJ
    261. Mannstadt M
    262. Manthravadi D
    263. Marinov GK
    264. Martin FJ
    265. Mattei E
    266. McCue K
    267. McEown M
    268. McVicker G
    269. Meadows SK
    270. Meissner A
    271. Mendenhall EM
    272. Messer CL
    273. Meuleman W
    274. Meyer C
    275. Miller S
    276. Milton MG
    277. Mishra T
    278. Moore DE
    279. Moore HM
    280. Moore JE
    281. Moore SH
    282. Moran J
    283. Mortazavi A
    284. Mudge JM
    285. Munshi N
    286. Murad R
    287. Myers RM
    288. Nandakumar V
    289. Nandi P
    290. Narasimha AM
    291. Narayanan AK
    292. Naughton H
    293. Navarro FCP
    294. Navas P
    295. Nazarovs J
    296. Nelson J
    297. Neph S
    298. Neri FJ
    299. Nery JR
    300. Nesmith AR
    301. Newberry JS
    302. Newberry KM
    303. Ngo V
    304. Nguyen R
    305. Nguyen TB
    306. Nguyen T
    307. Nishida A
    308. Noble WS
    309. Novak CS
    310. Novoa EM
    311. Nuñez B
    312. O’Donnell CW
    313. Olson S
    314. Onate KC
    315. Otterman E
    316. Ozadam H
    317. Pagan M
    318. Palden T
    319. Pan X
    320. Park Y
    321. Partridge EC
    322. Paten B
    323. Pauli-Behn F
    324. Pazin MJ
    325. Pei B
    326. Pennacchio LA
    327. Perez AR
    328. Perry EH
    329. Pervouchine DD
    330. Phalke NN
    331. Pham Q
    332. Phanstiel DH
    333. Plajzer-Frick I
    334. Pratt GA
    335. Pratt HE
    336. Preissl S
    337. Pritchard JK
    338. Pritykin Y
    339. Purcaro MJ
    340. Qin Q
    341. Quinones-Valdez G
    342. Rabano I
    343. Radovani E
    344. Raj A
    345. Rajagopal N
    346. Ram O
    347. Ramirez L
    348. Ramirez RN
    349. Rausch D
    350. Raychaudhuri S
    351. Raymond J
    352. Razavi R
    353. Reddy TE
    354. Reimonn TM
    355. Ren B
    356. Reymond A
    357. Reynolds A
    358. Rhie SK
    359. Rinn J
    360. Rivera M
    361. Rivera-Mulia JC
    362. Roberts BS
    363. Rodriguez JM
    364. Rozowsky J
    365. Ryan R
    366. Rynes E
    367. Salins DN
    368. Sandstrom R
    369. Sasaki T
    370. Sathe S
    371. Savic D
    372. Scavelli A
    373. Scheiman J
    374. Schlaffner C
    375. Schloss JA
    376. Schmitges FW
    377. See LH
    378. Sethi A
    379. Setty M
    380. Shafer A
    381. Shan S
    382. Sharon E
    383. Shen Q
    384. Shen Y
    385. Sherwood RI
    386. Shi M
    387. Shin S
    388. Shoresh N
    389. Siebenthall K
    390. Sisu C
    391. Slifer T
    392. Sloan CA
    393. Smith A
    394. Snetkova V
    395. Snyder MP
    396. Spacek DV
    397. Srinivasan S
    398. Srivas R
    399. Stamatoyannopoulos G
    400. Stamatoyannopoulos JA
    401. Stanton R
    402. Steffan D
    403. Stehling-Sun S
    404. Strattan JS
    405. Su A
    406. Sundararaman B
    407. Suner MM
    408. Syed T
    409. Szynkarek M
    410. Tanaka FY
    411. Tenen D
    412. Teng M
    413. Thomas JA
    414. Toffey D
    415. Tress ML
    416. Trout DE
    417. Trynka G
    418. Tsuji J
    419. Upchurch SA
    420. Ursu O
    421. Uszczynska-Ratajczak B
    422. Uziel MC
    423. Valencia A
    424. Biber BV
    425. van der Velde AG
    426. Van Nostrand EL
    427. Vaydylevich Y
    428. Vazquez J
    429. Victorsen A
    430. Vielmetter J
    431. Vierstra J
    432. Visel A
    433. Vlasova A
    434. Vockley CM
    435. Volpi S
    436. Vong S
    437. Wang H
    438. Wang M
    439. Wang Q
    440. Wang R
    441. Wang T
    442. Wang W
    443. Wang X
    444. Wang Y
    445. Watson NK
    446. Wei X
    447. Wei Z
    448. Weisser H
    449. Weissman SM
    450. Welch R
    451. Welikson RE
    452. Weng Z
    453. Westra HJ
    454. Whitaker JW
    455. White C
    456. White KP
    457. Wildberg A
    458. Williams BA
    459. Wine D
    460. Witt HN
    461. Wold B
    462. Wolf M
    463. Wright J
    464. Xiao R
    465. Xiao X
    466. Xu J
    467. Xu J
    468. Yan KK
    469. Yan Y
    470. Yang H
    471. Yang X
    472. Yang YW
    473. Yardımcı GG
    474. Yee BA
    475. Yeo GW
    476. Young T
    477. Yu T
    478. Yue F
    479. Zaleski C
    480. Zang C
    481. Zeng H
    482. Zeng W
    483. Zerbino DR
    484. Zhai J
    485. Zhan L
    486. Zhan Y
    487. Zhang B
    488. Zhang J
    489. Zhang J
    490. Zhang K
    491. Zhang L
    492. Zhang P
    493. Zhang Q
    494. Zhang XO
    495. Zhang Y
    496. Zhang Z
    497. Zhao Y
    498. Zheng Y
    499. Zhong G
    500. Zhou XQ
    501. Zhu Y
    502. Zimmerman J
    503. Moore JE
    504. Purcaro MJ
    505. Pratt HE
    506. Epstein CB
    507. Shoresh N
    508. Adrian J
    509. Kawli T
    510. Davis CA
    511. Dobin A
    512. Kaul R
    513. Halow J
    514. Van Nostrand EL
    515. Freese P
    516. Gorkin DU
    517. Shen Y
    518. He Y
    519. Mackiewicz M
    520. Pauli-Behn F
    521. Williams BA
    522. Mortazavi A
    523. Keller CA
    524. Zhang XO
    525. Elhajjajy SI
    526. Huey J
    527. Dickel DE
    528. Snetkova V
    529. Wei X
    530. Wang X
    531. Rivera-Mulia JC
    532. Rozowsky J
    533. Zhang J
    534. Chhetri SB
    535. Zhang J
    536. Victorsen A
    537. White KP
    538. Visel A
    539. Yeo GW
    540. Burge CB
    541. Lécuyer E
    542. Gilbert DM
    543. Dekker J
    544. Rinn J
    545. Mendenhall EM
    546. Ecker JR
    547. Kellis M
    548. Klein RJ
    549. Noble WS
    550. Kundaje A
    551. Guigó R
    552. Farnham PJ
    553. Cherry JM
    554. Myers RM
    555. Ren B
    556. Graveley BR
    557. Gerstein MB
    558. Pennacchio LA
    559. Snyder MP
    560. Bernstein BE
    561. Wold B
    562. Hardison RC
    563. Gingeras TR
    564. Stamatoyannopoulos JA
    565. Weng Z
    566. The ENCODE Project Consortium
    (2020) Expanded encyclopaedias of DNA elements in the human and mouse genomes
    Nature 583:699–710.
    https://doi.org/10.1038/s41586-020-2493-4
    1. Aguet F
    2. Anand S
    3. Ardlie KG
    4. Gabriel S
    5. Getz GA
    6. Graubert A
    7. Hadley K
    8. Handsaker RE
    9. Huang KH
    10. Kashin S
    11. Li X
    12. MacArthur DG
    13. Meier SR
    14. Nedzel JL
    15. Nguyen DT
    16. Segrè AV
    17. Todres E
    18. Balliu B
    19. Barbeira AN
    20. Battle A
    21. Bonazzola R
    22. Brown A
    23. Brown CD
    24. Castel SE
    25. Conrad DF
    26. Cotter DJ
    27. Cox N
    28. Das S
    29. de Goede OM
    30. Dermitzakis ET
    31. Einson J
    32. Engelhardt BE
    33. Eskin E
    34. Eulalio TY
    35. Ferraro NM
    36. Flynn ED
    37. Fresard L
    38. Gamazon ER
    39. Garrido-Martín D
    40. Gay NR
    41. Gloudemans MJ
    42. Guigó R
    43. Hame AR
    44. He Y
    45. Hoffman PJ
    46. Hormozdiari F
    47. Hou L
    48. Im HK
    49. Jo B
    50. Kasela S
    51. Kellis M
    52. Kim-Hellmuth S
    53. Kwong A
    54. Lappalainen T
    55. Li X
    56. Liang Y
    57. Mangul S
    58. Mohammadi P
    59. Montgomery SB
    60. Muñoz-Aguirre M
    61. Nachun DC
    62. Nobel AB
    63. Oliva M
    64. Park Y
    65. Park Y
    66. Parsana P
    67. Rao AS
    68. Reverter F
    69. Rouhana JM
    70. Sabatti C
    71. Saha A
    72. Stephens M
    73. Stranger BE
    74. Strober BJ
    75. Teran NA
    76. Viñuela A
    77. Wang G
    78. Wen X
    79. Wright F
    80. Wucher V
    81. Zou Y
    82. Ferreira PG
    83. Li G
    84. Melé M
    85. Yeger-Lotem E
    86. Barcus ME
    87. Bradbury D
    88. Krubit T
    89. McLean JA
    90. Qi L
    91. Robinson K
    92. Roche NV
    93. Smith AM
    94. Sobin L
    95. Tabor DE
    96. Undale A
    97. Bridge J
    98. Brigham LE
    99. Foster BA
    100. Gillard BM
    101. Hasz R
    102. Hunter M
    103. Johns C
    104. Johnson M
    105. Karasik E
    106. Kopen G
    107. Leinweber WF
    108. McDonald A
    109. Moser MT
    110. Myer K
    111. Ramsey KD
    112. Roe B
    113. Shad S
    114. Thomas JA
    115. Walters G
    116. Washington M
    117. Wheeler J
    118. Jewell SD
    119. Rohrer DC
    120. Valley DR
    121. Davis DA
    122. Mash DC
    123. Branton PA
    124. Barker LK
    125. Gardiner HM
    126. Mosavel M
    127. Siminoff LA
    128. Flicek P
    129. Haeussler M
    130. Juettemann T
    131. Kent WJ
    132. Lee CM
    133. Powell CC
    134. Rosenbloom KR
    135. Ruffier M
    136. Sheppard D
    137. Taylor K
    138. Trevanion SJ
    139. Zerbino DR
    140. Abell NS
    141. Akey J
    142. Chen L
    143. Demanelis K
    144. Doherty JA
    145. Feinberg AP
    146. Hansen KD
    147. Hickey PF
    148. Jasmine F
    149. Jiang L
    150. Kaul R
    151. Kibriya MG
    152. Li JB
    153. Li Q
    154. Lin S
    155. Linder SE
    156. Pierce BL
    157. Rizzardi LF
    158. Skol AD
    159. Smith KS
    160. Snyder M
    161. Stamatoyannopoulos J
    162. Tang H
    163. Wang M
    164. Carithers LJ
    165. Guan P
    166. Koester SE
    167. Little AR
    168. Moore HM
    169. Nierras CR
    170. Rao AK
    171. Vaught JB
    172. Volpi S
    173. The GTEx Consortium
    (2020) The GTEx Consortium atlas of genetic regulatory effects across human tissues
    Science 369:1318–1330.
    https://doi.org/10.1126/science.aaz1776
    1. Cordell HJ
    2. Fryett JJ
    3. Ueno K
    4. Darlay R
    5. Aiba Y
    6. Hitomi Y
    7. Kawashima M
    8. Nishida N
    9. Khor SS
    10. Gervais O
    11. Kawai Y
    12. Nagasaki M
    13. Tokunaga K
    14. Tang R
    15. Shi Y
    16. Li Z
    17. Juran BD
    18. Atkinson EJ
    19. Gerussi A
    20. Carbone M
    21. Asselta R
    22. Cheung A
    23. de Andrade M
    24. Baras A
    25. Horowitz J
    26. Ferreira MAR
    27. Sun D
    28. Jones DE
    29. Flack S
    30. Spicer A
    31. Mulcahy VL
    32. Byan J
    33. Han Y
    34. Sandford RN
    35. Lazaridis KN
    36. Amos CI
    37. Hirschfield GM
    38. Seldin MF
    39. Invernizzi P
    40. Siminovitch KA
    41. Ma X
    42. Nakamura M
    43. Mells GF
    44. Siminovitch KA
    45. Hirschfield GM
    46. Mason A
    47. Vincent C
    48. Xie G
    49. Zhang J
    50. Tang R
    51. Ma X
    52. Li Z
    53. Shi Y
    54. Affronti A
    55. Almasio PL
    56. Alvaro D
    57. Andreone P
    58. Andriulli A
    59. Azzaroli F
    60. Battezzati PM
    61. Benedetti A
    62. Bragazzi M
    63. Brunetto M
    64. Bruno S
    65. Calvaruso V
    66. Cardinale V
    67. Casella G
    68. Cazzagon N
    69. Ciaccio A
    70. Coco B
    71. Colli A
    72. Colloredo G
    73. Colombo M
    74. Colombo S
    75. Cristoferi L
    76. Cursaro C
    77. Crocè LS
    78. Crosignani A
    79. D’Amato D
    80. Donato F
    81. Elia G
    82. Fabris L
    83. Fagiuoli S
    84. Ferrari C
    85. Floreani A
    86. Galli A
    87. Giannini E
    88. Grattagliano I
    89. Lampertico P
    90. Lleo A
    91. Malinverno F
    92. Mancuso C
    93. Marra F
    94. Marzioni M
    95. Massironi S
    96. Mattalia A
    97. Miele L
    98. Milani C
    99. Morini L
    100. Morisco F
    101. Muratori L
    102. Muratori P
    103. Niro GA
    104. O’Donnell S
    105. Picciotto A
    106. Portincasa P
    107. Rigamonti C
    108. Ronca V
    109. Rosina F
    110. Spinzi G
    111. Strazzabosco M
    112. Tarocchi M
    113. Tiribelli C
    114. Toniutto P
    115. Valenti L
    116. Vinci M
    117. Zuin M
    118. Nakamura H
    119. Abiru S
    120. Nagaoka S
    121. Komori A
    122. Yatsuhashi H
    123. Ishibashi H
    124. Ito M
    125. Migita K
    126. Ohira H
    127. Katsushima S
    128. Naganuma A
    129. Sugi K
    130. Komatsu T
    131. Mannami T
    132. Matsushita K
    133. Yoshizawa K
    134. Makita F
    135. Nikami T
    136. Nishimura H
    137. Kouno H
    138. Kouno H
    139. Ota H
    140. Komura T
    141. Nakamura Y
    142. Shimada M
    143. Hirashima N
    144. Komeda T
    145. Ario K
    146. Nakamuta M
    147. Yamashita T
    148. Furuta K
    149. Kikuchi M
    150. Naeshiro N
    151. Takahashi H
    152. Mano Y
    153. Tsunematsu S
    154. Yabuuchi I
    155. Shimada Y
    156. Yamauchi K
    157. Sugimoto R
    158. Sakai H
    159. Mita E
    160. Koda M
    161. Tsuruta S
    162. Kamitsukasa H
    163. Sato T
    164. Masaki N
    165. Kobata T
    166. Fukushima N
    167. Ohara Y
    168. Muro T
    169. Takesaki E
    170. Takaki H
    171. Yamamoto T
    172. Kato M
    173. Nagaoki Y
    174. Hayashi S
    175. Ishida J
    176. Watanabe Y
    177. Kobayashi M
    178. Koga M
    179. Saoshiro T
    180. Yagura M
    181. Hirata K
    182. Tanaka A
    183. Takikawa H
    184. Zeniya M
    185. Abe M
    186. Onji M
    187. Kaneko S
    188. Honda M
    189. Arai K
    190. Arinaga-Hino T
    191. Hashimoto E
    192. Taniai M
    193. Umemura T
    194. Joshita S
    195. Nakao K
    196. Ichikawa T
    197. Shibata H
    198. Yamagiwa S
    199. Seike M
    200. Honda K
    201. Sakisaka S
    202. Takeyama Y
    203. Harada M
    204. Senju M
    205. Yokosuka O
    206. Kanda T
    207. Ueno Y
    208. Kikuchi K
    209. Ebinuma H
    210. Himoto T
    211. Yasunami M
    212. Murata K
    213. Mizokami M
    214. Kawata K
    215. Shimoda S
    216. Miyake Y
    217. Takaki A
    218. Yamamoto K
    219. Hirano K
    220. Ichida T
    221. Ido A
    222. Tsubouchi H
    223. Chayama K
    224. Harada K
    225. Nakanuma Y
    226. Maehara Y
    227. Taketomi A
    228. Shirabe K
    229. Soejima Y
    230. Mori A
    231. Yagi S
    232. Uemoto S
    233. H E
    234. Tanaka T
    235. Yamashiki N
    236. Tamura S
    237. Sugawara Y
    238. Kokudo N
    239. Juran BD
    240. Atkinson EJ
    241. Cheung A
    242. de Andrade M
    243. Lazaridis KN
    244. Chalasani N
    245. Luketic V
    246. Odin J
    247. Chopra K
    248. Baras A
    249. Horowitz J
    250. Abecasis G
    251. Cantor M
    252. Coppola G
    253. Economides A
    254. Lotta LA
    255. Overton JD
    256. Reid JG
    257. Shuldiner A
    258. Beechert C
    259. Forsythe C
    260. Fuller ED
    261. Gu Z
    262. Lattari M
    263. Lopez A
    264. Overton JD
    265. Schleicher TD
    266. Padilla MS
    267. Toledo K
    268. Widom L
    269. Wolf SE
    270. Pradhan M
    271. Manoochehri K
    272. Ulloa RH
    273. Bai X
    274. Balasubramanian S
    275. Barnard L
    276. Blumenfeld A
    277. Eom G
    278. Habegger L
    279. Hawes A
    280. Khalid S
    281. Reid JG
    282. Maxwell EK
    283. Salerno W
    284. Staples JC
    285. Jones MB
    286. Mitnaul LJ
    287. Sturgess R
    288. Healey C
    289. Yeoman A
    290. Gunasekera AVJ
    291. Kooner P
    292. Kapur K
    293. Sathyanarayana V
    294. Kallis Y
    295. Subhani J
    296. Harvey R
    297. McCorry R
    298. Rooney P
    299. Ramanaden D
    300. Evans R
    301. Mathialahan T
    302. Gasem J
    303. Shorrock C
    304. Bhalme M
    305. Southern P
    306. Tibble JA
    307. Gorard DA
    308. Jones S
    309. Mells G
    310. Mulcahy V
    311. Srivastava B
    312. Foxton MR
    313. Collins CE
    314. Elphick D
    315. Karmo M
    316. Porras-Perez F
    317. Mendall M
    318. Yapp T
    319. Patel M
    320. Ede R
    321. Sayer J
    322. Jupp J
    323. Fisher N
    324. Carter MJ
    325. Koss K
    326. Shah J
    327. Piotrowicz A
    328. Scott G
    329. Grimley C
    330. Gooding IR
    331. Williams S
    332. Tidbury J
    333. Lim G
    334. Cheent K
    335. Levi S
    336. Mansour D
    337. Beckley M
    338. Hollywood C
    339. Wong T
    340. Marley R
    341. Ramage J
    342. Gordon HM
    343. Ridpath J
    344. Ngatchu T
    345. Bob Grover VP
    346. Shidrawi RG
    347. Abouda G
    348. Corless L
    349. Narain M
    350. Rees I
    351. Brown A
    352. Taylor-Robinson S
    353. Wilkins J
    354. Grellier L
    355. Banim P
    356. Das D
    357. Heneghan MA
    358. Curtis H
    359. Matthews HC
    360. Mohammed F
    361. Aldersley M
    362. Srirajaskanthan R
    363. Walker G
    364. McNair A
    365. Sharif A
    366. Sen S
    367. Bird G
    368. Prince MI
    369. Prasad G
    370. Kitchen P
    371. Barnardo A
    372. Oza C
    373. Sivaramakrishnan NN
    374. Gupta P
    375. Shah A
    376. Evans CDJ
    377. Saha S
    378. Pollock K
    379. Bramley P
    380. Mukhopadhya A
    381. Barclay ST
    382. McDonald N
    383. Bathgate AJ
    384. Palmer K
    385. Dillon JF
    386. Rushbrook SM
    387. Przemioslo R
    388. McDonald C
    389. Millar A
    390. Tai C
    391. Mitchell S
    392. Metcalf J
    393. Shaukat S
    394. Ninkovic M
    395. Shmueli U
    396. Davis A
    397. Naqvi A
    398. Lee TJW
    399. Ryder S
    400. Collier J
    401. Klass H
    402. Cramp ME
    403. Sharer N
    404. Aspinall R
    405. Ghosh D
    406. Douds AC
    407. Booth J
    408. Williams E
    409. Hussaini H
    410. Christie J
    411. Mann S
    412. Thorburn D
    413. Marshall A
    414. Patanwala I
    415. Ala A
    416. Maltby J
    417. Matthew R
    418. Corbett C
    419. Vyas S
    420. Singhal S
    421. Gleeson D
    422. Misra S
    423. Butterworth J
    424. George K
    425. Harding T
    426. Douglass A
    427. Mitchison H
    428. Panter S
    429. Shearman J
    430. Bray G
    431. Roberts M
    432. Butcher G
    433. Forton D
    434. Mahmood Z
    435. Cowan M
    436. Das D
    437. Ch’ng CL
    438. Rahman M
    439. Whatley GCA
    440. Wesley E
    441. Mandal A
    442. Jain S
    443. Pereira SP
    444. Wright M
    445. Trivedi P
    446. Gordon FH
    447. Unitt E
    448. Palejwala A
    449. Austin A
    450. Vemala V
    451. Grant A
    452. Higham AD
    453. Brind A
    454. Mathew R
    455. Cox M
    456. Ramakrishnan S
    457. King A
    458. Whalley S
    459. Fraser J
    460. Thomson SJ
    461. Bell A
    462. Wong VS
    463. Kia R
    464. Gee I
    465. Keld R
    466. Ransford R
    467. Gotto J
    468. Millson C
    (2021) An international genome-wide meta-analysis of primary biliary cholangitis: novel risk loci and candidate drugs
    Journal of Hepatology 75:572–581.
    https://doi.org/10.1016/j.jhep.2021.04.055
    1. Morris AP
    2. Voight BF
    3. Teslovich TM
    4. Ferreira T
    5. Segrè AV
    6. Steinthorsdottir V
    7. Strawbridge RJ
    8. Khan H
    9. Grallert H
    10. Mahajan A
    11. Prokopenko I
    12. Kang HM
    13. Dina C
    14. Esko T
    15. Fraser RM
    16. Kanoni S
    17. Kumar A
    18. Lagou V
    19. Langenberg C
    20. Luan J
    21. Lindgren CM
    22. Müller-Nurasyid M
    23. Pechlivanis S
    24. Rayner NW
    25. Scott LJ
    26. Wiltshire S
    27. Yengo L
    28. Kinnunen L
    29. Rossin EJ
    30. Raychaudhuri S
    31. Johnson AD
    32. Dimas AS
    33. Loos RJF
    34. Vedantam S
    35. Chen H
    36. Florez JC
    37. Fox C
    38. Liu C-T
    39. Rybin D
    40. Couper DJ
    41. Kao WHL
    42. Li M
    43. Cornelis MC
    44. Kraft P
    45. Sun Q
    46. van Dam RM
    47. Stringham HM
    48. Chines PS
    49. Fischer K
    50. Fontanillas P
    51. Holmen OL
    52. Hunt SE
    53. Jackson AU
    54. Kong A
    55. Lawrence R
    56. Meyer J
    57. Perry JRB
    58. Platou CGP
    59. Potter S
    60. Rehnberg E
    61. Robertson N
    62. Sivapalaratnam S
    63. Stančáková A
    64. Stirrups K
    65. Thorleifsson G
    66. Tikkanen E
    67. Wood AR
    68. Almgren P
    69. Atalay M
    70. Benediktsson R
    71. Bonnycastle LL
    72. Burtt N
    73. Carey J
    74. Charpentier G
    75. Crenshaw AT
    76. Doney ASF
    77. Dorkhan M
    78. Edkins S
    79. Emilsson V
    80. Eury E
    81. Forsen T
    82. Gertow K
    83. Gigante B
    84. Grant GB
    85. Groves CJ
    86. Guiducci C
    87. Herder C
    88. Hreidarsson AB
    89. Hui J
    90. James A
    91. Jonsson A
    92. Rathmann W
    93. Klopp N
    94. Kravic J
    95. Krjutškov K
    96. Langford C
    97. Leander K
    98. Lindholm E
    99. Lobbens S
    100. Männistö S
    101. Mirza G
    102. Mühleisen TW
    103. Musk B
    104. Parkin M
    105. Rallidis L
    106. Saramies J
    107. Sennblad B
    108. Shah S
    109. Sigurðsson G
    110. Silveira A
    111. Steinbach G
    112. Thorand B
    113. Trakalo J
    114. Veglia F
    115. Wennauer R
    116. Winckler W
    117. Zabaneh D
    118. Campbell H
    119. van Duijn C
    120. Uitterlinden AG
    121. Hofman A
    122. Sijbrands E
    123. Abecasis GR
    124. Owen KR
    125. Zeggini E
    126. Trip MD
    127. Forouhi NG
    128. Syvänen A-C
    129. Eriksson JG
    130. Peltonen L
    131. Nöthen MM
    132. Balkau B
    133. Palmer CNA
    134. Lyssenko V
    135. Tuomi T
    136. Isomaa B
    137. Hunter DJ
    138. Qi L
    139. Shuldiner AR
    140. Roden M
    141. Barroso I
    142. Wilsgaard T
    143. Beilby J
    144. Hovingh K
    145. Price JF
    146. Wilson JF
    147. Rauramaa R
    148. Lakka TA
    149. Lind L
    150. Dedoussis G
    151. Njølstad I
    152. Pedersen NL
    153. Khaw K-T
    154. Wareham NJ
    155. Keinanen-Kiukaanniemi SM
    156. Saaristo TE
    157. Korpi-Hyövälti E
    158. Saltevo J
    159. Laakso M
    160. Kuusisto J
    161. Metspalu A
    162. Collins FS
    163. Mohlke KL
    164. Bergman RN
    165. Tuomilehto J
    166. Boehm BO
    167. Gieger C
    168. Hveem K
    169. Cauchi S
    170. Froguel P
    171. Baldassarre D
    172. Tremoli E
    173. Humphries SE
    174. Saleheen D
    175. Danesh J
    176. Ingelsson E
    177. Ripatti S
    178. Salomaa V
    179. Erbel R
    180. Jöckel K-H
    181. Moebus S
    182. Peters A
    183. Illig T
    184. de Faire U
    185. Hamsten A
    186. Morris AD
    187. Donnelly PJ
    188. Frayling TM
    189. Hattersley AT
    190. Boerwinkle E
    191. Melander O
    192. Kathiresan S
    193. Nilsson PM
    194. Deloukas P
    195. Thorsteinsdottir U
    196. Groop LC
    197. Stefansson K
    198. Hu F
    199. Pankow JS
    200. Dupuis J
    201. Meigs JB
    202. Altshuler D
    203. Boehnke M
    204. McCarthy MI
    205. Wellcome Trust Case Control Consortium
    206. Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) Investigators
    207. Genetic Investigation of ANthropometric Traits (GIANT) Consortium
    208. Asian Genetic Epidemiology Network–Type 2 Diabetes (AGEN-T2D) Consortium
    209. South Asian Type 2 Diabetes (SAT2D) Consortium
    210. DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium
    (2012) Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes
    Nature Genetics 44:981–990.
    https://doi.org/10.1038/ng.2383
    1. Schunkert H
    2. König IR
    3. Kathiresan S
    4. Reilly MP
    5. Assimes TL
    6. Holm H
    7. Preuss M
    8. Stewart AFR
    9. Barbalic M
    10. Gieger C
    11. Absher D
    12. Aherrahrou Z
    13. Allayee H
    14. Altshuler D
    15. Anand SS
    16. Andersen K
    17. Anderson JL
    18. Ardissino D
    19. Ball SG
    20. Balmforth AJ
    21. Barnes TA
    22. Becker DM
    23. Becker LC
    24. Berger K
    25. Bis JC
    26. Boekholdt SM
    27. Boerwinkle E
    28. Braund PS
    29. Brown MJ
    30. Burnett MS
    31. Buysschaert I
    32. Carlquist JF
    33. Chen L
    34. Cichon S
    35. Codd V
    36. Davies RW
    37. Dedoussis G
    38. Dehghan A
    39. Demissie S
    40. Devaney JM
    41. Diemert P
    42. Do R
    43. Doering A
    44. Eifert S
    45. Mokhtari NEE
    46. Ellis SG
    47. Elosua R
    48. Engert JC
    49. Epstein SE
    50. de Faire U
    51. Fischer M
    52. Folsom AR
    53. Freyer J
    54. Gigante B
    55. Girelli D
    56. Gretarsdottir S
    57. Gudnason V
    58. Gulcher JR
    59. Halperin E
    60. Hammond N
    61. Hazen SL
    62. Hofman A
    63. Horne BD
    64. Illig T
    65. Iribarren C
    66. Jones GT
    67. Jukema JW
    68. Kaiser MA
    69. Kaplan LM
    70. Kastelein JJP
    71. Khaw K-T
    72. Knowles JW
    73. Kolovou G
    74. Kong A
    75. Laaksonen R
    76. Lambrechts D
    77. Leander K
    78. Lettre G
    79. Li M
    80. Lieb W
    81. Loley C
    82. Lotery AJ
    83. Mannucci PM
    84. Maouche S
    85. Martinelli N
    86. McKeown PP
    87. Meisinger C
    88. Meitinger T
    89. Melander O
    90. Merlini PA
    91. Mooser V
    92. Morgan T
    93. Mühleisen TW
    94. Muhlestein JB
    95. Münzel T
    96. Musunuru K
    97. Nahrstaedt J
    98. Nelson CP
    99. Nöthen MM
    100. Olivieri O
    101. Patel RS
    102. Patterson CC
    103. Peters A
    104. Peyvandi F
    105. Qu L
    106. Quyyumi AA
    107. Rader DJ
    108. Rallidis LS
    109. Rice C
    110. Rosendaal FR
    111. Rubin D
    112. Salomaa V
    113. Sampietro ML
    114. Sandhu MS
    115. Schadt E
    116. Schäfer A
    117. Schillert A
    118. Schreiber S
    119. Schrezenmeir J
    120. Schwartz SM
    121. Siscovick DS
    122. Sivananthan M
    123. Sivapalaratnam S
    124. Smith A
    125. Smith TB
    126. Snoep JD
    127. Soranzo N
    128. Spertus JA
    129. Stark K
    130. Stirrups K
    131. Stoll M
    132. Tang WHW
    133. Tennstedt S
    134. Thorgeirsson G
    135. Thorleifsson G
    136. Tomaszewski M
    137. Uitterlinden AG
    138. van Rij AM
    139. Voight BF
    140. Wareham NJ
    141. Wells GA
    142. Wichmann H-E
    143. Wild PS
    144. Willenborg C
    145. Witteman JCM
    146. Wright BJ
    147. Ye S
    148. Zeller T
    149. Ziegler A
    150. Cambien F
    151. Goodall AH
    152. Cupples LA
    153. Quertermous T
    154. März W
    155. Hengstenberg C
    156. Blankenberg S
    157. Ouwehand WH
    158. Hall AS
    159. Deloukas P
    160. Thompson JR
    161. Stefansson K
    162. Roberts R
    163. Thorsteinsdottir U
    164. O’Donnell CJ
    165. McPherson R
    166. Erdmann J
    167. Samani NJ
    168. Cardiogenics
    169. CARDIoGRAM Consortium
    (2011) Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease
    Nature Genetics 43:333–338.
    https://doi.org/10.1038/ng.784

Article and author information

Author details

  1. Raehoon Jeong

    1. Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, United States
    2. Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, United States
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9840-4692
  2. Martha L Bulyk

    1. Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, United States
    2. Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, United States
    3. Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, United States
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Methodology, Project administration, Writing – review and editing
    For correspondence
    mlbulyk@genetics.med.harvard.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3456-4555

Funding

National Human Genome Research Institute (R01 HG010501)

  • Martha L Bulyk

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Vijay Sankaran, Shamil Sunyaev, Alexander Gusev, and members of the Raychaudhuri lab for helpful discussion. This work was funded by NIH grant R01 HG010501.

Version history

  1. Preprint posted:
  2. Sent for peer review:
  3. Reviewed Preprint version 1:
  4. Reviewed Preprint version 2:
  5. Version of Record published:

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.98289. This DOI represents all versions, and will always resolve to the latest one.

Copyright

© 2024, Jeong and Bulyk

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,017
    views
  • 80
    downloads
  • 5
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Citations by DOI

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Raehoon Jeong
  2. Martha L Bulyk
(2025)
Chromatin accessibility variation provides insights into missing regulation underlying immune-mediated diseases
eLife 13:RP98289.
https://doi.org/10.7554/eLife.98289.3

Share this article

https://doi.org/10.7554/eLife.98289