Chromatin accessibility variation provides insights into missing regulation underlying immune-mediated diseases

  1. Raehoon Jeong
  2. Martha L Bulyk  Is a corresponding author
  1. Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, United States
  2. Bioinformatics and Integrative Genomics Graduate Program, Harvard University, United States
  3. Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, United States
6 figures and 2 additional files

Figures

Figure 1 with 3 supplements
Immune disease heritability mediated by chromatin accessibility in lymphoblastoid cell lines (LCLs).

(A–B) Heritability enrichment in accessible regions in LCLs, based on (A) stratified linkage disequilibrium (LD) score regression (S-LDSC) and (B) the proportion of heritability mediated by chromatin accessibility quantitative trait loci (caQTLs) in LCLs based on mediated expression score regression (MESC). For both, error bars represent jackknife standard errors of the mean. The color of the bars indicates disease type. *: p<0.003125 (Bonferroni-corrected). (C) Mediated heritability enrichment of accessible regions by peak strength quintile. The strongest peaks are in the 1st quintile, and the weakest peaks are in the 5th quintile. Only enrichment values with FDR < 5% (based on q-value) are shown. Stronger and more significant enrichment indicates that mediated heritability is concentrated in that subset. (D) Mediated heritability enrichment of accessible regions by histone mark annotation. The percentages in parentheses represent the proportion of accessible regions with the indicated histone mark. Only enrichment values with FDR < 5% (based on q-value) are shown. Color and size of the points are on the same scale as in (C). PBC, primary biliary cholangitis; MS, multiple sclerosis; CEL, celiac disease; RA, rheumatoid arthritis; JIA, juvenile idiopathic arthritis; SLE, systemic lupus erythematosus; CD, Crohn’s disease; UC, ulcerative colitis; IBD, inflammatory bowel disease; ATD, autoimmune thyroid disease; VIT, vitiligo; AST, asthma; ALL, allergies; SCZ, schizophrenia; T2D, type 2 diabetes; CAD, coronary artery disease.

Figure 1—figure supplement 1
Analysis workflow in this study.

(A) caQTL analysis workflow. (B) eQTL analysis workflow.

Figure 1—figure supplement 2
Proportion of chromatin accessibility quantitative trait locus (caQTL)-mediated immune-mediated disease (IMD) heritability explained by ATAC peaks with various histone marks.

‘Only ATAC’ peak set includes peaks without any of the three histone marks (H3K27ac, H3K4me1, H3K4me3). Percentages for each peak set denote the proportion of ATAC peaks in that set. The error bars represent jackknife standard errors of the mean. ELS, enhancer-like signature; PLS, promoter-like signature.

Figure 1—figure supplement 3
Properties of ATAC peaks with various histone marks.

(A) Size of ATAC peaks with various histone marks. Peaks with size greater than 10,000 bp are discarded. *: p<2.2 × 10–16, Wilcoxon rank-sum test. (B) Proportion of ATAC peaks that are within the given transcription start site (TSS) window around genes expressed in lymphoblastoid cell lines (LCLs). Percentages for each TSS window denote the proportion of ATAC peaks in that window. ‘Only ATAC’ peak set includes peaks without any of the three histone marks (H3K27ac, H3K4me1, H3K4me3).

Figure 2 with 4 supplements
Immune-mediated diseases (IMD) heritability mediated by chromatin accessibility quantitative trait loci (caQTLs) and expression quantitative trait loci (eQTLs).

(A) h2med/h2SNP estimates of various IMDs for caQTLs, eQTLs, and their union. The error bars represent jackknife standard errors of the mean. AID: autoimmune disease. Disease abbreviations along the x-axis are as in Figure 1. (B) Schema of the potential causal relationships between genetic variants, caQTLs, eQTLs, and IMD risk. The two diagrams depict possible h2med/h2SNP trends depending on the causal relationship between caQTLs and eQTLs. (C) Number of IMD-associated loci colocalized with caQTLs or eQTLs. The proportion is out of the total number of IMD-associated (p<10–6) loci. (D) ELMO1 locus plot showing association to rheumatoid arthritis (RA), primary biliary cholangitis (PBC), multiple sclerosis (MS), chromatin accessibility, and ELMO1 expression in lymphoblastoid cell lines (LCLs). Purple shading in the gene plot at the bottom indicates the caQTL peak, and the purple diamond is the lead variant (rs60600003) that is within that peak. The other variants are colored by the degree of linkage disequilibrium (LD) with the annotated variant. (E) Enrichment of the Biological Process Gene Ontology (GO) terms of genes in proximity to IMD-colocalized caQTLs without eQTL colocalization.

Figure 2—figure supplement 1
Relationship between immune-mediated disease (IMD) heritability mediated by chromatin accessibility quantitative trait loci (caQTLs) and expression quantitative trait loci (eQTLs).

The numbers in the bars denote the estimated proportion of h2med/h2SNP by each subset. They are derived from h2med; caQTL/h2SNP, h2med; eQTL/h2SNP, and h2med; caQTL ∪ eQTL/h2SNP as described in the Methods.

Figure 2—figure supplement 2
Protein factors detected at colocalized chromatin accessibility quantitative trait locus (caQTL) by ChIP-seq.

Number of unique ATAC-seq peaks overlapping ChIP-seq peaks of corresponding protein factors in Cistrome data. In total, there were 305 unique ATAC-seq peaks considered. Minimum overlap of 50% of the ChIP-seq peaks was counted. Only top 20 factors are shown.

Figure 2—figure supplement 3
Biological processes enriched in immune-mediated disease (IMD)-colocalized expression quantitative trait loci (eQTLs).

Top 20 enriched Gene Ontology (GO) biological processes are shown. The p-values were calculated using a binomial test in the Protein Analysis Through Evolutionary Relationships (PANTHER) (Mi et al., 2019) tool.

Figure 2—figure supplement 4
Immune-mediated disease (IMD) genome-wide association study (GWAS) colocalization with autoimmune disease drug target gene expression quantitative trait locus (eQTL) in lymphoblastoid cell lines (LCLs).

(A–D) Association plots for IMD GWAS and LCL’s chromatin accessibility quantitative trait loci (caQTLs) and IL6R (A and C) and IL12A (B and D) eQTLs. Panels A and B show significant eQTL colocalization to Crohn’s disease (CD) and primary biliary cholangitis (PBC), respectively. Panels C and D show lack of colocalization to rheumatoid arthritis (RA) and Crohn’s disease (CD), respectively, which are the diseases that the respective drugs treat. In each panel, the yellow shade signifies the caQTL peak, and the purple diamond shows a strongly associated variant that is within that peak. The other variants are colored by the degree of LD with the annotated variant. PBC GWAS data in (B) did not have the highlighted variant (rs4679867), so the purple diamond is missing.

Effect of peak-to-TSS distance on cis-heritability of chromatin accessibility quantitative trait loci (caQTLs) and expression quantitative trait loci (eQTLs) and immune-mediated disease (IMD) heritability.

(A) Distribution of cis-heritability (h2cis) of caQTLs and eQTLs by peak-to-TSS distance quintiles. The ranges of peak-to-TSS distance are shown in parentheses. The comparisons shown on the top (in respective colors) are between the nearest and each subsequent quintile of the respective QTL h2cis distribution (i.e. one-sided Wilcoxon rank-sum test). The comparisons shown on the bottom (in black) are between caQTL and eQTL h2cis distribution (one-sided paired Wilcoxon rank-sum test). *: p<10–4, **: p<10–10, ***: p<10–20, and ns: p>0.05. (B) Regression estimates and their standard errors of the linear regression model testing the effects of caQTL h2cis and peak-to-TSS distance on eQTL h2cis. Peak-to-TSS distance was expressed in units of 100 kb to neatly visualize the effect size estimates. The error bars represent standard errors of the regression estimate. SE: standard error. (C) Proportion of caQTL-mediated IMD heritability explained by ATAC peaks within various TSS windows. Percentage for each TSS window denotes the proportion of ATAC peaks in that window. The error bars represent jackknife standard errors of the mean. Disease abbreviations along the x-axis are as in Figure 1. (D) A model of the relationship between peak-to-TSS distance and power to detect a corresponding caQTL or eQTL. The thickness of the arrows indicates the variant effect size on chromatin accessibility (yellow) or gene expression (blue). TSS, transcription start site; CRE, cis-regulatory element.

Figure 4 with 2 supplements
Additional colocalization of chromatin accessibility quantitative trait locus (caQTL)-colocalized immune-mediated disease (IMD) loci with meta-analyzed lymphoblastoid cell line (LCL) expression quantitative trait locus (eQTL) data.

(A) Number of caQTL-colocalized IMD loci that showed eQTL colocalization in LCLs. Disease abbreviations along the x-axis are as in Figure 1. (B) Distribution of peak-to-TSS distance of all caQTL-eQTL pairs and of those colocalized with IMD association. The number of loci in each category is shown in parentheses. *: p<0.01, ns: p>0.05. (C) POU3F1 eQTL that became significantly colocalized with rheumatoid arthritis (RA) association by meta-analyzing LCL eQTL data. Purple shading in the gene plot at the bottom indicates the caQTL peak, and the purple diamond is the lead variant (rs60600003) that is within that peak. The other variants are colored by the degree of linkage disequilibrium (LD) with the annotated variant.

Figure 4—figure supplement 1
Immune-mediated disease (IMD) genome-wide association study (GWAS) colocalization with meta-analyzed lymphoblastoid cell line (LCL) expression quantitative trait loci (eQTLs).

(A–C) Association plots for IMD GWAS, chromatin accessibility quantitative trait loci (caQTLs), and CIITA (A), ATG16L1 (B), and CARD9 (C) eQTLs. GEUVADIS eQTLs did not colocalize with the GWAS signals, but meta-analyzed eQTLs did. In each panel, the yellow shade signifies the caQTL peak, and the purple diamond shows a strongly associated variant that is within that peak. The other variants are colored by the degree of linkage disequilibrium (LD) with the annotated variant.

Figure 4—figure supplement 2
Chromatin accessibility and histone mark levels at immune-mediated disease (IMD)-associated lymphoblastoid cell line (LCL) chromatin accessibility quantitative trait loci (caQTLs).

IMD-associated caQTLs are separated by whether expression quantitative trait locus (eQTL) also colocalized with IMD association. (Top) Average profile of fold enrichment values of each assay in the 4 kb window centered at the caQTL peak center. (Bottom) Heatmap of fold enrichment values of each assay in the 4 kb window centered at the caQTL peak center.

Figure 5 with 1 supplement
Added utility of various immune cell expression quantitative trait locus (eQTL) data.

(A) Number of loci that additionally colocalized with eQTLs by lymphoblastoid cell line (LCL) meta-analysis (orange) and immune cell data (cyan) compared to the original analysis with Geuvadis LCL eQTL data (purple). The height of the bar is the proportion of loci with eQTL colocalization out of the total immune-mediated disease (IMD) loci with chromatin accessibility quantitative trait locus (caQTL) colocalization in the earlier analysis. Disease abbreviations along the x-axis are as in Figure 1. (B) Relationship between the number of multiple sclerosis (MS) genome-wide association study (GWAS) loci with eQTL colocalization and sample size for each eQTL dataset. Meta-analyzed eQTL data are labeled with their cell types. NK cell, natural killer cell; Treg, regulatory T cell.

Figure 5—figure supplement 1
Chromatin accessibility and H3K27ac mark levels at immune-mediated diseases (IMD)-associated chromatin accessibility quantitative trait loci (caQTLs) with respect to monocyte expression quantitative trait locus (eQTL) colocalization.

IMD-associated caQTLs without lymphoblastoid cell line (LCL) eQTL colocalization are separated by whether monocyte eQTL colocalized with IMD association. (Top) Average profile of fold enrichment values of each assay in the 3 kb window centered at the caQTL peak center. (Bottom) Heatmap of fold enrichment values of each assay in the 3 kb window centered at the caQTL peak center.

Immune-mediated disease (IMD) loci that colocalized with an expression quantitative trait locus (eQTL) in monocytes, but not in lymphoblastoid cell lines (LCLs), even though they colocalized with chromatin accessibility quantitative trait loci (caQTLs) in LCLs.

(A) TNFSF15 locus plot showing genetic association to inflammatory bowel disease (IBD), chromatin accessibility in LCLs, and TNFSF15 expression in LCLs and monocytes. Purple shading in the gene plot at the bottom indicates the caQTL peak, and the purple diamond shows a strongly associated variant (rs7848647) that is within that peak. The other variants are colored by the degree of linkage disequilibrium (LD) with the annotated variant. (B) Expression levels of genes in LCLs and monocytes colored according to their eQTL colocalization outcome. TPM: transcripts per million. (C) Number of genes with eQTL colocalization in monocytes, but not in LCLs, separated by the gene’s expression level in LCLs (column) and whether it is lower or higher than that in monocytes (row).

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Raehoon Jeong
  2. Martha L Bulyk
(2025)
Chromatin accessibility variation provides insights into missing regulation underlying immune-mediated diseases
eLife 13:RP98289.
https://doi.org/10.7554/eLife.98289.3