STAT3-mediated allelic imbalance of novel genetic variant Rs1047643 and B-cell-specific super-enhancer in association with systemic lupus erythematosus

  1. Yanfeng Zhang  Is a corresponding author
  2. Kenneth Day
  3. Devin M Absher  Is a corresponding author
  1. HudsonAlpha Institute for Biotechnology, United States
  2. Zymo Research Corp, United States
7 figures, 1 table and 6 additional files

Figures

Schematic of the study design.

On the basis of the functional genomic data feature, a two-stage study was designed. Summary of data sets is available in Supplementary files 1-2.

Figure 2 with 3 supplements
Change of allelic chromatin accessibility and expression in B cell subtypes from SLE patients and controls.

(A) Forest plot showing AI of allelic chromatin state of SNP rs1047643 in both resting naive (rN) and activated (Non-rN) B cells in patients of SLE compared with healthy controls. The p-value per study and combined p-value (summary) are calculated based on the linear regression model and Fisher’s method, respectively. The plot in the right panel displays the 95% of confidence interval of beta-value. (B–C) Boxplots showing allelic expression of SNP rs1047643 in both rN and activated B cells in patients with SLE as compared with healthy individuals. All raw data are available in Figure 2—source data 1.

Figure 2—source data 1

Source files for presenting results in Figure 2.

This zip archive contains all source data used for the quantitative analyses shown in Figure 2.

https://cdn.elifesciences.org/articles/72837/elife-72837-fig2-data1-v2.zip
Figure 2—figure supplement 1
Change of allelic chromatin accessibility in B cell subtypes from SLE patients and controls.

Forest plots showing AI of allelic chromatin state of SNP rs246367 (A) and rs72642993 (B) in both rN and activated (Non-rN) B cells in patients of SLE compared with healthy controls. The plots in the right panel display the 95% of confidence interval of beta-value.

Figure 2—figure supplement 2
Expression pattern of FDFT1 and BLK across B cell subtypes in patients with SLE and healthy controls.

The data showing expression profiles for FDFT1 (A) and BLK (B) in B cell subtypes were from a case-control study (Accession ID: GSE118254).

Figure 2—figure supplement 3
Expression pattern of FDFT1 and BLK across B cell subtypes in patients with SLE and healthy controls.

The data showing expression profiles for FDFT1 (A) and BLK (B) in B cell subtypes were from a case-control study (Accession ID: GSE92387).

Figure 3 with 2 supplements
Association analysis and functional prediction of SNP rs1047643.

(A) Association results for the SNP rs1047643 with SLE risk in single marker analyses. MAF, minor allele frequency; OR, odds ratio; CI, confidence interval. Adjusted p-trend: after adjustment for 12 GWAS index SNPs (shown in E) in a logistic regression model. (B) Haplotype analyses of the two SNPs (SNP1: GWAS indexed SNP rs17807624; SNP2: rs1047643) in relation to SLE risk. Baseline (the reference haplotype) represents the alleles associated with a reduced risk in two SNPs. (C) Barplot showing the genomic length of chromHMM-annotated enhancer state on the super-enhancer region (blue highlighted in 3 C) in 43 epigenomes. (D) Plot shows the eQTL result of SNP rs1047643 in whole blood or B cells from three databases (shown in y-axis). (E) Genomic annotations of the SNP rs1047643. The three tracks show locations of 13 GWAS index SNP, gene annotation and 15-state chromatin segments in CD19+ B cells at 8p23 locus, respectively. Vertical blue and purple lines, represents the location of super-enhancer and SNP rs1047643, respectively. (F) Long-range interaction between a super-enhancer and SNP rs1047643. The two tracks show chromatin interactions from two independent studies using whole-genome Hi-C and capture Hi-C technologies, respectively. Orange curves show the interactions between the super-enhancer and the SNP rs1047643. (G) Heatmaps showing the 3D DNA interactions at 8p23.1 locus in eight cell lines. The rectangle represents interactions between the super-enhancer and the SNP rs1047643. All raw data are available in Figure 3—source data 1.

Figure 3—source data 1

Source files for presenting results in Figure 3.

This zip archive contains all source data used for the quantitative analyses shown in Figure 3.

https://cdn.elifesciences.org/articles/72837/elife-72837-fig3-data1-v2.zip
Figure 3—figure supplement 1
Chromatin interactions with FDFT1 promoter region (marked in green arrow) on 8p23 locus from CHi-C data with duplicates in two types of normal T cells.

Orange arrow represents the location of super-enhancer identified in this study.

Figure 3—figure supplement 2
Heatmaps of Long-range chromatin interactions from Hi-C data in 8p23 locus at 10 kb (or 20 kb) resolution in a panel of human tissues (n = 9) from the 3D Genome Browser.

The circles shown on heatmaps are the interaction density between SNP rs1047643 and SE region.

Figure 4 with 3 supplements
Aberration of super-enhancer and FDFT1 promoter region in B cell subtypes from SLE patients.

(A) Empirical cumulative distribution of TPM values per 50 bp window across the 7 kb SE region in B cell subsets for disease and control groups. (B) Plots showing the TPM values at the third quartile (Q3) across B cell subtypes as a comparison between SLE and controls. (C) Empirical cumulative distribution of TPM values on the SE region (same as shown in A) in a comparison between two groups across four B cell subtypes. (D) Boxplots showing the TPM values per 50 bp window at the FDFT1 promoter region in B cell subtypes for SLE and controls. The black lines and grey areas represent the linear regression results towards the B cell development from T3 to DN stages, and 95% of CI. (E) Plots showing the correlation between super-enhancer and FDFT1 promoter regions based on mean TPM values with respect to B cell subtypes in SLE and controls. (F) Wiggle plot showing the enrichment of open chromatin states at 8p23.1 locus in B cell subtypes for two individuals (a healthy individual at upper panel, and a patient with SLE at lower panel). Purple and green vertical lines represent the locations for super-enhancer and FDFT1 promoter, respectively. Quantitative comparison of chromatin accessibility states in SE (G) and FDFT1 promoter regions (H) with respect to B cell subtypes. All raw data are available in Figure 4—source data 1.

Figure 4—source data 1

This txt file contains source data used for the quantitative analyses shown in Figure 4.

https://cdn.elifesciences.org/articles/72837/elife-72837-fig4-data1-v2.txt
Figure 4—figure supplement 1
Genome-wide background analysis of ATAC-seq data.

Left panel: empirical cumulative distribution of TPM values per 50 bp window across randomly selected regions (n = 2,000) in B cell subsets for disease and control groups. Right panel: plots showing the TPM values at the third quartile (Q3) across B cell subtypes as a comparison between SLE and controls.

Figure 4—figure supplement 2
Aberration of super-enhancer in resting naive B cell subtypes from SLE patients in relation to healthy controls.

(A) Wiggle plot showing the enrichment of open chromatin states at 8p23.1 locus in resting native B cells from eight individuals. Blue and purple vertical lines represent the locations of SE and FDFT1 promoter, respectively. (B–C) Quantitative comparison of chromatin accessibility states in the SE and FDFT1 promoter regions in naive B cells in a comparison between SLE and controls.

Figure 4—figure supplement 3
Super-enhancer activity in T and neutrophils from SLE patients and controls.

(A–B) Empirical cumulative distribution of TPM values per 50 bp window and enrichment of ATAC-seq reads (TPM value) across the SE region in neutrophil cell subsets from SLE patients and controls. (C–D) Empirical cumulative distribution of TPM values per 50 bp window and enrichment of ATAC-seq reads (TPM value) across the SE region in two T cell subsets from SLE patients. (E) Wiggle plot showing the enrichment of open chromatin states at 8p23.1 locus in neutrophils and T cells. Blue and purple vertical lines represent the locations of SE and FDFT1 promoter, respectively.

Figure 5 with 1 supplement
Hypomethylation in super-enhancer region in B cell subtypes from SLE patients.

(A) Boxplots showing the CpG methylation levels per 50 bp window in 7 kb SE region in B cell subtypes for SLE and control groups. The black and red lines represent the linear regression results towards the B cell development from rN to DN stages for SLE and controls, respectively. (B) Plots showing the correlation between TPM values (y-axis) and DNA methylation levels (x-axis) averaged over each B cell type in SLE and controls. All raw data are available in Figure 5—source data 1.

Figure 5—source data 1

This txt file contains source data used for the quantitative analyses shown in Figure 5.

https://cdn.elifesciences.org/articles/72837/elife-72837-fig5-data1-v2.txt
Figure 5—figure supplement 1
DNA methylation comparison across randomly selected regions in B cell subtypes between patients with SLE and controls.

Boxplots showing the CpG methylation levels per 50 bp window in 2000 randomly selected regions in B cell subtypes for SLE and control groups. The black and red lines represent the linear regression results towards the B cell development from rN to DN stages for SLE and controls, respectively.

Figure 6 with 3 supplements
Contribution of STAT3 modulates the enhancer activity and SNP-residing locus in cultured GM11997 cells.

(A) ChIP-qPCR for H3K27ac (left lower panel), H3K4me1 (right lower panel) and pSTAT3 (B) at 8p23 super-enhancer region following 40 μM S3I-201 treatment for 24 hr. Upper panel: UCSC genome browser showing the location of two pairs of qPCR primers (SE5 and SE3) on the SE region (yellow). Two tracks shown below are the enrichment of H3K27ac and H3K4me1 across the SE region. (C) Allelic ChIP-qPCR for pSTAT3 binding on rs1047643 (T vs C alleles) following S3I-201 treatment for 24 hr. (D–E) ChIP-qPCR for H3K27ac (D), and pSTAT3 (E) at 8p23 super-enhancer region following 100 nM ML115 treatment for 6 hr. (F) Allelic ChIP-qPCR for pSTAT3 binding on rs1047643 in cells that have been challenged with ML115 for 6 hr as indicated. Note: the fold changes for the rs1047643-associated BLK and FDFT1 genes in response to small molecules compared to vehicle (0.1% DMSO) as control, which was set as one in all cases, are presented. NS, not significance; *, p < 0.05; **, p < 0.01; ***, p < 0.005.

Figure 6—figure supplement 1
Quality control of ChIP experiments in GM11997 cells.

Plots showing ChIP-qPCR results for H3K27ac (A and D), H3K4me1 (B and E) and pSTAT3 (C and F) at a negative control (NC) region with the treatment of S3I-201 and ML115, respectively.

Figure 6—figure supplement 2
Genotyping of SNP rs1047643 in GM11997 genomic DNA using allelic qPCR analysis.

Amplification plots are presented for two alleles.

Figure 6—figure supplement 3
Validation of STAT3-mediated allelic binding in GM11997 cells.

Plots showing ChIP-qPCR results for pSTAT3 at rs1047643 following 1 μM Cucurbitacin I for 24 hr (A) and 50 ng/ml IL-6 treatment for 1 h (B), respectively. NS, not significance; *, p < 0.05.

Figure 7 with 1 supplement
Expression of two alleles on SNP rs1047643 and its linked genes in cultured cells.

Left panel: allelic RT-qPCR on SNP rs1047643 (T vs C alleles) following S3I-201 (A) and ML115 (C) treatment for 24 hr, respectively. Right panel: RT-qPCR analysis showing the fold changes for the rs1047643-associated BLK and FDFT1 genes in response to different concentrations of S3I-201 (B) and ML115 (D) compared to vehicle (0.1% DMSO) as control, which was set as one in all cases, are presented. *, p < 0.05; **, p < 0.01; ***, p < 0.005.

Figure 7—figure supplement 1
Expression of two alleles on SNP rs1047643 in B-lymphoblastic cells.

Allelic RT-qPCR in GM11997 cells that have been challenged with S3I-201 for 24 hr (A) and ML115 for 6 hr (B) as indicated, respectively.

Tables

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Chemical compound, drugML115Cayman ChemicalCayman Chemical: 15,178Madoux et al., 2010
Chemical compound, drugS3I-201Sigma-AldrichSigma-Aldrich: SML0330
Chemical compound, drugCucurbitacin ISigma-AldrichSigma-Aldrich: C4493
Chemical compound, drugRecombinant human IL-6Cell Guidance SystemsCell Guidance Systems: GFH10AF
AntibodyPhospho-STAT3 (Ser727)Thermo Fisher ScientificThermo Fisher Scientific Cat# PA5-17876; RRID:AB_10980044
AntibodyAnti-Histone H3 (acetyl K27)AbcamAbcam Cat# ab4729; RRID:AB_2118291
AntibodyH3K4me1 Recombinant Polyclonal AntibodyThermo Fisher ScientificThermo Fisher Scientific Cat# 710795; RRID:AB_2532764
Antibodynormal mouse IgGSanta Cruz BiotechnologySanta Cruz Biotechnology Cat# sc-2025; RRID:AB_737182
Antibodynormal rabbit IgGSanta Cruz BiotechnologySanta Cruz Biotechnology Cat# sc-2027; RRID:AB_737197
Cell line (H. sapiens)GM11997CoriellCoriell Cat# GM11997; RRID:CVCL_5C55
Sequence-based reagentChIP-qPCR primersThis paperSee Supplementary file 5
Sequence-based reagentRT-qPCR primersThis paperSee Supplementary file 5
Sequence-based reagentAllelic qPCR primersThis paperSee Supplementary file 5
Software, algorithmRR Foundationhttps://www.r-project.orgVersion 4.0.2
Software, algorithmHisat2Kim et al., 2019Version 2
Software, algorithmAllelic imbalance analysis and plotsThis paper (Zhang, 2021)The R code used for the
AI analysis can be
accessed via github
at https://github.com/youngorchuang/Allelic-imbalance-analysis,
(copy archived at swh:1:rev:f0db42af8fed130ebbfe0b46abf992300dadddd6)
Software, algorithmHiCUPWingett et al., 2015
Commercial assay or kitMycoplasma detection kitSigma-AldrichSigma-Aldrich:MP0025
Commercial assay or kitSuperScript III reverse transcriptaseThermo Fisher ScientificThermo Fisher Scientific:18080044
Commercial assay or kitLuna Universal qPCR Master MixNew England BiolabsNew England Biolabs:M3003X

Additional files

Supplementary file 1

Summary of data sets used in the study.

Functional genomics data sets, including ATAC-seq, RNA-seq and RRBS-seq data sets from seven SLE case-control studies (Supplementary file 2), and Hi-C data sets in multiple cell lines, and a SNP microarray data set from a lupus GWAS study.

https://cdn.elifesciences.org/articles/72837/elife-72837-supp1-v2.xlsx
Supplementary file 2

List of data sets from seven SLE case-control studies.

https://cdn.elifesciences.org/articles/72837/elife-72837-supp2-v2.xlsx
Supplementary file 3

Association results for the SNP rs1047643 with SLE risk in European population.

https://cdn.elifesciences.org/articles/72837/elife-72837-supp3-v2.xlsx
Supplementary file 4

LD score (r2) between SNP rs1047643 and 12 GWAS tag SNPs in European population.

https://cdn.elifesciences.org/articles/72837/elife-72837-supp4-v2.xlsx
Supplementary file 5

List of primers used in this study.

https://cdn.elifesciences.org/articles/72837/elife-72837-supp5-v2.xlsx
Transparent reporting form
https://cdn.elifesciences.org/articles/72837/elife-72837-transrepform1-v2.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Yanfeng Zhang
  2. Kenneth Day
  3. Devin M Absher
(2022)
STAT3-mediated allelic imbalance of novel genetic variant Rs1047643 and B-cell-specific super-enhancer in association with systemic lupus erythematosus
eLife 11:e72837.
https://doi.org/10.7554/eLife.72837