Epigenetic analysis of Paget’s disease of bone identifies differentially methylated loci that predict disease status

  1. Ilhame Diboun
  2. Sachin Wani
  3. Stuart H Ralston
  4. Omar ME Albagha  Is a corresponding author
  1. Division of Genomic and Translational Biomedicine, College of Health and Life Sciences, Hamad Bin Khalifa University, Qatar
  2. Centre for Genomic and Experimental Medicine, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, United Kingdom
5 figures, 4 tables and 8 additional files

Figures

Study design and analysis workflow.

Differentially methylated sites (DMS) and differentially methylated regions (DMR) were analyzed using, the general/generalized linear model, respectively, in the discovery set. Those reaching FDR < 0.05 were tested in the cross-validation set to identify DMS/DMR that replicate at the same significance level. The DMS and the important sites within DMR were pooled together giving rise to the Pooled sites (refer to Materials and methods), of these a best PDB discriminatory subset was obtained using the Lasso and Elastic-Net regression. A multivariate classifier based on the discovery measurement of the Pooled/Best subset sites yielded an AUC value of 92.8% and 82.5%, respectively, when tested in the cross-validation.

Figure 2 with 1 supplement
Differential methylation analysis comparing controls to PDB patients (n = 246).

(A) Site analysis, a Manhattan plot showing the chromosomal positions (x-axis) versus the −log10 (p) of significant DMS and adjacent sites. For the Bonferroni significant sites however, the meta-analysis p-values are shown instead and highlighted in color. The horizontal dashed line indicates the Bonferroni corrected significance threshold (p<1.17×10−7). (B, C) Region analysis, showing the multitude of significantly hyper-methylated (red) and hypo-methylated (blue) sites from LTB (Bonferroni replicated from island analysis) and HSPA13 (Bonferroni replicated from gene body analysis). The dashed lines represent the FDR < 0.05 threshold for each region, which depends on the number of sites within the region (refer to Materials and methods).

Figure 2—figure supplement 1
QQ plots of expected versus observed –log10 p-values from site differential methylation analysis.

The genomic inflation factor = 1.23.

Translating the methylation data into functional networks.

Nodes are functional, cellular, molecular, and sub-cellular keywords from GO annotations enriched amongst the Pooled sites. An edge between two nodes indicates that differentially methylated genes associated with the keyword in node one are significantly partially correlated with their counterparts from node 2 more often than can be accounted for by chance.

The orthogonal partial least squares-discriminant analysis (OPLS-DA) was performed using the Pooled sites identified from the discovery set (n = 246).

(A) Classifier trained on all 2847 pooled sites with FDR < 0.05 (Pooled sites) from the discovery set. (B) Testing the classifier on the replication (or cross-validation) set. (C) ROC curve analysis yielded an overall sensitivity of 0.84, specificity of 0.81, and AUC of 0.928. (D) Classifier trained on the Best subset sites from Glmnet analysis (n = 95) using the discovery set. (E) Testing the classifier on the replication (or cross-validation) set. (F) ROC curve analysis showed an overall sensitivity of 0.77, specificity of 0.74, and AUC of 0.825. The Scatter plots show the predictive component that discriminates PDB cases from controls (x-axis) versus the orthogonal component representing a multivariate confounding effect that is independent of PDB (y-axis).

Functions of genes mapped near the Best subset of differentially methylated sites identified through the elastic-net regularization extension of the generalized linear model.

(A) An IPA-based network showing a subset of these genes with functional interactions (edges) or mapping to one of three functional classes: immune, viral, and bone homeostasis. (B) An overview of GO biological processes significantly enriched amongst the Best subset together with their beta values from the Glmnet R package implementing the extended generalized linear model in question.

Tables

Table 1
Descriptive statistics of the study cohort.
DiscoveryCross-validation
PDB caseControlPDB caseControl
Number116130116130
Age (years), mean ± SD72.1 ± 7.5*70.0 ± 7.4*72.5 ± 8.772.3 ± 8.2
Male, n (%)65 (56.0)*48 (36.9)*59 (50.9)53 (40.8)
Female, n (%)51 (44.0)*82 (63.1)*57 (49.1)77 (59.2)
SQSTM1 mutation, n (%)16 (13.8)0 (0)17 (14.6)0 (0)
  1. *P<0.05 comparing Paget’s disease (PDB) cases to controls.

Table 2
Differentially methylated CpG sites (DMS) in Paget’s disease of bone.
CpG SiteDiscoveryCross-validationMeta-analysisAnnotations
Probe IDChrPositionΔ Beta*p-valueΔ Beta*p-valueΔ Beta*p-valueNearest gene
cg10290814177284330−0.0181.2 × 10−6−0.0151.4 × 10−4−0.0172.3 × 10−10TNK1
cg193618651220922163−0.0145.4 × 10−6−0.0129.7 × 10−5−0.0137.6 × 10−10MOSC2
cg09152582188928362−0.0212.1 × 10−5−0.0183.5 × 10−5−0.0191.1 × 10−9PKN2-AS1
cg0926008910134599860−0.0244.6 × 10−5−0.0241.2 × 10−4−0.0249.5 × 10−9NKX6-2
cg2487927310102989645−0.0264.9 × 10−5−0.0161.7 × 10−4−0.0211.4 × 10−8LBX1
cg038397091396743492−0.0142.7 × 10−4−0.0143.4 × 10−5−0.0141.8 × 10−8HS6ST3
cg16419235857360613−0.0361.9 × 10−4−0.0298.3 × 10−5−0.0323.1 × 10−8PENK
cg043179621679623625−0.0171.4 × 10−6−0.0192.9 × 10−3−0.0183.1 × 10−8MAF
cg01429039452918065−0.0231.8 × 10−4−0.0201.1 × 10−4−0.0213.5 × 10−8SPATA18
cg03885399147691550−0.0204.4 × 10−6−0.0143.6 × 10−3−0.0174.7 × 10−8TAL1
cg047389653147127662−0.0374.0 × 10−5−0.0287.1 × 10−4−0.0336.2 × 10−8ZIC1
cg1095418212104532377−0.0161.9 × 10−4−0.0092.1 × 10−4−0.0137.8 × 10−8NFYB
cg1096436781771973−0.0251.3 × 10−4−0.0193.8 × 10−4−0.0229.4 × 10−8ARHGEF10
cg127394541164290833−0.0182.4 × 10−4−0.0122.4 × 10−4−0.0151.1 × 10−7-
  1. *Δ Beta represents the difference in DNA methylation in cases as compared to controls (Beta Control-Beta PDB). Position in base pairs in reference to human genome build 37 (GRCh37). Chr, chromosome; CpG, cytosine-phosphate-guanine. All p-values are genome-wide significant based on Bonferroni corrected p-value < 0.05.

Table 3
Differentially methylated regions (DMR) in Paget’s disease of bone.
RegionChrNumber of sitesDiscovery p-value*Cross-validation p-value*Gene
Island6531.40 × 10−23.25 × 10−4LTB
Island6594.11 × 10−32.47 × 10−3SKIV2L;RDBP
Island10492.65 × 10−34.72 × 10−3EBF3
Island11493.57 × 10−39.52 × 10−3CCND1
Gene Body1522.01 × 10−53.14 × 10−5SDCCAG8
Gene Body9366.09 × 10−31.20 × 10−2CACNA1B
Gene Body8512.49 × 10−24.39 × 10−3RBPMS
Gene Body2153.19 × 10−22.88 × 10−3HSPA13
Gene Body2523.80 × 10−22.39 × 10−3PARD3B
Gene Body22344.49 × 10−27.10 × 10−3BRD1
  1. *P-values are adjusted for multiple testing using the Bonferroni method.

Key resources table
Reagent type
(species) or resource
DesignationSource or referenceIdentifiersAdditional information
OtherInfinium HumanMethylation450 BeadChipIllumina, USADNA Methylation array
Software, algorithmRnBeadsRVersion 1.10.8
Software, algorithmSIMCAUmetrics, SwedenVersion 15
Software, algorithmIPAQiagen, Germany
Software, algorithmGGMRVersion 2.4
Software, algorithmtopGORVersion 2.4

Additional files

Supplementary file 1

List of replicated differentially methylated sites with FDR < 0.05.

https://cdn.elifesciences.org/articles/65715/elife-65715-supp1-v3.xlsx
Supplementary file 2

List of replicated DMR at islands with FDR < 0.05.

https://cdn.elifesciences.org/articles/65715/elife-65715-supp2-v3.xlsx
Supplementary file 3

List of replicated DMR at gene bodies with FDR < 0.05.

https://cdn.elifesciences.org/articles/65715/elife-65715-supp3-v3.xlsx
Supplementary file 4

List of replicated DMR at promoters with FDR < 0.05.

https://cdn.elifesciences.org/articles/65715/elife-65715-supp4-v3.xlsx
Supplementary file 5

List of Best subset sites from Glmnet analysis.

https://cdn.elifesciences.org/articles/65715/elife-65715-supp5-v3.xlsx
Supplementary file 6

List of CpG, reported as correlated between bone and blood in Ebrahimi et al. (PMID: 32692944), mapping to the same genes as our significant DMS and DMR in Pagets disease.

https://cdn.elifesciences.org/articles/65715/elife-65715-supp6-v3.xlsx
Supplementary file 7

List of expression quantitative trait-methylation (eQTM) sites from the Pooled sites.

Highlighted in bold are the 8CpGs belonging to the best subset of sites (a subset of sites best explanatory of PDB).

https://cdn.elifesciences.org/articles/65715/elife-65715-supp7-v3.xlsx
Transparent reporting form
https://cdn.elifesciences.org/articles/65715/elife-65715-transrepform-v3.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Ilhame Diboun
  2. Sachin Wani
  3. Stuart H Ralston
  4. Omar ME Albagha
(2021)
Epigenetic analysis of Paget’s disease of bone identifies differentially methylated loci that predict disease status
eLife 10:e65715.
https://doi.org/10.7554/eLife.65715