Research Article

Genetics and Genomics

Discovering non-additive heritability using additive GWAS summary statistics

Center for Computational Molecular Biology, Brown University, United States
Department of Ecology and Evolutionary Biology, Brown University, United States
Department of Integrative Biology, The University of Texas at Austin, United States
Department of Population Health, The University of Texas at Austin, United States
Institute for Computational and Experimental Research in Mathematics, Brown University, United States
Department of Biostatistics, Brown University, United States
Data Science Institute, Brown University, United States
Microsoft, United States

Jun 24, 2024

Open access
Copyright information

Abstract
Editor's evaluation
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

LD score regression (LDSC) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it a fast and popular approach. In this work, we present interaction-LD score (i-LDSC) regression: an extension of the original LDSC framework that accounts for interactions between genetic variants. By studying a wide range of generative models in simulations, and by re-analyzing 25 well-studied quantitative phenotypes from 349,468 individuals in the UK Biobank and up to 159,095 individuals in BioBank Japan, we show that the inclusion of a cis-interaction score (i.e. interactions between a focal variant and proximal variants) recovers genetic variance that is not captured by LDSC. For each of the 25 traits analyzed in the UK Biobank and BioBank Japan, i-LDSC detects additional variation contributed by genetic interactions. The i-LDSC software and its application to these biobanks represent a step towards resolving further genetic contributions of sources of non-additive genetic effects to complex trait variation.

Editor's evaluation

This study provides a valuable investigation into whether phenotypic variance due to interactions between genetic variants can be measured using genome-wide association summary statistics. The authors present a convincing method, i-LDSC, that uses statistics on the correlations between genotypes at different loci (linkage disequilibrium) to estimate the phenotypic variance explained by both additive genetic effects and pairwise interactions.

https://doi.org/10.7554/eLife.90459.sa0

Introduction

Heritability is defined as the proportion of phenotypic trait variation that can be explained by genetic effects (Bulik-Sullivan et al., 2015b, Bulik-Sullivan et al., 2015a, Shi et al., 2016). Until recently, studies of heritability in humans have been reliant on typically small sized family studies with known relatedness structures among individuals (Zaitlen et al., 2013; Polderman et al., 2015). Due to advances in genomic sequencing and the steady development of statistical tools, it is now possible to obtain reliable heritability estimates from biobank-scale data sets of unrelated individuals (Bulik-Sullivan et al., 2015b; Shi et al., 2016; Hou et al., 2019; Pazokitoroudi et al., 2020). Computational and privacy considerations with genome-wide association studies (GWAS) in these larger cohorts have motivated a recent trend to estimate heritability using summary statistics (i.e. estimated effect sizes and their corresponding standard errors). In the GWAS framework, additive effect sizes and standard errors for individual single nucleotide polymorphisms (SNPs) are estimated by regressing phenotype measurements onto the allele counts of each SNP independently. Through the application of this approach over the last two decades, it has become clear that many traits have a complex and polygenic basis—that is, hundreds to thousands of individual genetic loci across the genome often contribute to the genetic basis of variation in a single trait (Yengo et al., 2018).

Many statistical methods have been developed to improve the estimation of heritability from GWAS summary statistics (Bulik-Sullivan et al., 2015b, Shi et al., 2016, Speed and Balding, 2019, Song et al., 2022). The most widely used of these approaches is linkage disequilibrium (LD) score regression and the corresponding LDSC software (Bulik-Sullivan et al., 2015b), which corrects for inflation in GWAS summary statistics by modeling the relationship between the variance of SNP-level effect sizes and the sum of correlation coefficients between focal SNPs and their genomic neighbors (i.e. the LD score of each variant). The formulation of the LDSC framework relies on the fact that the expected relationship between chi-square test statistics (i.e. the squared magnitude of GWAS allelic effect estimates) and LD scores holds when complex traits are generated under the infinitesimal (or polygenic) model which assumes: (i) all causal variants have the same expected contribution to phenotypic variation and (ii) causal variants are uniformly distributed along the genome. Initial simulations in Bulik-Sullivan et al. showed that violations of these assumptions can be tolerated to a point, but begin to affect the estimation of narrow-sense heritability once a certain proportion of variants have nonzero effects. Importantly, the estimand of the LDSC model is the proportion of phenotypic variance attributable to additive effects of genotyped SNPs. The main motivation behind the LDSC model is that, for polygenic traits, many marker SNPs tag nonzero effects. This may simply arise because some of these SNPS are in LD with causal variants (Bulik-Sullivan et al., 2015b) or because their statistical association is the product of a confounding factor such as population stratification.

As of late, there have been many efforts to build upon and improve the LDSC framework. For example, recent work has shown that it is possible to estimate the proportion of phenotypic variation explained by dominance effects (Palmer et al., 2023) and local ancestry (Chan et al., 2023) using extensions of the LDSC model. One limitation of LDSC is that, in practice, it only uses the diagonal elements of the squared LD matrix in its formulation which, while computationally efficient, does not account for information about trait architecture that is captured by the off-diagonal elements. This tradeoff helps LDSC to scale genome-wide, but it has also been shown to lead to heritability estimates with large standard error (Ning et al., 2020, Zhang et al., 2021, Song et al., 2022). Recently, newer approaches have attempted to reformulate the LDSC model by using the eigenvalues of the LD matrix to leverage more of the information present in the correlation structure between SNPs (Shi et al., 2016, Song et al., 2022).

In this paper, we show that the LDSC framework can be extended to estimate greater proportions of genetic variance in complex traits (i.e. beyond the variance that is attributable to additive effects) when a subset of causal variants is involved in a gene-by-gene (G×G) interaction. Indeed, recent association mapping studies have shown that G×G interactions can drive heterogeneity of causal variant effect sizes (Patel et al., 2022). Importantly, non-additive genetic effects have been proposed as one of the main factors that explains ‘missing’ heritability—the proportion of heritability not explained by the additive effects of variants (Eichler et al., 2010).

The key insight we highlight in this manuscript is that SNP-level GWAS summary statistics can provide evidence of non-additive genetic effects contributing to trait architecture if there is a nonzero correlation between individual-level genotypes and their statistical interactions. We present the ‘interaction-LD score’ regression model or i-LDSC: an extension of the LDSC framework which recovers ‘missing’ heritability by leveraging this ‘tagged’ relationship between linear and nonlinear genetic effects. To validate the performance of i-LDSC in simulation studies, we focus on synthetic trait architectures that have been generated with contributions stemming from second-order and cis-acting statistical SNP-by-SNP interaction effects; however, note that the general concept underlying i-LDSC can easily be extended to other sources of non-additive genetic effects (e.g. gene-by-environment interactions). The main difference between i-LDSC and LDSC is that the i-LDSC model includes an additional set of ‘cis-interaction’ LD scores in its regression model. These scores measure the amount of phenoytpic variation contributed by genetic interactions that can be explained by additive effects. In practice, these additional scores are efficient to compute and require nothing more than access to a representative pairwise LD map, same as the input required for LD score regression.

Through extensive simulations, we show that i-LDSC recovers substantial non-additive heritability that is not captured by LDSC when genetic interactions are indeed present in the generative model for a given complex trait. More importantly, i-LDSC has a calibrated type I error rate and does not overestimate contributions of genetic interactions to trait variation in simulated data when only additive effects are present. While analyzing 25 complex traits in the UK Biobank and BioBank Japan, we illustrate that pairwise interactions are a source of ‘missing’ heritability captured by additive GWAS summary statistics—suggesting that phenotypic variation due to non-additive genetic effects is more pervasive in human phenotypes than previously reported. Specifically, we find evidence of tagged genetic interaction effects contributing to heritability estimates in all of the 25 traits in the UK Biobank, and 23 of the 25 traits we analyzed in the BioBank Japan. We believe that i-LDSC, with our development of a new cis-interaction score, represents a significant step towards resolving the true contribution of genetic interactions.

Results

Overview of the interaction-LD score regression model

Interaction-LD score regression (i-LDSC) is a statistical framework for estimating heritability (i.e. the proportion of trait variance attributable to genetic variance). Here, we will give an overview of the i-LDSC method and its corresponding software, as well as detail how its underlying model differs from that of LDSC (Bulik-Sullivan et al., 2015b). We will assume that we are analyzing a GWAS dats set $𝒟 = {𝐗, 𝐲}$ where $𝐗$ is an $N \times J$ matrix of genotypes with $J$ denoting the number of SNPs (each of which is encoded as {0, 1, 2} copies of a reference allele at each locus $j$ ) and $𝐲$ is an $N$ -dimensional vector of measurements of a quantitative trait. The i-LDSC framework only requires summary statistics of individual-level data: namely, marginal effect size estimates for each SNP $\hat{𝜷}$ and a sample LD matrix $𝐑$ (which can be provided via reference panel data).

We begin by considering the following generative linear model for complex traits

y = b_{0} + X β + W θ + ε, ε \sim N (0, (1 - H^{2}) I),

where $b_{0}$ is an intercept term; $𝜷 = (β_{1}, \dots, β_{J})$ is a $J$ -dimensional vector containing the true additive effect sizes for an additional copy of the reference allele at each locus on $y$ ; $W$ is an $N \times M$ matrix of (pairwise) cis-acting SNP-by-SNP statistical interactions between some subset of causal SNPs, where columns of this matrix are assumed to be the Hadamard (element-wise) product between genotypic vectors of the form $𝐱_{j} \circ 𝐱_{k}$ for the $j$ -th and $k$ -th variants; $𝜽 = (θ_{1}, \dots, θ_{M})$ is an $M$ -dimensional vector containing the interaction effect sizes; $𝜺$ is a normally distributed error term with mean zero and variance scaled according to the proportion of phenotypic variation not explained by genetic effects (Bulik-Sullivan et al., 2015b), which we will refer to as the broad-sense heritability of the trait denoted by $H^{2}$ ; and $𝐈$ denotes an $N \times N$ identity matrix. For convenience, we will assume that the genotype matrix (column-wise) and the trait of interest have been mean-centered and standardized (Strandén and Christensen, 2011; de Los Campos et al., 2013; Zhou et al., 2013). Lastly, we will let the intercept term b₀ be a fixed parameter and we will assume that the effect sizes are each normally distributed with variances proportional to their individual contributions to trait heritability (Yang et al., 2010; Wu et al., 2011; Zhou et al., 2013; Crawford et al., 2017)

β_{j} \sim N (0, φ_{β}^{2} / J), θ_{m} \sim N (0, φ_{θ}^{2} / M) .

Effectively, we say that $𝕍 [𝐗 𝜷] = φ_{β}^{2}$ is the proportion of phenotypic variation contributed by additive SNP effects under the generative model, while $𝕍 [𝐖 𝜽] = φ_{θ}^{2}$ makes up the proportion of phenotypic variation contributed by genetic interactions. While the appropriateness of treating genetic effects as random variables in analytical derivations has been questioned (de Los Campos et al., 2015), later, we will justify the theory presented here with simulation results showing that i-LDSC accurately recovers non-additive genetic variance in Equation 1 under a broad range of conditions.

There are two key takeaways from the generative model specified above. First, Equation 2 implies that the additive and non-additive components in Equation 1 are orthogonal to each other. In other words, $𝔼 [𝜷^{⊺} 𝐗^{⊺} 𝐖 𝜽] = 𝔼 [𝜷^{⊺}] 𝐗^{⊺} 𝐖 𝔼 [𝜽] = 𝟎$ . This is important because it means that there is a unique partitioning of genetic variance when studying a trait of interest. The second key takeaway is that the genotype matrix $𝐗$ and the matrix of genetic interactions $𝐖$ themselves are correlated despite being linearly independent (see Materials and methods). This property stems from the fact that the pairwise interaction between two SNPs is encoded as the Hadamard product of two genotypic vectors in the form $𝐰_{m} = 𝐱_{j} \circ 𝐱_{k}$ (which is a nonlinear function of the genotypes).

A central objective in GWAS studies is to infer how much phenotypic variation can be explained by genetic effects. To achieve that objective, a key consideration involves incorporating the possibility of non-additive sources of genetic variation to be explained by additive effect size estimates obtained from GWAS analyses (Hill et al., 2008). If we assume that the genotype and interaction matrices are correlated, then $X$ and $𝐖$ are not completely orthogonal (i.e. such that $X^{⊺} W \neq 0$ ) and the following relationship between the moment matrix $X^{⊺} y$ , the observed marginal GWAS summary statistics $\hat{β}$ , and the true coefficient values $β$ from the generative model in Equation 1 holds in expectation (see Materials and methods)

E [X^{⊺} y] = (X^{⊺} X) β + (X^{⊺} W) θ \overset{\approx}{⟺} E [\hat{β}] = R β + V θ

where $𝐑$ is a sample estimate of the LD matrix, and $𝐕$ represents a sample estimate of the correlation between the individual-level genotypes $𝐗$ and the span of genetic interactions between causal SNPs in $𝐖$ . Intuitively, the term $V θ$ can be interpreted as the subset of pairwise interaction effects that are tagged by the additive effect estimates from the GWAS study. Note that, when (i) non-additive genetic effects do not contribute to the overall architecture of a trait (i.e. such that $θ = 0$ ) or (ii) the genotype and interaction matrices $𝐗$ and $𝐖$ are uncorrelated, the equation above simplifies to a relationship between LD and summary statistics that is assumed in many GWAS studies and methods (Hormozdiari et al., 2014; Nakka et al., 2016; Zhu and Stephens, 2017; Zhang et al., 2018; Zhu and Stephens, 2018; Cheng et al., 2020; Demetci et al., 2021).

The goal of i-LDSC is to increase estimates of genetic variance by accounting for sources of non-additive genetic effects that can be explained by additive GWAS summary statistics. To do this, we extend the LD score regression framework and the corresponding LDSC software (Bulik-Sullivan et al., 2015b). Here, according to Equation 3, we note that $\hat{β} \sim N (R β + V θ, λ R)$ where $λ$ is a scale variance term due to uncontrolled confounding effects (Guan and Stephens, 2011; Song et al., 2022). Next, we condition on $Θ = (β, θ)$ and take the expectation of chi-square statistics $χ^{2} = N \hat{β} {\hat{β}}^{⊺}$ to yield

\begin{array}{ll} E [\hat{β} {\hat{β}}^{⊺}] = E [E [\hat{β} {\hat{β}}^{⊺} | Θ]] & = E [V [\hat{β} | Θ] + E [\hat{β} | Θ] E {[\hat{β} | Θ]}^{⊺}] \\ = E [λ R + (R β + V θ) (R β + V θ)^{⊺}] \\ = E [λ R + R β β^{⊺} R + 2 R β θ^{⊺} V^{⊺} + V θ θ^{⊺} V^{⊺}] \\ = λ R + (\frac{φ_{β}^{2}}{J}) R^{2} + (\frac{φ_{θ}^{2}}{M}) V^{2} . \end{array}

We define $ℓ_{j} = \sum_{k} r_{j k}^{2}$ as the LD score for the additive effect of the $j$ -th variant (Bulik-Sullivan et al., 2015b), and $f_{j} = \sum_{m} v_{j m}^{2}$ represents the ‘cis-interaction’ LD score which encodes the pairwise interaction between the $j$ -th variant and all other variants within a genomic window that is a pre-specified number of SNPs wide (Crawford et al., 2017), respectively. By considering only the diagonal elements of LD matrix in the first term, similar to the original LDSC approach (Bulik-Sullivan et al., 2015b; Song et al., 2022), we get the following simplified regression model

E [χ^{2}] \propto 1 + ℓ τ + f ϑ

where $χ^{2} = (χ_{1}^{2}, \dots, χ_{J}^{2})$ is a $J$ -dimensional vector of chi-square summary statistics, and $ℓ = (ℓ_{1}, \dots, ℓ_{J})$ and $f = (f_{1}, \dots, f_{J})$ are $J$ -dimensional vectors of additive and cis-interaction LD scores, respectively. Furthermore, we define the variance components $τ = N φ_{β}^{2} / J$ and $ϑ = N φ_{θ}^{2} / M$ as the additive and non-additive regression coefficients of the model, and 1 is the intercept meant to model the bias factor due to uncontrolled confounding effects (e.g. cryptic relatedness structure). In practice, we efficiently compute the cis-interaction LD scores by considering only a subset of interactions between each $j$ -th focal SNP and SNPs within a cis-proximal window around the $j$ -th SNP. In our validation studies and applications, we base the width of this window on the observation that LD decays outside of a window of 1 centimorgan (cM); therefore, SNPs outside the 1 cM window centered on the $j$ -th SNP will not significantly contribute to its LD scores. Note that the width of this window can be relaxed in the i-LDSC software when appropriate. We fit the i-LDSC model using weighted least squares to estimate regression parameters and derive p-values for identifying traits that have significant statistical evidence of tagged cis-interaction effects by testing the null hypothesis $H_{0} : ϑ = 0$ . Importantly, under the null model of a trait being generated by only additive effects, the i-LDSC model in Equation 5 reduces to an infinitesimal model (Fisher, 1999) or, in the case some variants have no effect on the trait, a polygenic model.

Lastly, we want to note the empirical observation that the additive ( $ℓ$ ) and interaction $(𝒇$ ) LD scores are lowly correlated. This is important because it indicates that the presence of cis-interaction LD scores in the model specified in Equation 5 has little-to-no influence over the estimate for the additive coefficient $τ$ . Instead, the inclusion of $𝒇$ creates a multivariate model that can identify the proportion of variance explained by both additive and non-additive effects in summary statistics. In other words, we can interpret $\hat{ϑ}$ as an estimate of the phenotypic variation explained by tagged cis-acting interaction effects. The concept of additive genetic effects partially explaining non-additive variation has also described in various studies from quantitative genetics (Hill et al., 2008; Hivert et al., 2021; Mäki-Tanila and Hill, 2014). Under Hardy-Weinberg equilibrium, it can be shown that the additive variance explained by $J$ SNPs takes on the following form (Materials and methods) (Falconer and Mackay, 1983)

σ_{A}^{2} = \sum_{j = 1}^{J} 2 p_{j} (1 - p_{j}) {[β_{j} + 2 \sum_{k \neq j}^{J} p_{k} θ_{j k}]}^{2} .

The expression for the additive variance $σ_{A}^{2}$ in Equation 6 is important because it represents the theoretical upper bound on the proportion of total phenotypic variance that can be recovered from GWAS summary statistics using the i-LDSC framework. As a result, we use the sum of coefficient estimates $\hat{τ} + \hat{ϑ} \leq σ_{A}^{2}$ to construct i-LDSC heritability estimates. A full derivation of the cis-interaction regression framework and details about its corresponding implementation in our software i-LDSC can be found in Materials and Methods.

Detection of tagged pairwise interaction effects using i-LDSC in simulations

We illustrate the power of i-LDSC across different genetic trait architectures via extensive simulation studies (Materials and methods). We generate synthetic phenotypes using real genome-wide genotype data from individuals of self-identified European ancestry in the UK Biobank. To do so, we first assume that traits have a polygenic architecture where all SNPs have a nonzero additive effect. Next, we randomly select a set of causal cis-interaction variants and divide them into two interacting groups (Materials and methods). One may interpret the SNPs in group #1 as being the ‘hubs’ in an interaction map (Crawford et al., 2017), whereas SNPs in group #2 are selected to be variants within some kilobase (kb) window around each SNP in group #1. We assume a wide range of simulation scenarios by varying the following parameters:

heritability: H² = 0.3 and 0.6;
proportion of phenotypic variation that is generated by additive effects: $ρ =$ 0.5, 0.8, and 1;
percentage of SNPs selected to be in group #1: 1%, 5%, and 10%;
genomic window used to assign SNPs to group #2: ± 10 and ± 100 kb.

We also varied the correlation between SNP effect size and minor allele frequency (MAF; as discussed in Schoech et al., 2019). All results presented in this section are based on 100 different simulated phenotypes for each parameter combination.

Figure 1 demonstrates that i-LDSC robustly detects significant tagged non-additive genetic variance, regardless of the total number of causal interactions genome-wide. Instead, the power of i-LDSC depends on the proportion of phenotypic variation that is generated by additive versus interaction effects (ρ), and its power tends to scale with the window size used to compute the cis-interaction LD scores (see Materials and methods). i-LDSC shows a similar performance for detecting tagged cis-interaction effects when the effect sizes of causal SNPs depend on their minor allele frequency and when we varied the number of SNPs assigned to be in group #2 within 10 kb and 100 kb windows, respectively (Figure 1—figure supplements 1–5).

Figure 1 with 5 supplements see all

Download asset Open asset

Power of the i-LDSC framework to detect tagged pairwise genetic interaction effects on simulated data.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank. All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 1%, 5%, and 10% of the total number of SNPs genome-wide (see the x-axis in each panel). These interact with the group #2 SNPs which are selected to be variants within a ± 10 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with no minor allele frequency dependency $α = 0$ (see Materials and methods). Panels (A) and (B) are results with simulations using a heritability $H^{2} = 0.3$ , while panels (C) and (D) were generated with $H^{2} = 0.6$ . We also varied the proportion of heritability contributed by additive effects to (**A, C**) $ρ = 0.5$ and (**B, D**) $ρ = 0.8$ , respectively. Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the *cis*-interaction LD scores using different estimating windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. Results are based on 100 simulations per parameter combination and the horizontal bars represent standard errors. Generally, the performance of i-LDSC increases with larger heritability and lower proportions of additive variation. Note that LDSC is not shown here because it does not search for tagged interaction effects in summary statistics.

Importantly, i-LDSC does not falsely identify putative non-additive genetic effects in GWAS summary statistics when the synthetic phenotype was generated by only additive effects ( $ρ = 1$ ). Figure 2 illustrates the performance of i-LDSC under the null hypothesis $H_{0} : ϑ = 0$ , with the type I error rates for different estimation window sizes of the cis-interaction LD scores highlighted in panel A. Here, we also show that, when no genetic interaction effects are present, i-LDSC unbiasedly estimates the cis-interaction coefficient in the regression model to be $\hat{ϑ} = 0$ (Figure 2B), robustly estimates the heritability (Figure 2C), and provides well-calibrated p-values when assessed over many traits (Figure 2D). This behavior is consistent across different MAF-dependent effect size distributions, and p-value calibration is not sensitive to misspecification of the estimation windows used to generate the cis-interaction LD scores (Figure 2—figure supplements 1–2).

Figure 2 with 2 supplements see all

Download asset Open asset

The i-LDSC framework is well-calibrated under the null hypothesis and does not identify evidence of tagged non-additive effects when polygenic traits are generated by only additive effects.

In these simulations, synthetic trait architecture is made up of only additive genetic variation (i.e. $ρ = 1$ ). Coefficients for additive and interaction effects were simulated with no minor allele frequency dependency $α = 0$ (see Materials and methods). Here, we are blind to the parameter settings used in generative model and run i-LDSC while computing the *cis*-interaction LD scores using different estimating windows of ± 5 (green), ± 10 (orange), ± 25 (purple), and ± 50 (pink) SNPs. (A) Mean type I error rate using the i-LDSC framework across an array of estimation window sizes for the *cis*-interaction LD scores. This is determined by assessing the p-value of the *cis*-interaction coefficient ( $ϑ$ ) in the i-LDSC regression model and checking whether p < 0.05. (B) Estimates of the *cis*-interaction coefficient ( $ϑ$ ). Since traits were simulated with only additive effects, these estimates should be centered around zero. (C) Estimates of the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) where the true additive variance is set to $H^{2} ρ = 0.6$ . (D) QQ-plot of the p-values for the *cis*-interaction coefficient ( $ϑ$ ) in i-LDSC. Results are based on 100 simulations per parameter combination and the horizontal bars represent standard errors.

One of the innovations that i-LDSC offers over the traditional LDSC framework is increased heritability estimates after the identification of non-additive genetic effects that are tagged by GWAS summary statistics. Here, we applied both methods to the same set of simulations in order to understand how LDSC behaves for traits generated with cis-interaction effects. Figure 3 depicts boxplots of the heritability estimates for each approach and shows that, across an array of different synthetic phenotype architectures, LDSC captures less of phenotypic variance explained by all genetic effects. It is important to note that i-LDSC can yield upwardly biased heritability estimates when the cis-interaction scores are computed over genomic window sizes that are too small; however, these estimates become more accurate for larger window size choices (Figure 3—figure supplement 1). In contrast to LDSC, which aims to capture phenotypic variance attributable to the additive effects of genotyped SNPs, i-LDSC accurately partitions genetic effects into additive versus cis-interacting components, which in turn generally leads the ability of i-LDSC to capture more genetic variance. The mean absolute error between the true generative heritability and heritability estimates produced by i-LDSC and LDSC are shown in Supplementary files 1 and 2, respectively. Generally, the error in heritability estimates is higher for LDSC than it is for i-LDSC across each of the scenarios that we consider.

Figure 3 with 13 supplements see all

Download asset Open asset

i-LDSC robustly and accurately estimates the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) in simulations in polygenic traits, compared to LDSC, due to our accounting for interaction effects tagged in additive GWAS summary statistics.

Synthetic trait architecture was simulated using real genotype data from individuals of self-identified European ancestry in the UK Biobank (Materials and Methods). All SNPs were considered to have at least an additive effect (i.e. creating a polygenic trait architecture). Next, we randomly select two groups of interacting variants and divide them into two groups. The group #1 SNPs are chosen to be 10% of the total number of SNPs genome-wide. These interact with the group #2 SNPs which are selected to be variants within a ± 100 kilobase (kb) window around each SNP in group #1. Coefficients for additive and interaction effects were simulated with no minor allele frequency dependency $α = 0$ (see Materials and methods). Here, we assume a heritability (A) $H^{2} = 0.3$ or (B) $H^{2} = 0.6$ (marked by the black dotted lines, respectively), and we vary the proportion contributed by additive effects with $ρ = {0.2, 0.4, 0.6, 0.8}$ . The grey dotted lines represent the total contribution of additive effects in the generative model for the synthetic traits ( $H^{2} ρ)$ . i-LDSC outperforms LDSC in recovering heritability across each scenario. Results are based on 100 simulations per parameter combination.

Next, we perform an additional set of simulations where we explore other common generative models for complex trait architecture that involve non-additive genetic effects. Specifically, we compare heritability estimates from LDSC and i-LDSC in the presence of additive effects, cis-acting interactions, and a third source of genetic variance stemming from either gene-by-environment (G×E) or or gene-by-ancestry (G×Ancestry) effects. Details on how these components were generated can be found in Materials and Methods. In general, i-LDSC underestimates overall heritability when additive effects and cis-acting interactions are present alongside G×E (Figure 3—figure supplement 2) and/or G×Ancestry effects when PCs are included as covariates (Figure 3—figure supplement 3). Notably, when PCs are not included to correct for residual stratification, both LDSC and i-LDSC can yield unbounded heritability estimates greater than 1 (Figure 3—figure supplement 4). Also interestingly, when we omit cis-interactions from the generative model (i.e. the genetic architecture of simulated traits is only made up of additive and G×E or G×Ancestry effects), i-LDSC will still estimate a nonzero genetic variance component with the cis-interaction LD scores (Figure 3—figure supplements 5–7). Collectively, these results empirically show the important point that cis-interaction scores are not enough to recover missing genetic variation for all types of trait architectures; however, they are helpful in recovering phenotypic variation explained by statistical interaction effects. Recall that the linear relationship between (expected) $χ^{2}$ test statistics and LD scores proposed by the LDSC framework holds when complex traits are generated under the polygenic model where all causal variants have the same expected contribution to phenotypic variation. When cis-interactions affect genetic architecture (e.g. in our earlier simulations in Figure 3), these assumptions are violated in LDSC, but the inclusion of the additional nonlinear scores in i-LDSC help recover the relationship between the expectation of $χ^{2}$ test statistics and LD.

As a further demonstration of how i-LDSC performs when assumptions of the original LD score model are violated, we also generated synthetic phenotypes with sparse architectures using the spike-and-slab model (Zhou et al., 2013). Here, traits were simulated with solely additive effects, but this time only variants with the top or bottom ${1, 5, 10, 25, 50, 100}$ percentile of LD scores were given nonzero effects (see Materials and methods). Breaking the relationship assumed under the LDSC framework between LD scores and chi-squared statistics (i.e. that they are generally positively correlated) led to unbounded estimates of heritability in all but the (polygenic) scenario when 100% of SNPs contributed to the phenotypic variation (Figure 3—figure supplement 8).

Finally, we performed a set of polygenic simulations to assess if i-LDSC estimates of non-additive genetic variance could be spuriously inflated due to either (i) unobserved additive effects (see, for example, Hemani et al., 2014), (ii) unobserved SNPs that are involved in genetic interactions, or by (iii) nonzero correlation between the additive and interaction effect sizes in the generative model (i.e. breaking the independence assumption in Equation 2). In the first setting, we observed that, across a range of both minor allele frequencies and effect sizes, the omission of causal haplotypes had a negligible effect on the estimated value of the coefficients in i-LDSC (Figure 3—figure supplement 9). We hypothesize this is due to the fact that the simulations were done for polygenic architectures where all SNPs have at least an additive effect. As a result, not observing a small subset of SNPs does not hinder the ability of i-LDSC to estimate genetic variance because the effect size of each SNP is small. If these simulations were conducted for sparse architectures, we would have likely seen a greater impact on i-LDSC; although, we have already shown the LD score regression framework to be uncalibrated for traits with sparse genetic architectures (again see Figure 3—figure supplement 8). In the second setting, we observed that the i-LDSC framework protects against the false discovery of non-additive genetic effects and underestimates the variance component $ϑ$ when causal variants involved in pairwise interactions were unobserved (Figure 3—figure supplements 10 and 11). As a direct comparison, estimates of the additive variance component $τ$ in i-LDSC were not affected by the unobserved interacting variants. Lastly, in the third setting, we observed that the mean estimate of the genetic variance in both LDSC and i-LDSC had a slight upward bias as the correlation between additive and interaction effect sizes in the generative model increased; however, the median of these bias estimates was still near zero across all simulated scenarios and their corresponding replicates (Figure 3—figure supplements 12 and 13).

Application of i-LDSC to the UK Biobank and BioBank Japan

To assess whether pairwise interaction genetic effects are significantly affecting estimates of heritability in empirical biobank data, we applied i-LDSC to 25 continuous quantitative traits from the UK Biobank and BioBank Japan (Supplementary file 3). Protocols for computing GWAS summary statistics for the UK Biobank are described in the Materials and methods; while pre-computed summary statistics for BioBank Japan were downloaded directly from the consortium website (https://pheweb.jp/downloads). We release the cis-acting SNP-by-SNP interaction LD scores used in our analyses on the i-LDSC GitHub repository from two reference groups in the 1000 Genomes: 489 individuals from the European superpopulation (EUR) and 504 individuals from the East Asian (EAS) superpopulation (see also Supplementary files 4 and 5).

In each of the 25 traits, we analyzed in the UK Biobank, we detected significant proportions of estimated genetic variation stemming from tagged pairwise cis-interactions (Table 1). This includes many canonical traits of interest in heritability analyses: height, cholesterol levels, urate levels, and both systolic and diastolic blood pressure. Our findings in Table 1 are supported by multiple published studies identifying evidence of non-additive effects playing a role in the architectures of different traits of interest. For example, Li et al., 2020 found evidence for genetic interactions that contributed to the pathogenesis of coronary artery disease. It was also recently shown that non-additive genetic effects plays a significant role in body mass index (Song et al., 2022). Generally, we find that the traditional LDSC produces lower estimates of trait heritability because it does not consider the additional sources of genetic signal that i-LDSC does (Table 1). In BioBank Japan, 23 of the 25 traits analyzed had a significant nonlinear component detected by i-LDSC — with HDL and triglyceride levels being the only exceptions.

Table 1

i-LDSC heritability estimates and p-values highlighting statistically significant contributions of tagged pairwise genetic interaction effects for 25 traits in the UK Biobank and BioBank Japan.

Here, LDSC heritability estimates are included as a baseline. The difference between the approaches is that the i-LDSC heritability estimates include proportions of phenotypic variation that are explained by tagged non-additive variation (see columns with estimates of $ϑ$ ). Note that all 25 traits analyzed in the UK Biobank and 23 of the 25 traits analyzed in BioBank Japan have a statistically significant amount of tagged non-additive genetic effects as detected by the cis-interaction LD score (p < 0.05). The two traits without significant tagged non-additive genetic effects in BioBank Japan were HDL (p = 0.081) and Triglyceride (p = 0.110). These traits are indicated by *. The i-LDSC p-values are related to the estimates of the $ϑ$ coefficients which are also displayed in Figure 4.

Trait	UKB (LDSC)	UKB (i-LDSC)	UKB $\hat{ϑ}$	UKB p-value	BBJ (LDSC)	BBJ (i-LDSC)	BBJ $\hat{ϑ}$	BBJ p-value
Basophil	0.0250	0.0315	0.0065	1.572× 10⁻¹²	0.0684	0.1548	0.0864	0.025
BMI	0.1757	0.2349	0.0592	3.083× 10⁻⁸⁴	0.1667	0.2656	0.0989	2.438× 10⁻¹⁸
Cholesterol	0.0954	0.0974	0.0020	1.821× 10⁻¹⁶	0.0629	0.1268	0.0639	2.740× 10⁻⁴
CRP	0.0354	0.0414	0.0060	9.845× 10⁻¹²	0.0202	0.1625	0.1423	0.020
DBP	0.0940	0.1203	0.0263	1.118× 10⁻⁶⁵	0.0605	0.1267	0.0662	1.675× 10⁻⁷
EGFR	0.1521	0.1999	0.0478	1.187× 10⁻⁴⁶	0.1010	0.1225	0.0215	4.232× 10⁻⁵
Eosinophil	0.1055	0.1375	0.0320	1.230× 10⁻¹⁸	0.0785	0.1973	0.1188	0.001
HBA1C	0.0906	0.1083	0.0177	1.578× 10⁻²⁶	0.1057	0.1308	0.0251	0.031
HDL*	0.1599	0.1768	0.0169	9.636× 10⁻³⁷	0.1590	0.1838	0.0248	0.081
Height	0.3675	0.4815	0.1140	1.038× 10⁻⁶⁴	0.3941	0.7336	0.3395	7.433× 10⁻³³
Hematocrit	0.1078	0.1352	0.0274	2.479× 10⁻²⁵	0.0752	0.0928	0.0176	3.689× 10⁻⁵
Hemoglobin	0.1177	0.1433	0.0256	4.284× 10⁻²⁷	0.0702	0.0752	0.0050	9.037× 10⁻⁴
LDL	0.0802	0.0859	0.0057	5.087× 10⁻¹³	0.0745	0.1438	0.0693	0.018
Lymphocyte	0.0402	0.0501	0.0099	4.906× 10⁻¹⁹	0.0844	0.1757	0.0913	5.479× 10⁻⁵
MCH	0.1361	0.1597	0.0236	1.785× 10⁻²⁵	0.1536	0.2831	0.1295	1.042× 10⁻⁵
MCHC	0.0317	0.0364	0.0047	3.730× 10⁻¹²	0.0571	0.0650	0.0079	0.027
MCV	0.1630	0.1902	0.0272	1.180× 10⁻²⁹	0.1530	0.2818	0.1288	1.042× 10⁻⁵
Monocyte	0.0788	0.0955	0.0167	5.257× 10⁻¹⁸	0.0888	0.1549	0.0661	0.004
Neutrophil	0.1102	0.1391	0.0289	1.777× 10⁻³³	0.1191	0.2114	0.0923	5.050× 10⁻⁵
Platelet	0.1992	0.2447	0.0455	2.303× 10⁻³⁷	0.1565	0.2436	0.0871	7.724× 10⁻⁹
RBC	0.1574	0.1933	0.0359	3.292× 10⁻³¹	0.1203	0.2068	0.0865	5.972× 10⁻⁸
SBP	0.0954	0.1201	0.0247	8.660× 10⁻⁷⁵	0.0769	0.1604	0.0835	9.075× 10⁻¹⁰
Triglycerides*	0.1061	0.1204	0.0143	1.410× 10⁻²⁶	0.1171	0.2670	0.1499	0.110
Urate	0.1217	0.1550	0.0333	9.642× 10⁻³⁸	0.1395	0.3462	0.2067	0.015
WBC	0.0962	0.1250	0.0288	9.866× 10⁻³⁴	0.1024	0.2266	0.1242	1.346× 10⁻⁸

For each of the 25 traits that we analyzed, we found that the i-LDSC heritability estimates are significantly correlated with corresponding estimates from LDSC in both the UK Biobank ( $r^{2} = 0.988$ , $P = 5.936 \times 10^{- 24}$ ) and BioBank Japan ( $r^{2} = 0.849$ , $P = 6.061 \times 10^{- 11}$ ) as shown in Figure 4A. Additionally, we found that the heritability estimates for the same traits between the two biobanks are highly correlated according to both LDSC ( $r^{2} = 0.848$ , $P = 7.166 \times 10^{- 11}$ ) and i-LDSC ( $r^{2} = 0.666$ , $P = 6.551 \times 10^{- 7}$ ) analyses as shown in Figure 4B. After comparing the i-LDSC heritability estimates to LDSC, we then assessed whether there was significant difference in the amount of phenotypic variation explained by the non-additive genetic effect component in the GWAS summary statistics derived from the the UK Biobank and BioBank Japan (i.e. comparing the estimates of $ϑ$ ; see Figure 4—figure supplement 1A). We show that, while heterogeneous between traits, the phenotypic variation explained by genetic interactions is relatively of the same magnitude for both biobanks ( $r^{2} = 0.372$ , $P = 0.0119$ ). Notably, the trait with the most significant evidence of tagged cis-interaction effects in GWAS summary statistics is height which is known to have a highly polygenic architecture.

Figure 4 with 1 supplement see all

Download asset Open asset

The i-LDSC framework recovers heritability and provides estimates of tagged *cis*-interactions in GWAS summary statistics ( $ϑ$ ) for 25 quantitiative traits in the UK Biobank and BioBank Japan.

(A) In both the UK Biobank (green) and BioBank Japan (purple), estimates of phenotypic variance explained (PVE) by genetic effects from i-LDSC and LDSC are highly correlated for 25 different complex traits. The Spearman correlation coefficient between heritability estimates from LDSC and i-LDSC for the UK Biobank and BioBank Japan are $r^{2} = 0.989$ and $r^{2} = 0.850$ , respectively. The $y = x$ dotted line represents the values at which estimates from both approaches are the same. (B) PVE estimates from the UK Biobank are better correlated with those from the BioBank Japan across 25 traits using LDSC (Spearman $r^{2} = 0.848$ ) than i-LDSC (Spearman $r^{2} = 0.666$ ). (C) Both the original and stratified LDSC models recover the same amount of PVE when the *cis*-interaction LD score is included as an additional component in the UK Biobank analysis (Spearman $r^{2} = 0.989$ ). These models are listed as i-LDSC and s+i-LDSC, respectively. For s+i-LDSC, we included 97 functional annotations from Gazal et al. to estimate heritability. (D) Estimates of non-additive variance components in i-LDSC versus s+i-LDSC (Spearmen $r^{2} = 0.184$ ). While not statistically significant in the stratified analysis with the additional annotations, the non-additive component still makes nonzero contributions to the PVE estimation for all 25 traits in the UK Biobank (see Tables 1 and 2).

The intercepts estimated by LDSC and i-LDSC are also highly correlated in both the UK Biobank and the BioBank Japan (Figure 4—figure supplement 1B). Recall that these intercept estimates represent the confounding factor due to uncontrolled effects. For LDSC, this does include phenotypic variation that is due to unaccounted for pairwise statistical genetic interactions. The i-LDSC intercept estimates tend to be correlated with, but are generally different than, those computed with LDSC — empirically indicating that non-additive genetic variation is partitioned away and is missed when using the standard LD score alone. This result shows similar patterns in both the UK Biobank ( $r^{2} = 0.888$ , $P = 1.962 \times 10^{- 12}$ ) and BioBank Japan ( $r^{2} = 0.813$ , $P = 7.814 \times 10^{- 10}$ ).

Lastly, we performed an additional analysis in the UK Biobank where the cis-interaction scores are included as an annotation alongside 97 other functional categories in the stratified-LD score regression framework and its software s-LDSC (Gazal et al., 2017; Materials and methods). Here, s-LDSC heritability estimates still showed an increase with the interaction scores versus when the publicly available functional categories were analyzed alone, but albeit at a much smaller magnitude (Table 2). The contributions from the pairwise interaction component to the overall estimate of genetic variance ranged from 0.005 for MCHC ( $P = 0.373$ ) to 0.055 for HDL ( $P = 0.575$ ; Figure 4C and D). Furthermore, in this analysis, the estimates of the non-additive components were no longer statistically significant for any of the traits in the UK Biobank (Table 2). Despite this, these results highlight the ability of the i-LDSC framework to identify sources of ‘missing’ phenotypic variance explained in heritability estimation. Importantly, moving forward, we suggest using the cis-interaction scores with additional annotations whenever they are available as it provides more conservative estimates of the role of non-additive effects on trait architecture.

Table 2

Comparison of s-LDSC and i-LDSC estimates of phenotypic variance explained (PVE) by genetic effects for 25 complex traits in the UK Biobank.

Here, we use stratified LD score regression (s-LDSC) to partition heritability across different genomic elements (Finucane et al., 2015). We used 97 functional annotations from Gazal et al. to estimate heritability in 25 traits. We then appended cis-interaction LD scores as an additional annotation to obtain heritability estimates (this method is referred to as s+i-LDSC in the table). p-values for the s+i-LDSC model detailing the contributions of tagged non-additive genetic effects for 25 traits are provided in the last column. Note that, while not statistically significant in this stratified analysis with the additional annotations, the non-additive component still makes nonzero contributions to the PVE estimation for all 25 traits.

Trait	UKB PVE (s-LDSC)	UKB PVE (s+i-LDSC)	s+i-LDSC p-value
Basophil	0.0363	0.0375	0.4728
BMI	0.2100	0.2482	0.8126
Cholesterol	0.1042	0.1358	0.6202
CRP	0.0452	0.0524	0.6483
DBP	0.1228	0.1441	0.6125
EGFR	0.1826	0.2105	0.8507
Eosinophil	0.1403	0.1578	0.1867
HBA1C	0.1040	0.1275	0.6917
HDL	0.1820	0.2373	0.5754
Height	0.4315	0.4726	0.5224
Hematocrit	0.1416	0.1646	0.3956
Hemoglobin	0.1504	0.1795	0.2299
LDL	0.0858	0.1131	0.8812
Lymphocyte	0.0545	0.0651	0.1453
MCH	0.1497	0.1545	0.0968
MCHC	0.0450	0.0496	0.3728
MCV	0.1814	0.1930	0.1530
Monocyte	0.1085	0.1431	0.5421
Neutrophil	0.1320	0.1599	0.2499
Platelet	0.2317	0.2628	0.7371
RBC	0.1933	0.2223	0.3197
SBP	0.1206	0.1419	0.1100
Triglycerides	0.1335	0.1621	0.5301
Urate	0.1530	0.1736	0.1177
WBC	0.1221	0.1482	0.5155

Discussion

In this paper, we present i-LDSC, an extension of the LD score regression framework which aims to recover missing heritability from GWAS summary statistics by incorporating an additional score that measures the non-additive genetic variation that is tagged by genotyped SNPs. Here, we demonstrate how i-LDSC builds upon the original LDSC model through the development of new ‘cis-interaction’ LD scores which help to investigate signals of cis-acting SNP-by-SNP interactions (Figure 1 and Figure 1—figure supplements 1–5). Through extensive simulations, we show that i-LDSC is well-calibrated under the null model when polygenic traits are generated only by additive effects (Figure 2 and Figure 2—figure supplements 1–2), we highlight that i-LDSC provides greater heritability estimates over LDSC when traits are indeed generated with cis-acting SNP-by-SNP interaction effects (Figure 3 and Figure 3—figure supplement 1, and Supplementary files 1 and 2), and we tested the robustness of i-LDSC on phenotypes where assumptions of the original LD score model are violated (Figure 3—figure supplements 2–13). Finally, in real data, we show examples of many traits with estimated GWAS summary statistics that tag cis-interaction effects in the UK Biobank and BioBank Japan (Figure 4 and Figure 4—figure supplement 1, Tables 1 and 2, and Supplementary files 3-5). We have made i-LDSC a publicly available command line tool that requires minimal updates to the computing environment used to run the original implementation of LD score regression. In addition, we provide pre-computed cis-interaction LD scores calculated from the European (EUR) and East Asian (EAS) reference populations in the 1000 Genomes phase 3 data (see Data and Software Availability under Materials and Methods).

The current implementation of the i-LDSC framework offers many directions for future development and applications. First, an area of future work would be to explore how the relationship between cis-interaction LD scores and interaction effect sizes from the generative model of complex traits might bias heritability estimates provided by i-LDSC (e.g., similar to the relationship we explored between the standard LD scores and linear effect sizes in Figure 3—figure supplement 8). Second, as we showed with our simulation studies (Figure 3—figure supplements 2–8), the cis-interaction LD scores that we propose are not always enough to recover explainable non-additive genetic effects for all types of trait architectures. While we focus on pairwise cis-acting SNP-by-SNP statistical interactions in this work, the theoretical concepts underlying i-LDSC can easily be adapted to other types of interactions as well. Third, in our analysis of the UK Biobank and BioBank Japan, we showed that the inclusion of additional categories via frameworks such as stratified LD score regression (Finucane et al., 2015) can be used to provide more refined heritability estimates from GWAS summary statistics while accounting for linkage (see results in Table 1 versus Table 2). A key part of our future work is to continue to explore whether considering functional annotation groups would also improve our ability to identify tagged non-additive genetic effects. Lastly, we have only focused on analyzing one phenotype at a time in this study. However, many previous studies have extensively shown that modeling multiple phenotypes can often dramatically increase power (Runcie et al., 2020; Stamp et al., 2022). Therefore, it would be interesting to extend the i-LDSC framework to multiple traits to study nonlinear genetic correlations in the same way that LDSC was recently extended to uncover additive genetic correlation maps across traits (Naqvi et al., 2021).

	$𝐱_{j}^{2} = 0$	$𝐱_{j}^{2} = 1$	$𝐱_{j}^{2} = 4$
$𝐱_{k} = 0$	$u_{j k}^{2}$	$2 u_{j k} (1 - p_{k} - u_{j k})$	${(1 - p_{k} - u_{j k})}^{2}$
$𝐱_{k} = 1$	$2 u_{j k} (1 - p_{j} - u_{j k})$	$\begin{array}{ll} 2 u_{j k} (u_{j k} + p_{j} + p_{k} - 1) + \\ 2 (1 - p_{j} - u_{j k}) (1 - p_{k} - u_{j k}) \end{array}$	$2 (u_{j k} + p_{j} + p_{k} - 1) (1 - p_{k} - u_{j k})$
$𝐱_{k} = 2$	${(1 - p_{j} - u_{j k})}^{2}$	$2 (u_{j k} + p_{j} + p_{k} - 1) (1 - p_{j} - u_{j k})$	${(u_{j k} + p_{j} + p_{k} - 1)}^{2}$

Share this article

Cite this article

Power of the i-LDSC framework to detect tagged pairwise genetic interaction effects on simulated data.

The i-LDSC framework is well-calibrated under the null hypothesis and does not identify evidence of tagged non-additive effects when polygenic traits are generated by only additive effects.

i-LDSC robustly and accurately estimates the proportions of phenotypic variance explained (PVE) by genetic effects (i.e. estimated heritability) in simulations in polygenic traits, compared to LDSC, due to our accounting for interaction effects tagged in additive GWAS summary statistics.

i-LDSC heritability estimates and p-values highlighting statistically significant contributions of tagged pairwise genetic interaction effects for 25 traits in the UK Biobank and BioBank Japan.

The i-LDSC framework recovers heritability and provides estimates of tagged cis-interactions in GWAS summary statistics (ϑ) for 25 quantitiative traits in the UK Biobank and BioBank Japan.

Comparison of s-LDSC and i-LDSC estimates of phenotypic variance explained (PVE) by genetic effects for 25 complex traits in the UK Biobank.

Author details

Samuel Pattillo Smith

Contribution

Contributed equally with

Competing interests

Gregory Darnell

Contribution

Contributed equally with

Competing interests

Dana Udwin

Contribution

Competing interests

Julian Stamp

Contribution

Competing interests

Arbel Harpak

Contribution

Competing interests

Sohini Ramachandran

Contribution

Contributed equally with

Competing interests

Lorin Crawford

Contribution

Contributed equally with

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism

The i-LDSC framework recovers heritability and provides estimates of tagged cis-interactions in GWAS summary statistics ( $ϑ$ ) for 25 quantitiative traits in the UK Biobank and BioBank Japan.