A Mendelian randomization study of the role of lipoprotein subfractions in coronary artery disease
Abstract
Recent genetic data can offer important insights into the roles of lipoprotein subfractions and particle sizes in preventing coronary artery disease (CAD), as previous observational studies have often reported conflicting results. We used the LD score regression to estimate the genetic correlation of 77 subfraction traits with traditional lipid profile and identified 27 traits that may represent distinct genetic mechanisms. We then used Mendelian randomization (MR) to estimate the causal effect of these traits on the risk of CAD. In univariable MR, the concentration and content of medium high-density lipoprotein (HDL) particles showed a protective effect against CAD. The effect was not attenuated in multivariable analyses. Multivariable MR analyses also found that small HDL particles and smaller mean HDL particle diameter may have a protective effect. We identified four genetic markers for HDL particle size and CAD. Further investigations are needed to fully understand the role of HDL particle size.
Introduction
Lipoprotein subfractions have been increasingly studied in epidemiological research and used in clinical practice to predict the risk of cardiovascular diseases (CVD) (Rankin et al., 2014; Mora et al., 2009; China Kadoorie Biobank Collaborative Group et al., 2018). Several studies have identified potentially novel subfraction predictors for CVD (Mora et al., 2009; Hoogeveen et al., 2014; Williams et al., 2014; Ditah et al., 2016; Lawler et al., 2017; Fischer et al., 2014) and demonstrated that the addition of subfraction measurements can significantly improve the risk prediction for CVD (Würtz et al., 2012; van Schalkwijk et al., 2014; McGarrah et al., 2016; Rankin et al., 2014). However, these observational studies often provide conflicting evidence on the precise roles of the lipoprotein subfractions. For example, while some studies suggested that small, dense low-density lipoprotein (LDL) particles may be more atherogenic (Lamarche et al., 1997; Hoogeveen et al., 2014), others found that larger LDL size is associated with higher CVD risk (Campos et al., 2001; Mora, 2009). Some recent observational studies found that the inverse association of CVD outcomes with smaller high-density lipoprotein (HDL) particles is stronger than the association with larger HDL particles (Ditah et al., 2016; Kim et al., 2016; McGarrah et al., 2016; Silbernagel et al., 2017), but other studies reached the opposite conclusion in different cohorts (Li et al., 2016; Arsenault et al., 2009). Currently, the utility of lipoprotein subfractions or particle sizes in routine clinical practice remains controversial (Superko, 2009; Mora, 2009; Davidson et al., 2011; Bays et al., 2016), as there is still a great uncertainty about their causal roles in CVD, largely due to a lack of intervention data (Bays et al., 2016).
Mendelian randomization (MR) is an useful causal inference method that avoids many common pitfalls of observational cohort studies (Smith and Ebrahim, 2003). By using genetic variation as instrumental variables, MR asks if the genetic predisposition to a higher level of the exposure (in this case, lipoprotein subfractions) is associated with higher occurrences of the disease outcome (Didelez and Sheehan, 2007). A positive association suggests a causally protective effect of the exposure if the genetic variants satisfy the instrumental variable assumptions (Didelez and Sheehan, 2007; Davey Smith and Hemani, 2014). Since MR can provide unbiased causal estimate even when there are unmeasured confounders, it is generally considered more credible than other non-randomized designs and is quickly gaining popularity in epidemiological research (Gidding et al., 2012; Davies et al., 2018). MR has been used to estimate the effect of several metabolites on CVD, but most prior studies are limited to just one or a few risk exposures at a time (Emdin et al., 2016; Ference et al., 2017).
In this study, we will use recent genetic data to investigate the roles of lipid and lipoprotein traits in the occurrence of coronary artery disease (CAD) and myocardial infarction (MI). In particular, we are interested in discovering lipoprotein subfractions that may be causal risk factors for CAD and MI in addition to the traditional lipid profile (LDL cholesterol, HDL cholesterol, and triglycerides levels). To this end, we will first estimate the genetic correlation of the lipoprotein subfractions and particle sizes with the tradition risk factors and remove the traits that have a high genetic correlation. We will then use MR to estimate the causal effects of the selected lipoprotein subfractions and particle sizes on CAD and MI. Finally, we will explore potential genetic markers for the identified lipoprotein and subfraction traits.
Materials and methods
GWAS summary datasets and lipoprotein particle measurements
Request a detailed protocolTable 1 describes all GWAS summary datasets used in this study, including two GWAS of the traditional lipid risk factors (Willer et al., 2013; Hoffmann et al., 2018), two recent GWAS of the human lipidome (Kettunen et al., 2016; Davis et al., 2017), and three GWAS of CAD or MI (Nikpay et al., 2015; Nelson et al., 2017; Abbott et al., 2018). In the two GWAS of the lipidome (Kettunen et al., 2016; Davis et al., 2017), high-throughput nuclear magnetic resonance (NMR) spectroscopy was used to measure the circulating lipid and lipoprotein traits (Soininen et al., 2009). We investigated the 82 lipid and lipoprotein traits measured in these studies that are related to very-low-density lipoprotein (VLDL), LDL, intermediate-density lipoprotein (IDL), and HDL subfractions and particle sizes. All the subfraction traits are named with three components that are separated by hyphens: the first component indicates the size (XS, S, M, L, XL, XXL); the second component indicates the fraction according to the lipoprotein density (VLDL, LDL, IDL, HDL); the third component indicates the measurement (C for total cholesterol, CE for cholesterol esters, FC for free cholesterol, L for total lipids, P for particle concentration, PL for phospholipids, TG for triglycerides). For example, M-HDL-P refers to the concentration of medium HDL particles.
Aside from the concentration and content of lipoprotein subfractions, the two lipidome GWAS also measured the traditional lipid traits (TG, LDL-C, HDL-C), the average diameter of the fractions (VLDL-D, LDL-D, HDL-D) and the concentration of apolipoprotein A1 (ApoA1) and apolipoprotein B (ApoB). A full list of the lipoprotein measurements investigated in this article can be found in Appendix 1.
Genetic correlation and phenotypic screening
Request a detailed protocolGenetic correlation is a measure of association between the genetic determinants of two phenotypes. It is conceptually different from epidemiological correlation that can be directly estimated from cross-sectional data. In this study, we applied the LD-score regression (Bulik-Sullivan et al., 2015) to the lipidome GWAS (Kettunen et al., 2016; Davis et al., 2017) to estimate the genetic correlations between the lipoprotein subfractions, particle sizes, and traditional risk factors. We then removed lipoprotein subfractions and particle sizes that are strongly correlated with the traditional risk factors, defined as an estimated genetic correlation > 0.8 with TG, LDL-C, HDL-C, ApoB, or ApoA1 in the GWAS published by Davis et al., 2017. Because these traits are largely co-determined with the traditional risk factors, they do not represent independent biological mechanisms and may lead to multicollinearity issues in multivariate MR analyses. Finally, we obtained an independent estimate of the genetic correlations between the selected traits by applying the LD score regression to the GWAS published by Kettunen et al., 2016. We used Bonferroni's procedure to correct for multiple testing (familywise error rate at 0.05).
Three-sample Mendelian randomization design
Request a detailed protocolFor MR, we employed a three-sample design (Zhao et al., 2019b) in which one GWAS was used to select independent genetic instruments that are associated with one or several lipoprotein measures. The other two GWAS were then used to obtain summary associations of the selected SNPs with the exposure and the outcome, as in a typical two-sample MR design (Pierce and Burgess, 2013; Hemani et al., 2016). More specifically, the selection GWAS was used to create a set of SNPs that are in linkage equilibrium with each other in a reference panel (distance >10 megabase pairs, ). This was done by ordering the SNPs by the p-values of their association with the trait(s) under investigation and then selecting them greedily using the linkage-disequilibrium (LD) clumping function in the PLINK software package (Purcell et al., 2007). To avoid winner's curse, we require the other two GWAS to have no overlapping sample with the selection GWAS.
As the GWAS published by Davis et al., 2017 has a smaller sample size, we used it to select the genetic instruments so the larger dataset can be used for statistical estimation. In univariable MR, associations of the selected SNPs with the exposure trait (a lipoprotein subfraction or a particle size trait) were obtained from the GWAS published by Kettunen et al., 2016 and the associations with MI were obtained using summary data from an interim release of UK BioBank (Abbott et al., 2018). To maximize the statistical power, we used the so-called ‘genome-wide MR’ design. Independent SNPs are selected by using LD clumping, but we do not truncate the list of SNPs by their p-values. More details about this design can be found in a previous methodological article (Zhao et al., 2019b).
To control for potential pleiotropic effects via the traditional risk factors, we performed two multivariable MR analyses for each lipoprotein subfraction or particle size under investigation. The first multivariable MR analysis considers four exposures: TG, LDL-C, HDL-C, and the lipoprotein measurement under investigation. The second multivariable MR analysis replaces LDL-C and HDL-C with ApoB and ApoA1, in accordance with some recent studies (Richardson et al., 2020). SNPs were ranked by their minimum p-values with the four exposures and are selected as instruments only if they were associated with at least one of the four exposures (p-value ). Both multivariable MR analyses used the Davis (Davis et al., 2017) and GERA (Hoffmann et al., 2018) datasets for instrument selection, the Kettunen (Kettunen et al., 2016) and GLGC (Willer et al., 2013) datasets for the associations of the instruments with the exposures, and the CARDIoGRAMplusC4D + UK Biobank (Nelson et al., 2017) dataset for the associations with CAD.
Statistical estimation
Request a detailed protocolFor univariable MR, we used the robust adjusted profile score (RAPS) because it is more efficient and robust than many conventional methods (Zhao et al., 2020; Zhao et al., 2019b). RAPS can consistently estimate the causal effect even when some of the genetic variants violate instrumental variables assumptions. For multivariable MR, we used an extension to RAPS called GRAPPLE to obtain the causal effect estimates of multiple exposures (Wang et al., 2020). GRAPPLE also allows the exposure GWAS to have overlapping sample with the outcome GWAS, while the original RAPS does not. We assessed the strength of the instruments using the modified Cochran's Q statistic (Sanderson et al., 2019). Because many lipoprotein subfraction traits were analyzed simultaneously, we used the Benjamini-Hochberg procedure to correct for multiple testing (Benjamini and Hochberg, 1995) and the false discovery rate was set to be 0.05. More detail about the statistical methods can be found in Appendix 3.
Genetic markers for lipoprotein subfractions and CAD
Request a detailed protocolTo obtain genetic markers, we selected SNPs that are associated with the lipoprotein measurements identified in the MR (p-value ) and CAD (p-value ) but are not associated with LDL-C or ApoB (p-value ). To maximize the power of this exploratory analysis, we meta-analyzed the results of the two lipidome GWAS (Kettunen et al., 2016; Davis et al., 2017) by inverse-variance weighting. For the associations with LDL-C and CAD, we used the GWAS summary data reported by the GLGC (Willer et al., 2013) and CARDIoGRAMplusC4D (Nelson et al., 2017) consortia. We used LD clumping to obtain independent markers (Purcell et al., 2007) and then validate the markers using tissue-specific gene expression data from the GTEx project.
Sensitivity analysis and replicability
Request a detailed protocolBecause we had multiple GWAS summary datasets for the lipoprotein subfractions and CAD/MI (Table 1), we swapped the roles of the GWAS datasets in the three-sample MR design whenever permitted by the statistical methods to obtain multiple statistical estimates. These estimates are not completely independent of the primary results, but they can nonetheless be used to assess replicability. As a sensitivity analysis, We further analyzed univariable MR using inverse-variance weighting (IVW) (Burgess et al., 2013) and weighted median (Bowden et al., 2016) and compared with the primary results obtained by RAPS. We also assessed the assumptions made by RAPS using some diagnostic plots suggested in previous methodological articles (Zhao et al., 2019b).
Results
Genetic correlations and phenotypic screening
We obtained the genetic correlations of the lipoprotein subfractions and particle sizes with the traditional lipid risk factors: TG, LDL-C, HDL-C, ApoB, and ApoA1 (Table 1). We found that almost all VLDL subfractions traits (besides those related to very small VLDL subfraction) and the mean VLDL particle diameter have an estimated genetic correlation with TG very close to 1. Most traits related to the large and very large HDL subfractions also have a high genetic correlation with HDL-C and ApoA1.
After removing traits that are strongly correlated with the traditional risk factors, we obtained 27 traits that may involve independent genetic mechanisms. Figure 1 shows the genetic correlation matrix for these traits and the traditional lipid factors. The selected traits can be divided into two groups based on whether they are related to VLDL/LDL/IDL particles or HDL particles. Within each group, most traits were strongly correlated with the others. In the first group, most traits had a positive genetic correlation with LDL-C and ApoB, while in the second group, most traits had a positive genetic correlation with HDL-C and ApoA1. Exceptions include LDL-D, which had a negative but statistically non-significant genetic correlation with LDL-C and ApoB, and S-HDL-P and S-HDL-L, which showed no or weak genetic correlation with HDL-C and ApoA1.
Mendelian randomization
Figure 2 shows the estimated causal effect of the selected lipoprotein measurements on MI or CAD that are statistically significant (false discovery rate = 0.05). The unfiltered results can be found in Appendix 3, which also contains results of the sensitivity and replicability analyses.
The concentration and lipid content of VLDL, LDL, and IDL subfractions showed harmful and nearly uniform effects on MI in univariable MR. However, after adjusting for the traditional lipid risk factors, the effects of these ApoB-related subfractions become close to zero (besides IDL-FC in one multivariable analysis). The mean diameter of LDL particles (LDL-D) showed a harmful effect on MI in univariable MR, though the effect was smaller than those of the LDL subfractions in univariable MR. The estimated effect of LDL-D was attenuated in the multivariable MR analyses.
The concentration and content of medium HDL particles showed protective effects in univariable and multivariable MR analyses. In particular, adjusting for the traditional lipid risk factors did not attenuate the effect of traits related to medium HDL. The concentration of and total lipid in small HDL particles showed protective effects in multivariable MR analyses, though the effect sizes were smaller than those of the medium HDL traits. The mean diameter of HDL particles (HDL-D) had almost no effect on MI in the univariable MR analysis, but after adjusting for the traditional lipid risk factors, it showed a harmful effect.
Table 2 reports the estimated effects of M-HDL-P, S-HDL-P, HDL-D, and traditional lipid traits (TG, LDL-C, HDL-C, ApoB, ApoA1) in the multivariable MR analyses. To better understand the role of HDL subfractions and particle sizes, we also included in the table the results of the multivariate MR analyses for the traditional lipid risk factors only. Those baseline analyses suggested that HDL-C/ApoA1 had a weak, non-significant protective effect on CAD, which is consistent with prior studies (Holmes et al., 2015; Wang et al., 2020). Adding S-HDL-P to the MR analysis did not substantially alter the estimated effects of the traditional lipid traits. However, when M-HDL-P or HDL-D was included in the model, the estimated effects of M-HDL-P and HDL-D changed substantially. In particular, when M-HDL-P was included in the multivariable MR analyses, HDL-C/ApoA1 showed a harmful effect on CAD. When HDL-D was included, HDL-C/ApoA1 showed a protective effect.
Genetic markers associated with HDL subfractions and CAD
We identified four genetic variants that are associated with S-HDL-P, M-HDL-P, or HDL-D, not associated with LDL-C or ApoB, and associated with CAD: rs838880 (SCARB1), rs737337 (DOCK6), rs2943641 (IRS1), and rs6065904 (PLTP) (Figure 3). These SNP-cis gene pairs are also supported by examining expression quantitative trait loci (eQTL) in the tissue-specific GTEx data (Appendix 4). The first three variants were not associated with S-HDL-P. However, they had uniformly positive associations with M-HDL-P, L-HDL-P, XL-HDL-P, HDL-D, ApoA1, and HDL-C, and a negative association with CAD. The last variant rs6065904 had positive associations with S-HDL-P and M-HDL-P, negative associations with L-HDL-P, XL-HDL-P, HDL-D, negative but smaller associations with ApoA1 and HDL-C, and a negative association with CAD.
Sensitivity and replicability analysis
We also investigated the effects of lipoprotein subfractions and particle sizes on MI/CAD using multiple GWAS datasets, MR designs and statistical methods. The results are provided in Appendix 3 and are generally in agreement with the primary results reported above. The diagnostic plots for S-HDL-P and M-HDL-P did not suggest evidence of violations of the instrument strength independent of direct effect (InSIDE) assumption (Bowden et al., 2015) made by RAPS and GRAPPLE (Appendix 4).
Discussion
By using recent genetic data and MR, this study examines whether some lipoprotein subfractions and particle sizes, beyond the traditional lipid risk factors, may play a role in coronary artery disease. We find that VLDL subfractions have extremely high genetic correlations with blood triglyceride level and thus offer little extra value. We find some weak evidence that larger LDL particle size may have a small harmful effect on myocardial infarction and coronary artery disease.
Our main finding is that the size of HDL particles may play an important and previously undiscovered role. Although the concentration and lipid content of small and medium HDL particles appear to be positively correlated with HDL cholesterol and ApoA1, their genetic correlations are much smaller than 1, indicating possible independent biological pathway(s). Moreover, the MR analyses suggested that the small and medium HDL particles may have protective effects on CAD. We also find that larger HDL mean particle diameter may have a harmful effect on CAD. Finally, we identified four potential genetic markers for HDL particle size that are independent of LDL cholesterol and ApoB.
There has been a heated debate on the role of HDL particles in CAD in recent years following the failure of several trials for CETP inhibitors (Barter et al., 2007; Schwartz et al., 2012; Lincoff et al., 2017) and recombinant ApoA1 (Nicholls et al., 2018) targeting HDL cholesterol. Observational epidemiology studies have long demonstrated strong inverse association between HDL cholesterol and the risk of CAD or MI (Miller and Miller, 1975; Lewington et al., 2007; Di Angelantonio et al., 2009), but conflicting evidence has been found in MR studies. In an influential study, Voight and collaborators found that the genetic variants associated with HDL cholesterol had varied associations with CAD and that almost all variants suggesting a protective effect of HDL cholesterol were also associated with LDL cholesterol or triglycerides (Voight et al., 2012). Other MR studies also found that the effect of HDL cholesterol on CAD is heterogeneous (Zhao et al., 2019b) or attenuated after adjusting for LDL cholesterol and triglycerides (Holmes et al., 2017; White et al., 2016).
Notice that the harmful effect of larger HDL particle diameter found in this study relies on including HDL-C or ApoA1 in the multivariable MR analysis. Thus, the role of HDL particles in preventing CAD may be more complicated than, for example, that of LDL cholesterol or ApoB. It is possible that HDL cholesterol, HDL subfractions, and HDL particle size are all phenotypic markers for some underlying causal mechanism. A related theory is the HDL function hypothesis (Rader and Hovingh, 2014). Cholesterol efflux capacity, a measure of HDL function, has been documented as superior to HDL-C in predicting CVD risk (Rohatgi et al., 2014; Saleheen et al., 2015). Recent epidemiologic studies found that HDL particle size is positively associated with cholesterol efflux capacity in post-menopausal women (El Khoudary et al., 2016) and in an asymptomatic older cohort (Mutharasan et al., 2017). However, mechanistic efflux studies showed that small HDL particles actually mediate more cholesterol efflux (Favari et al., 2009; Du et al., 2015). A likely explanation of this seeming contradiction is that a high concentration of small HDL particles in the serum may mark a block in maturation of small HDL particles (Mutharasan et al., 2017). This can also partly explain our finding that small HDL traits have a smaller effect than medium HDL traits, as increased medium HDL might indicate successful maturation of small HDL particles.
Among the reported genetic markers, SCARB1 and PLTP have established relations to HDL metabolism and CAD. SCARB1 encodes a plasma membrane receptor for HDL and is involved in hepatic uptake of cholesterol from peripheral tissues. Recently, a rare mutation (P376L) of SCARB1 was reported to raise HDL-C level and increase CAD risk (Zanoni et al., 2016; Samadi et al., 2019). This is opposite direction to the conventional belief that HDL-C is protective and could be explained by HDL dysfunction. PLTP encodes the phospholipid transfer protein and mediates the transfer of phospholipid and cholesterol from LDL and VLDL to HDL. As a result, PLTP plays a complex but pivotal role in HDL particle size and composition. Several studies have suggested that high PLTP activity is a risk factor for CAD (Schlitt et al., 2003; Schlitt et al., 2009; Zhao et al., 2019a).
Our study should be viewed in the context of its limitations, in particular, the inherent limitations of the summary-data MR design. Any causal inference from non-experimental data makes unverifiable assumptions, so does our study. Conventional MR studies assume that the genetic variants are valid instrumental variables. The statistical methods used by us make less stringent assumptions about the instrumental variables, but those assumptions could still be violated even though our model diagnosis does not suggest evidence against the InSIDE assumption. Our study did not adjust for other risk factors for CAD such as body mass index, blood pressure, and smoking. All the GWAS datasets used in this study are from the European population, so the same conclusions might not generalize to other populations. Furthermore, our study used GWAS datasets from heterogeneous subpopulations, which may also introduce bias (Zhao et al., 2019c). We also did not use more than one subfraction traits as exposures in multivariable MR because of their high genetic correlations. Alternative statistical methods could be used to select the best causal risk factor from high-throughput experiments (Zuber et al., 2019). Finally, as pointed out by revieweres, triglycerides has a greater intra-individual biological variability than HDL particle size. It is likely that triglycerides and HDL size represent a gene/environment interaction with a very large environmental component. Further investigations are needed to fully understand this mechanism.
Recently, a NMR spectroscopy method has been developed to estimate HDL cholesterol efflux capacity from serum (Kuusisto et al., 2019). That method can form the basis of a genetic analysis of HDL cholesterol efflux capacity and may complement the results here. We believe more laboratorial and epidemiological research is needed to clarify the roles of HDL subfractions and particle size in cardiovascular diseases.
Appendix 1
Lipid and lipoprotein traits
Two published GWAS of the human lipidome [Kettunen2016, Davis2017] measured lipoprotein subfractions and particle sizes using NMR spectroscopy. We investigated the 82 lipid and lipoprotein traits measured in these studies that are related to very-low-density lipoprotein (VLDL), LDL, and HDL subfractions and particle sizes. All the subfraction traits are named using three components separated by hyphen: the first indicates the size (XS, S, M, L, XL, XXL); the second indicates the category according to the lipoprotein density (VLDL, LDL, IDL, HDL); the third indicates the measurement (C for total cholesterol, CE for cholesterol esters, FC for free cholesterol, L for total lipids, P for particle concentration, PL for phospholipids, TG for triglycerides). A full list of lipid and lipoprotein traits used in our study can be found in Appendix 1—table 1 below.
Appendix 2
Genetic correlations
We estimated the genetic correlation between lipoprotein subfractions, particle sizes, and traditional lipid risk factors using the LD score regression (Li et al., 2016). Appendix 2—figure 1–3 show the estimated genetic correlation matrix between selected traits using different datasets. Below the figures, Appendix 2—table 1 shows the estimated genetic correlations of the lipoprotein subbfractions with the traditional lipid risk factors using the Davis GWAS. The results in Appendix 2—table 1 were then used to screen the traits as described in Materials and methods.
Appendix 3
Mendelian randomization
We implemented several Mendelian randomization (MR) designs and statistical methods to estimate the causal effect of lipoprotein subfractions and particles sizes on coronary artery disease. In general, we adopted the three-sample summary data MR design described in Zhao et al., 2019b, Wang et al., 2020 and we swapped the roles of the GWAS datasets whenever permitted by the statistical methods. More specifically, the statistical methods we used for univariable MR (RAPS, IVW, weighted median) require that the GWAS datasets for obtaining instruments, SNP effects on the exposure, and SNP effects on the outcome must have no overlapping sample. The multivariable MR method we used (GRAPPLE) allows the exposure and outcome GWAS to be dependent and estimates the proportion of overlapping sample. However, GRAPPLE still requires that the selection GWAS uses an non-overlapping sample.
The MR designs we implemented in this study are summarized in Appendix 3—table 1. We considered two ways of instrument selection for univariable MR. In ‘traditional selection’, the traditional lipid traits were used to select the instruments for the corresponding subfraction traits. That is, HDL-C was used to select SNPs for HDL subfractions and particle size, LDL-C for IDL and LDL subfractions and particle size, and TG for VLDL subfractions and particle size. This tends to select more instruments because the GWAS for traditional lipid traits had a larger sample size. In ‘subfraction selection’, the instrumental SNPs were selected for each lipoprotein subfraction and particle size using the same or closest trait in the selection GWAS. For example, if the exposure under investigation is S-HDL-L but it is not measured in the Davis GWAS (if it is used for selection), S-HDL-P is used instead for instrument selection.
For multivariable MR, we considered two models with different sets of exposures: TG, LDL-C, HDL-C, and the subfraction/particle size under investigation; TG, ApoB, ApoA1, and the subfraction/particle size under investigation. SNPs were selected as potential instruments if they were associated (p-value ) with at least one of the four exposures. LD clumping was then used to obtain independent instruments, as described in Materials and Methods.
We briefly comment on the statistical methods used in univariable MR. All the three methods we used—RAPS, IVW, weighted median—require that the exposure GWAS and outcome GWAS have non-overlapping samples. RAPS and weighted median can provide consistent estimate of the causal effect even when some of the genetic variants are not valid instruments, provided that the direct effects of the genetic variants are independent of the strength of their associations with the exposure. The last condition is called the Instrument Strength Independent of Direct Effect (InSIDE) assumption in the MR literature [bowden2015mendelian]. RAPS is also robust to idiosyncratically large direct effect (Bowden et al., 2015). Because IVW and weighted median can be severely biased by weak instruments (Zhao et al., 2020), we only used them with the set of SNPs that have genome-wide significant association (p-value ) with the exposure. In comparison, RAPS does not suffer from weak instrument bias and we used it with all the SNPs obtained by LD clumping without any p-value threshold.
Below, Appendix 3—figure 1 shows the MR results for the 27 lipoprotein measurements selected in phenotypic screening. Estimates that are statistically significant at a false discovery rate of 0.05 are shown in Figure 2 of the main paper. Appendix 3—table 2 shows the estimated effect of all the lipoprotein subfractions and particle sizes on myocardial infarction or coronary artery disease in various MR designs. Full results of the multivariable MR analyses, including the estimated effects of the traditional lipid risk factors, can be found in Appendix 3—tables 5 and 6. The results of the univariable MR analyses using IVW and weighted median estimators can be found in Appendix 3—tables 3 and 4.
Pooled results
In the tables below, Red indicates p-value is significant (at level 0.05) after Bonferroni correction for all the results in the corresponding table and blue indicates p-value ≤ 0.05.
Univariable MR results
Multivariable MR results
Q-statistics for multivariable Mendelian randomization
Here we provide the list of modified Cochran's Q-statistics for the multivariable MR analyses (Appendix 3—tables 7 and 8).
Appendix 4
Diagnostic plots and the genetic markers
As mentioned above, RAPS is more robust against invalid instruments than other statistical methods for univariable MR, but it still needs the InSIDE assumption to be approximately satisfied. Zhao et al., 2019b described two diagnostic plots RAPS that checks whether there is clear evidence that the InSIDE assumption is violated. Here, we report these plots for HDL-C and M-HDL-P in different studies (Appendix 4—figures 1 and 2). Notice that a lack of evidence to falsify the InSIDE assumption does not mean that it is true.
S-HDL-P
M-HDL-P
Genetic markers for M-HDL-P and S-HDL-P
We can further assess the validity of the InSIDE assumption for M-HDL-P and S-HDL-P but examining the associations of their genetic instruments with the traditional lipid risk factors and other subfraction traits. We meta-analyzed the summary results in the two lipidome GWAS (Davis and Kettunen) and obtained SNPs that are associated with S-HDL-P and M-HDL-P (p-value ; the results are LD-clumped). The next two Tables show some information about these genetic markers and their associations with other traits (Appendix 4—table 1 and 2).
Appendix 4—figures 3 and 4 shows how adjusting for LDL-C and TG changes the effects of the selected SNPs for S-HDL-P and M-HDL-P on CAD. The adjusted effect on CAD is obtained by original effect on CAD – 0.45 * effect on LDL-C – 0.25 * effect on TG. After the adjustment, the associations of the genetic variants with CAD generally became closer to the fitted lines that correspond to the estimated effects of S-HDL-P and M-HDL-P.
Gene expression
Here we provide evidence of variant-gene associations from Quantatitive Trait Locus (QTL) analyses in the GTEx project (Appendix 4—table 3).
Data availability
GWAS data used in the data are publicly available. Details can be found in Table 1.
References
-
WebsiteRound 2 {GWAS} results of thousands of phenotype inthe {UK} {BioBank}Accessed August 31, 2018.
-
Effects of torcetrapib in patients at high risk for coronary eventsNew England Journal of Medicine 357:2109–2122.https://doi.org/10.1056/NEJMoa0706628
-
National lipid association annual summary of clinical lipidology 2016Journal of Clinical Lipidology 10:S1–S43.https://doi.org/10.1016/j.jacl.2015.08.002
-
Controlling the false discovery rate: a practical and powerful approach to multiple testingJournal of the Royal Statistical Society: Series B 57:289–300.https://doi.org/10.1111/j.2517-6161.1995.tb02031.x
-
Mendelian randomization with invalid instruments: effect estimation and Bias detection through egger regressionInternational Journal of Epidemiology 44:512–525.https://doi.org/10.1093/ije/dyv080
-
Mendelian randomization analysis with multiple genetic variants using summarized dataGenetic Epidemiology 37:658–665.https://doi.org/10.1002/gepi.21758
-
Lipids, lipoproteins, and metabolites and Risk of Myocardial Infarction and StrokeJournal of the American College of Cardiology 71:620–632.https://doi.org/10.1016/j.jacc.2017.12.006
-
Mendelian randomization: genetic anchors for causal inference in epidemiological studiesHuman Molecular Genetics 23:R89–R98.https://doi.org/10.1093/hmg/ddu328
-
Mendelian randomization as an instrumental variable approach to causal inferenceStatistical Methods in Medical Research 16:309–330.https://doi.org/10.1177/0962280206077743
-
Cholesterol efflux capacity and subclasses of HDL particles in healthy women transitioning through menopauseThe Journal of Clinical Endocrinology & Metabolism 101:3419–3428.https://doi.org/10.1210/jc.2016-2144
-
Phenotypic characterization of genetically Lowered Human Lipoprotein(a) LevelsJournal of the American College of Cardiology 68:2761–2772.https://doi.org/10.1016/j.jacc.2016.10.033
-
Mendelian randomization of blood lipids for coronary heart diseaseEuropean Heart Journal 36:539–550.https://doi.org/10.1093/eurheartj/eht571
-
Mendelian randomization in Cardiometabolic disease: challenges in evaluating causalityNature Reviews Cardiology 14:577–590.https://doi.org/10.1038/nrcardio.2017.78
-
Small dense low-density lipoprotein-cholesterol concentrations predict risk for coronary heart disease: the atherosclerosis risk in communities (ARIC) studyArteriosclerosis, Thrombosis, and Vascular Biology 34:1069–1077.https://doi.org/10.1161/ATVBAHA.114.303284
-
Direct estimation of HDL-Mediated cholesterol efflux capacity from serumClinical Chemistry 65:1042–1050.https://doi.org/10.1373/clinchem.2018.299222
-
Residual risk of atherosclerotic cardiovascular events in relation to reductions in Very-Low-Density lipoproteinsJournal of the American Heart Association 6:e007402.https://doi.org/10.1161/JAHA.117.007402
-
Evacetrapib and cardiovascular outcomes in High-Risk vascular diseaseNew England Journal of Medicine 376:1933–1942.https://doi.org/10.1056/NEJMoa1609581
-
Efficient design for mendelian randomization studies: subsample and 2-sample instrumental variable estimatorsAmerican Journal of Epidemiology 178:1177–1184.https://doi.org/10.1093/aje/kwt084
-
PLINK: a tool set for whole-genome association and population-based linkage analysesThe American Journal of Human Genetics 81:559–575.https://doi.org/10.1086/519795
-
HDL and cardiovascular diseaseThe Lancet 384:618–625.https://doi.org/10.1016/S0140-6736(14)61217-4
-
HDL cholesterol efflux capacity and incident cardiovascular eventsNew England Journal of Medicine 371:2383–2393.https://doi.org/10.1056/NEJMoa1409065
-
Association of HDL cholesterol efflux capacity with incident coronary heart disease events: a prospective case-control studyThe Lancet Diabetes & Endocrinology 3:507–513.https://doi.org/10.1016/S2213-8587(15)00126-6
-
An examination of multivariable mendelian randomization in the single-sample and two-sample summary data settingsInternational Journal of Epidemiology 48:713–727.https://doi.org/10.1093/ije/dyy262
-
High plasma phospholipid transfer protein levels as a risk factor for coronary artery diseaseArteriosclerosis, Thrombosis, and Vascular Biology 23:1857–1862.https://doi.org/10.1161/01.ATV.0000094433.98445.7F
-
Effects of dalcetrapib in patients with a recent acute coronary syndromeNew England Journal of Medicine 367:2089–2099.https://doi.org/10.1056/NEJMoa1206797
-
'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease?International Journal of Epidemiology 32:1–22.https://doi.org/10.1093/ije/dyg070
-
Discovery and refinement of loci associated with lipid levelsNature Genetics 45:1274.https://doi.org/10.1038/ng.2797
-
Comparative proteome analysis of epicardial and subcutaneous adipose tissues from patients with or without coronary artery diseaseInternational Journal of Endocrinology 2019:6976712.https://doi.org/10.1155/2019/6976712
-
Powerful three-sample genome-wide design and robust statistical inference in summary-data mendelian randomizationInternational Journal of Epidemiology 48:1478–1492.https://doi.org/10.1093/ije/dyz142
-
Two-Sample instrumental variable analyses using heterogeneous samplesStatistical Science 34:317–333.https://doi.org/10.1214/18-STS692
-
Statistical inference in two-sample summary-data mendelian randomization using robust adjusted profile scoreThe Annals of Statistics 48:1742–1769.https://doi.org/10.1214/19-AOS1866
Article and author information
Author details
Funding
No external funding was received for this work.
Copyright
© 2021, Zhao et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 2,210
- views
-
- 265
- downloads
-
- 27
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Genetics and Genomics
- Neuroscience
Continued methodological advances have enabled numerous statistical approaches for the analysis of summary statistics from genome-wide association studies. Genetic correlation analysis within specific regions enables a new strategy for identifying pleiotropy. Genomic regions with significant ‘local’ genetic correlations can be investigated further using state-of-the-art methodologies for statistical fine-mapping and variant colocalisation. We explored the utility of a genome-wide local genetic correlation analysis approach for identifying genetic overlaps between the candidate neuropsychiatric disorders, Alzheimer’s disease (AD), amyotrophic lateral sclerosis (ALS), frontotemporal dementia, Parkinson’s disease, and schizophrenia. The correlation analysis identified several associations between traits, the majority of which were loci in the human leukocyte antigen region. Colocalisation analysis suggested that disease-implicated variants in these loci often differ between traits and, in one locus, indicated a shared causal variant between ALS and AD. Our study identified candidate loci that might play a role in multiple neuropsychiatric diseases and suggested the role of distinct mechanisms across diseases despite shared loci. The fine-mapping and colocalisation analysis protocol designed for this study has been implemented in a flexible analysis pipeline that produces HTML reports and is available at: https://github.com/ThomasPSpargo/COLOC-reporter.
-
- Chromosomes and Gene Expression
- Genetics and Genomics
The enhancer-promoter looping model, in which enhancers activate their target genes via physical contact, has long dominated the field of gene regulation. However, the ubiquity of this model has been questioned due to evidence of alternative mechanisms and the lack of its systematic validation, primarily owing to the absence of suitable experimental techniques. In this study, we present a new MNase-based proximity ligation method called MChIP-C, allowing for the measurement of protein-mediated chromatin interactions at single-nucleosome resolution on a genome-wide scale. By applying MChIP-C to study H3K4me3 promoter-centered interactions in K562 cells, we found that it had greatly improved resolution and sensitivity compared to restriction endonuclease-based C-methods. This allowed us to identify EP300 histone acetyltransferase and the SWI/SNF remodeling complex as potential candidates for establishing and/or maintaining enhancer-promoter interactions. Finally, leveraging data from published CRISPRi screens, we found that most functionally verified enhancers do physically interact with their cognate promoters, supporting the enhancer-promoter looping model.