Background

Alzheimer’s disease (AD) is a progressive degenerative disease of the central nervous system, characterized by cognitive impairment, reduced functional capacity for daily living, and behavioral changes. It can be divided into two types: early-onset AD (EOAD, age of onset ≤ 65 years) and late-onset AD (LOAD, age of onset > 65 years); the proportion of LOAD in patients with AD is approximately 95%, with LOAD having a stronger genetic predisposition than EOAD[13]. According to the latest data from the World Health Organization (WHO), the population with AD is currently over 50 million worldwide and is expected to rise to 115 million by 2050[4,5]. With the increasing aging population, the incidence of AD continues to rise, making AD the fifth leading cause of death worldwide. Given that AD is a chronic complex disorder involving multiple pathophysiological changes, it is likely caused by the joint action of various factors in a multifaceted pathological process, and this intricate nature of AD contributes to the current challenges in its diagnosis and treatment, such as low consultation rates, high rates of misdiagnosis at initial consultations, and low rates of long-term standardized treatment[6], thereby making AD one of the most perplexing diseases. Consequently, examining the pathogenic mechanisms of AD, identifying its risk factors, and conducting timely and effective early screening and diagnosis are of utmost importance.

Traditional epidemiological studies have reported common risk factors for AD. Some metabolic co-morbidities are highly associated with AD, such as cardiovascular disease[7,8], obesity[9,10], and diabetes[11,12]. Serological parameters such as C-reactive protein[13], lipids[14,15], and vitamin levels[1618] have been previously reported as potential biomarkers for AD. In addition, some factors related to lifestyle, family history, education, economic level, and environment correlate with AD[1922]. However, most epidemiological studies are insufficient to draw definitive conclusions on causal association due to the potential for reverse causality and confounding bias.

Mendelian randomization (MR) analysis is an emerging method to explore the causal association between AD and various factors[2325]. MR analysis reduces confounding and reverse causality due to the segregation and independent assortment of genes passed from parents to offspring[26]. In the absence of pleiotropy (that is, genetic variation related to a disease via other pathways) and demographic stratification, MR can present a clear estimate of risk of disease[27, 28]. MR analysis is increasingly used to determine a causal relationship between potentially modifiable risk factors and outcomes[29]. These advantages make MR a valuable tool to better elucidate the potential risk or protective factors for AD.

Chen et al. [30] used MR analysis to reveal the causal relationship between AD and factors including sociodemographic and early life status. However, the study revealed they are restricted by the available variables from the UKB database, which lead to variables such as air pollution, blood glucose measures and so on were not included. And also, due to the high degree of heterogeneity present in AD subtypes, which have different biological and genetic characteristics. Thus, the previous studies cannot offer a systematic and complete viewpoint. Our study uses the MRC IEU OpenGWAS database as the sample source for MR analysis to address the aforementioned limitations. The MRC IEU OpenGWAS database, the largest open GWAS database globally, has compiled 42,335 GWAS summary datasets from sources such as the UK Biobank, FinnGen Biobank, and Biobank Japan. Analyzing large-scale datasets will break new ground for MR research on AD.

MR requires a combination of background knowledge in biology, computer science, software studies, and statistics, which often leads to a dilemma where biologists are not well-versed in computer and statistical fields, while computer science experts struggle to adopt a medical biology mindset. Consequently, the vast majority of available GWAS data have not been effectively utilized through MR. Therefore, the construction of a multi-level data platform specifically for AD based on MR analysis of massive GWAS data is of great strategic significance, and it will facilitate researchers and clinicians worldwide to conveniently and rapidly obtain risk factors that are causally associated with AD.

In summary, in this work we attempt to identify risk or protective factors causally associated with AD from a holistic and systematic perspective, thereby providing new ideas for understanding the AD pathogenesis, achieving early diagnosis, and developing clinical drugs. In the first place, this study uses a hypothesis free data mining approach to studying the possible etiology of Alzheimer’s disease based on Mendelian randomization (MR), with specific attention to different AD subtypes (EOAD and LOAD). Based on this, we developed an online open integrated platform, MRAD (Mendelian randomization for Alzheimer’s disease, https://gwasmrad.com/mrad/). Moreover, the platform was further enriched by including related targets’ information such as functions and pathways retrieved from the public database Uniprot. The platform is the first multi-dimensional, integrated, shared, and interactive comprehensive platform for AD MR research to date.

Methods

Database and software

The following databases and software packages were used in this study: MRC IEU OpenGWAS[31] (https://gwas.mrcieu.ac.uk/), UniProt[32] (https://www.uniprot.org/), EVenn[33] (http://www.ehbio.com/test/venn/%23/), R (version 4.1.2) software[34].

MR design for AD (Figure 1)

Data sources

Exposure traits

Inclusion criteria: datasets of the European population.

Study design

Exclusion criteria: (i) eQTL-related datasets; (ii) AD-related datasets.

In this study, the GWAS datasets selected were derived from 42,335 GWAS datasets in the public database (MRC IEU OpenGWAS, https://gwas.mrcieu.ac.uk/). Based on the above inclusion and exclusion criteria, 19,942 eQTL-related datasets were excluded first, leaving 22,393 GWAS datasets. Next, the datasets with the European population were selected, and 18,117 GWAS datasets were obtained. Finally, 20 AD-related datasets were excluded; 18,097 GWAS datasets were obtained at the end as the exposure traits of this study (See Table S1 for basic information).

Outcome traits

Inclusion criteria: (i) datasets of patients with AD with complete information and clear data sources; (ii) datasets of the European population.

Exclusion criteria: (i) Number of SNPs <1 million; (ii) datasets with unspecified sex; (iii) datasets with a family history of AD; (iv) datasets with dementia.

Based on the above criteria, 16 GWAS datasets of outcome traits were selected from the MRC IEU OpenGWAS database, comprising datasets of AD from Alzheimer Disease Genetics Consortium (ADGC), Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium (CHARGE), The European Alzheimer’s Disease Initiative (EADI), and Genetic and Environmental Risk in AD/Defining Genetic, Polygenic and Environmental Risk for Alzheimer’s Disease Consortium (GERAD/PERADES) 2019 (ieu-b-2); AD from Benjamin Woolf 2022 (ieu-b-5067); AD from International Genomics of Alzheimer’s Project (IGAP) 2013 (ieu-a-297) as the datasets of main outcome traits for AD, as well as 13 datasets from FinnGen biobank 2021 corresponding to various AD subtypes, referred to as AD-finn subtypes. (as shown in Figure 2)

Basic information of 16 outcome traits in MRC IEU OpenGWAS

Selection of instrumental variables

SNPs serve as instrumental variables for MR research. In this study, 18,097 exposures-variable SNPs were selected for MR research from the GWAS data (as mentioned in Exposure traits) respectively, with the selected SNPs fulfilling the following requirements: (i) a genome-wide significant association with risk factors (p < 5×10-8) in the European 1000 Genomes Project reference panel; (ii) independent of one another (that is, the r2 of linkage disequilibrium (LD) is less than 0.001 within a 10,000-kb distance) to avoid potential biases caused by LD between SNPs in the analysis.

Statistical models for causal effect inference

A random-effects IVW model was used in this study as the major analysis method to uncover potential risk or protective factors for AD. The random-effects IVW model as the gold standard for MR studies, its principle is to calculate the inverse of the variance of each IV as its weight, assuming all IVs are valid. The regression does not include an intercept term, and the final result is the weighted average of the effect estimates from all IVs [35]. This model indicates that the true effect values may vary across different studies due to both sampling error and the heterogeneity of the true effect. The weight of each study is jointly determined by its inverse variance and the estimated heterogeneity variance. Thus, as long as there is no pleiotropy, even when there is significant heterogeneity (p < 0.05), this method remains the best MR model.

To assess the robustness of the IVW results, sensitivity analysis was performed using six additional models: (i) MR-Egger: MR-Egger’s biggest difference from IVW is that it considers the intercept term during regression to evaluate bias caused by horizontal pleiotropy. The intercept represents the magnitude of horizontal pleiotropy, with a value close to 0 indicating minimal pleiotropy. The primary purpose is to detect and correct for horizontal pleiotropy. Thus, when significant horizontal pleiotropy is observed (p < 0.05), this method is preferred [36,37]. (ii) Weighted median: The weighted median method is a technique for evaluating causal relationships using a majority of genetic variants (SNPs). If at least 50% of the SNPs are valid IVs, the median of the causal estimates will tend toward the true causal effect. This method provides an unbiased estimate (i.e., the “majority validity” assumption) [38]. (iii) Simple mode: Involves comparing the frequencies or proportions of genotypes or phenotypes between control and experimental groups. Moreover, it can illustrate whether the observed differences in genotypes or phenotypes between the two groups are statistically significant. (iv) Weighted mode: The weighted mode method is a technique for combining multiple Mendelian randomization estimates. This method assigns weights to the causal effect estimates of different genetic variants on the trait and then takes the weighted mode as the final estimate of the causal effect. In genetic variant estimates, the method can decrease bias caused by outliers. (v) Maximum likelihood: This method is used when it is known that a random sample follows a particular probability distribution; however, the specific parameters of that distribution remain unknown, and it involves conducting multiple experiments, observing the results, and using those results to infer the approximate values of the parameters [39]. (vi) Penalized weighted median: An enhanced version of the weighted median estimate that provides a consistent estimate of the causal effect. (vii) Heterogeneity and horizontal pleiotropy assessment use the heterogeneity tests [40] and Egger intercept tests [41], respectively.

The above analyses were performed using the TwoSampleMR[42] package in the R (version 4.1.2) software. Association of exposures with outcomes was assessed using odds ratio (OR) and 95% confidence interval (95% CI), with OR > 1 indicating a positive association (risk factor) and 0 < OR < 1 indicating a negative association (protective factor). Differences with a two-sided p < .05 were considered statistically significant. Furthermore, owing to the relatively large number of exposure and outcome traits included in this study, the multiple testing correction method Bonferroni correction was added to identify significant hits, threshold for Bonferroni-corrected was 0.05 divided by 289,552 tests (p <1.727e-07).

Building the MRAD platform

In this study, the online MRAD platform was developed using the Shiny package[43] in R (version 4.1.2) and hosted on an Ubuntu 20.04 server. By leveraging Shiny, we combined the computational capabilities of R with modern web technologies, allowing to construct an interactive user interface with novel approaches.

Results

Results of hypothesis-free Mendelian randomization analysis for Alzheimer’s disease

Based on hypothesis-free Mendelian randomization analysis for Alzheimer’s disease, this study generated a total of 400,274 data points. The major analysis method of IVW model consists of 73,129 records with 4840 exposure traits, which fall into 10 categories: Disease (n=17,168), Medical laboratory science (n=15,416), Imaging (n=4,896), Anthropometric (n=4,478), Treatment (n=4,546), Molecular trait (n=17,757), Gut microbiota (n=48), Past history (n=668), Family history (n=1,114), and Lifestyle trait (n=7,038), as shown in Figure 3. To assess the robustness of the IVW results, sensitivity analysis was performed using six other models (MR-Egger with a total of 50,804 records, Weighted median with a total of 50,804 records, Simple mode with a total of 50,804 records, Weighted mode with a total of 50,804 records, Maximum likelihood with a total of 73,125 records, and Penalized weighted median with a total of 50,804 records).

Categories of the exposure traits identified by IVW model

MRAD platform integration

Based on the 400,274 data points stated above, we created herein is an online data analysis platform for identifying the risk or protective factors for AD called MRAD (Mendelian randomization for Alzheimer’s disease, https://gwasmrad.com/mrad/). It contains six modules: (i) Home; (ii) Study Design; (iii) IVW interactive; (iv) IVW static; (v) Sensitivity analysis interactive; and (vi) Sensitivity analysis static; The platform provides a user-friendly search interface, allowing users to search, interactively visualize, analyze, and download the obtained results (MRAD User Guide see Supplementary Material for details). In our view, as the first interactive comprehensive platform for AD MR research to date, this online platform would benefit the field of scientific research in AD in numerous ways. On the one hand, it would allow researchers to quickly identify risk or protective factors from their own research and generate novel hypothesis regarding the molecular mechanism of AD. On the other hand, it would allow researchers with complementary expertise to provide multiple characterizations of the same data. As the platform is hosted on a server and accessed through a web interface, which could meet the multi-terminal compatibility, thereby MRAD’s online presence could increase access to potential users.

MRAD utility data mining

To demonstrate the utility of MRAD platform, we focus on the IVW model-identified exposure traits that have significantly and consistently effect across three main outcome traits of AD to demonstrate the performance of the MRAD platform. Detailed investigation and reporting of other factors will be carried out in future research.

In this study, MR analysis was first performed on the three main outcome traits of AD to explore their potential risk or protective factors, leading to identification of a total of 80 exposure traits (p<0.05), which fell into five Classification I categories: Medical laboratory science (n=51), Family history (n=10), Disease (n=9), Molecular trait (n=7), and Lifestyle trait (n=3). A total of 63 exposure traits (risk factors) were positively associated with all the three main outcome traits, while 16 exposure traits (protective factors) were negatively associated with the three main outcome traits, with Ulcerative colitis (ebi-a-GCST000964) being negatively associated with the AD outcome traits of ieu-b-2 and ieu-a-297, and positively associated with the AD outcome traits of ieu-b-5067. MR analysis was performed on the outcome traits of 13 different AD-finn subtypes to further examine the causal association between the above-identified key common exposure traits and different subtypes of AD outcome traits. The results are provided below in detail.

Causal association between medical laboratory science and the main outcome traits of AD

In this study, the 51 medical laboratory science items that each had a causal effect on the main outcome traits of AD were grouped into three Classification II categories (blood lipids and lipoproteins (n=36), immunological tests (n=12), and plasma protein tests (n=3)).

1 Blood lipids and lipoproteins

A total of 36 blood lipids and lipoproteins items as exposure traits had effects on the main outcome traits of AD: (1) 32 of which were positively associated with the main outcome traits, 7 of which, e.g., apolipoprotein B (ieu-b-108), were positively associated with EOAD (finn-b-AD_EO) and LOAD (finn-b-AD_LO); free cholesterol in IDL (met-c-868) was positively associated with EOAD (finn-b-AD_EO); 4 of which, e.g., phospholipids in small LDL (met-d-S_LDL_PL), were positively associated with LOAD (finn-b-AD_LO), as shown in Figure 4A. The corresponding sensitivity analysis and Bonferroni correction results are shown in Figure S1 and Table S2. (2) four of which were negatively associated with the main outcome traits, apolipoprotein A-I (ieu-b-107) was negatively associated with both EOAD (finn-b-AD_EO) and LOAD (finn-b-AD_LO), and the negative causal association was slightly stronger for EOAD than for LOAD; phospholipids to total lipids ratio in chylomicrons and extremely large VLDL (met-d-XXL_VLDL_PL_pct) was negatively associated with LOAD (finn-b-AD_LO). These findings are illustrated in Figure 4B. The corresponding sensitivity analysis and Bonferroni correction results are shown in Figure S2 and Table S2.

80 exposure traits with causal effects on the main outcome traits of AD based on major analysis method random-effects IVW model.

Figure 4A. Thirty-two blood lipids and lipoproteins items that were positively associated with the main outcome traits of AD. Figure 4B. Four blood lipids and lipoproteins items that were negatively associated with the main outcome traits of AD. Figure 4C. Twelve immunological test items that were positively associated with the main outcome traits of AD. Figure 4D. Three plasma protein tests items that were negatively associated with the main outcome traits of AD. Figure 4E. Ten family history items with causal effects on the main outcome traits of AD. Figure 4F. Nine diseases items with causal effects on the main outcome traits of AD. Figure 4G. Seven molecular trait items with causal effects on the main outcome traits of AD. Figure 4H. Three lifestyle trait items with causal effects on the main outcome traits of AD.

Note: The pink dots in the figure represent positive association, the blue dots in the figure represent negative association, with the color depth of the dots being positively proportional to the OR value (the darker the color, the larger the OR value), and the size of the dots being inversely proportional to the p-value (the smaller the p-value, the larger the dots). The gray dots represent no significant causal association (p>0.05).

2 Immunological tests

A total of 12 immunological test items as exposure traits had positive effects on the main outcome traits of AD. Six of which, e.g., CD33 on Monocytic Myeloid-Derived Suppressor Cells (ebi-a-GCST90001952), were positively associated with LOAD (finn-b-AD_LO), as shown in Figure 4C. The corresponding sensitivity analysis and Bonferroni correction results are shown in Figure S3 and Table S2.

3 Plasma protein tests

A total of 3 plasma protein tests items as exposure traits had negative effects on the main outcome traits of AD. The three exposure traits were C-reactive protein (ukb-d-30710_raw, ukb-d-30710_irnt, and ieu-b-4764). All the three exposure traits were negatively associated with EOAD (finn-b-AD_EO) and LOAD (finn-b-AD_LO), as shown in Figure 4D. The corresponding sensitivity analysis and Bonferroni correction results are shown in Figure S4 and Table S2.

Causal association between family history and the main outcome traits of AD

A total of 10 family history items as exposure traits had causal effects on the main outcome traits of AD. In particular, a parental or family history of AD increased the overall risk of developing AD, and was positively associated with both EOAD (finn-b-AD_EO) and LOAD (finn-b-AD_LO), as shown in Figure 4E. The corresponding sensitivity analysis and Bonferroni correction results are shown in Figure S5 and Table S2.

Causal association between diseases and the main outcome traits of AD

In this study, the 9 diseases items that each had a causal effect on the main outcome traits of AD were grouped into four Classification II categories (dementia (n=5), neurodegenerative diseases (n=2), mental disorders associated with neurological diseases (n=1), and digestive system diseases (n=1)). Their causal effects with the main outcome traits of AD and the outcome traits of EOAD (finn-b-AD_EO) and LOAD (finn-b-AD_LO) are shown in Figure 4F. The corresponding sensitivity analysis and Bonferroni correction results are shown in Figure S6 and Table S2.

Causal association of molecular traits with the main outcome traits of AD

A total of 7 molecular trait items as exposure traits had causal effects on the main outcome traits of AD, among which Myeloid cell surface antigen CD33 (prot-a-439) was positively associated with the main outcome traits of AD, as well as with both EOAD (finn-b-AD_EO) and LOAD (finn-b-AD_LO). The remaining six were all negatively associated with the main outcome traits of AD, and their causal effects on the outcome traits of 13 AD-finn subtypes were as follows: (i) tubulin-specific chaperone A (TBCA; prot-a-2930) and vacuolar protein sorting-associated protein 29 (VPS29; prot-a-3203) were negatively associated with both EOAD (finn-b-AD_EO) and LOAD (finn-b-AD_LO); (ii) guanine nucleotide-binding protein G(k) subunit alpha (GNAI3; prot-a-1226) and proteasome activator complex subunit 1 (PSME1; prot-a-2420) were negatively associated with LOAD (finn-b-AD_LO), but had no significant causal association with EOAD (finn-b-AD_EO) (p>0.05); and (iii) neither glutamine (met-c-860) nor glutamine (met-d-Gln) had significant causal association with EOAD (finn-b-AD_EO) or LOAD (finn-b-AD_EO) (p>0.05), as shown in Figure 4G. The corresponding sensitivity analysis and Bonferroni correction results are shown in Figure 5 and Table S2.

Statistical models for causal effect results of seven molecular trait items with causal effects on the main outcome traits of AD.

Note:

(i) For column Inverse variance weighted, MR egger, Weighted median, Simple mode, Weighted mode, Maximum likelihood, and Penalized weighted median: the pink dots in the figure represent positive association, the blue dots represent negative association, with the color depth of the dots being positively proportional to the OR value (the darker the color, the larger the OR value), and the size of the dots being inversely proportional to the p-value (the smaller the p-value, the larger the dots). The gray dots represent no significant causal association (p>0.05). The star mark(✪) represents that is significant at the Bonferroni threshold (p<1.727e-07).

(ii) For column Heterogeneity test: the pink dots in the figure represent the effect of heterogeneity was considered negligible (heterogeneity_pval> 0.05).

The gray dots represent significant association (p<0.05).

(iii) For column Egger intercept test: the pink dots in the figure represent there was no significant difference between Egger Intercept and 0, indicating no horizontal pleiotropy (Horizontal_pval> 0.05). The gray dots represent significant association (p<0.05). The dark gray dots represent not applicable due to the quantity of SNP was less than 3.

Causal association of lifestyle traits with the main outcome traits of AD

A total of 3 lifestyle trait items as exposure traits had causal effects on the main outcome traits of AD. Their causal effects with the main outcome traits of AD and the outcome traits of EOAD (finn-b-AD_EO) and LOAD (finn-b-AD_LO) are shown in Figure4H. The corresponding sensitivity analysis and Bonferroni correction results are shown in Figure S7 and Table S2.

Discussion

Despite decades of research on AD, controversy still remains regarding which factors play an important in its pathogenesis. This study carried out hypothesis-free Mendelian randomization analysis for Alzheimer’s disease, which provided a thorough and comprehensive evaluation with regard to risk or protective factors for AD. This MR study covers most exposure traits that are causally associated with AD outcome traits, including diseases, medical laboratory science items, imaging items, anthropometric items, treatments, molecular traits, gut microbiota, past histories, family histories, and lifestyle traits, and reveals the causal associations between these exposure traits and different AD subtypes.

Based on this, for the convenience of display and operation, a user-friendly prediction platform was built online called MRAD. The MRAD provides a one-stop online analysis service for researchers worldwide, including data retrieval → visualization → personalized analysis → data download. Users can obtain analysis results of different MR models (the main IVW model and six sensitivity analysis models) on 18,097 exposure traits and 16 AD outcome traits, totaling 400,274 records, and are allowed to set personalized parameters to meet different analysis needs. Additionally, the MRAD provides interactive visualization interfaces and download functions for the above results.

MRAD platform provides a unique resource for systematically identifying risk or protective factors of AD, which facilitates early identification, diagnosis, prevention, and treatment, with significant clinical and social value. It could have several strengths: (i) The current methods for identifying AD mainly rely on assessment scales, cerebrospinal fluid (CSF) examinations, and brain PET/MRI. However, assessment scales can be biased by factors such as the anxiety and nervousness of the subjects. CSF examinations require an invasive lumbar puncture, leading to low patient acceptance. PET/MRI scans are expensive and have limited equipment accessibility. These limitations restrict early AD identification. Thus, there is a pressing clinical need for readily available, time- and cost-effective, and accurate detection methods. In this study, the Medical laboratory science and Molecular trait used could be less expensive, faster to detect, easier to operate, and more accessible for widespread adoption. They hold great value for early AD identification and have the potential to become crucial tools for identifying AD in the future. (ii) Imaging acts as a powerful assistive tool for diagnosing Alzheimer’s disease. Traditional imaging examinations mainly depict changes in the brain’s macroscopic structure, while research on microstructural changes in disease-related areas is relatively limited. Studies have demonstrated that microstructural neurodegenerative processes are extensive and pronounced during AD progression. Our study results cover traditional macroscopic neuroimaging results and reveal numerous potential causal relationships between brain microstructure and AD. The combination of macroscopic and microstructural insights will provide more valuable information for clinical diagnosis. (iii) Clarifying patient’s disease, past history, and family history can aid in preventing AD at an early stage, and prevention of AD could be attained through monitoring anthropometric indicators, improving gut microbiota, and adjusting lifestyle traits. (iv) Currently, the development of new drugs for AD is mainly underscored by Aβ, Tau, and other inhibitors. Since 2000, global pharmaceutical companies have invested hundreds of billions of dollars in the development of new drugs for AD, and these drugs have not yielded successful results. AD drug development has thus been perceived as having the highest failure rate of all drug research, reaching 99.6%. Hence, further research on molecular traits to find new targets and develop new drugs for these targets will provide new pathways for AD treatment.

To briefly demonstrate the performance of MRAD, we explored the IVW model-identified exposure traits that had significantly consistently effect across all the three main outcome traits of AD.

The association of lipids and lipoproteins, C-reactive protein, family histories, neurological disorders, glutamine, and education level with AD has been widely reported[23,4467] and is consistent with the results of this study. Moreover, given that the prevalence of LOAD is about 95% in patients with AD and that LOAD has a stronger genetic predisposition than EOAD[13], identifying new risk genes for LOAD is crucial for understanding its potential etiology. Therefore, this study further explored the relationships between these traits and different AD subtypes, leading to the following findings: (i) apolipoprotein B, cholesterol, total, LDL cholesterol, Low density lipoprotein cholesterol levels, total cholesterol in LDL, total cholesterol in medium LDL, cholesterol to total lipids ratio in large LDL, free cholesterol in large LDL, free cholesterol in LDL, phospholipids in small LDL, parental or family history of AD, parental longevity (mother’s attained age), dementia, vascular dementia, dementia with Lewy bodies, other degenerative diseases of the nervous system, and organic, including symptomatic, mental disorders were all positively associated with LOAD; (ii) apolipoprotein A-I, phospholipids to total lipids ratio in chylomicrons and extremely large VLDL, C-reactive protein, parental longevity (both parents in top 10%), and qualifications: A levels/AS levels or equivalent were all negatively associated with LOAD. These findings suggest that the above traits may have critical impacts on LOAD.

Moreover, some novel potential therapeutic targets of AD were identified as follows: CD33 on Monocytic Myeloid-Derived Suppressor Cells, CD33 on CD33+ HLA DR+ CD14dim, CD33 on CD33+ HLA DR+, CD33 on CD33+ HLA DR+ CD14-, CD33 on CD33dim HLA DR -, CD33 on CD33dim HLA DR+ CD11b-, and Myeloid cell surface antigen CD33 were positively associated with all the three main outcome traits of AD and the risk of LOAD. It has been reported that CD33 is a 67 kDa glycosylated transmembrane protein, a member of the sialic acid-binding immunoglobulin like lectins family (SIGLECS family), which is an important receptor for cell growth and survival, as well as a critical receptor for the clathrin-independent endocytosis pathway and the innate and adaptive immune system functions. CD33 is mainly expressed in microglia, which are a type of glial cells in the central nervous system[68]. Meanwhile, the splicing efficiency of CD33 affects microglia activation[69]. Several genome-wide association studies have demonstrated that CD33 is a high-risk gene for AD[7071]. In animal models, knockdown of CD33 significantly reduced amyloid plaque levels and knockout mice did not exhibit other health defects. Sialylated glycoproteins and glycolipids on amyloid plaques bind to CD33, which is most likely the cause of the amyloid “immune escape“[72]. Furthermore, polymorphisms in CD33 can increase the risk of AD by causing neuronal degeneration in the hippocampal and parahippocampal regions of the brain[73]. Downregulation of the sialic acid-binding domain of CD33 can reduce the risk of developing AD. Therefore, inhibiting CD33 is an effective approach to inhibit the development of AD, and the sialic acid-binding site on CD33 is a promising pharmacophore[74].

Tubulin-specific chaperone A (TBCA) was negatively associated with all the three main outcome traits of AD, as well as EOAD and LOAD (pval is significant at the Bonferroni threshold). TBCA is an important member of the tubulin-specific chaperones (TBCs) family. Tian et al. and Nolasco et al. demonstrated that TBCA can regulate the proportion of α and β-tubulin, enabling them to correctly aggregate into cellular microtubules[75]. Cellular microtubules play important roles in many biological functions, especially in cell movement, cell division, intracellular transport, and cell structure. After silencing TBCA, abnormal microtubule aggregation occurs in mammalian cells, and the cells cannot grow and divide normally, ultimately leading to apoptosis[76,77]. Moreover, studies have shown that TBCA plays a crucial role in correct β-tubulin folding and α/β-tubulin heterodimer formation[78]. Protein misfolding can lead to many diseases, such as neurodegenerative diseases. Additionally, higher levels of TBCA are significantly associated with lower AD risk[79]. These findings suggest that TBCA may serve as a potential protective factor against AD.

Vacuolar protein sorting-associated protein 29 (VPS29) was negatively associated with all the three main outcome traits of AD, as well as EOAD and LOAD (pval is significant at the Bonferroni threshold). VPS29 is a component of the retromer complex and is highly expressed in the brain, heart, and kidneys, playing an essential role in retromer functions such as synaptic transmission, survival, and movement[80]. Retromer mainly consists of the VPS26-VPS29-VPS35 trimer and Sorting Nexins (SNXs), and its defects are closely related to various human diseases, including neurodegenerative diseases[80]. Studies have reported that VPS29 knockdown leads to reduced levels of VPS35 and VPS26[81,82], which regulates the localization of retromer within neurons and is essential for the aging nervous system[80]. The retromer complex has been found to regulate the transport of a variety of substances, including amyloid precursor protein (APP), β-secretase, and phagocytic receptors on microglia. The retromer complex regulates the production of amyloid-β (Aβ) by regulating the transport of relevant carrier proteins, thus playing a role in AD[83]. When the retromer complex malfunctions, the pathway for the reverse transport of APP and β-secretase to the trans-Golgi network is disrupted, resulting in an increase in the production of Aβ, which accelerates the pathological process of AD[84]. Meanwhile, the reduction of phagocytic receptors on the surface of microglia weakens the clearance and protective functions of microglia. Recent studies have shown that stabilizing the retromer complex through chaperone proteins can limit the amyloid processing of APP to reduce the production of Aβ[83]. These findings suggest that the retromer complex can serve as a new therapeutic target to intervene in the pathological progression of AD.

Guanine nucleotide-binding protein G(k) subunit alpha (GNAI3) was negatively associated with the three main outcome traits of AD and the risk of LOAD. G proteins are a class of signal transduction proteins that can bind with guanosine diphosphate (GDP) and have guanosine triphosphate (GTP) hydrolysis activity; they have more than 40 types, consisting of alpha, beta, and gamma subunits with a total molecular weight of about 100 kDa, with the alpha subunit having the greatest variation and determining the specificity of the G proteins[85]. G proteins are intracellular membrane proteins that shuttle between receptors and effector proteins, acting as signal transducers and playing an absolute dominant role in transmembrane cell signaling in the body. All cellular activities are related to signals, and signals are the initiating factors of all cell activities, while physiological responses are only the final results of signals acting on cells. After receiving external stimuli, cells respond by implementing signal transduction through a set of specific mechanisms to ultimately regulate the expression of specific genes, and the whole process is referred to as a cellular signaling pathway. In the pathogenesis of AD, the abnormal content and distribution of multiple signaling molecules, as well as the abnormality of signa transmission pathways, play an important role in AD pathological changes[86], suggesting that gaining insights into signal transduction mechanisms may provide a potential new pathway to explore the pathogenesis of AD.

Proteasome activator complex subunit 1 (PSME1) was negatively associated with all the three main outcome traits of AD and the risk of LOAD. PSME1 is the encoding gene of the 11s proteasome activator subunit (also known as PA28α) and is located on human chromosome 14q11.2. PA28α is an activator of proteasome, which mainly increases the protein degradation activity of 20S proteasome and participates in MHC-I (major histocompatibility complex I) restricted antigen presentation[87]. Studies have shown that PA28α overexpression in the brain of female mice can effectively prevent protein aggregation in the hippocampus, thereby reducing depression-like behavior and enhancing learning and memory ability[88]. Related studies have shown that proteasome function and PA28α expression are inhibited in the brain of diabetic rats[88]. The PA28 expression in the diabetic brain has a certain regulatory effect on protein metabolism caused by oxidative damage[88]. As suggested above, PSME1 may be a new potential therapeutic target for AD and deserves further investigation.

Conclusions

To the best of our knowledge, this is one of the most comprehensive studies to provide important insight into genetic etiology underlying AD based on hypothesis-free Mendelian randomization analysis. In the meantime, we developed the first MR platform for AD, of great clinical and scientific significance that provided a thorough and comprehensive evaluation with regard to risk or protective factors for AD. It also provided physicians and scientists with a very convenient, free as well as user-friendly tool for further scientific investigation. It is important to notice that we recognized CD33, TBCA, VPS29, GNAI3, and PSME1 as novel potential therapeutic targets for AD that deserve further investigation in more detail. However, in this study, since the GWAS datasets for both the exposure and the outcome traits (AD) selected were obtained from the public database (MRC IEU OpenGWAS), where the GWAS datasets for AD are only of European population, and since we use the TwoSampleMR, which requires that the populations for the exposure traits and the outcome traits be the same to satisfy the requirement for a control variable, this study currently has certain limitations in terms of population. We initiated a Mendelian randomization study on AD at clinical hospitals in China and are currently in the sample collection stage to address the limitations. In the future, we will integrate data from more populations and continuously update new advances in AD research to explore its potential differences in different populations.

Declarations

Ethics approval and consent to participate

Not applicable. All data in this study are sourced from publicly available datasets.

Consent for publication

Not applicable. All data in this study are sourced from publicly available datasets.

Competing interests

We have no conflict of interest to declare. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Funding

This work was supported by the National Natural Science Foundation of China (No. 82302872); the Changchun Science and Technology Planning Project (No. 21ZY18).

Authors’ contributions

Zhao TY: Methodology, formal analysis, data curation, visualization, and writing—original draft preparation; Li H: software; Zhang MS, Xu Y and Zhang M: writing—editing; Chen L: conceptualization and supervision. All authors have read and approved the published version of the manuscript.

Acknowledgements

We would like to thank Taylor & Francis (www.tandfeditingservices.com) for English language editing.

Availability of data and material

Data availability

Publicly available datasets were analyzed in this study. These data can be found here: [MRC IEU OpenGWAS] at (https://gwas.mrcieu.ac.uk/), and [UniProt] at (https://www.uniprot.org/), the above database search was completed on January 30, 2023.

Code availability

The MRAD platform can be freely accessed online at https://gwasmrad.com/mrad/.

The main project development repository: https://github.com/ZhaoTianyu-zty/MRAD.

Abbreviations

  • AD: Alzheimer’s disease

  • APP: amyloid precursor protein

  • Aβ: amyloid-β

  • CI: confidence interval

  • EOAD: early-onset Alzheimer’s disease

  • eQTL: expression Quantitative Trait Loci

  • G protein: guanine nucleotide-binding protein G(k) subunit alpha

  • GDP: guanosine diphosphate

  • Gln: glutamine

  • GTP: guanosine triphosphate

  • GWAS: genome-wide association study

  • HLA: human leukocyte antigen

  • IVW: Inverse Variance Weighted

  • LD: linkage disequilibrium

  • LDL: low density lipoprotein

  • LOAD: late-onset Alzheimer’s disease

  • MR: Mendelian randomization

  • MRAD: Mendelian randomization for Alzheimer’s disease

  • OR: odds ratio

  • PSME1: proteasome activator complex subunit 1

  • RCT: randomized controlled trial

  • SIGLECS: sialic acid-binding immunoglobulin like lectins

  • SNPs: Single nucleotide polymorphisms

  • SNXs: Sorting Nexins

  • TBCA: tubulin-specific chaperone A

  • TBCs: tubulin-specific chaperones

  • VLDL: very low density lipoprotein

  • VPS29: vacuolar protein sorting-associated protein 29

  • WHO: World Health Organization

Highlights

(1) To the best of our knowledge, this is one of the most comprehensive studies to provide important insight into genetic etiology underlying AD based on hypothesis-free Mendelian randomization analysis. We generated 400,274 data entries in total, among which the major analysis method of IVW model consists of 73,129 records with 4840 exposure traits, which fall into 10 categories: Disease (n=17,168), Medical laboratory science (n=15,416), Imaging (n=4,896), Anthropometric (n=4,478), Treatment (n=4,546), Molecular trait (n=17,757), Gut microbiota (n=48), Past history (n=668), Family history (n=1,114), and Lifestyle trait (n=7,038).

(2) It is also important to note that we developed the first MR platform for AD, of great clinical and scientific significance that provided a thorough and comprehensive evaluation with regard to risk or protective factors for AD. It also provided physicians and scientists with a very convenient, free as well as user-friendly tool for further scientific investigation. The overall method used to construct this platform can be applied to the research of other diseases’ etiology.

(3) It is also worth noting that we identified CD33, TBCA, VPS29, GNAI3, and PSME1 as novel potential therapeutic targets, which might be promising drug targets for AD and warrant further clinical investigation, especially TBCA and VPS29.