Mendelian randomization analysis provides causality of smoking on the expression of ACE2, a putative SARS-CoV-2 receptor

  1. Hui Liu
  2. Junyi Xin
  3. Sheng Cai
  4. Xia Jiang  Is a corresponding author
  1. Biomedical Research Center, Zhejiang Provincial Key Laboratory of Laparoscopic Technology, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, China
  2. Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, China
  3. Institute of Drug Metabolism and Pharmaceutical Analysis, Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, Zhejiang University, China
  4. Department of Clinical Neuroscience, Center for Molecular Medicine, Karolinska Institute, Sweden

Abstract

Background:

To understand a causal role of modifiable lifestyle factors in angiotensin-converting enzyme 2 (ACE2) expression (a putative severe acute respiratory syndrome coronavirus 2 [SARS-CoV-2] receptor) across 44 human tissues/organs, and in coronavirus disease 2019 (COVID-19) susceptibility and severity, we conducted a phenome-wide two-sample Mendelian randomization (MR) study.

Methods:

More than 500 genetic variants were used as instrumental variables to predict smoking and alcohol consumption. Inverse-variance weighted approach was adopted as the primary method to estimate a causal association, while MR-Egger regression, weighted median, and MR pleiotropy residual sum and outlier (MR-PRESSO) were performed to identify potential horizontal pleiotropy.

Results:

We found that genetically predicted smoking intensity significantly increased ACE2 expression in thyroid (β=1.468, p=1.8×10−8), and increased ACE2 expression in adipose, brain, colon, and liver with nominal significance. Additionally, genetically predicted smoking initiation significantly increased the risk of COVID-19 onset (odds ratio=1.14, p=8.7×10−5). No statistically significant result was observed for alcohol consumption.

Conclusions:

Our work demonstrates an important role of smoking, measured by both status and intensity, in the susceptibility to COVID-19.

Funding:

XJ is supported by research grants from the Swedish Research Council (VR-2018–02247) and Swedish Research Council for Health, Working Life and Welfare (FORTE-2020–00884).

Introduction

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has led to a worldwide pandemic of coronavirus disease 2019 (COVID-19) (Coronaviridae Study Group of the International Committee on Taxonomy of Viruses, 2020; World Health Organization, 2020). As a host receptor of SARS-CoV-2, the expression level of angiotensin-converting enzyme 2 (ACE2) has been found to influence both the risk and severity of infection (Hoffmann et al., 2020; Wrapp et al., 2020; Zhou et al., 2020a; Li et al., 2003; Li et al., 2005). Moreover, a growing body of evidence from epidemiological investigations has demonstrated a substantial disparity in the susceptibility to infection (Guan et al., 2020; Hu et al., 2020; Mehra et al., 2020; Patanavanich and Glantz, 2020). For example, a multi-center study involving 8910 COVID-19 cases from 169 hospitals in Europe and North America identified an increased risk of in-hospital death among current smokers (odds ratio [OR]=1.79; 95% CI: 1.29–2.47) compared with ever-smokers or non-smokers (Mehra et al., 2020).

Consistent with findings from large-scale population-based observational studies, a laboratory-based study involving 131 RNA-sequenced human lung cancer tissues (54 samples of European ancestry individuals and 77 samples of Asian ancestry individuals) found that smokers expressed a significantly higher level of ACE2 compared to non-smokers in both populations, leading to a potentially heightened susceptibility to SARS-CoV-2 infection (Cai, 2020). Furthermore, incorporating two additional DNA microarray datasets of lung cancer, the significant smoking-ACE2 association observed in a total of 224 samples did not alter after adjusting for age, sex, race, and platforms. Nevertheless, these samples are derived from lung cancer patients, restricting its generalizability to normal lung tissues and to the general population. In a related work, Rao et al. conducted a phenome-wide Mendelian randomization (MR) study examining an extensive amount of diseases, traits, and blood proteins and identified several ‘exposures’ including diabetes, breast cancer, lung cancer, inflammatory bowel disease, and smoking to increase ACE2 expression in normal lung tissue (Rao et al., 2020). This analysis, despite its substantially augmented number of exposures (N=3948), has several limitations. First of all, disease status such as diabetes and cancers are difficult to modify, at the population level it is more important to discover and intervene with modifiable risk factors such as smoking and alcohol consumption. However, regarding smoking, only three single nucleotide polymorphisms (SNPs) were used as instrumental variables (IVs) by Rao et al., which explained negligible phenotypic variation and did not accurately predict smoking status. The hitherto largest genome-wide association study (GWAS) of tobacco use was conducted in a total of 1.2 million individuals and discovered over 400 genetic variants associated with smoking initiation and intensity (Liu et al., 2019). Last but not least, Rao et al. focused on lung tissue instead of systemically examining all human tissues. Despite lungs being the most relevant and vulnerable organ to a respiratory syndrome COVID-19, recent studies have identified the involvement of other human tissues (e.g. gastrointestinal tract) in SARS-CoV-2 infection (Zou et al., 2020).

Motivated by these findings, we aim to explore whether genetic predisposition to common human modifiable behaviours including smoking and alcohol consumption could lead to an increased ACE2 expression, which subsequently yields to an increased susceptibility and severity of COVID-19. Here, we conduct a phenome-wide MR analysis by incorporating ACE2 expressions from a broad spectrum of tissues/organs available in the GTEx database. As one of the hitherto largest databases with concomitant information on DNA genotype and RNA expression, GTEx collects a large variety of tissues (N=44) from healthy population (deceased donors) (Gamazon et al., 2018). In addition, data on COVID-19 susceptibility and severity were obtained from COVID-19 Host Genetics Initiative, a global project aims to understand the role of host genome in COVID-19 outcome (COVID-19 Host Genetics Initiative, 2020). Hundreds of genetic variants identified by a large-scale GWAS of tobacco use and alcohol consumption were used as IVs – incorporating additional loci greatly enhances the strength of genetic instruments as well as both accuracy and precision of MR estimates (Liu et al., 2019).

Materials and methods

Data on IV-exposure

Request a detailed protocol

IV-exposure associations were extracted from the hitherto largest GWAS conducted by the GCSCAN consortium (GWAS and Sequencing Consortium of Alcohol and Nicotine use) for tobacco use and alcohol consumption, totalling 1.2 million individuals of European ancestry (Liu et al., 2019). This GWAS firstly meta-analysed summary-level data from each participating cohort and identified independent SNPs passing genome-wide significance (p<5×10−8) based on linkage disequilibrium. After that, additional independent and genome-wide significant SNPs were selected using a conditional analysis within each significant locus defined as a 1 MB region surrounding the sentinel variant (the variant in the locus with the lowest p-value). We used all conditionally independent biallelic SNPs as IVs.

In our analysis, we included two smoking phenotypes and one drinking phenotype, smoking initiation as reflected by never vs. ever smoking (IV=378, N=1,232,091), smoking intensity as reflected by cigarettes per day (IV=55, N=337,334), and common (opposing to excessive or harmful) alcohol drinking behaviour defined as drinks per week (IV=99, N=941,280). The proportion of phenotypic variance explained by IVs accounted for 2.3% for smoking initiation, 1.1% for cigarettes per day, and 0.2% for drinks per week. Detailed information regarding IVs for each exposure were shown in Supplementary file 1a-1c.

Strong instrumental variable is the basic requirement to ensure a valid MR result. The strength of IV was verified by calculating F-statistics using the formula F=R2(n1k)(1R2)k, where R2 is the proportion of variance explained by the IV, k refers to the number of IVs, and n indicates the sample size (Pierce et al., 2011). The F-statistics for smoking initiation, smoking intensity (cigarettes per day), and alcohol consumption (drinks per week) were 77.2, 67.4, and 17.8, respectively, indicating strong IVs (F-statistics > 10) for each of our exposure of interest.

Data on IV-outcome

Request a detailed protocol

Associations of genetic variants with ACE2 expression were extracted from the GTEx database release version 8 available at the GTEx Portal (http://www.gtexportal.org). It is one of the largest databases with concomitant information on genotype and expression data for a large variety of non-diseased tissues collected from >1000 human donors. Out of the total 54 tissues/organs, we focused on ACE2 expression in 44 tissues/organs with a decent sample size involving at least 100 individuals to ensure statistical power (Supplementary file 1d). Specifically, the associations of genotype with ACE2 expression in adipose tissue, artery, brain, colon, oesophagus, heart, liver, lung, minor salivary gland, muscle, nerve, ovary, pancreas, pituitary, prostate, skin, small intestine, stomach, testis, thyroid, uterus, and vagina were included.

Unlike most existing MR studies that consider disease status as outcomes, this MR study treats gene expression levels as outcomes. GTEx identifies expression quantitative trait loci (eQTL) by associating genetic variations called from GWAS with gene expression levels obtained from RNA-sequencing. Expression values for each gene were inverse quantile normalized to a standard normal distribution across samples as previously described (Gamazon et al., 2018). Please note, here we studied ACE2 RNA expression instead of its actual protein expression. Since ACE2 RNA and protein quantities have been found to be well correlated in tissues such as lung and kidney (Wang et al., 2020), ACE2 RNA expression acts as a good proxy for ACE2 protein expression.

In addition to ACE2 expression, associations of genetic variants with COVID-19 susceptibility and severity were extracted from the COVID-19 Host Genetics Initiative (https://www.covid19hg.org). It is a global genetics collaboration aiming to explore the genetic determinants of COVID-19 susceptibility and severity. We used the summary statistics from the data freeze 5 of GWAS meta-analysis, which was released publicly on January 18, 2021. Three COVID-19 outcomes, including susceptibility to COVID-19, hospitalized COVID-19, and very severe respiratory confirmed COVID-19, were used in our analysis.

Briefly, susceptibility to COVID-19 cases was defined as individuals with laboratory-confirmed positive for SARS-Cov-2 infection (via nucleic acid amplification test or serological test), clinician diagnosis, health record evidence by ICD coding, or self-reported (Ncase=38,984 vs. Ncontrol=1,644,784). Hospitalized COVID-19 cases were defined as individuals hospitalized due to COVID-19 related symptoms with laboratory-confirmed positive for SARS-Cov-2 infection (Ncase=9986 vs. Ncontrol=1,877,672). Very severe respiratory confirmed COVID-19 cases were defined as hospitalized individuals with laboratory-confirmed positive for SARS-Cov-2 infection, who needed respiratory support except for simple oxygen supplementary or died due to COVID-19 (Ncase=5101 vs. Ncontrol=1,383,241).

Statistical analysis

Request a detailed protocol

MR uses SNPs as proxies for exposure(s) assuming that SNPs are randomly allotted at conception mirroring a randomized procedure and that SNPs always precede disease onset to eliminate reverse causality. Three essential model assumptions need to be fulfilled to guarantee valid IVs (Zheng et al., 2017), that is, IVs are associated with the exposure (relevance assumption); there is no association between IVs and any confounders of the exposure-outcome relationship (independence assumption); and IVs are associated with the outcome only through the studied exposure (exclusion restriction assumption). If all three model assumptions are satisfied, a causal relationship can be made based on the observed IV-exposure and IV-outcome associations.

We conducted a two-sample MR, where IV-exposure and IV-outcome associations were estimated in two non-overlapping samples. The inverse-variance weighted (IVW) approach was applied as the primary method to estimate the causal link between exposures (smoking and alcohol consumption) and outcomes (ACE2 expression and COVID-19 related adverse outcomes) (Burgess et al., 2015). The causal estimate is calculated as a ratio of which the IV-outcome association was divided by the IV-exposure association for each IV and combined across multiple IVs weighted by the reciprocal of an approximate expression for their asymptotic variance. To evaluate potential heterogeneity among causal effects of different genetic variants, Cochran’s Q test was performed and p<0.05 was considered as the existence of heterogeneity (Greco M et al., 2015).

One major concern of MR is horizontal pleiotropy, meaning genetic variants influence the outcome other than through the exposure. A series of sensitivity analyses were conducted to detect and correct for such a scenario. First, MR-Egger regression was adopted to examine the presence of potential pleiotropy, as well as to complement results from main analysis (IVW) (Bowden et al., 2015). When the instrument strength independent of direct effect assumption holds, intercept of MR-Egger regression that differs from zero indicates the presence of horizontal pleiotropy. In addition to the IVW approach and the MR-Egger regression, the weighted median method and the MR-pleiotropy residual sum and outlier (MR-PRESSO) test were also used to detect potential horizontal pleiotropy. The beta estimate of weighted median is calculated as the median of the weighted empirical distribution function of the ratio IV estimates evaluated using each genetic variant individually (Bowden et al., 2016). The p-value of MR-PRESSO global test less than 1.0×10−6 indicates the presence of horizontal pleiotropy. MR-PRESSO not only identifies horizontal pleiotropy by comparing the observed residual sum of squares to the expected residual sum of squares, but also corrects for horizontal pleiotropy through removing outliers (Verbanck et al., 2018). Second, we excluded palindromic SNPs with ambiguous strand identification and performed the IVW method on the remaining SNPs. Subsequently, we removed SNPs associated with potential confounding traits as confirmed by the GWAS Catalog (basically any other trait than our exposure of interest). Moreover, leave-one-out analysis was performed where we excluded one SNP at a time and conducted IVW on the rest SNPs to identify the potential influence of outlying variants on the overall estimates. Results were considered significant only if they passed statistical significance (p<0.05) in the IVW approach or the MR-Egger regression and remained directional consistent in the weighted median and MR-PRESSO methods across both primary and sensitivity analyses. We additionally set a corrected p threshold as dividing 0.05 by the number of outcomes 0.05/(44+3)=1.0×10−3 (including 44 tissues/organs and 3 COVID-19 outcomes).

Finally, the power of current study was estimated according to a method suggested by Brion et al., 2013. To guarantee statistical power, we only included tissues/organs with at least 100 samples in the GTEx database. Under the current sample size, given 1–2% of the phenotypic variance of smoking and alcohol consumption explained by IVs, our study had sufficient power (>80%) to detect a causal effect of 0.74–2.66 in ACE2 expression, and to detect an OR ranging from 1.11 to 1.39 for COVID-19 related outcomes (Supplementary file 1e).

Results

We extracted a total of 532 independent SNPs that achieved genome-wide significance for smoking initiation (N=378), cigarettes per day (N=55), and drinks per week (N=99) from the GSCAN consortium and used as IVs (Supplementary file 1a-c). We were able to map 80–95% of those IVs to the GTEx database and to the COVID-19 Host Genetics Initiative database – a virtually complete coverage. The flowchart of IV-selection is shown in Figure 1.

Flowchart on the selection of instrumental variables.

We found that genetically instrumented smoking initiation was associated with a significantly increased ACE2 expression in brain putamen basal ganglia (β=1.117, p=0.006), in brain hypothalamus (β=0.848, p=0.022), and in subcutaneous adipose tissue (β=0.285, p=0.016) using the IVW approach (Table 1). Results remained directional consistent in the MR-Egger regression although a slightly increased statistical uncertainty was observed (β=1.667, p=0.334 in brain putamen basal ganglia; β=0.963, p=0.545 in brain hypothalamus; β=0.663, p=0.205 in subcutaneous adipose tissue). Additionally, increased ACE2 expression in two colon tissues was observed only through the MR-Egger regression (transverse colon [β=1.129, p=0.017] and sigmoid colon [β=1.925, p=0.042]) and the direction of estimates remained consistent in the IVW approach. For all these associations, we did not find apparent heterogeneity as indicated by Cochran’s Q statistics (all p>0.05) or horizontal pleiotropy as indicted by MR-Egger intercept (all p>0.05) and MR-PRESSO global test (all p>1.0×10−6), except that horizontal pleiotropy was observed in transverse colon by MR-Egger intercept (p=0.04). Although significant associations of smoking initiation with ACE2 expression in brain caudate basal ganglia and in cerebellar hemisphere were found using the IVW approach, the direction of estimates was opposite in the MR-Egger regression (Supplementary file 1f). These associations were therefore not considered as informative.

Table 1
Causal association of smoking initiation and angiotensin-converting enzyme 2 (ACE2) expression.
Organ/tissueMethodN of IVsBetaSEpp*
Adipose – subcutaneousInverse-variance weighted3580.2850.1180.0160.834
MR-Egger3580.6630.5220.2050.456
Weighted median3580.1420.1860.444
MR-PRESSO3580.2850.1180.0170.846
Brain – hypothalamusInverse-variance weighted3560.8480.3690.0220.614
MR-Egger3560.9631.5880.5450.941
Weighted median3560.5490.5660.332
MR-PRESSO3560.8480.3690.0220.655
Brain – putamen
(basal ganglia)
Inverse-variance weighted3571.1170.4060.0060.334
MR-Egger3571.6671.7240.3340.743
Weighted median3571.2560.6070.039
MR-PRESSO3571.1170.4060.0060.321
Colon – sigmoidInverse-variance weighted3590.3140.2140.1430.887
MR-Egger3591.9250.9450.0420.080
Weighted median3590.4730.3340.156
MR-PRESSO3590.3140.2140.1440.904
Colon – transverseInverse-variance weighted3590.1930.1130.0880.348
MR-Egger3591.1290.4710.0170.041
Weighted median3590.2620.1650.114
MR-PRESSO3590.1930.1130.0890.362
  1. *p indicates p-value of heterogenous from inverse-variance weighted (IVW) approach, or p-value of intercept from MR-Egger regression, or p-value from Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) global test.

We further identified that genetically predicted smoking intensity as reflected by cigarettes per day was associated with a significantly elevated ACE2 expression in thyroid (β=1.468, p=1.8×10−8), in liver (β=1.216, p=0.009), in brain hypothalamus (β=1.789, p=0.014), and in ovary (β=1.545, p=0.026) using the IVW method (Table 2). Results remained directional consistent in the MR-Egger regression (β=1.739, p=0.062 in thyroid; β=1.132, p=0.226 in liver; β=0.041, p=0.983 in brain hypothalamus; β=1.658, p=0.347 in ovary). On the contrary, levels of ACE2 expression decreased with genetically instrumented smoking intensity in sigmoid colon tissue (β=−1.971, p=0.019) and in vagina tissue (β=−3.271, p=0.043) using the MR-Egger regression. For all these associations, no apparent horizontal pleiotropy and heterogeneity was found (Supplementary file 1g).

Table 2
Causal association of cigarettes per day and angiotensin-converting enzyme 2 (ACE2) expression.
Organ/tissueMethodN of IVsBetaSEpp*
Brain – hypothalamusInverse-variance weighted481.7890.7300.0140.959
MR-Egger480.0411.9200.9830.310
Weighted median482.1821.3470.105
MR-PRESSO481.7890.7300.0180.964
Colon – sigmoidInverse-variance weighted47−0.8320.4390.0580.652
MR-Egger47−1.9710.8070.0190.092
Weighted median47−1.1820.7470.114
MR-PRESSO47−0.8320.4390.0640.701
LiverInverse-variance weighted471.2160.4680.0090.807
MR-Egger471.1320.9220.2260.913
Weighted median471.0580.8500.213
MR-PRESSO471.2160.4680.0120.843
OvaryInverse-variance weighted481.5450.6930.0260.837
MR-Egger481.6581.7450.3470.943
Weighted median482.5451.2170.037
MR-PRESSO481.5450.6930.0310.844
ThyroidInverse-variance weighted471.4680.3921.8×10−40.604
MR-Egger471.7390.9070.0620.739
Weighted median471.4350.6410.025
MR-PRESSO471.4680.3925.0×10−40.670
VaginaInverse-variance weighted48−1.1500.6880.0940.055
MR-Egger48−3.2711.5740.0430.142
Weighted median48−2.6440.9160.004
MR-PRESSO48−1.1500.6880.1010.057
  1. *p indicates p-value of heterogenous from inverse-variance weighted (IVW) approach, or p-value of intercept from MR-Egger regression, or p-value from Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) global test.

For alcohol consumption defined as drinks per week, only one suggestive association with ACE2 expression in tibial nerve was observed using the IVW approach (β=−1.462, p=0.006). However, the direction of effect from the MR-Egger regression was opposite (β=0.336, p=0.721). We therefore considered an overall null association as our main finding with alcohol consumption (Supplementary file 1h).

In the sensitivity analyses where we excluded palindromic or pleiotropic SNPs, results remained largely consistent with our primary findings (full results shown in Supplementary file 1i-1n). We sequentially excluded proxy SNPs to identify random error introduced by imperfect proxies. In the leave-one-out analyses where we iteratively removed one SNP each time and performed the IVW approach using the remaining SNPs, results were again concordant with our primary findings, indicating an absence of outlying SNPs (Appendix 1—figure 1).

Finally, complementing to findings of ACE2 expression, we tested a putative causal link between smoking status, smoking intensity, alcohol consumption, and the risk of COVID-19 related adverse outcomes. We found that smoking initiation significantly increased the risk of COVID-19 onset (IVW: OR=1.15, 95%CI: 1.07–1.23, p=8.7×10−5) even after taking into account multiple comparisons (Table 3). Results remained significant in the weighted median, and MR-PRESSO methods, however, showed larger statistical uncertainties in the MR-Egger regression. In addition, smoking initiation increased the risk of very severe respiratory confirmed COVID-19 and hospitalized COVID-19 using the IVW approach, but the results were not supported by the MR-Egger regression. Smoking intensity (cigarettes per day) only increased the risk of very severe respiratory confirmed COVID-19 as shown in the MR-Egger regression (OR=5.99, 95%CI: 1.57–22.84, p=0.012) (Supplementary file 1o). On the contrary, we did not find a causal link between alcohol consumption (drinks per week) and the risk of COVID-19 adverse outcomes (Supplementary file 1p). Findings did not alter in the sensitivity analyses (Supplementary file 1q-r and Appendix 1—figure 4).

Table 3
Causal link of smoking initiation with the risk of coronavirus disease 2019 (COVID-19) related adverse outcomes.
OutcomeMethodN of IVsOR (95% CI)pp*
COVID-19 susceptibilityInverse-variance weighted3521.15 (1.07–1.23)8.7×10−56.7×10−5
MR-Egger3521.11 (0.83–1.49)0.4890.821
Weighted median3521.18 (1.08–1.29)2.9×10−4
MR-PRESSO3521.15 (1.07–1.23)1.0×10−45.0×10−5
Hospitalized COVID-19Inverse-variance weighted3511.32 (1.16–1.50)3.7×10−50.009
MR-Egger3510.78 (0.44–1.36)0.3830.059
Weighted median3511.37 (1.14–1.66)0.001
MR-PRESSO3511.32 (1.16–1.50)4.6×10−50.011
Very severe respiratory confirmed COVID-19Inverse-variance weighted3521.25 (1.03–1.51)0.0250.114
MR-Egger3520.83 (0.37–1.89)0.6580.318
Weighted median3521.17 (0.89–1.55)0.267
MR-PRESSO3521.25 (1.03–1.51)0.0250.119
  1. *p indicates p-value of heterogeneous from inverse-variance weighted (IVW) approach, or p-value of intercept from MR-Egger regression, or p-value from Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) global test.

Discussion

We conducted a large-scale genetic analysis to understand the role of cigarette smoking and alcohol consumption with ACE2 expression in multiple tissues/organs, comprehending its role in the prevention of COVID-19. Strong IVs were constructed using hundreds of SNPs associated with smoking and alcohol consumption. We capitalized on the summary statistics of the largest tissue-specific eQTL conducted for ACE2 expression levels and the most up-to-date GWAS data of COVID-19 related adverse outcomes. We found a putative causal relationship between smoking-related phenotypes and an increased ACE2 expression in multiple tissues, as well as an increased susceptibility and severity of COVID-19.

Our findings are supported by previous epidemiological studies which have demonstrated a significant association between smoking and COVID-19 disease progression or death (Guan et al., 2020; Hu et al., 2020; Mehra et al., 2020). For example, a study involving 214 patients with laboratory confirmed COVID-19 from Wenzhou China found that compared to non-severe cases, patients with severe disease were more likely to be smokers (26.3% vs. 6.4%, p = 0.038) (Zheng et al., 2020). Another study recruiting 78 patients with COVID-19 in Wuhan also found a higher proportion of ever-smokers in COVID-19 progression group than in the improvement/stabilization group (OR=14.28, 95% CI:1.58–25.00, p=0.018) (Liu et al., 2020). Consistent with these findings, a meta-analysis on a total of 11,590 COVID-19 cases demonstrated a higher proportion of smokers among 2133 patients experienced disease progression, suggesting that smoking aggravated COVID-19 progression (OR=1.91, 95% CI: 1.42–2.59) (Patanavanich and Glantz, 2020). On the contrary, a few small-scale studies with sample sizes ranging from 44 to 191 conducted in Wuhan China did not report remarkable association of smoking with COVID-19 severity or progression (Huang et al., 2020; Yang et al., 2020; Zhang et al., 2020; Zhou et al., 2020b). For instance, a retrospective study including 191 patients of which 137 survived and 54 non-survived found a comparable proportion of smokers in survivors and non-survivors (9% vs. 4%, p=0.21) (Zhou et al., 2020b). A meta-analysis including five studies (four in Wuhan and one across 30 provinces in Mainland China) (Guan et al., 2020; Liu et al., 2020; Huang et al., 2020; Yang et al., 2020; Zhang et al., 2020) involving 1399 individuals with COVID-19 revealed no significant association between smoking and disease severity (OR=1.69, 95% CI: 0.41–6.92) (Lippi and Henry, 2020). Opposite to the findings from Asian population, evidence from the Veterans Affairs Birth Cohort in a US population found that current smoking was associated with a lower risk of COVID-19 susceptibility (OR=0.45, 95% CI: 0.35–0.57) (Rentsch et al., 2020). The contradictory results in those small-scale studies might be due to insufficient power, low proportion of smokers, and a limited representativeness of the study population. For example, compared with the high proportion of smokers in China (an average 26.6% prevalence in the general population), only 1.4% patients were current smokers in the study conducted by Zhang et al., 2020 and 12.6% in the study conducted by Guan et al., 2020. Given these discrepancies, additional studies are warranted to confirm the role of smoking in both the onset and progression of COVID-19.

In addition to clinical observational studies, laboratory examination has demonstrated the importance of tissues specificity. For example, Cai et al. identified that smoking was associated with an elevated expression of ACE2 in the lung, providing biological evidence of smoking with an increased susceptibility to SARS-CoV-2 infection or severity (Cai, 2020). Moreover, Rao et al. conducted a phenome-wide MR study incorporating 3948 traits, diseases, and blood proteins and identified a nominal significant association between tobacco use and ACE2 expression in the lung (IVW: β=0.918, p=0.016) (Rao et al., 2020). However, this association did not pass multiple comparisons (p-value for FDR was 0.51), which was consistent with our MR results using a greatly augmented number of IVs (for smoking status IV=378).

The biological mechanisms underlying smoking and tissue-specific ACE2 expression remain to be disclosed. ACE2 has been considered as the target receptor of SARS-CoV-2 entry into the host cells and an increased expression of ACE2 appears to raise both the susceptibility and severity of COVID-19. As a type I transmembrane metallocarboxypeptidase homologous to ACE, ACE2 is known to be expressed in a variety of tissues, including respiratory tract, cardio-renal tissues, and gastrointestinal tissues (Harmer et al., 2002). Our study found that smoking increased ACE2 expression in multiple tissues including the brain and colon tissues. While SARS-CoV-2 mainly spreads via respiratory tract, enrichment of SARS-CoV-2 in gastrointestinal tract has been confirmed by testing viral RNA in stool from 71 patients with COVID-19, suggesting the importance of gastrointestinal involvement in the infection (Xiao et al., 2020). Furthermore, nearly one-fifth COVID-19 patients remained SARS-CoV-2 RNA-positive in their stool, despite negative results in their respiratory samples. Of those 71 patients, ACE2 was abundantly expressed in the gastrointestinal epithelia, but rarely expressed in the esophageal epithelia. In addition, findings from public databases (GEO, GTEx, and HPA) have demonstrated a higher expression level of ACE2 in the gastrointestinal tract (colon, rectal, and small intestine) and liver than in the lung (Burgueño et al., 2020; Li et al., 2020; Pirola and Sookoian, 2020). Taking these pieces of evidence together, we could reasonably assume that the biological mechanisms underlying the link between ACE2 expression and COVID-19 susceptibility are complicated, involving multiple organs other than the lung. Consistent with our findings on a link between smoking and increased ACE2 expression in both transverse colon and sigmoid colon, these results collectively suggest that smoking mediates ACE2 expression in gastrointestinal tract, subsequently influence the susceptibility and severity of COVID-19.

We also found that smoking influenced ACE2 expression in the brain tissue, especially in putamen basal ganglia and hypothalamus. Smoking may promote cellular uptake of SARS-CoV-2 through nicotinic acetylcholine receptor (nAChR) signalling (Russo et al., 2020). It is worth noting that nAChR and ACE2 are known to be co-expressed on many sites such as cortex, striatum, and hypothalamus within human brain (Dani and Bertrand, 2007; Jones et al., 2009). Nicotine stimulation of the nAChR increased ACE2 expression within neural cells, indicating a likelihood that smokers are more vulnerable to COVID-19 (Olds and Kabbani, 2020).

To the best of our knowledge, we performed a large-scale phenome-wide MR study to understand a causal role of smoking and alcohol consumption in ACE2 expression, as well as in COVID-19 related outcomes, totalling 532 genetic associations from 1.2 million individuals of European ancestry and covering almost all tissues/organs of human body (N=44). In addition, we rigorously selected proxy SNPs and performed a series of sensitivity analyses to satisfy MR model assumptions, for example, we satisfied the ‘relevance’ assumption by using GWAS-identified significant SNPs as IVs; we ensured the ‘independent’ assumption and the ‘exclusion restriction’ assumption by performing several important sensitivity analyses. However, limitations need to be acknowledged. Although hundreds of SNPs were used as proxies for cigarette smoking and alcohol consumption, these GWAS-identified SNPs explained only a small fraction (1–2%) of phenotypic variance. In addition, the minimum sample size of 114 in brain substantia nigra tissue further limited the statistical power of MR analysis, albeit GTEx database is so far the largest available database with both genotype and expression data. Moreover, data on the genotypes, tissue expression, and COVID-19 related outcomes of our analyses were all based on European ancestry populations, restricting its generalizability to other ethnicities. On the other hand, our data, with exposure and outcome GWAS(s) (or eQTLs) conducted using individuals of the same underlying populations (all European ancestry), greatly reduces the population stratification as well as satisfies the MR model assumption – for a two-sample MR to be valid, the two samples have to be preferably from the same ethnicity. Finally, we have to acknowledge that few of our significant associations passed the stringent Bonferroni corrections. Our results provide a comprehensive picture for the causal relationship of smoking with ACE2 expression in various tissues and with COVID-19 susceptibility and severity, yet we stress caution for interpretation and extra analyses are needed to replicate these findings.

In conclusion, genetically instrumented smoking phenotypes reflected by both smoking initiation and smoking intensity are significantly associated with a high expression level of ACE2 in multiple tissues/organs, subsequently increasing the susceptibility and severity of COVID-19. Our results provide important clinical implications on that smokers might be more vulnerable to SARS-CoV-2 infection or severe disease. At the population level, smoking cessation is also an important actionable prevention strategy to COVID-19. Further studies are needed to confirm or refute our findings.

Appendix 1

Appendix 1—figure 1
Leave-one-out analysis of causal association of smoking initiation with angiotensin-converting enzyme 2 (ACE2) expression.

Each boxplot represents the centralized tendency of effect sizes (beta coefficients) of smoking initiation on tissue-specific ACE2 expression based on the results of leave-one-out analysis where we excluded one single nucleotide polymorphism (SNP) at a time and performed inverse-variance weighted (IVW) using the remaining SNPs.

Appendix 1—figure 2
Leave-one-out analysis of causal association of cigarettes per day with angiotensin-converting enzyme 2 (ACE2) expression.

Each boxplot represents the centralized tendency of effect sizes (beta coefficients) of cigarettes per day on tissue-specific ACE2 expression based on the results of leave-one-out analysis where we excluded one single nucleotide polymorphism (SNP) at a time and performed inverse-variance weighted (IVW) using the remaining SNPs.

Appendix 1—figure 3
Leave-one-out analysis of causal association of drinks per week with angiotensin-converting enzyme 2 (ACE2) expression.

Each boxplot represents the centralized tendency of effect sizes (beta coefficients) of drinks per week on tissue-specific ACE2 expression based on the results of leave-one-out analysis where we excluded one single nucleotide polymorphism (SNP) at a time and performed inverse-variance weighted (IVW) using the remaining SNPs.

Appendix 1—figure 4
Leave-one-out analysis of causal association of smoking and alcohol consumption with coronavirus disease 2019 (COVID-19) related adverse outcomes.

(A) Leave-one-out analysis of causal association of smoking initiation with COVID-19 related adverse outcomes. (B) Leave-one-out analysis of causal association of cigarettes per day with COVID-19 related adverse outcomes. (C) Leave-one-out analysis of causal association of drinks per week with COVID-19 related adverse outcomes. Each boxplot represents the centralized tendency of effect sizes (beta coefficients) of exposure (including smoking and alcohol consumption) on COVID-19 related adverse outcomes based on the results of leave-one-out analysis where we excluded one single nucleotide polymorphism (SNP) at a time and performed inverse-variance weighted (IVW) using the remaining SNPs. COVID-19 indicates susceptibility to COVID-19. Severe COVID-19 indicates very severe respiratory confirmed COVID-19.

Data availability

All data generated or analysed during this study are publicly-available. Data of the GSCAN consortium, the GTEx project, and the COVID-19 Host Genetics Initiative can be acessed at https://genome.psych.umn.edu/index.php/GSCAN, https://www.gtexportal.org, and https://www.covid19hg.org, respectively. Furthermore, data and main programming codes with annotations have been uploaded to GitHub and made publicly available at https://github.com/hye-hz/MR_Smoke_COVID19.git (copy archived at https://archive.softwareheritage.org/swh:1:rev:1a2038517d8f2c7c772e69c9c5abab7713add9bb).

References

    1. Liu M
    2. Jiang Y
    3. Wedow R
    4. Li Y
    5. Brazel DM
    6. Chen F
    7. Datta G
    8. Davila-Velderrain J
    9. McGuire D
    10. Tian C
    11. Zhan X
    12. Choquet H
    13. Docherty AR
    14. Faul JD
    15. Foerster JR
    16. Fritsche LG
    17. Gabrielsen ME
    18. Gordon SD
    19. Haessler J
    20. Hottenga JJ
    21. Huang H
    22. Jang SK
    23. Jansen PR
    24. Ling Y
    25. Mägi R
    26. Matoba N
    27. McMahon G
    28. Mulas A
    29. Orrù V
    30. Palviainen T
    31. Pandit A
    32. Reginsson GW
    33. Skogholt AH
    34. Smith JA
    35. Taylor AE
    36. Turman C
    37. Willemsen G
    38. Young H
    39. Young KA
    40. Zajac GJM
    41. Zhao W
    42. Zhou W
    43. Bjornsdottir G
    44. Boardman JD
    45. Boehnke M
    46. Boomsma DI
    47. Chen C
    48. Cucca F
    49. Davies GE
    50. Eaton CB
    51. Ehringer MA
    52. Esko T
    53. Fiorillo E
    54. Gillespie NA
    55. Gudbjartsson DF
    56. Haller T
    57. Harris KM
    58. Heath AC
    59. Hewitt JK
    60. Hickie IB
    61. Hokanson JE
    62. Hopfer CJ
    63. Hunter DJ
    64. Iacono WG
    65. Johnson EO
    66. Kamatani Y
    67. Kardia SLR
    68. Keller MC
    69. Kellis M
    70. Kooperberg C
    71. Kraft P
    72. Krauter KS
    73. Laakso M
    74. Lind PA
    75. Loukola A
    76. Lutz SM
    77. Madden PAF
    78. Martin NG
    79. McGue M
    80. McQueen MB
    81. Medland SE
    82. Metspalu A
    83. Mohlke KL
    84. Nielsen JB
    85. Okada Y
    86. Peters U
    87. Polderman TJC
    88. Posthuma D
    89. Reiner AP
    90. Rice JP
    91. Rimm E
    92. Rose RJ
    93. Runarsdottir V
    94. Stallings MC
    95. Stančáková A
    96. Stefansson H
    97. Thai KK
    98. Tindle HA
    99. Tyrfingsson T
    100. Wall TL
    101. Weir DR
    102. Weisner C
    103. Whitfield JB
    104. Winsvold BS
    105. Yin J
    106. Zuccolo L
    107. Bierut LJ
    108. Hveem K
    109. Lee JJ
    110. Munafò MR
    111. Saccone NL
    112. Willer CJ
    113. Cornelis MC
    114. David SP
    115. Hinds DA
    116. Jorgenson E
    117. Kaprio J
    118. Stitzel JA
    119. Stefansson K
    120. Thorgeirsson TE
    121. Abecasis G
    122. Liu DJ
    123. Vrieze S
    124. 23andMe Research Team
    125. HUNT All-In Psychiatry
    (2019) Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use
    Nature Genetics 51:237–244.
    https://doi.org/10.1038/s41588-018-0307-5

Article and author information

Author details

  1. Hui Liu

    Biomedical Research Center, Zhejiang Provincial Key Laboratory of Laparoscopic Technology, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, China
    Contribution
    Data curation, Writing - original draft, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5531-3640
  2. Junyi Xin

    Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, China
    Contribution
    Data curation, Formal analysis, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6677-3936
  3. Sheng Cai

    Institute of Drug Metabolism and Pharmaceutical Analysis, Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, Zhejiang University, Hangzhou, China
    Contribution
    Data curation, Formal analysis, Writing - review and editing
    Competing interests
    No competing interests declared
  4. Xia Jiang

    Department of Clinical Neuroscience, Center for Molecular Medicine, Karolinska Institute, Stockholm, Sweden
    Contribution
    Conceptualization, Supervision, Writing - original draft, Writing - review and editing
    For correspondence
    xia.jiang@ki.se
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5878-8986

Funding

Swedish Research Council (VR-2018-02247)

  • Xia Jiang

Swedish Research Council for Health, Working Life and Welfare (FORTE-2020-00884)

  • Xia Jiang

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank the GSCAN consortium, the GTEx Program, and the COVID-19 Host Genetics Initiative for the release of their data.

Copyright

© 2021, Liu et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,192
    views
  • 124
    downloads
  • 12
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Hui Liu
  2. Junyi Xin
  3. Sheng Cai
  4. Xia Jiang
(2021)
Mendelian randomization analysis provides causality of smoking on the expression of ACE2, a putative SARS-CoV-2 receptor
eLife 10:e64188.
https://doi.org/10.7554/eLife.64188

Share this article

https://doi.org/10.7554/eLife.64188

Further reading

    1. Epidemiology and Global Health
    2. Genetics and Genomics
    Tianyu Zhao, Hui Li ... Li Chen
    Research Article

    Alzheimer’s disease (AD) is a complex degenerative disease of the central nervous system, and elucidating its pathogenesis remains challenging. In this study, we used the inverse-variance weighted (IVW) model as the major analysis method to perform hypothesis-free Mendelian randomization (MR) analysis on the data from MRC IEU OpenGWAS (18,097 exposure traits and 16 AD outcome traits), and conducted sensitivity analysis with six models, to assess the robustness of the IVW results, to identify various classes of risk or protective factors for AD, early-onset AD, and late-onset AD. We generated 400,274 data entries in total, among which the major analysis method of the IVW model consists of 73,129 records with 4840 exposure traits, which fall into 10 categories: Disease, Medical laboratory science, Imaging, Anthropometric, Treatment, Molecular trait, Gut microbiota, Past history, Family history, and Lifestyle trait. More importantly, a freely accessed online platform called MRAD (https://gwasmrad.com/mrad/) has been developed using the Shiny package with MR analysis results. Additionally, novel potential AD therapeutic targets (CD33, TBCA, VPS29, GNAI3, PSME1) are identified, among which CD33 was positively associated with the main outcome traits of AD, as well as with both EOAD and LOAD. TBCA and VPS29 were negatively associated with the main outcome traits of AD, as well as with both EOAD and LOAD. GNAI3 and PSME1 were negatively associated with the main outcome traits of AD, as well as with LOAD, but had no significant causal association with EOAD. The findings of our research advance our understanding of the etiology of AD.

    1. Epidemiology and Global Health
    Xiaoning Wang, Jinxiang Zhao ... Dong Liu
    Research Article

    Artificially sweetened beverages containing noncaloric monosaccharides were suggested as healthier alternatives to sugar-sweetened beverages. Nevertheless, the potential detrimental effects of these noncaloric monosaccharides on blood vessel function remain inadequately understood. We have established a zebrafish model that exhibits significant excessive angiogenesis induced by high glucose, resembling the hyperangiogenic characteristics observed in proliferative diabetic retinopathy (PDR). Utilizing this model, we observed that glucose and noncaloric monosaccharides could induce excessive formation of blood vessels, especially intersegmental vessels (ISVs). The excessively branched vessels were observed to be formed by ectopic activation of quiescent endothelial cells (ECs) into tip cells. Single-cell transcriptomic sequencing analysis of the ECs in the embryos exposed to high glucose revealed an augmented ratio of capillary ECs, proliferating ECs, and a series of upregulated proangiogenic genes. Further analysis and experiments validated that reduced foxo1a mediated the excessive angiogenesis induced by monosaccharides via upregulating the expression of marcksl1a. This study has provided new evidence showing the negative effects of noncaloric monosaccharides on the vascular system and the underlying mechanisms.