1. Epidemiology and Global Health
Download icon

Mendelian randomization analysis provides causality of smoking on the expression of ACE2, a putative SARS-CoV-2 receptor

  1. Hui Liu
  2. Junyi Xin
  3. Sheng Cai
  4. Xia Jiang  Is a corresponding author
  1. Biomedical Research Center, Zhejiang Provincial Key Laboratory of Laparoscopic Technology, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, China
  2. Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, China
  3. Institute of Drug Metabolism and Pharmaceutical Analysis, Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, Zhejiang University, China
  4. Department of Clinical Neuroscience, Center for Molecular Medicine, Karolinska Institute, Sweden
Research Article
  • Cited 0
  • Views 301
  • Annotations
Cite this article as: eLife 2021;10:e64188 doi: 10.7554/eLife.64188

Abstract

Background:

To understand a causal role of modifiable lifestyle factors in angiotensin-converting enzyme 2 (ACE2) expression (a putative severe acute respiratory syndrome coronavirus 2 [SARS-CoV-2] receptor) across 44 human tissues/organs, and in coronavirus disease 2019 (COVID-19) susceptibility and severity, we conducted a phenome-wide two-sample Mendelian randomization (MR) study.

Methods:

More than 500 genetic variants were used as instrumental variables to predict smoking and alcohol consumption. Inverse-variance weighted approach was adopted as the primary method to estimate a causal association, while MR-Egger regression, weighted median, and MR pleiotropy residual sum and outlier (MR-PRESSO) were performed to identify potential horizontal pleiotropy.

Results:

We found that genetically predicted smoking intensity significantly increased ACE2 expression in thyroid (β=1.468, p=1.8×10−8), and increased ACE2 expression in adipose, brain, colon, and liver with nominal significance. Additionally, genetically predicted smoking initiation significantly increased the risk of COVID-19 onset (odds ratio=1.14, p=8.7×10−5). No statistically significant result was observed for alcohol consumption.

Conclusions:

Our work demonstrates an important role of smoking, measured by both status and intensity, in the susceptibility to COVID-19.

Funding:

XJ is supported by research grants from the Swedish Research Council (VR-2018–02247) and Swedish Research Council for Health, Working Life and Welfare (FORTE-2020–00884).

Introduction

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has led to a worldwide pandemic of coronavirus disease 2019 (COVID-19) (Coronaviridae Study Group of the International Committee on Taxonomy of Viruses, 2020; World Health Organization, 2020). As a host receptor of SARS-CoV-2, the expression level of angiotensin-converting enzyme 2 (ACE2) has been found to influence both the risk and severity of infection (Hoffmann et al., 2020; Wrapp et al., 2020; Zhou et al., 2020a; Li et al., 2003; Li et al., 2005). Moreover, a growing body of evidence from epidemiological investigations has demonstrated a substantial disparity in the susceptibility to infection (Guan et al., 2020; Hu et al., 2020; Mehra et al., 2020; Patanavanich and Glantz, 2020). For example, a multi-center study involving 8910 COVID-19 cases from 169 hospitals in Europe and North America identified an increased risk of in-hospital death among current smokers (odds ratio [OR]=1.79; 95% CI: 1.29–2.47) compared with ever-smokers or non-smokers (Mehra et al., 2020).

Consistent with findings from large-scale population-based observational studies, a laboratory-based study involving 131 RNA-sequenced human lung cancer tissues (54 samples of European ancestry individuals and 77 samples of Asian ancestry individuals) found that smokers expressed a significantly higher level of ACE2 compared to non-smokers in both populations, leading to a potentially heightened susceptibility to SARS-CoV-2 infection (Cai, 2020). Furthermore, incorporating two additional DNA microarray datasets of lung cancer, the significant smoking-ACE2 association observed in a total of 224 samples did not alter after adjusting for age, sex, race, and platforms. Nevertheless, these samples are derived from lung cancer patients, restricting its generalizability to normal lung tissues and to the general population. In a related work, Rao et al. conducted a phenome-wide Mendelian randomization (MR) study examining an extensive amount of diseases, traits, and blood proteins and identified several ‘exposures’ including diabetes, breast cancer, lung cancer, inflammatory bowel disease, and smoking to increase ACE2 expression in normal lung tissue (Rao et al., 2020). This analysis, despite its substantially augmented number of exposures (N=3948), has several limitations. First of all, disease status such as diabetes and cancers are difficult to modify, at the population level it is more important to discover and intervene with modifiable risk factors such as smoking and alcohol consumption. However, regarding smoking, only three single nucleotide polymorphisms (SNPs) were used as instrumental variables (IVs) by Rao et al., which explained negligible phenotypic variation and did not accurately predict smoking status. The hitherto largest genome-wide association study (GWAS) of tobacco use was conducted in a total of 1.2 million individuals and discovered over 400 genetic variants associated with smoking initiation and intensity (Liu et al., 2019). Last but not least, Rao et al. focused on lung tissue instead of systemically examining all human tissues. Despite lungs being the most relevant and vulnerable organ to a respiratory syndrome COVID-19, recent studies have identified the involvement of other human tissues (e.g. gastrointestinal tract) in SARS-CoV-2 infection (Zou et al., 2020).

Motivated by these findings, we aim to explore whether genetic predisposition to common human modifiable behaviours including smoking and alcohol consumption could lead to an increased ACE2 expression, which subsequently yields to an increased susceptibility and severity of COVID-19. Here, we conduct a phenome-wide MR analysis by incorporating ACE2 expressions from a broad spectrum of tissues/organs available in the GTEx database. As one of the hitherto largest databases with concomitant information on DNA genotype and RNA expression, GTEx collects a large variety of tissues (N=44) from healthy population (deceased donors) (Gamazon et al., 2018). In addition, data on COVID-19 susceptibility and severity were obtained from COVID-19 Host Genetics Initiative, a global project aims to understand the role of host genome in COVID-19 outcome (COVID-19 Host Genetics Initiative, 2020). Hundreds of genetic variants identified by a large-scale GWAS of tobacco use and alcohol consumption were used as IVs – incorporating additional loci greatly enhances the strength of genetic instruments as well as both accuracy and precision of MR estimates (Liu et al., 2019).

Materials and methods

Data on IV-exposure

Request a detailed protocol

IV-exposure associations were extracted from the hitherto largest GWAS conducted by the GCSCAN consortium (GWAS and Sequencing Consortium of Alcohol and Nicotine use) for tobacco use and alcohol consumption, totalling 1.2 million individuals of European ancestry (Liu et al., 2019). This GWAS firstly meta-analysed summary-level data from each participating cohort and identified independent SNPs passing genome-wide significance (p<5×10−8) based on linkage disequilibrium. After that, additional independent and genome-wide significant SNPs were selected using a conditional analysis within each significant locus defined as a 1 MB region surrounding the sentinel variant (the variant in the locus with the lowest p-value). We used all conditionally independent biallelic SNPs as IVs.

In our analysis, we included two smoking phenotypes and one drinking phenotype, smoking initiation as reflected by never vs. ever smoking (IV=378, N=1,232,091), smoking intensity as reflected by cigarettes per day (IV=55, N=337,334), and common (opposing to excessive or harmful) alcohol drinking behaviour defined as drinks per week (IV=99, N=941,280). The proportion of phenotypic variance explained by IVs accounted for 2.3% for smoking initiation, 1.1% for cigarettes per day, and 0.2% for drinks per week. Detailed information regarding IVs for each exposure were shown in Supplementary file 1a-1c.

Strong instrumental variable is the basic requirement to ensure a valid MR result. The strength of IV was verified by calculating F-statistics using the formula F=R2(n1k)(1R2)k, where R2 is the proportion of variance explained by the IV, k refers to the number of IVs, and n indicates the sample size (Pierce et al., 2011). The F-statistics for smoking initiation, smoking intensity (cigarettes per day), and alcohol consumption (drinks per week) were 77.2, 67.4, and 17.8, respectively, indicating strong IVs (F-statistics > 10) for each of our exposure of interest.

Data on IV-outcome

Request a detailed protocol

Associations of genetic variants with ACE2 expression were extracted from the GTEx database release version 8 available at the GTEx Portal (http://www.gtexportal.org). It is one of the largest databases with concomitant information on genotype and expression data for a large variety of non-diseased tissues collected from >1000 human donors. Out of the total 54 tissues/organs, we focused on ACE2 expression in 44 tissues/organs with a decent sample size involving at least 100 individuals to ensure statistical power (Supplementary file 1d). Specifically, the associations of genotype with ACE2 expression in adipose tissue, artery, brain, colon, oesophagus, heart, liver, lung, minor salivary gland, muscle, nerve, ovary, pancreas, pituitary, prostate, skin, small intestine, stomach, testis, thyroid, uterus, and vagina were included.

Unlike most existing MR studies that consider disease status as outcomes, this MR study treats gene expression levels as outcomes. GTEx identifies expression quantitative trait loci (eQTL) by associating genetic variations called from GWAS with gene expression levels obtained from RNA-sequencing. Expression values for each gene were inverse quantile normalized to a standard normal distribution across samples as previously described (Gamazon et al., 2018). Please note, here we studied ACE2 RNA expression instead of its actual protein expression. Since ACE2 RNA and protein quantities have been found to be well correlated in tissues such as lung and kidney (Wang et al., 2020), ACE2 RNA expression acts as a good proxy for ACE2 protein expression.

In addition to ACE2 expression, associations of genetic variants with COVID-19 susceptibility and severity were extracted from the COVID-19 Host Genetics Initiative (https://www.covid19hg.org). It is a global genetics collaboration aiming to explore the genetic determinants of COVID-19 susceptibility and severity. We used the summary statistics from the data freeze 5 of GWAS meta-analysis, which was released publicly on January 18, 2021. Three COVID-19 outcomes, including susceptibility to COVID-19, hospitalized COVID-19, and very severe respiratory confirmed COVID-19, were used in our analysis.

Briefly, susceptibility to COVID-19 cases was defined as individuals with laboratory-confirmed positive for SARS-Cov-2 infection (via nucleic acid amplification test or serological test), clinician diagnosis, health record evidence by ICD coding, or self-reported (Ncase=38,984 vs. Ncontrol=1,644,784). Hospitalized COVID-19 cases were defined as individuals hospitalized due to COVID-19 related symptoms with laboratory-confirmed positive for SARS-Cov-2 infection (Ncase=9986 vs. Ncontrol=1,877,672). Very severe respiratory confirmed COVID-19 cases were defined as hospitalized individuals with laboratory-confirmed positive for SARS-Cov-2 infection, who needed respiratory support except for simple oxygen supplementary or died due to COVID-19 (Ncase=5101 vs. Ncontrol=1,383,241).

Statistical analysis

Request a detailed protocol

MR uses SNPs as proxies for exposure(s) assuming that SNPs are randomly allotted at conception mirroring a randomized procedure and that SNPs always precede disease onset to eliminate reverse causality. Three essential model assumptions need to be fulfilled to guarantee valid IVs (Zheng et al., 2017), that is, IVs are associated with the exposure (relevance assumption); there is no association between IVs and any confounders of the exposure-outcome relationship (independence assumption); and IVs are associated with the outcome only through the studied exposure (exclusion restriction assumption). If all three model assumptions are satisfied, a causal relationship can be made based on the observed IV-exposure and IV-outcome associations.

We conducted a two-sample MR, where IV-exposure and IV-outcome associations were estimated in two non-overlapping samples. The inverse-variance weighted (IVW) approach was applied as the primary method to estimate the causal link between exposures (smoking and alcohol consumption) and outcomes (ACE2 expression and COVID-19 related adverse outcomes) (Burgess et al., 2015). The causal estimate is calculated as a ratio of which the IV-outcome association was divided by the IV-exposure association for each IV and combined across multiple IVs weighted by the reciprocal of an approximate expression for their asymptotic variance. To evaluate potential heterogeneity among causal effects of different genetic variants, Cochran’s Q test was performed and p<0.05 was considered as the existence of heterogeneity (Greco M et al., 2015).

One major concern of MR is horizontal pleiotropy, meaning genetic variants influence the outcome other than through the exposure. A series of sensitivity analyses were conducted to detect and correct for such a scenario. First, MR-Egger regression was adopted to examine the presence of potential pleiotropy, as well as to complement results from main analysis (IVW) (Bowden et al., 2015). When the instrument strength independent of direct effect assumption holds, intercept of MR-Egger regression that differs from zero indicates the presence of horizontal pleiotropy. In addition to the IVW approach and the MR-Egger regression, the weighted median method and the MR-pleiotropy residual sum and outlier (MR-PRESSO) test were also used to detect potential horizontal pleiotropy. The beta estimate of weighted median is calculated as the median of the weighted empirical distribution function of the ratio IV estimates evaluated using each genetic variant individually (Bowden et al., 2016). The p-value of MR-PRESSO global test less than 1.0×10−6 indicates the presence of horizontal pleiotropy. MR-PRESSO not only identifies horizontal pleiotropy by comparing the observed residual sum of squares to the expected residual sum of squares, but also corrects for horizontal pleiotropy through removing outliers (Verbanck et al., 2018). Second, we excluded palindromic SNPs with ambiguous strand identification and performed the IVW method on the remaining SNPs. Subsequently, we removed SNPs associated with potential confounding traits as confirmed by the GWAS Catalog (basically any other trait than our exposure of interest). Moreover, leave-one-out analysis was performed where we excluded one SNP at a time and conducted IVW on the rest SNPs to identify the potential influence of outlying variants on the overall estimates. Results were considered significant only if they passed statistical significance (p<0.05) in the IVW approach or the MR-Egger regression and remained directional consistent in the weighted median and MR-PRESSO methods across both primary and sensitivity analyses. We additionally set a corrected p threshold as dividing 0.05 by the number of outcomes 0.05/(44+3)=1.0×10−3 (including 44 tissues/organs and 3 COVID-19 outcomes).

Finally, the power of current study was estimated according to a method suggested by Brion et al., 2013. To guarantee statistical power, we only included tissues/organs with at least 100 samples in the GTEx database. Under the current sample size, given 1–2% of the phenotypic variance of smoking and alcohol consumption explained by IVs, our study had sufficient power (>80%) to detect a causal effect of 0.74–2.66 in ACE2 expression, and to detect an OR ranging from 1.11 to 1.39 for COVID-19 related outcomes (Supplementary file 1e).

Results

We extracted a total of 532 independent SNPs that achieved genome-wide significance for smoking initiation (N=378), cigarettes per day (N=55), and drinks per week (N=99) from the GSCAN consortium and used as IVs (Supplementary file 1a-c). We were able to map 80–95% of those IVs to the GTEx database and to the COVID-19 Host Genetics Initiative database – a virtually complete coverage. The flowchart of IV-selection is shown in Figure 1.

Flowchart on the selection of instrumental variables.

We found that genetically instrumented smoking initiation was associated with a significantly increased ACE2 expression in brain putamen basal ganglia (β=1.117, p=0.006), in brain hypothalamus (β=0.848, p=0.022), and in subcutaneous adipose tissue (β=0.285, p=0.016) using the IVW approach (Table 1). Results remained directional consistent in the MR-Egger regression although a slightly increased statistical uncertainty was observed (β=1.667, p=0.334 in brain putamen basal ganglia; β=0.963, p=0.545 in brain hypothalamus; β=0.663, p=0.205 in subcutaneous adipose tissue). Additionally, increased ACE2 expression in two colon tissues was observed only through the MR-Egger regression (transverse colon [β=1.129, p=0.017] and sigmoid colon [β=1.925, p=0.042]) and the direction of estimates remained consistent in the IVW approach. For all these associations, we did not find apparent heterogeneity as indicated by Cochran’s Q statistics (all p>0.05) or horizontal pleiotropy as indicted by MR-Egger intercept (all p>0.05) and MR-PRESSO global test (all p>1.0×10−6), except that horizontal pleiotropy was observed in transverse colon by MR-Egger intercept (p=0.04). Although significant associations of smoking initiation with ACE2 expression in brain caudate basal ganglia and in cerebellar hemisphere were found using the IVW approach, the direction of estimates was opposite in the MR-Egger regression (Supplementary file 1f). These associations were therefore not considered as informative.

Table 1
Causal association of smoking initiation and angiotensin-converting enzyme 2 (ACE2) expression.
Organ/tissueMethodN of IVsBetaSEpp*
Adipose – subcutaneousInverse-variance weighted3580.2850.1180.0160.834
MR-Egger3580.6630.5220.2050.456
Weighted median3580.1420.1860.444
MR-PRESSO3580.2850.1180.0170.846
Brain – hypothalamusInverse-variance weighted3560.8480.3690.0220.614
MR-Egger3560.9631.5880.5450.941
Weighted median3560.5490.5660.332
MR-PRESSO3560.8480.3690.0220.655
Brain – putamen
(basal ganglia)
Inverse-variance weighted3571.1170.4060.0060.334
MR-Egger3571.6671.7240.3340.743
Weighted median3571.2560.6070.039
MR-PRESSO3571.1170.4060.0060.321
Colon – sigmoidInverse-variance weighted3590.3140.2140.1430.887
MR-Egger3591.9250.9450.0420.080
Weighted median3590.4730.3340.156
MR-PRESSO3590.3140.2140.1440.904
Colon – transverseInverse-variance weighted3590.1930.1130.0880.348
MR-Egger3591.1290.4710.0170.041
Weighted median3590.2620.1650.114
MR-PRESSO3590.1930.1130.0890.362
  1. *p indicates p-value of heterogenous from inverse-variance weighted (IVW) approach, or p-value of intercept from MR-Egger regression, or p-value from Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) global test.

We further identified that genetically predicted smoking intensity as reflected by cigarettes per day was associated with a significantly elevated ACE2 expression in thyroid (β=1.468, p=1.8×10−8), in liver (β=1.216, p=0.009), in brain hypothalamus (β=1.789, p=0.014), and in ovary (β=1.545, p=0.026) using the IVW method (Table 2). Results remained directional consistent in the MR-Egger regression (β=1.739, p=0.062 in thyroid; β=1.132, p=0.226 in liver; β=0.041, p=0.983 in brain hypothalamus; β=1.658, p=0.347 in ovary). On the contrary, levels of ACE2 expression decreased with genetically instrumented smoking intensity in sigmoid colon tissue (β=−1.971, p=0.019) and in vagina tissue (β=−3.271, p=0.043) using the MR-Egger regression. For all these associations, no apparent horizontal pleiotropy and heterogeneity was found (Supplementary file 1g).

Table 2
Causal association of cigarettes per day and angiotensin-converting enzyme 2 (ACE2) expression.
Organ/tissueMethodN of IVsBetaSEpp*
Brain – hypothalamusInverse-variance weighted481.7890.7300.0140.959
MR-Egger480.0411.9200.9830.310
Weighted median482.1821.3470.105
MR-PRESSO481.7890.7300.0180.964
Colon – sigmoidInverse-variance weighted47−0.8320.4390.0580.652
MR-Egger47−1.9710.8070.0190.092
Weighted median47−1.1820.7470.114
MR-PRESSO47−0.8320.4390.0640.701
LiverInverse-variance weighted471.2160.4680.0090.807
MR-Egger471.1320.9220.2260.913
Weighted median471.0580.8500.213
MR-PRESSO471.2160.4680.0120.843
OvaryInverse-variance weighted481.5450.6930.0260.837
MR-Egger481.6581.7450.3470.943
Weighted median482.5451.2170.037
MR-PRESSO481.5450.6930.0310.844
ThyroidInverse-variance weighted471.4680.3921.8×10−40.604
MR-Egger471.7390.9070.0620.739
Weighted median471.4350.6410.025
MR-PRESSO471.4680.3925.0×10−40.670
VaginaInverse-variance weighted48−1.1500.6880.0940.055
MR-Egger48−3.2711.5740.0430.142
Weighted median48−2.6440.9160.004
MR-PRESSO48−1.1500.6880.1010.057
  1. *p indicates p-value of heterogenous from inverse-variance weighted (IVW) approach, or p-value of intercept from MR-Egger regression, or p-value from Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) global test.

For alcohol consumption defined as drinks per week, only one suggestive association with ACE2 expression in tibial nerve was observed using the IVW approach (β=−1.462, p=0.006). However, the direction of effect from the MR-Egger regression was opposite (β=0.336, p=0.721). We therefore considered an overall null association as our main finding with alcohol consumption (Supplementary file 1h).

In the sensitivity analyses where we excluded palindromic or pleiotropic SNPs, results remained largely consistent with our primary findings (full results shown in Supplementary file 1i-1n). We sequentially excluded proxy SNPs to identify random error introduced by imperfect proxies. In the leave-one-out analyses where we iteratively removed one SNP each time and performed the IVW approach using the remaining SNPs, results were again concordant with our primary findings, indicating an absence of outlying SNPs (Appendix 1—figure 1).

Finally, complementing to findings of ACE2 expression, we tested a putative causal link between smoking status, smoking intensity, alcohol consumption, and the risk of COVID-19 related adverse outcomes. We found that smoking initiation significantly increased the risk of COVID-19 onset (IVW: OR=1.15, 95%CI: 1.07–1.23, p=8.7×10−5) even after taking into account multiple comparisons (Table 3). Results remained significant in the weighted median, and MR-PRESSO methods, however, showed larger statistical uncertainties in the MR-Egger regression. In addition, smoking initiation increased the risk of very severe respiratory confirmed COVID-19 and hospitalized COVID-19 using the IVW approach, but the results were not supported by the MR-Egger regression. Smoking intensity (cigarettes per day) only increased the risk of very severe respiratory confirmed COVID-19 as shown in the MR-Egger regression (OR=5.99, 95%CI: 1.57–22.84, p=0.012) (Supplementary file 1o). On the contrary, we did not find a causal link between alcohol consumption (drinks per week) and the risk of COVID-19 adverse outcomes (Supplementary file 1p). Findings did not alter in the sensitivity analyses (Supplementary file 1q-r and Appendix 1—figure 4).

Table 3
Causal link of smoking initiation with the risk of coronavirus disease 2019 (COVID-19) related adverse outcomes.
OutcomeMethodN of IVsOR (95% CI)pp*
COVID-19 susceptibilityInverse-variance weighted3521.15 (1.07–1.23)8.7×10−56.7×10−5
MR-Egger3521.11 (0.83–1.49)0.4890.821
Weighted median3521.18 (1.08–1.29)2.9×10−4
MR-PRESSO3521.15 (1.07–1.23)1.0×10−45.0×10−5
Hospitalized COVID-19Inverse-variance weighted3511.32 (1.16–1.50)3.7×10−50.009
MR-Egger3510.78 (0.44–1.36)0.3830.059
Weighted median3511.37 (1.14–1.66)0.001
MR-PRESSO3511.32 (1.16–1.50)4.6×10−50.011
Very severe respiratory confirmed COVID-19Inverse-variance weighted3521.25 (1.03–1.51)0.0250.114
MR-Egger3520.83 (0.37–1.89)0.6580.318
Weighted median3521.17 (0.89–1.55)0.267
MR-PRESSO3521.25 (1.03–1.51)0.0250.119
  1. *p indicates p-value of heterogeneous from inverse-variance weighted (IVW) approach, or p-value of intercept from MR-Egger regression, or p-value from Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) global test.

Discussion

We conducted a large-scale genetic analysis to understand the role of cigarette smoking and alcohol consumption with ACE2 expression in multiple tissues/organs, comprehending its role in the prevention of COVID-19. Strong IVs were constructed using hundreds of SNPs associated with smoking and alcohol consumption. We capitalized on the summary statistics of the largest tissue-specific eQTL conducted for ACE2 expression levels and the most up-to-date GWAS data of COVID-19 related adverse outcomes. We found a putative causal relationship between smoking-related phenotypes and an increased ACE2 expression in multiple tissues, as well as an increased susceptibility and severity of COVID-19.

Our findings are supported by previous epidemiological studies which have demonstrated a significant association between smoking and COVID-19 disease progression or death (Guan et al., 2020; Hu et al., 2020; Mehra et al., 2020). For example, a study involving 214 patients with laboratory confirmed COVID-19 from Wenzhou China found that compared to non-severe cases, patients with severe disease were more likely to be smokers (26.3% vs. 6.4%, p = 0.038) (Zheng et al., 2020). Another study recruiting 78 patients with COVID-19 in Wuhan also found a higher proportion of ever-smokers in COVID-19 progression group than in the improvement/stabilization group (OR=14.28, 95% CI:1.58–25.00, p=0.018) (Liu et al., 2020). Consistent with these findings, a meta-analysis on a total of 11,590 COVID-19 cases demonstrated a higher proportion of smokers among 2133 patients experienced disease progression, suggesting that smoking aggravated COVID-19 progression (OR=1.91, 95% CI: 1.42–2.59) (Patanavanich and Glantz, 2020). On the contrary, a few small-scale studies with sample sizes ranging from 44 to 191 conducted in Wuhan China did not report remarkable association of smoking with COVID-19 severity or progression (Huang et al., 2020; Yang et al., 2020; Zhang et al., 2020; Zhou et al., 2020b). For instance, a retrospective study including 191 patients of which 137 survived and 54 non-survived found a comparable proportion of smokers in survivors and non-survivors (9% vs. 4%, p=0.21) (Zhou et al., 2020b). A meta-analysis including five studies (four in Wuhan and one across 30 provinces in Mainland China) (Guan et al., 2020; Liu et al., 2020; Huang et al., 2020; Yang et al., 2020; Zhang et al., 2020) involving 1399 individuals with COVID-19 revealed no significant association between smoking and disease severity (OR=1.69, 95% CI: 0.41–6.92) (Lippi and Henry, 2020). Opposite to the findings from Asian population, evidence from the Veterans Affairs Birth Cohort in a US population found that current smoking was associated with a lower risk of COVID-19 susceptibility (OR=0.45, 95% CI: 0.35–0.57) (Rentsch et al., 2020). The contradictory results in those small-scale studies might be due to insufficient power, low proportion of smokers, and a limited representativeness of the study population. For example, compared with the high proportion of smokers in China (an average 26.6% prevalence in the general population), only 1.4% patients were current smokers in the study conducted by Zhang et al., 2020 and 12.6% in the study conducted by Guan et al., 2020. Given these discrepancies, additional studies are warranted to confirm the role of smoking in both the onset and progression of COVID-19.

In addition to clinical observational studies, laboratory examination has demonstrated the importance of tissues specificity. For example, Cai et al. identified that smoking was associated with an elevated expression of ACE2 in the lung, providing biological evidence of smoking with an increased susceptibility to SARS-CoV-2 infection or severity (Cai, 2020). Moreover, Rao et al. conducted a phenome-wide MR study incorporating 3948 traits, diseases, and blood proteins and identified a nominal significant association between tobacco use and ACE2 expression in the lung (IVW: β=0.918, p=0.016) (Rao et al., 2020). However, this association did not pass multiple comparisons (p-value for FDR was 0.51), which was consistent with our MR results using a greatly augmented number of IVs (for smoking status IV=378).

The biological mechanisms underlying smoking and tissue-specific ACE2 expression remain to be disclosed. ACE2 has been considered as the target receptor of SARS-CoV-2 entry into the host cells and an increased expression of ACE2 appears to raise both the susceptibility and severity of COVID-19. As a type I transmembrane metallocarboxypeptidase homologous to ACE, ACE2 is known to be expressed in a variety of tissues, including respiratory tract, cardio-renal tissues, and gastrointestinal tissues (Harmer et al., 2002). Our study found that smoking increased ACE2 expression in multiple tissues including the brain and colon tissues. While SARS-CoV-2 mainly spreads via respiratory tract, enrichment of SARS-CoV-2 in gastrointestinal tract has been confirmed by testing viral RNA in stool from 71 patients with COVID-19, suggesting the importance of gastrointestinal involvement in the infection (Xiao et al., 2020). Furthermore, nearly one-fifth COVID-19 patients remained SARS-CoV-2 RNA-positive in their stool, despite negative results in their respiratory samples. Of those 71 patients, ACE2 was abundantly expressed in the gastrointestinal epithelia, but rarely expressed in the esophageal epithelia. In addition, findings from public databases (GEO, GTEx, and HPA) have demonstrated a higher expression level of ACE2 in the gastrointestinal tract (colon, rectal, and small intestine) and liver than in the lung (Burgueño et al., 2020; Li et al., 2020; Pirola and Sookoian, 2020). Taking these pieces of evidence together, we could reasonably assume that the biological mechanisms underlying the link between ACE2 expression and COVID-19 susceptibility are complicated, involving multiple organs other than the lung. Consistent with our findings on a link between smoking and increased ACE2 expression in both transverse colon and sigmoid colon, these results collectively suggest that smoking mediates ACE2 expression in gastrointestinal tract, subsequently influence the susceptibility and severity of COVID-19.

We also found that smoking influenced ACE2 expression in the brain tissue, especially in putamen basal ganglia and hypothalamus. Smoking may promote cellular uptake of SARS-CoV-2 through nicotinic acetylcholine receptor (nAChR) signalling (Russo et al., 2020). It is worth noting that nAChR and ACE2 are known to be co-expressed on many sites such as cortex, striatum, and hypothalamus within human brain (Dani and Bertrand, 2007; Jones et al., 2009). Nicotine stimulation of the nAChR increased ACE2 expression within neural cells, indicating a likelihood that smokers are more vulnerable to COVID-19 (Olds and Kabbani, 2020).

To the best of our knowledge, we performed a large-scale phenome-wide MR study to understand a causal role of smoking and alcohol consumption in ACE2 expression, as well as in COVID-19 related outcomes, totalling 532 genetic associations from 1.2 million individuals of European ancestry and covering almost all tissues/organs of human body (N=44). In addition, we rigorously selected proxy SNPs and performed a series of sensitivity analyses to satisfy MR model assumptions, for example, we satisfied the ‘relevance’ assumption by using GWAS-identified significant SNPs as IVs; we ensured the ‘independent’ assumption and the ‘exclusion restriction’ assumption by performing several important sensitivity analyses. However, limitations need to be acknowledged. Although hundreds of SNPs were used as proxies for cigarette smoking and alcohol consumption, these GWAS-identified SNPs explained only a small fraction (1–2%) of phenotypic variance. In addition, the minimum sample size of 114 in brain substantia nigra tissue further limited the statistical power of MR analysis, albeit GTEx database is so far the largest available database with both genotype and expression data. Moreover, data on the genotypes, tissue expression, and COVID-19 related outcomes of our analyses were all based on European ancestry populations, restricting its generalizability to other ethnicities. On the other hand, our data, with exposure and outcome GWAS(s) (or eQTLs) conducted using individuals of the same underlying populations (all European ancestry), greatly reduces the population stratification as well as satisfies the MR model assumption – for a two-sample MR to be valid, the two samples have to be preferably from the same ethnicity. Finally, we have to acknowledge that few of our significant associations passed the stringent Bonferroni corrections. Our results provide a comprehensive picture for the causal relationship of smoking with ACE2 expression in various tissues and with COVID-19 susceptibility and severity, yet we stress caution for interpretation and extra analyses are needed to replicate these findings.

In conclusion, genetically instrumented smoking phenotypes reflected by both smoking initiation and smoking intensity are significantly associated with a high expression level of ACE2 in multiple tissues/organs, subsequently increasing the susceptibility and severity of COVID-19. Our results provide important clinical implications on that smokers might be more vulnerable to SARS-CoV-2 infection or severe disease. At the population level, smoking cessation is also an important actionable prevention strategy to COVID-19. Further studies are needed to confirm or refute our findings.

Appendix 1

Appendix 1—figure 1
Leave-one-out analysis of causal association of smoking initiation with angiotensin-converting enzyme 2 (ACE2) expression.

Each boxplot represents the centralized tendency of effect sizes (beta coefficients) of smoking initiation on tissue-specific ACE2 expression based on the results of leave-one-out analysis where we excluded one single nucleotide polymorphism (SNP) at a time and performed inverse-variance weighted (IVW) using the remaining SNPs.

Appendix 1—figure 2
Leave-one-out analysis of causal association of cigarettes per day with angiotensin-converting enzyme 2 (ACE2) expression.

Each boxplot represents the centralized tendency of effect sizes (beta coefficients) of cigarettes per day on tissue-specific ACE2 expression based on the results of leave-one-out analysis where we excluded one single nucleotide polymorphism (SNP) at a time and performed inverse-variance weighted (IVW) using the remaining SNPs.

Appendix 1—figure 3
Leave-one-out analysis of causal association of drinks per week with angiotensin-converting enzyme 2 (ACE2) expression.

Each boxplot represents the centralized tendency of effect sizes (beta coefficients) of drinks per week on tissue-specific ACE2 expression based on the results of leave-one-out analysis where we excluded one single nucleotide polymorphism (SNP) at a time and performed inverse-variance weighted (IVW) using the remaining SNPs.

Appendix 1—figure 4
Leave-one-out analysis of causal association of smoking and alcohol consumption with coronavirus disease 2019 (COVID-19) related adverse outcomes.

(A) Leave-one-out analysis of causal association of smoking initiation with COVID-19 related adverse outcomes. (B) Leave-one-out analysis of causal association of cigarettes per day with COVID-19 related adverse outcomes. (C) Leave-one-out analysis of causal association of drinks per week with COVID-19 related adverse outcomes. Each boxplot represents the centralized tendency of effect sizes (beta coefficients) of exposure (including smoking and alcohol consumption) on COVID-19 related adverse outcomes based on the results of leave-one-out analysis where we excluded one single nucleotide polymorphism (SNP) at a time and performed inverse-variance weighted (IVW) using the remaining SNPs. COVID-19 indicates susceptibility to COVID-19. Severe COVID-19 indicates very severe respiratory confirmed COVID-19.

Data availability

All data generated or analysed during this study are publicly-available. Data of the GSCAN consortium, the GTEx project, and the COVID-19 Host Genetics Initiative can be acessed at https://genome.psych.umn.edu/index.php/GSCAN, https://www.gtexportal.org, and https://www.covid19hg.org, respectively. Furthermore, data and main programming codes with annotations have been uploaded to GitHub and made publicly available at https://github.com/hye-hz/MR_Smoke_COVID19.git (copy archived at https://archive.softwareheritage.org/swh:1:rev:1a2038517d8f2c7c772e69c9c5abab7713add9bb).

References

    1. Liu M
    2. Jiang Y
    3. Wedow R
    4. Li Y
    5. Brazel DM
    6. Chen F
    7. Datta G
    8. Davila-Velderrain J
    9. McGuire D
    10. Tian C
    11. Zhan X
    12. Choquet H
    13. Docherty AR
    14. Faul JD
    15. Foerster JR
    16. Fritsche LG
    17. Gabrielsen ME
    18. Gordon SD
    19. Haessler J
    20. Hottenga JJ
    21. Huang H
    22. Jang SK
    23. Jansen PR
    24. Ling Y
    25. Mägi R
    26. Matoba N
    27. McMahon G
    28. Mulas A
    29. Orrù V
    30. Palviainen T
    31. Pandit A
    32. Reginsson GW
    33. Skogholt AH
    34. Smith JA
    35. Taylor AE
    36. Turman C
    37. Willemsen G
    38. Young H
    39. Young KA
    40. Zajac GJM
    41. Zhao W
    42. Zhou W
    43. Bjornsdottir G
    44. Boardman JD
    45. Boehnke M
    46. Boomsma DI
    47. Chen C
    48. Cucca F
    49. Davies GE
    50. Eaton CB
    51. Ehringer MA
    52. Esko T
    53. Fiorillo E
    54. Gillespie NA
    55. Gudbjartsson DF
    56. Haller T
    57. Harris KM
    58. Heath AC
    59. Hewitt JK
    60. Hickie IB
    61. Hokanson JE
    62. Hopfer CJ
    63. Hunter DJ
    64. Iacono WG
    65. Johnson EO
    66. Kamatani Y
    67. Kardia SLR
    68. Keller MC
    69. Kellis M
    70. Kooperberg C
    71. Kraft P
    72. Krauter KS
    73. Laakso M
    74. Lind PA
    75. Loukola A
    76. Lutz SM
    77. Madden PAF
    78. Martin NG
    79. McGue M
    80. McQueen MB
    81. Medland SE
    82. Metspalu A
    83. Mohlke KL
    84. Nielsen JB
    85. Okada Y
    86. Peters U
    87. Polderman TJC
    88. Posthuma D
    89. Reiner AP
    90. Rice JP
    91. Rimm E
    92. Rose RJ
    93. Runarsdottir V
    94. Stallings MC
    95. Stančáková A
    96. Stefansson H
    97. Thai KK
    98. Tindle HA
    99. Tyrfingsson T
    100. Wall TL
    101. Weir DR
    102. Weisner C
    103. Whitfield JB
    104. Winsvold BS
    105. Yin J
    106. Zuccolo L
    107. Bierut LJ
    108. Hveem K
    109. Lee JJ
    110. Munafò MR
    111. Saccone NL
    112. Willer CJ
    113. Cornelis MC
    114. David SP
    115. Hinds DA
    116. Jorgenson E
    117. Kaprio J
    118. Stitzel JA
    119. Stefansson K
    120. Thorgeirsson TE
    121. Abecasis G
    122. Liu DJ
    123. Vrieze S
    124. 23andMe Research Team
    125. HUNT All-In Psychiatry
    (2019) Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use
    Nature Genetics 51:237–244.
    https://doi.org/10.1038/s41588-018-0307-5

Decision letter

  1. Eduardo Franco
    Senior Editor; McGill University, Canada
  2. M Dawn Teare
    Reviewing Editor; Newcastle University, United Kingdom
  3. Houfeng Zheng
    Reviewer; Westlake University, China
  4. Derrick Bennett
    Reviewer; Nuffield Department of Population Health, United Kingdom

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This work presents a two-sample Mendelian Randomization (MR) analysis of smoking and alcohol consumption with ACE2 expression in multiple organs. The MR approach allows to explore a causal role of these modifiable lifestyle factors in ACE2 expression in 44 tissues/organs using data from the GCSCAN consortium and GTEx. The MR analysis finds interesting associations with smoking status and intensity and increased levels of ACE2 expression in organs that may go on to modify susceptibility to COVID-19. However, no evidence for an effect of alcohol was seen.

Decision letter after peer review:

Thank you for submitting your article "Mendelian randomization analysis provides causality of smoking on the expression of ACE2, a putative SARS-CoV-2 receptor" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Houfeng Zheng (Reviewer #1); Derrick Bennett (Reviewer #2).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

We would like to draw your attention to changes in our policy on revisions we have made in response to COVID-19 (https://elifesciences.org/articles/57162). Specifically, when editors judge that a submitted work as a whole belongs in eLife but that some conclusions require a modest amount of additional new data, as they do with your paper, we are asking that the manuscript be revised to either limit claims to those supported by data in hand, or to explicitly state that the relevant conclusions require additional supporting data. Our expectation is that, where possible, the authors will eventually carry out the additional work and report on how they affect the relevant conclusions either in a preprint on bioRxiv or medRxiv, or if appropriate, as a Research Advance in eLife, either of which would be linked to the original paper.

Summary:

This paper presents a two-sample Mendelian Randomization analysis of smoking and alcohol consumption with ACE2 expression in multiple organs using tissue samples made available through the GTEx dataset. The MR analysis found promising associations with smoking status and intensity and increased levels of ACE2 expression in organs that may go on to modify susceptibility to COVID-19. While the research is novel, the methods have been applied to only one data resource and some of the conclusions are not warranted by the data and analysis. In particular the causal inferences relating to COVID-19 susceptibility require additional data. Many conclusions are too strong based on the analyses performed.

Essential revisions:

1. The authors must include a detailed flowchart of how SNPs were selected/excluded for each IV. The information must be reported in sufficient detail so that the IVs could be recreated and the whole MR analysis could be replicated. If this requires substantial programming the annotated code could be made available through GitHub for example.

2. The authors report "Our results provide important clinical implications on that smokers might be more vulnerable to SARS-CoV-2 infection or severe disease." They have not directly assessed the relationship of their IV for smoking to SARS-CoV-2 infection so their conclusions need to be toned down. Did the authors consider obtaining SARS-COV-2 outcome data from the COVID-19 Host Genetics Initiative (https://doi.org/10.1038/s41431-020-0636-6)? This would greatly strengthen the report.

3. The MR studies have been conducted in a European population so are these results generalizable to other populations? While the resource used is impressive can this analysis be replicated in an independent data set? Even if some of the signals could be replicated this would add enormous value to the results.

4. Can you offer an explanation why ACE2 was highly expressed on brain, colon, liver et al., but not on respiratory tract and lung tissue? Is higher expression of ACE2 really susceptible factor for Covid19? What is the evidence?

5. The Discussion section seems to focus on evidence from China but the results from China may be affected by the sex-differences in smoking and alcohol prevalence. The patterns of smoking and alcohol in East Asian populations is very different from Western populations. Typically very few women smoke or drink in East Asia. The authors should comment on this.

6. Did the authors consider performing analyses separately for men and women in this study?

7. Table 2 shows the detectable difference with a fixed power of 80% and a significance level of 0.05. Should the significance level be modified to deal with multiple testing with sample from different organs? If not why not? The sample size calculation requires further clarification. It is preferable to mention the a priori power calculations in the methods section of the report not the results. Are the effect sizes detectable clinically relevant? How was this ascertained?

8. The authors need to report the associated F-statistics for their instrumental variables.

9. The authors mention that "Expression values for each gene were inverse quantile normalized to a standard normal distribution across samples". So this suggests that the results are based on a per standard deviation change but this is not clear from the results.

10. Only MR-Egger was used to assess horizontal pleiotropy. There are several other approaches that make different assumptions to MR-Egger that should be considered in order to triangulate the findings.

11. In the MR results, the significance from IVW approach were not replicated in MR-Egger regression, and vice versa. Can we believe these are real casual associations? Could you explain?

12. The abstract should communicate the size of the dataset (ie the number of samples) and report an effect size, 95% confidence interval and p-value for each signal reported in the abstract. Stating 'significant' or 'non-significant' is not appropriate for an abstract.

13. In Figures 1, 2 and 3 the x-axis needs clearer labelling. Isn't this a plot of β values per 1 SD change in ACE2 expression?

https://doi.org/10.7554/eLife.64188.sa1

Author response

Summary:

This paper presents a two-sample Mendelian Randomization analysis of smoking and alcohol consumption with ACE2 expression in multiple organs using tissue samples made available through the GTEx dataset. The MR analysis found promising associations with smoking status and intensity and increased levels of ACE2 expression in organs that may go on to modify susceptibility to COVID-19. While the research is novel, the methods have been applied to only one data resource and some of the conclusions are not warranted by the data and analysis. In particular the causal inferences relating to COVID-19 susceptibility require additional data. Many conclusions are too strong based on the analyses performed.

We thank the reviewer for a nice summary of our paper as well as for pointing out the potential limitations, all of which are very solid, and we totally agree with. In our revised manuscript, we have added extra analyses leveraging GWAS summary statistics of COVID-19 related outcomes to bring in an additional source of data. We have also revised the whole manuscript to avoid a possible over-selling of results. We feel that our manuscript has been greatly improved and hope it now meets the criteria for publication in eLife.

Essential revisions:

1. The authors must include a detailed flowchart of how SNPs were selected/excluded for each IV. The information must be reported in sufficient detail so that the IVs could be recreated and the whole MR analysis could be replicated. If this requires substantial programming the annotated code could be made available through GitHub for example.

We thank the reviewer for pointing this issue out. Indeed, selection of IVs serves as an important basis for MR analysis.

Following with the reviewer’s suggestion, we have added a flowchart of IV-selection as Figure 1. Specific information regarding each IV including its rsID, genomic coordinates, effect size, allele frequency etc. can be found in our Supplementary File 1a-1c. Data and main programming codes with annotations have been uploaded to GitHub and made publicly available at https://github.com/hye-hz/MR_Smoke_COVID19.git.

2. The authors report "Our results provide important clinical implications on that smokers might be more vulnerable to SARS-CoV-2 infection or severe disease." They have not directly assessed the relationship of their IV for smoking to SARS-CoV-2 infection so their conclusions need to be toned down. Did the authors consider obtaining SARS-COV-2 outcome data from the COVID-19 Host Genetics Initiative (https://doi.org/10.1038/s41431-020-0636-6)? This would greatly strengthen the report.

We thank the reviewer for this extremely constructive comment as well as for pointing us to the COVID-19 outcome-related GWAS data. Incorporating those data will greatly enhance the validity of our current results. Following with the reviewer’s suggestions, we have estimated the putative causal effect of smoking and alcohol consumption on COVID-19 susceptibility and severity. Consistent with our ACE2 expression results, both smoking initiation and intensity were associated with an increased risk of COVID-19 related outcomes while alcohol consumption did not seem to influence the risk.

We have formulated those results into extra tables (Table 3 and Supplementary File 1o-1r) and added to our manuscript, the relevant texts read:

“Finally, complementing to findings of ACE2 expression, we tested a putative causal link between smoking status, smoking intensity, alcohol consumption and the risk of COVID-19 related adverse outcomes. We found that smoking initiation significantly increased the risk of COVID-19 onset (IVW: OR=1.15, 95%CI: 1.07-1.23, p=8.7×10-5) even after taking into account multiple comparisons (Table 3). Results remained significant in the weighted median and MR-PRESSO methods, however, showed larger statistical uncertainties in the MR-Egger regression. In addition, smoking initiation increased the risk of very severe respiratory confirmed COVID-19 and hospitalized COVID-19 using the IVW approach, but the results were not supported by the MR-Egger regression. Smoking intensity (cigarettes per day) only increased the risk of very severe respiratory confirmed COVID-19 as shown in the MR-Egger regression (OR=5.99, 95%CI: 1.57-22.84, p=0.012) (Supplementary File 1o). On the contrary, we did not find a causal link between alcohol consumption (drinks per week) and the risk of COVID-19 adverse outcomes (Supplementary File 1p). Findings did not alter in the sensitivity analyses (Supplementary File 1q-1r and Appendix 1-Figure 4)”

3. The MR studies have been conducted in a European population so are these results generalizable to other populations? While the resource used is impressive can this analysis be replicated in an independent data set? Even if some of the signals could be replicated this would add enormous value to the results.

We thank the reviewer for raising such an important point regarding the generalizability of results. Indeed, only data of European ancestry populations were used in our study and therefore our results had limited generalizability to other ethnicities. We have acknowledged this limitation in our Discussion, it reads:

“Moreover, data on the genotypes, tissue expression and COVID-19 related outcomes of our analyses were all based on European ancestry populations, restricting its generalizability to other ethnicities. On the other hand, our data, with exposure and outcome GWAS(s) (or eQTLs) conducted using individuals of the same underlying populations (all European ancestry) greatly reduces the population stratification as well as satisfies the MR model assumption – for a two-sample MR to be valid, the two samples have to be preferably from the same ethnicity.”

Regarding the replication of results, we unfortunately couldn’t find proper GWAS conducted for our exposures of interest (smoking, alcohol consumption) or outcomes (ACE2 expression, COVID-19 adverse outcomes) in populations other than in the European ancestry populations. Following with the reviewer’s suggestion, we have incorporated COVID-19 outcome-related GWAS which serve as an additional source of data and corroborate with our ACE2 expression findings to add values to our original results. Please read our response to question #2.

4. Can you offer an explanation why ACE2 was highly expressed on brain, colon, liver et al., but not on respiratory tract and lung tissue? Is higher expression of ACE2 really susceptible factor for Covid19? What is the evidence?

We shared the same concerns as the reviewer regarding the interpretation of our results. We tried to offer explanation from three aspects as shown below. The reviewer is very welcome to provide further comments and/or suggestions based on our explanations.

First of all, our results were consistent with the study conducted by Rao et al. which examined the effect of an extensive amount of diseases, traits and blood proteins (N=3948) on ACE2 expression in the lung1. For smoking, 3 SNPs associated with tobacco use were extracted from the UK Biobank (ukb-b-5115) and used as IVs; Data on ACE2 expression of the lung tissue were extracted from the GTEx database and used as the outcome. Despite a nominal significant association of tobacco use and ACE2 expression in the lung (IVW: β=0.92, SE=0.38, p=0.016), this result did not pass multiple comparisons (p value for FDR was 0.51). Our results, with 100 times augmented numbers of IVs and largely increased statistical power, did not find a causal relationship between genetically predicted smoking and ACE2 expression in the lung.

Secondly, as we mentioned on Page 9-10 in our Discussion section:

“While SARS-CoV-2 mainly spreads via respiratory tract, enrichment of SARS-CoV-2 in gastrointestinal tract has been confirmed by testing viral RNA in stool from 71 patients with COVID-19, suggesting the importance of gastrointestinal involvement in the infection2. Furthermore, nearly one fifth of COVID-19 patients remained SARS-CoV-2 RNA positive in their stool, despite their negative results in their respiratory samples. Of those 71 patients, ACE2 was abundantly expressed in the gastrointestinal epithelia, but rarely expressed in the esophageal epithelium. In addition, findings from public databases (GEO, GTEx and HPA) have demonstrated a higher expression level of ACE2 in the gastrointestinal tract (colon, rectal, and small intestine) and liver than in the lung3-5. Taking these pieces of evidence together, we could reasonably assume that the biological mechanisms underlying the link between ACE2 expression and COVID-19 susceptibility is complicated, involving multiple organs other than the lung.”

Last but not the least, as the reviewer mentioned in his original question – “Is higher expression of ACE2 really a susceptible factor for COVID-19? what is the evidence?” – We have to admit that there is a lack of strong evidence supporting an absolute causal role of ACE2 in the susceptibility of COVID-19. This infection, as all other complex traits or diseases, occurs under a multifactorial etiology with the involvement of multiple factors, such as a non-trivial role of environmental exposures (here, virus load for example) and various immunological / physiological molecular events other than the expression of ACE2 receptor. Our study, which focused on ACE2 expression with reasonable hypothesis (a most relevant receptor for the virus) and availability of data, contributes as a preliminary first step to understand the etiology of COVID-19. Downstream experimental data and future large-scale analysis are needed to dispute or support our findings, as well as to explore molecules other than ACE2 receptor.

5. The Discussion section seems to focus on evidence from China but the results from China may be affected by the sex-differences in smoking and alcohol prevalence. The patterns of smoking and alcohol in East Asian populations is very different from Western populations. Typically very few women smoke or drink in East Asia. The authors should comment on this.

We agree with the reviewer, indeed, the patterns of smoking and alcohol consumption in East Asian populations differ from that of Western populations, particularly among women. Since the prevalence of smoking among COVID-19 patients are in general low (range from 1.4% to 21.9%), and the patient population consists mainly males (range from 50% to 73%), the sex difference may impose a minimal effect on the outcome. Nevertheless, following with the reviewer’s suggestion, we have cited results from Western populations in our Discussion section, it reads:

“Opposite to the findings from Asian population, evidence from the Veterans Affairs Birth Cohort in a United States population found that current smoking was associated with a lower risk of COVID-19 susceptibility (OR=0.45, 95%CI: 0.35-0.57)6.”

6. Did the authors consider performing analyses separately for men and women in this study?

Unfortunately, we were unable to perform such an analysis. Although most current GWAS(s) were carried out including both men and women, stratified analysis was still performed only to a small extent. None of our exposure or outcome GWAS data (both ACE2 expression and COVID-19 adverse outcomes) were stratified by sex. This analysis could be an important future direction to be focused on when data becomes available.

7. Table 2 shows the detectable difference with a fixed power of 80% and a significance level of 0.05. Should the significance level be modified to deal with multiple testing with sample from different organs? If not why not? The sample size calculation requires further clarification. It is preferable to mention the a priori power calculations in the methods section of the report not the results. Are the effect sizes detectable clinically relevant? How was this ascertained?

We thank the reviewer for covering this aspect. Indeed, Bonferroni correction should be used to take into consideration multiple comparisons. We set our corrected p threshold as dividing 0.05 by the number of outcomes 0.05(44+3)=1.0×10-3 (including 44 tissues / organs and three COVID-19 related adverse outcomes).

However, as 1) the main aim of this work was to evaluate the evidence of a putative causal role of smoking and alcohol consumption on ACE2 expression levels across a range of human tissues / organs (N=44) rather than to identify a specific tissue; and 2) tissues and organs are not totally independent of each other; we therefore applied a marginal significant level of p < 0.05 in our study. For all our results, we reported clearly the p to ensure an accurate interpretation. We added explanatory sentences on the reasons of using a marginal significant p value as well as highlighted all the results that passed Bonferroni correction. For example, we found that smoking initiation significantly augmented the risk of COVID-19 susceptibility (IVW: OR=1.15, 95%CI: 1.07-1.23, p=8.7×10-5) even after taking into account multiple comparisons, indicating that smokers had a higher risk of developing COVID-19 compared with non-smokers.

Following with the reviewer’s suggestion, we have moved the sample size calculation to the Method section and added additional explanatory sentences, it reads:

“To guarantee statistical power, we only included tissues/organs with at least 100 samples in the GTEx database. Under the current sample size, given 1– 2% of the phenotypic variance of smoking and alcohol consumption explained by IVs, our study had sufficient power (>80%) to detect a causal effect of 0.74 to 2.66 in ACE2 expression, and to detect an OR ranging from 1.11 to 1.39 for COVID-19 related outcomes (Supplementary File 1e).”

Lastly, as the reviewer mentioned in his original question – “Are the effect sizes detectable clinically relevant? How was this ascertained?” – We need to admit that effect size in itself cannot give a clear interpretation of the clinical relevance. Specifically, Dr. Burgess underscores that for a binary exposure (e.g. smoking initiation), MR is most suitable for testing a causal link (if there is a causal relationship) rather than for calculating a causal estimation (how strong is the magnitude of causal relationship and how much risk of outcome can be reduced if this exposure is “blocked”)7. Because the causal estimation of a binary exposure assumes that the casual effect is a stepwise function at the point of dichotomization, however, MR estimations perform parametric assumptions. Caution is needed when inferring the causal estimation to clinical relevance using a binary exposure.

8. The authors need to report the associated F-statistics for their instrumental variables.

We thank the reviewer for covering this aspect. Strong instrumental variable is the basic requirement to ensure a valid MR result. Following with the reviewer’s suggestion, the strength of instrumental variable was verified by calculating F-statistics using the formula F=R2(n1k)(1R2)k, where R2 is the proportion of variance explained by the instrumental variable, k refers to the number of IVs, and n indicates the sample size8. The F-statistics for smoking initiation, smoking intensity (cigarettes per day) and alcohol consumption (drinks per week) were 77.2, 67.4, and 17.8, respectively, indicating strong IVs (F-statistics > 10) for each of our exposure of interest.

9. The authors mention that "Expression values for each gene were inverse quantile normalized to a standard normal distribution across samples". So this suggests that the results are based on a per standard deviation change but this is not clear from the results.

We thank the opportunity for being able to make further clarifications. When saying “Expression values for each gene were inverse quantile normalized to a standard normal distribution across samples”, it referred to the ACE2 expression level (here our outcome). For our exposure, three phenotypes were included in the analyses, that is, smoking initiation (ever vs. never), smoking intensity (cigarettes per day) and alcohol consumption (drinks per week). For the latter two exposures, we indeed calculated results based on a per-SD change of exposure, while for smoking initiation which was a binary exposure, we calculated results based on per-unit change in the exposure on the log odds scale.

10. Only MR-Egger was used to assess horizontal pleiotropy. There are several other approaches that make different assumptions to MR-Egger that should be considered in order to triangulate the findings.

Following with the reviewer’s suggestion, we have tested horizontal pleiotropy using the MR-PRESSO approach, results of which have been added to Tables 1-3, to Supplementary File 1f-1h, and to Supplementary File 1o-1p. There was no evidence for the existence of horizontal pleiotropy according to the global test (all p>1.0×10-6).

11. In the MR results, the significance from IVW approach were not replicated in MR-Egger regression, and vice versa. Can we believe these are real casual associations? Could you explain?

We appreciated this opportunity for making further clarifications.

First of all, it is not surprising that the significance from IVW were not replicated in the MR-Egger regression as this method provides twice as large standard errors as IVW and therefore a wider 95% confidence intervals.

Secondly, as we mentioned in our Methods section, “Results were considered significant only if they passed statistical significance (p<0.05) in the IVW approach or the MR-Egger regression and remained directional consistent in the weighted median and MR-PRESSO methods across both primary and sensitivity analyses.”. The IVW approach is the most conventional method which was applied as the primary method to estimate a causal link between exposures (smoking and alcohol consumption) and outcomes (ACE2 expression and COVID-19 related outcome), while MR-Egger regression and several other approaches were used mainly to identify potential horizontal pleiotropy.

12. The abstract should communicate the size of the dataset (ie the number of samples) and report an effect size, 95% confidence interval and p-value for each signal reported in the abstract. Stating 'significant' or 'non-significant' is not appropriate for an abstract.

We thank this constructive suggestion from reviewers and have comprehensively revised our abstract, it reads:

“Background: To understand a causal role of modifiable lifestyle factors in ACE2 expression (a putative SARS-CoV-2 receptor) across 44 human tissues/organs, and in COVID-19 susceptibility and severity, we conducted a phenome-wide two-sample Mendelian randomization (MR) study.

Methods: More than 500 genetic variants were used as instrumental variables to predict smoking and alcohol consumption. Inverse-variance weighted approach was adopted as the primary method to estimate a causal association, while MR-Egger regression, weighted median and MR-PRESSO were performed to identify potential horizontal pleiotropy.

Results: We found that genetically predicted smoking intensity significantly increased ACE2 expression in thyroid (β=1.468, p=1.8×10-8); and increased ACE2 expression in adipose, brain, colon and liver with nominal significance. Additionally, genetically predicted smoking initiation significantly increased the risk of COVID-19 onset (odds ratio=1.14, p=8.7×10-5). No statistically significant result was observed for alcohol consumption.

Conclusions: Our work demonstrates an important role of smoking, measured by both status and intensity, in the susceptibility to COVID-19.”

13. In Figures 1, 2 and 3 the x-axis needs clearer labelling. Isn't this a plot of β values per 1 SD change in ACE2 expression?

Following with the reviewer’s suggestion, we have revised the footnotes of each figure accordingly.

Taking Appendix 1-Figure 1 as an example, the footnote was stated as “Each boxplot represents the centralized tendency of effect sizes (β coefficients) of smoking initiation on tissue-specific ACE2 expression based on the results of leave-one-out analysis where we excluded one SNP at a time and performed IVW using the remaining SNPs”.

These boxplots (or the leave-one-out analysis) aim to identify outlying SNPs which could potentially bias our causal estimates. For example, if the result was driven by one or a few SNPs with larger effects, then we would expect a drastic change in the estimate when that particular outlying SNP was removed. In our case, all estimates centred around the expected value (from the main analysis using all IVs) indicating an absence of outlying SNPs.

References:

1. Rao S, Lau A, So HC. Exploring Diseases/Traits and Blood Proteins Causally Related to Expression of ACE2, the Putative Receptor of SARS-CoV-2: A Mendelian Randomization Analysis Highlights Tentative Relevance of Diabetes-Related Traits. Diabetes Care 2020;43(7):1416-1426.

2. Xiao F, Tang M, Zheng X, Liu Y, Li X, Shan H. Evidence for Gastrointestinal Infection of SARS-CoV-2. Gastroenterology 2020;158(6):1831-1833 e3.

3. Pirola CJ, Sookoian S. COVID-19 and ACE2 in the Liver and Gastrointestinal Tract: Putative Biological Explanations of Sexual Dimorphism. Gastroenterology 2020;159(4):1620-1621.

4. Li MY, Li L, Zhang Y, Wang XS. Expression of the SARS-CoV-2 cell receptor gene ACE2 in a wide variety of human tissues. Infect Dis Poverty 2020;9(1):45.

5. Burgueno JF, Reich A, Hazime H, Quintero MA, Fernandez I, Fritsch J, Santander AM, Brito N, Damas OM, Deshpande A, Kerman DH, Zhang L, Gao Z, Ban Y, Wang L, Pignac-Kobinger J, Abreu MT. Expression of SARS-CoV-2 Entry Molecules ACE2 and TMPRSS2 in the Gut of Patients With IBD. Inflamm Bowel Dis 2020;26(6):797-808.

6. Rentsch CT, Kidwai-Khan F, Tate JP, Park LS, King JT, Skanderson M, Hauser RG, Schultze A, Jarvis CI, Holodniy M, Re VL, Akgün KM, Crothers K, Taddei TH, Freiberg MS, Justice AC. Covid-19 Testing, Hospital Admission, and Intensive Care Among 2,026,227 United States Veterans Aged 54-75 Years. medRxiv 2020:2020.04.09.20059964.

7. Burgess S, Labrecque JA. Mendelian randomization with a binary exposure variable: interpretation and presentation of causal estimates. Eur J Epidemiol 2018;33(10):947-952.

8. Pierce BL, Ahsan H, Vanderweele TJ. Power and instrument strength requirements for Mendelian randomization studies using multiple genetic variants. Int J Epidemiol 2011;40(3):740-52.

9. Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genet Epidemiol 2016;40(4):304-14.

10. Verbanck M, Chen CY, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet 2018;50(5):693-698.

https://doi.org/10.7554/eLife.64188.sa2

Article and author information

Author details

  1. Hui Liu

    Biomedical Research Center, Zhejiang Provincial Key Laboratory of Laparoscopic Technology, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, China
    Contribution
    Data curation, Writing - original draft, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5531-3640
  2. Junyi Xin

    Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, China
    Contribution
    Data curation, Formal analysis, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6677-3936
  3. Sheng Cai

    Institute of Drug Metabolism and Pharmaceutical Analysis, Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, Zhejiang University, Hangzhou, China
    Contribution
    Data curation, Formal analysis, Writing - review and editing
    Competing interests
    No competing interests declared
  4. Xia Jiang

    Department of Clinical Neuroscience, Center for Molecular Medicine, Karolinska Institute, Stockholm, Sweden
    Contribution
    Conceptualization, Supervision, Writing - original draft, Writing - review and editing
    For correspondence
    xia.jiang@ki.se
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5878-8986

Funding

Swedish Research Council (VR-2018-02247)

  • Xia Jiang

Swedish Research Council for Health, Working Life and Welfare (FORTE-2020-00884)

  • Xia Jiang

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank the GSCAN consortium, the GTEx Program, and the COVID-19 Host Genetics Initiative for the release of their data.

Senior Editor

  1. Eduardo Franco, McGill University, Canada

Reviewing Editor

  1. M Dawn Teare, Newcastle University, United Kingdom

Reviewers

  1. Houfeng Zheng, Westlake University, China
  2. Derrick Bennett, Nuffield Department of Population Health, United Kingdom

Publication history

  1. Received: October 20, 2020
  2. Accepted: June 19, 2021
  3. Accepted Manuscript published: July 6, 2021 (version 1)
  4. Version of Record published: July 15, 2021 (version 2)

Copyright

© 2021, Liu et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 301
    Page views
  • 49
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

  1. Further reading

Further reading

    1. Epidemiology and Global Health
    Ariel Karlinsky, Dmitry Kobak
    Tools and Resources Updated

    Comparing the impact of the COVID-19 pandemic between countries or across time is difficult because the reported numbers of cases and deaths can be strongly affected by testing capacity and reporting policy. Excess mortality, defined as the increase in all-cause mortality relative to the expected mortality, is widely considered as a more objective indicator of the COVID-19 death toll. However, there has been no global, frequently updated repository of the all-cause mortality data across countries. To fill this gap, we have collected weekly, monthly, or quarterly all-cause mortality data from 103 countries and territories, openly available as the regularly updated World Mortality Dataset. We used this dataset to compute the excess mortality in each country during the COVID-19 pandemic. We found that in several worst-affected countries (Peru, Ecuador, Bolivia, Mexico) the excess mortality was above 50% of the expected annual mortality (Peru, Ecuador, Bolivia, Mexico) or above 400 excess deaths per 100,000 population (Peru, Bulgaria, North Macedonia, Serbia). At the same time, in several other countries (e.g. Australia and New Zealand) mortality during the pandemic was below the usual level, presumably due to social distancing measures decreasing the non-COVID infectious mortality. Furthermore, we found that while many countries have been reporting the COVID-19 deaths very accurately, some countries have been substantially underreporting their COVID-19 deaths (e.g. Nicaragua, Russia, Uzbekistan), by up to two orders of magnitude (Tajikistan). Our results highlight the importance of open and rapid all-cause mortality reporting for pandemic monitoring.

    1. Epidemiology and Global Health
    2. Medicine
    Steven J Clipman et al.
    Research Article

    Background:People who inject drugs (PWID) account for some of the most explosive HIV and hepatitis C virus (HCV) epidemics globally. While individual drivers of infection are well understood, less is known about network factors, with minimal data beyond direct ties.

    Methods:2,512 PWID in New Delhi, India were recruited in 2017-19 using a sociometric network design. Sampling was initiated with 10 indexes who recruited named injection partners (people who they injected with in the prior month). Each recruit then recruited their named injection partners following the same process with cross-network linkages established by biometric data. Participants responded to a survey, including information on injection locations, and provided a blood sample. Factors associated with HIV/HCV infection were identified using logistic regression.

    Results:Median age was 26; 99% were male. Baseline HIV prevalence was 37.0% and 46.8% were actively infected with HCV (HCV RNA positive). The odds of prevalent HIV and active HCV infection decreased with each additional degree of separation from an infected alter (HIV AOR: 0.87; HCV AOR: 0.90) and increased among those who injected at a specific location (HIV AOR: 1.50; HCV AOR: 1.69) independent of individual-level factors (p<0.001). Additionally, sociometric factors e.g., network distance to an infected alter, were statistically significant predictors even when considering immediate egocentric ties.

    Conclusions:These data demonstrate an extremely high burden of HIV and HCV infection and a highly interconnected injection and spatial network structure. Incorporating network and spatial data into the design/implementation of interventions may help interrupt transmission while improving efficiency.

    Funding:National Institute on Drug Abuse and the Johns Hopkins University Center for AIDS Research.