1. Genetics and Genomics
Download icon

Integration of genomics and transcriptomics predicts diabetic retinopathy susceptibility genes

  1. Andrew D Skol
  2. Segun C Jung
  3. Ana Marija Sokovic
  4. Siquan Chen
  5. Sarah Fazal
  6. Olukayode Sosina
  7. Poulami P Borkar
  8. Amy Lin
  9. Maria Sverdlov
  10. Dingcai Cao
  11. Anand Swaroop
  12. Ionut Bebu
  13. DCCT/EDIC Study group
  14. Barbara E Stranger  Is a corresponding author
  15. Michael A Grassi  Is a corresponding author
  1. Department of Pathology and Laboratory Medicine, Ann and Robert H. Lurie Children's Hospital of Chicago, United States
  2. Research and Development, NeoGenomics Laboratories, United States
  3. University of Illinois at Chicago, United States
  4. Cellular Screening Center, Office of Shared Research Facilities, The University of Chicago, United States
  5. Department of Biostatistics, Johns Hopkins University, United States
  6. National Eye Institute, National Institutes of Health (NIH), United States
  7. The George Washington University, Biostatistics Center, United States
  8. Department of Pharmacology, Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, United States
Research Article
  • Cited 1
  • Views 1,636
  • Annotations
Cite this article as: eLife 2020;9:e59980 doi: 10.7554/eLife.59980

Abstract

We determined differential gene expression in response to high glucose in lymphoblastoid cell lines derived from matched individuals with type 1 diabetes with and without retinopathy. Those genes exhibiting the largest difference in glucose response were assessed for association with diabetic retinopathy in a genome-wide association study meta-analysis. Expression quantitative trait loci (eQTLs) of the glucose response genes were tested for association with diabetic retinopathy. We detected an enrichment of the eQTLs from the glucose response genes among small association p-values and identified folliculin (FLCN) as a susceptibility gene for diabetic retinopathy. Expression of FLCN in response to glucose was greater in individuals with diabetic retinopathy. Independent cohorts of individuals with diabetes revealed an association of FLCN eQTLs with diabetic retinopathy. Mendelian randomization confirmed a direct positive effect of increased FLCN expression on retinopathy. Integrating genetic association with gene expression implicated FLCN as a disease gene for diabetic retinopathy.

eLife digest

One of the side effects of diabetes is loss of vision from diabetic retinopathy, which is caused by injury to the light sensing tissue in the eye, the retina. Almost all individuals with diabetes develop diabetic retinopathy to some extent, and it is the leading cause of irreversible vision loss in working-age adults in the United States. How long a person has been living with diabetes, the extent of increased blood sugars and genetics all contribute to the risk and severity of diabetic retinopathy. Unfortunately, virtually no genes associated with diabetic retinopathy have yet been identified.

When a gene is activated, it produces messenger molecules known as mRNA that are used by cells as instructions to produce proteins. The analysis of mRNA molecules, as well as genes themselves, can reveal the role of certain genes in disease. The studies of all genes and their associated mRNAs are respectively called genomics and transcriptomics. Genomics reveals what genes are present, while transcriptomics shows how active genes are in different cells.

Skol et al. developed methods to study genomics and transcriptomics together to help discover genes that cause diabetic retinopathy. Genes involved in how cells respond to high blood sugar were first identified using cells grown in the lab. By comparing the activity of these genes in people with and without retinopathy the study identified genes associated with an increased risk of retinopathy in diabetes. In people with retinopathy, the activity of the folliculin gene (FLCN) increased more in response to high blood sugar. This was further verified with independent groups of people and using computer models to estimate the effect of different versions of the folliculin gene.

The methods used here could be applied to understand complex genetics in other diseases. The results provide new understanding of the effects of diabetes. They may also help in the development of new treatments for diabetic retinopathy, which are likely to improve on the current approach of using laser surgery or injections into the eye.

Introduction

Almost all individuals with diabetes will develop some form of diabetic retinopathy over time (National Diabetes Fact Sheet, 2011). In the United States diabetic retinopathy is the most frequent cause of blindness among working age individuals (Centers for Disease Control and Prevention, 2018). Interindividual variation contributes significantly to susceptibility of the severe manifestations of diabetic retinopathy, which results in vision impairment. Epidemiological studies suggest that phenotypic variation is influenced by two primary risk factors: the duration of diabetes and an individual’s level of glycemia (HbA1c) (DCCT/EDIC Research Group et al., 2017). However, these two factors do not completely explain an individual’s risk for developing diabetic retinopathy. For instance, a common anecdotal clinical experience is the comparison of patients with similar durations of diabetes and similar levels of glycemic control who have tremendously disparate clinical outcomes for diabetic retinopathy. Moreover, some individuals with diabetes develop very minimal retinopathy (Sun et al., 2011), whereas others clearly seem to have a predisposition for severe retinopathy (Gao et al., 2014).

Together, these observations in conjunction with the high concordance of diabetic retinopathy between family members support an underlying genetic mechanism. Familial aggregation and twin studies estimate that genetic factors account for 25–50% of an individual’s risk of developing severe diabetic retinopathy (Arar et al., 2008; Hietala et al., 2008). Unfortunately, little is known about the genetic architecture that contributes to susceptibility for diabetic retinopathy. Genetic studies suggest that it is a highly polygenic trait influenced by multiple genetic variants of small effect. Our group and others have performed genome-wide association studies to better delineate the molecular factors that predispose to diabetic retinopathy (Grassi et al., 2011; Grassi et al., 2012; Pollack et al., 2019). However, these studies have had limited success, likely due to insufficient study sample sizes and the phenotypic heterogeneity of diabetic retinopathy.

Notably, like other complex disease traits including age-related macular degeneration (Fritsche et al., 2014; Fritsche et al., 2016), a majority of genetic variants nominally associated with diabetic retinopathy are located in intronic or inter-genic regions (Risch and Merikangas, 1996). Most of these variants appear to play critically important functional roles in regulating gene expression. In fact, several of the top associated SNPs identified in our meta-GWAS of diabetic retinopathy (Grassi et al., 2011) are present in DNase hypersensitivity sites and affect gene expression levels by altering the allelic chromatin state or the binding sites of transcription factors (Maurano et al., 2012).

The observation that disease-associated genetic loci often influence gene expression levels (Gamazon et al., 2018) led us to postulate that integrating gene expression with genetic association would be a powerful approach to identify susceptibility genes for diabetic retinopathy. We hypothesized that cell lines derived from individuals with diabetes with and without retinopathy could be used to uncover genetic variation that explain individual differences in the response to diabetes. Culturing two sets of cell lines under controlled, identical conditions from individuals with diabetes who did and those who did not develop retinopathy could unmask molecular differences in how these groups respond to glucose (Grassi et al., 2016; Grassi et al., 2014). We presumed that a portion of those differences would have a genetic basis.

In this article, we identify genes whose expression responds differently to glucose in cells derived from T1D individuals with and without diabetic retinopathy. We show that one of these genes, folliculin (FLCN), is causally implicated in diabetic retinopathy based on results from genetic association testing and Mendelian randomization.

Results

Individuals with retinopathy (PDR) show differences in diabetes duration and level of glycemia compared to individuals without retinopathy (nDR)

Matched DCCT/EDIC participants (for age, sex, treatment group, cohort, and diabetes duration) from whom the gene expression profiling was obtained are detailed in Supplementary file 1a. All individuals had T1D and were Caucasian, and 60% were female. As anticipated, notable differences were observed between individuals with and without retinopathy (PDR vs. nDR) for mean duration of diabetes (53 ± 43.4 months vs. 27 ± 13.4 months). as it was also not possible to completely match participant pairs for this covariate or for level of glycemia (HbA1c).mean HbA1c (9.71 ± 2.37 vs. 7.62 ± 1.07) given their significant impact on retinopathy.

Interindividual variation is evident in the transcriptional response to glucose

We quantified gene expression levels from lymphoblastoid cell lines (LCLs) of all study individuals (nDM, PDR, and nDR) in both standard glucose (SG) and high glucose (HG) conditions and determined the genome-wide transcriptional response to glucose for each individual (RGall). We observed that 22% of 11,548 examined genes demonstrated a differential response in expression between the two conditions (true positive rate; π1 = 0.22) (Storey and Tibshirani, 2003Figure 1—figure supplement 1), with 299 of those at an FDR < 0.05 (Figure 1a), supporting a significant impact of glucose on the LCL transcriptome. We confirmed that interindividual transcriptome response to HG is greater than the intraindividual response (p=2×10−16) (Figure 1—figure supplement 2, Figure 1—figure supplement 3, Figure 1—figure supplement 4). Interestingly, TXNIP, the most highly glucose-inducible gene in multiple cell types (Devi et al., 2017; Chen et al., 2016), exhibited the largest (log2(FC) difference = 0.2) and most significant (p=3.2×10−12, FDR = 5.1×10−8) transcriptional response to glucose. Pathway analysis using gene set enrichment analysis (GSEA) revealed dramatic upregulation of genes involved in structural changes to DNA (DNA packaging, FDR < 0.0001) and in genes such as transcription factors that modulate the cellular response to environmental stimuli (protein DNA complex, FDR < 0.0001) (Figure 1b). Conversely, genes that modulate the cellular response to infection were considerably downregulated (type 1 interferon, FDR < 0.0001; gamma interferon, FDR < 0.0001; leukocyte chemotaxis genes, FDR < 0.0001) potentially supporting earlier work that chronic glucose exposure depresses cellular immune responsiveness (Delamaire et al., 1997; Al-Mashat et al., 2006).

Figure 1 with 4 supplements see all
Response to glucose.

(a) Volcano plot summarizing transcriptional response to glucose for all 22 individuals (RGAll consisting of nDM, nDR, and PDR individuals). Each point represents a single gene. Red indicates genes showing a differential response (FDR < 0.05; log10 >1.3 represented by the dotted line) and an absolute log2FC >0.17. Adj p-value is false discovery rate (FDR). FC indicates expression fold change with positive values indicating higher expression in the high glucose condition relative to the standard condition. Source and code files for this plot are available in Figure 1—source data 1 and Source code 1. An additional source file can be found in Gene Expression Omnibus (GEO) at https://www.ncbi.nlm.nih.gov/geo/ under accession code GSE146615. (b) QQ (quantile-quantile) plot plot summarizing GSEA of transcriptional response to glucose in all 22 individuals. Pathways are classified as upregulated (red) or downregulated (blue) in response to glucose. Only significant GO categories (FDR < 0.1%) are labeled. Red line indicates the null expectation. Source and code files for this plot are available in Figure 1—source data 2, 3, 4, and 5 and Source code 2.

Individuals with diabetic retinopathy exhibit a differential transcriptional response to glucose

We observed differences in the transcriptional response to glucose between matched individuals with and without diabetic retinopathy (RGpdr–ndr). Principal component analysis (PCA) demonstrated that the observed interindividual variance is dominated by randomized DCCT treatment (intensive vs. conventional) group effects based on retinopathy status (p=3×10−6) (Figure 2—figure supplement 1) and is not confounded by LCL growth rate (p>0.05) or EBV-(Epstein Barr virus) copy number (p>0.05). Using a gene-wise analysis we identified 103 genes exhibiting a differential glucose response between individuals with and without retinopathy (p<0.01) (Figure 2; Supplementary file 1b). Some of these genes and pathways have previously been shown to play a role in diabetic retinopathy. One of the top differential response genes was IL1B (p=0.008, log2(FC) response difference = 0.289). Expression of IL1B has been previously reported to be induced by HG (Shanmugam et al., 2003). Additionally, the expression of IL1B is upregulated in the diabetic retina and has been implicated in the pathogenesis of diabetic retinopathy (Liu et al., 2012). Likewise, the top GSEA pathway has also previously been implicated in the pathogenesis of diabetic retinopathy. We identified PDGF signaling as the most significant differential response pathway (FDR = 0.012) (Figure 2—figure supplement 2). Elevated levels of PDGF are present in the vitreous of individuals with proliferative diabetic retinopathy (PDR) compared to individuals without diabetes (Freyberger et al., 2000). As PDGF is required for normal blood vessel maintenance, it is thought to contribute to the pericyte loss, microaneurysms, and acellular capillaries that are key features of the diabetic retina (Hammes et al., 2002). Interestingly, despite our model utilizing lymphoblastoid cells, it was able to reveal the upregulation of PDGF which is primarily a vascular factor that also plays a key role in neuronal tissue.

Figure 2 with 2 supplements see all
Differential transcriptional response to glucose among individuals with diabetes with and without retinopathy.

Volcano plot summarizing genes exhibiting a differential response to glucose between individuals with diabetes with and without retinopathy (RGPDR–nDR). The difference in FC between groups is represented on the X-axis and p-value of this difference on the Y-axis. Red indicates the 103 genes showing the most differential expression between individuals with and without retinopathy (p<0.01). FC, fold change. Source and code files for this plot are available in Figure 2—source data 1 and Source code 5. An additional source file for this plot can be found in Gene Expression Omnibus (GEO) at https://www.ncbi.nlm.nih.gov/geo/ under accession code GSE146615.

Genetic association reveals that some genes with differential response to glucose play a role in susceptibility to diabetic retinopathy

We sought to assess whether the most significant differential response genes (RGpdr–ndr) could yield novel insights into diabetic retinopathy. An overview of our approach is presented in Figure 3a. First, we selected the top 103 genes (p<0.01) that showed the largest difference in gene expression response to glucose between individuals with diabetes with and without retinopathy. We next identified all of the significant expression quantitative trait loci (eQTLs) for these genes in GTEx (version 7) (GTEx Consortium, 2015). In total, we found 7253 unique eQTL SNPs (hereafter referred to as eSNPs) in at least one of the 48 tissues investigated by GTEx. Differential response genes are more likely to harbor eSNPs, and hence be eGenes, compared to the genome-wide average (p=2.0×10−16) (Figure 3—figure supplement 1). This suggests that differential response genes are more likely to be genetically regulated and may contribute to interindividual differences in the development of diabetic retinopathy. To test if the eSNPs for the 103 differential response genes were more associated with diabetic retinopathy than expected, we evaluated the association between the 7253 differential response gene eSNPs and diabetic retinopathy using our published GWAS of diabetic retinopathy (Grassi et al., 2011). The 7253 eSNPs from the differential response genes are enriched for association with diabetic retinopathy (FDR < 0.05) (Figure 3b). To further assess the significance of this enrichment, we performed permutation testing of eSNPs from random sets of 103 genes which demonstrated that less than 1% contained the same proportion of similarly skewed GWAS p-values (Figure 3—figure supplement 2). The eSNPs for differential response genes were enriched among diabetic retinopathy meta-GWAS p-values relative to all eSNPs (p=0.0012) and all SNPs (p=0.0023) (Figure 3c). Thus, some of the genes exhibiting a differential response to glucose (RGpdr–ndr) are associated with the development of severe diabetic retinopathy.

Figure 3 with 2 supplements see all
Association of glucose differential response genes (RGpdr–ndr) with diabetic retinopathy.

(a) Workflow of analytical steps integrating glucose differential response genes with genetic association with diabetic retinopathy. Flow chart showing key experimental steps based on stepwise findings. (b) QQ plot revealing a skew away from the null and above the FDR 0.05 threshold suggests that expression of some of the glucose response genes may be causally related to diabetic retinopathy. 7253 GTEx eSNPs were generated from the 103 differential response genes and tested for their association with diabetic retinopathy in a GWAS. Observed vs. expected p-values are plotted. The null hypothesis of no difference between the observed and expected p-values is represented by the red line. No influence of population structure or other design factors was observed (genomic control inflation estimate λGC = 1.005) (Devlin and Roeder, 1999). Source and code files for this plot are available in Figure 3—source data 1 and Source code 7. (c) Bar plot comparing frequency of p-values <0.05 in diabetic retinopathy GWAS of: all eSNPs, all SNPs, and eSNPs from the 103 differential response genes. An excess of GWAS p-values of <0.05 is observed in the eSNPs from the glucose differential response genes (p=0.0012 vs. all eSNPs and p=0.0023 vs. all SNPs). The proportion of SNPs with p<0.05 in the all SNPs, all eSNPs, and 103 differential response gene eSNPs are 0.0505, 0.0499, and 0.0571, respectively. Source and code files for this plot are available in Figure 3—source data 2 and Source code 8.

FLCN is a putative diabetic retinopathy disease gene

The most significant retinopathy-associated eSNP among the set of 7253 eSNPs tested is rs11867934 (Figure 4a); FDR < 0.05; meta-GWAS p=6.7×10−6<Bonferroni adjusted p-value of 6.9 × 10−6; OR = 0.86, 95% confidence interval (CI) = 0.71,1.00; minor allele frequency = 0.22. rs11867934 is an intergenic eSNP for FLCN in multiple biologically relevant tissues including artery and nerve. We confirmed FLCN expression in the retina of human donor eyes (Figure 4—figure supplement 1). In the LCLs derived from individuals with diabetes, FLCN was upregulated in response to glucose to a greater extent in individuals with diabetic retinopathy than in individuals with diabetes without retinopathy (log2FC difference = 0.27, p=2.5×10−3) (Figure 2, Supplementary file 1b, and Figure 4—figure supplement 2). eQTLs in retina have recently been mapped (Ratnapriya et al., 2019). We determined that at least 43% of retina eQTLs are also eQTLs in GTEx LCLs. Examining the genome-wide association signal for a disease from eQTLs in aggregate can be a more powerful strategy to discern a heterogenous genetic signal than testing each of these SNPs individually. We collated all the eSNPs for FLCN in the retina. We assessed the aggregated association of FLCN eSNPs (n = 272 eSNPs significant in the retina and 20 or more GTEx tissues) with diabetic retinopathy in the meta-GWAS and observed an enrichment for association with diabetic retinopathy (π1 = 0.9; Figure 4b, Figure 4—figure supplement 3). We then validated the FLCN association with diabetic retinopathy in a third cohort, the UK Biobank (UKBB) (Supplementary file 1c), and found that the FLCN eSNPs were enriched for association with diabetic retinopathy in the UKBB (π1 = 0.73) (Figure 4—figure supplement 4).

Figure 4 with 4 supplements see all
Diabetic retinopathy meta-GWAS for eSNPs of differential response genes to glucose.

(a) Manhattan plot of the results of the meta-GWAS for diabetic retinopathy showing association signals for the eSNPs from the differential response genes to glucose for individuals with and without retinopathy (RGPDR–nDR). Threshold lines represent Bonferroni correction (blue) and FDR < 0.05 (black). Association testing for diabetic retinopathy performed with 7253 eSNPs representing 103 differential response genes to glucose. Source and code files for this plot are available in Figure 4—source data 1 and Source code 11. (b) Bar plot comparing the true positive rate (π1), TPR, for association of diabetic retinopathy with all SNPs, all eSNPs, eSNPs from the 103 differential response genes to glucose (n = 7253), and eSNPs found in retina and >20 GTEx tissues for folliculin (FLCN) (n = 272). TPR is an estimate of the proportion of tests that are true under the alternative hypothesis. Plot reveals significant enrichment for glucose response gene eSNPs in general and for FLCN eSNPs (π1 = 0.9) in particular. Source and code files for this plot are available in Figure 4—source data 2, 3, and 4, and Source code 12.

We applied Mendelian randomization to assess whether the level of FLCN expression affects the development of diabetic retinopathy. We first imputed retinal FLCN expression in the UKBB, and then estimated the effects of the estimated FLCN expression on diabetic retinopathy using summary data-based Mendelian randomization analysis (Zhu et al., 2016) (SMR). Mendelian randomization treats the genotype as an instrumental variable. A one standard deviation (SD) increase in the predicted retinal expression of FLCN increases the risk of diabetic retinopathy by 0.15 SD (95% CI: 0.02–0.29, standard error 0.07, p=0.024). Individuals with diabetes with high predicted retinal FLCN expression have increased odds of developing retinopathy (1.3 OR increase per SD increase in FLCN expression) (Chinn, 2000). We did not observe any evidence of horizontal pleiotropy (in which FLCN eSNPs are independently associated with both FLCN expression and diabetic retinopathy) confounding the analysis [HEIDI p>0.05 (p=0.2)] (Zhu et al., 2016). We detected an aggregated effect of 14 independent FLCN eQTLs (r2 < 0.2) on the development of diabetic retinopathy through FLCN expression using multi-SNP Mendelian randomization (p=0.04) (Wu et al., 2018a). Together, these findings support the presence of genetic variation at the FLCN locus affecting both FLCN expression and the development of diabetic retinopathy through the expression of FLCN.

Discussion

The cellular response to elevated glucose is an increasingly important pathway to understand in light of the emerging epidemic levels of diabetes worldwide (National Diabetes Fact Sheet, 2011). Variations in the cellular response to glucose at a molecular level have not been well characterized between cell types, and to an even lesser degree between individuals. In prior work, we characterized robust, repeatable interindividual differences in transcriptional response to glucose in LCLs of individuals with diabetic retinopathy (Grassi et al., 2016). As an LCL generated from each individual is genetically unique, it follows that the gene expression response to glucose between individuals should be phenotypically heterogeneous and that a portion of the interindividual variability will be genetically determined. We hypothesized that interindividual variation in the cellular response to glucose may reveal clues to the genetic basis of diabetic retinopathy, thereby providing insights into its predisposition.

We demonstrated that different individual-derived cell lines treated under identical culture conditions reveal an individual-specific transcriptional response to glucose and this signal far exceeds accompanying experimental noise. Transformation and multiple freeze/thaw passages do not homogenize the individualized response to HG-induced gene expression in LCLs. Analyzing the individual glucose-stimulated transcriptional response revealed several insights into the pathophysiology of the diabetic state and how it relates to the development of retinopathy. For instance, TXNIP was identified as the top differential response gene to glucose in all individuals (RGall). TXNIP is a key marker of oxidative stress. It is upregulated in the diabetic retina where it induces Muller cell activation (Devi et al., 2017). HG treatment has been shown to increase TXNIP expression (Chen et al., 2016). TXNIP is a glucose sensor whose expression has been strongly associated with both hyperglycemia and diabetic complications. Specifically, the TXNIP locus was differentially methylated in the primary leukocytes of EDIC cases and controls (Chen et al., 2016). A key mechanism by which cells respond to stress is through changes in genome configuration. Conformational alterations in DNA packaging influence the accessibility of DNA for transcription. Structural changes in DNA conformation facilitate cellular adaptation and response to stimuli which can enable transcriptional changes. The GSEA showed that the cellular response to chronic glucose stress involves alterations in DNA accessibility which facilitates the gene expression response to this environmental stimulus (Smith and Workman, 2012). The transcriptional response to glucose in part manifests as diminished immune responsiveness, a well-characterized feature of diabetes (Shanmugam et al., 2003; Mowat and Baum, 1971).

Further, we considered that the genetic component of an individual’s response to glucose may influence their susceptibility to diabetic complications like retinopathy. Cell lines from individuals with diabetes with and without retinopathy reveal differences in the response to glucose at a molecular level. In addition, not only were some of these differential response genes biologically relevant to diabetic retinopathy as exemplified by IL1B and PDGF, but also many had a genetic basis for their differential response. By integrating the gene expression findings with GWAS data, we implicated FLCN as a putative disease gene in diabetic retinopathy. Mendelian randomization provided evidence that genetic variation affects diabetic retinopathy through alterations in FLCN expression thereby suggesting that FLCN expression is a mediator of diabetic retinopathy. FLCN is a biologically plausible diabetic retinopathy disease gene since its expression is present in both neuronal and vascular cells of the retina. Current evidence suggests that FLCN is a negative regulator of AMPK which helps to modulate the energy sensing ability of AMPK and plays a role in responding to cellular stress (Hasumi et al., 2012). AMPK plays an important role in providing resistance to cellular stresses by regulating autophagy and cellular bioenergetics to avoid apoptosis. Loss of FLCN results in constitutive activation of AMPK. Higher levels of FLCN would suggest less cellular capacity to deal with stress (Possik et al., 2014). Interestingly, the protective effect of agents such as metformin and fenofibrate on diabetic retinopathy might be mediated through AMPK (Kim et al., 2007; Joe et al., 2015).

Our study design had several advantages over prior approaches aimed at revealing the genetic basis of diabetic retinopathy. First, we utilized white blood cells which are readily accessible from the peripheral circulation of human patients (Epidemiology of Diabetes Interventions and Complications (EDIC) Research Group, 1999) and can reveal differential molecular characteristics depending on the stage of diabetic retinopathy (Tang and Kern, 2011; Gubitosi-Klug et al., 2008; Kern, 2007). LCLs are derived from white blood cells making them a relevant cellular population to study for diabetic retinopathy. LCLs have been shown to be a powerful model system for functional genetic studies in humans (Tang and Kern, 2011; Kern, 2007). Second, an LCL was generated for every individual enrolled in the landmark DCCT/EDIC study. DCCT/EDIC is the best-characterized prospective interventional cohort ever created to follow systemic complications of long-standing diabetes. DCCT/EDIC allows for detailed stratification of individuals, each of whom has had extensive prospective clinical phenotyping. Third, glucose was employed to elicit a provocative response in LCLs. By focusing on a secondary sequela of diabetes like retinopathy, the cellular response to glucose stimulation through transcription became a meaningful and directly relevant reflection of the stress each cell in the body encounters from diabetes. Insights into glucose-stimulated gene expression in LCLs have broad applicability to multiple tissues of interest for diabetic complications (even in the retina as we have shown) due to significant evidence supporting a shared framework for gene regulation among tissues (GTEx Consortium, 2015). Finally, disease-associated eQTL provide functional insights into the pathogenesis of a condition. We show that altering the levels of FLCN expression impacts risk of diabetic retinopathy. Aggregating independent eQTLs for the same gene (that are not in high linkage disequilibrium) revealed an enriched association that may otherwise have been missed by a conventional GWAS approach (Wu et al., 2018b). Treating the associated eQTL as an instrumental variable, Mendelian randomization supported the potential causality of FLCN in the pathogenesis of the disease. Inherently, this approach yielded all three M’s of target modulation: mechanism, magnitude, and markers (Plenge, 2019).

The present work had inherent limitations. First, LCLs are not primary cells but rather a transformed cell line. The Multiple Tissue Human Expression Resource (MuTHER) LCL study revealed a large impact of common environmental exposure, stemming from shared sample handling, on gene expression in twin LCLs (Wright et al., 2014). The significant correlation of these extrinsic factors on LCL gene expression emphasizes the importance of randomization and biological replicates which we implemented in this study. Moreover, as a cell line, heterogeneous genomic alterations have been identified in lymphoblastoid cells that increase with passaging, thereby raising the concern that this can lead to variability in their transcriptome (Ben-David et al., 2018). Importantly, the EDIC cell lines employed in this study were only passaged once previously. Additionally, genomic changes have only a minor effect on genotypic frequencies with a 99.63% genotype concordance between lymphoblastoid cells and their parent leukocytes. Mendelian error rates in levels of heterozygosity are not significantly different between LCLs and their primary B-lymphocyte cells of origin (McCarthy et al., 2016). Second, it is not possible to delineate cause from effect in gene expression studies. Gene expression changes may be causal, epiphenomena, or due to reverse causality (the disease causing the gene expression changes rather than the other way around). In this study, by integrating genetic analyses with gene expression and recognizing that variation in the underlying genome precedes disease onset and can therefore be considered an instrumental variable, we identified through Mendelian randomization potentially causal gene expression changes in FLCN that act as a mediator for retinopathy thereby avoiding the trap of reverse causality. Finally, eQTL found in LCLs may not be relevant to diabetic retinopathy. As noted previously we found 43% of retina eQTL are shared with LCLs. We demonstrated that independent FLCN eQTLs found both in the retina and GTEx tissues showed an enriched association with diabetic retinopathy, a finding that was replicated in a large independent cohort from the UKBB. For complex trait associations in general and for those specifically in the retina, eQTL that are shared between tissues explain a greater proportion of associations than tissue-specific eQTL (Gamazon et al., 2018). For instance, shared tissue eQTL are enriched among genetic associations with age-related macular degeneration, another common retinal disease, despite the high tissue specificity of the disease (Ratnapriya et al., 2019; Unlu et al., 2019).

In summary, integration of gene expression from a relevant cellular model with genetic association data provided insights into the functional relevance of genetic risk for a complex disease. Using disease-associated differential gene and eQTL-based genome-wide association testing, we identified possible causal genetic pathways for diabetic retinopathy. Specifically, our studies implicated FLCN as a putative diabetic retinopathy susceptibility gene. Future work that incorporates more extensive molecular profiling of the cellular response to glucose in conjunction with a greater number of cell lines may yield further insights into the underlying genetic basis of diabetic retinopathy.

Materials and methods

Overview

In this study we profiled the transcriptomes of cell lines derived from 22 individuals (seven individuals with no diabetes [nDM], eight with T1D with PDR, and seven with T1D with no retinopathy [nDR]) utilizing gene expression microarrays to characterize the transcriptional response to glucose. Specifically, we cultured LCLs derived from each individual in SG and HG medium and measured gene expression for each gene in each sample, as well as the difference (Δ = response to glucose [RG]) in each gene’s expression for each individual (Figure 5a). We compared the differential response in gene expression to glucose for all individuals with and without proliferative retinopathy. ‘Differential response’ in gene expression refers to the difference in gene expression response to glucose between groups. Specifically, we identified genes with a significant differential response in expression between individuals with diabetes with and without PDR (RGpdr–ndr). We followed up genes showing differential response using the results of both a prior genome-wide association study (GWAS) meta-analysis of diabetic retinopathy (in the GoKinD and EDIC cohorts) (Grassi et al., 2011) and the results of a multi-tissue eQTL analysis from GTEx (GTEx Consortium, 2015) to identify potential diabetic retinopathy susceptibility genes (Figure 5b).

Experimental design.

(a) Schematic representation of the experimental design for transcriptomic profiling. Lymphoblastoid cell lines (LCLs) from 22 individuals were cultured under both standard glucose (SG) and high glucose (HG) conditions. Gene expression was quantified using microarrays for three biological replicates of each LCL in each condition. The response to glucose was determined for all genes on a per-individual basis, by comparing expression in SG and HG conditions. The cell lines were derived from individuals with diabetes and no retinopathy (7), individuals with diabetes and proliferative diabetic retinopathy (8), and individuals without diabetes (7). (b) We identified 15 individuals based on retinopathy status from the Epidemiology of Diabetes Interventions and Complications (EDIC) cohort. We compared the differential response in gene expression to glucose for individuals with and without proliferative retinopathy (RGpdr–ndr). Expression quantitative trait loci (eQTL) for those genes that showed the greatest differential response between individuals with and without retinopathy were tested for their genetic association with diabetic retinopathy.

Participant safety and confidentiality issues

Request a detailed protocol

All cell lines were de-identified prior to their arrival at the University of Illinois at Chicago; therefore, this proposal qualified as non-human subjects research according to the guidelines set forth by the institutional review board at the University of Illinois at Chicago. As the data were analyzed anonymously, no participant consent was required. DCCT participants previously provided consent for their samples to be used for research. Matching of participants was performed at George Washington University Biostatistics Center and did not involve protected health information as the phenotypic data were de-identified. The George Washington University institutional review board has approved all analyses of EDIC data of this nature. All protocols used for this portion of the study are in accordance with federal regulations and the principles expressed in the Declaration of Helsinki. Specific approval of the study design and plan was obtained from the EDIC Research Review Committee.

Cell lines

Request a detailed protocol

Twenty-two LCLs were used in the study as described previously (Grassi et al., 2016). Briefly, we included 15 of the 1441 LCLs generated from individuals with type 1 diabetes from the DCCT/EDIC cohort (The Writing Team for the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications Research Group, 2002; Epidemiology of Diabetes Interventions and Complications (EDIC) Research Group, 1999), consisting of eight matched cases with PDR and seven without retinopathy (nDR) as the controls (Grassi et al., 2016Supplementary file 1d). Whole blood samples were ascertained from DCCT study participants between 1991 and 1993. White blood cells from the samples were transformed into LCLs in the early 2000s. The 15 LCLs from individuals with diabetes consisted of matched cases and controls. Cases were defined by the development of PDR by EDIC Year 10 (2004), whereas controls had no retinopathy through EDIC Year 10 (2004). Retinopathy grade was determined by seven-field stereoscopic photos. Control participants had an ETDRS (Early Treatment Diabetes Retinopathy Score) of 10 and case participants had an ETDRS score of ≥61. All eight pairs were matched by age, sex, treatment group (intensive vs. conventional), cohort (primary vs. secondary), and diabetes duration (The DCCT Research Group, 1986; The diabetes control and complications trial, 1995), except one pair that was matched by age, sex, and treatment group only. Diabetes duration was defined as the number of months since the diagnosis of diabetes at DCCT baseline which was the time at participant enrollment (1983–1989). For the seven pairs matched on duration, four pairs were matched by duration quartiles (baseline duration 0–4 years, 4–8 years, 8–12 years, or 12–15 years) and three pairs were matched by duration halves (<8 years vs. ≥ 8 years). Matching by age was done similar to duration: four pairs by quartile (<21 years, 21–25 years, 26–31 years, and ≥31 years) and the remaining four by halves (<26 years vs. ≥26 years). The remaining seven LCLs were purchased from the Coriell Institute for Medical Research NIGMS Human Genetic Cell Repository (http://ccr.coriell.org/) (GM14581, GM14569, GM14381, GM07012, GM14520, GM11985, and GM07344). None of these individuals had a history of diabetes (nDM). The covariates available for these seven individuals were age and sex; male and female individuals were included. All of these individuals without diabetes were unrelated and of European ancestry (Grassi et al., 2016; Grassi et al., 2014Supplementary file 1d). All 22 cell lines underwent Hoechst staining to ensure they were free from mycoplasma contamination.

Culture conditions

Request a detailed protocol

All LCLs were maintained in conventional lymphocyte cell culture conditions of RPMI 1640 with 10% FBS in a 25 cm2 cell culture flask. The cells were incubated at 37°C in 5% CO2 and the media was changed twice each week. Prior to the experiments (below), lymphoblastoid cells following serum starvation were passaged for a minimum of 1 week in either SG RPMI 1640 (11 mM glucose) or HG RPMI media (30 mM glucose) (Caramori et al., 2012).

Gene expression profiling

Request a detailed protocol

Quality control from RNA extraction was performed using the Agilent bio-analyzer, processed using the Illumina TotalPrep−96 RNA Amplification Kit (ThermoFisher 4393543), hybridized to Illumina HT12v4 microarrays (Catalog number: 4393543), and scanned on an Illumina HiScan scanner (Du et al., 2008; Lin et al., 2008). For each of the 22 individuals, three biological replicates were profiled, with each sample assessed at both SG conditions (11 mM of glucose) and HG conditions (30 mM of glucose). Biological replicates were split from the same mother flask; cells were grown in separate flasks and run on different microarray plates on different days. Each biological replicate was generated from a separate frozen aliquot of that cell line. The gene expression profiling was performed in a masked fashion for both the case/control (PDR, nDR, and nDM) status of the individual and the glucose treatment (SG and HG) of the sample in order to reduce any bias.

Relative EBV copy number

Request a detailed protocol

Standard TaqMan qPCR was performed using EBV and NRF1 probes and primers (Choy et al., 2008). To calculate real-time PCR efficiencies a standard curve of 10 points of twofold dilution of 156.7 ng of gDNA was used from the Raji cell line (ATCC CCL-86). Probes were designed for the target, EBV, and a reference gene, NRF1. Final concentrations of the probes and primers were 657 nM and 250 nM, respectively. EBV probe: 5′6FAM-CCACCTCCACGTGGATCACGA-MGBNFGQ3′; EBV forward primer: 5′ GAGCGATCTTGGCAATCTCT; EBV reverse primer: 5′ AGTAGCCAGGCACCTACTGG; NRF1 probe: 5′VIC-CACTGCATGTGCTTCTATGGTAGCCA-MGBNFQ3′; NRF1 forward primer: 5′ ATGGAGGAACACGGAGTGAC; NRF1 reverse primer: 5′ CATCAGCTGCTGTGGAGTTG. Cycle number of crossing point versus DNA concentration were plotted to calculate the slope. The real-time PCR efficiency (E) was calculated according to the equation: E = 10 (−1/slope). Triplicates were done for each data point. Genomic DNA (78.3 ng) from each LCL was used in a standard TaqMan qPCR reaction with EBV as target gene and NRF1 as reference gene. The sequences and concentrations of the probes and primers were as shown above.

Growth rate measurement

Request a detailed protocol

LCLs were thawed and cultured in RPMI and 10% FBS until they reached over 85% cell viability. Cells were seeded in a T25 flask. Two replicates were performed per cell line. Cells were counted every day or every other day for 5–10 days and recorded.

Quality control for gene expression

Request a detailed protocol

The gene expression data comprised a total of 144 samples from 22 individuals (three replicates per individual and treatment, except for three individuals with five replicates). Gene expression was assessed in two conditions, SG and HG, and generated in four batch runs that were carefully designed to minimize potential batch effects. BeadChip data were extracted using GenomeStudio (version GSGX 1.9.0) and the raw expression and control probe data from the four different batches were preprocessed using a lumiExpresso function in the lumi R package version 2.38.0 (Grassi et al., 2011; Grassi et al., 2012) in three steps: (i) background correction (lumiB function with the bgAdjust method); (ii) variance stabilizing transformation (lumiT function with the log2 option); and (iii) normalization (lumiN function with the robust spline normalization [rsn] algorithm that is a mixture of quantile and loess normalization). To remove unexpressed probes, we applied a detection filter to retain probes with strong true signal by applying Illumina BeadArrays detection p-values <0.01 followed by removing probes that did not have annotated genes, resulting in a total of 15,591 probes.

Gene expression analysis

Request a detailed protocol

The study design is portrayed in Figure 5a. For a given individual Si (i = 1,…,22) and gene Gk (k = 1,...,15591), we calculated ∆i,k = HGi,k− SGi,k, where ∆ is the individual’s response to glucose (RG), HG is gene expression in high glucose culture, and SG is gene expression in standard glucose culture. All replicate data were fit using a mixed model that accounted for the correlation between repeated measures within individuals. The design matrix was constructed and analysis performed using the R version 3.5.1 package limma (Ritchie et al., 2015). We built a design matrix using the model.matrix function, and accounted for correlation between biological triplicates using limma’s duplicate correlation function. A mixed linear model was then fit that incorporates this correlation and ∆i,k using the lmFit function. PCA of gene expression was run with the prcomp function in R (Becker, 1988). For each gene, we calculated moderated t- and f-statistics and log-odds of expression by empirical Bayes moderation of the standard errors toward a common value. Differential response reflects fold change (FC) differences between matched individuals in the two groups in their paired response to glucose. The power to detect a 2 FC difference in gene expression between the two retinopathy groups (retinopathy vs. no retinopathy) (RGpdr–ndr) with a paired analysis given our sample size and using a type I error rate of 0.05 is 95% (as supported by our prior work Grassi et al., 2016).

Gene set enrichment analysis

Request a detailed protocol

GSEA was performed using pre-ranked gene lists (Subramanian et al., 2005). We ranked all analyzed genes based on sign (fold change) × (–log10(p-value)) (Grassi et al., 2012). Duplicated genes were removed. The gene ranking resulted in the inclusion of 11,579 genes. Enrichment statistics were calculated using rank weighting and the significance of enrichment was determined using permutations performed by gene set. The gene sets included c2.all.v6.0 and c5.all.v6.0. The minimum gene set size was 15 and the maximum gene set size was 500. GSEA was used to identify significant gene sets for the response to glucose in all study participants (RGall: nDM + PDR + nDR).

Expression quantitative trait loci

Request a detailed protocol

To determine if the genes showing a differential response in gene expression (RGpdr–ndr) is driven by germline genetic variation, we tested if the eQTLs for these genes are enriched for small diabetic retinopathy GWAS p-values (Grassi et al., 2011). We use the term ‘differential response gene’ for those genes identified by the RGpdr–ndr analysis. All statistically significant eSNPs (false discovery rate [FDR] threshold of ≤0.05) (single nucleotide polymorphisms, SNPs, corresponding to cis-eQTLs from the GTEx and EyeGEx datasets) were collated for the glucose response genes in any of the 48 GTEx (version 7) tissues and the retina (GTEx Consortium, 2015; Ratnapriya et al., 2019). We use the term eGene for any gene with at least one significant eSNP in any tissue.

Genome-wide association study

Request a detailed protocol

Meta-analysis p-values were ascertained from our prior GWAS for diabetic retinopathy (Grassi et al., 2011). The study assessed the genetic risk of sight threatening complications of diabetic retinopathy as defined by the presence of diabetic macular edema or PDR (cases) compared to those without (controls) in two large type 1 diabetes cohorts of 2829 total individuals (973 cases and 1856 controls) taken from the Genetics of Kidney in Diabetes (GoKinD) and the Epidemiology of Diabetes Interventions and Complications study (EDIC) cohorts.

We sought to determine whether there is enrichment of small p-values for diabetic retinopathy meta-GWAS among the significant eQTLs for the glucose response genes that show a significant differential glucose response between individuals with and without retinopathy (RGpdr–ndr). We used Benjamini–Hochberg adjusted p-values (FDR) to account for multiple testing given the high level of linkage disequilibrium between many eSNPs within an eQTL. SNPs from the three studies (expression, eQTL, and GWAS) were matched by mapping all SNPs to dbSNP v.147 (Grassi et al., 2013). We determined the corresponding FDR for each glucose response gene’s eSNPs in the diabetic retinopathy meta-GWAS. The Bonferroni correction was used to establish the threshold for significance. To assess enrichment, we first determined the observed proportion of meta-GWAS FDR values <0.05 among the statistically significant eQTLs of the glucose response genes (RGpdr–ndr). Next, we took 10,000 random samples of 103 GTEx eGenes (genes with an eQTL in any GTEx tissue) and identified corresponding eSNPs across all GTEx tissues. We calculated the GWAS FDR for these eSNPs and recorded the proportion of FDR values <0.05.

Validation for the association of glucose response gene eSNPs with diabetic retinopathy was performed in the UKBB GWAS (Supplementary file 1c; Sudlow et al., 2015). Only individuals of northern European ancestry were analyzed. Quality control excluded individuals who were outliers based on relatedness, exhibited an excess of missing genotype calls, had more heterozygosity than expected, or had sex chromosome aneuploidy. A total of 337,147 individuals were available for analysis. Case participants were defined as those who answered ‘yes’ to questionnaire data eyesight field 6148 ‘Diabetes related eye disease’ (n = 2332). Our prior work validated the utility of self-report for the presence of severe diabetic retinopathy (Grassi et al., 2013; Grassi et al., 2009). Control participants were defined as those who answered ‘yes’ to data field 2443 ‘Diabetes diagnosed by doctor’ (n = 14,680), excluding case participants. SNPs were excluded according to the following: minor allele frequency <0.004; missing rate >0.015; HWE p<1×10−10; INFO score <0.8. We performed logistic regression as implemented in Plink2 (Chang et al., 2015) on this set of cases and controls. The logistic regression, including the following covariates: first 10 genotype-based principal components, chromosomal sex (as defined by XX, XY status), age, type of diabetes, HbA1c, and genotyping array type.

Mendelian randomization

Request a detailed protocol

To explore a possible causal effect of increased FLCN expression on diabetic retinopathy, we employed Mendelian randomization (Davies et al., 2019). Effects were estimated with summary data-based Mendelian randomization analysis (Zhu et al., 2016) (SMR). We estimated the effect of increasing levels of FLCN expression on diabetic retinopathy in the UKBB GWAS for diabetic retinopathy (described above) utilizing 272 SNPs that were significant cis-eSNPs (FDR ≤ 0.05) for FLCN in retina and also in at least 20 GTEx tissues. A total of 246 SNPs remained after removing those SNPs or their proxies (r2 > 0.8) not genotyped in the UKBB. For each individual, the exposure was based on the genetically predicted gene expression of FLCN in retina and the outcome was the likelihood of having diabetic retinopathy. Heterogeneity in dependent instruments (HEIDI) (Zhu et al., 2016) was used to investigate the possibility of confounding bias from horizontal pleiotropy with 14 independent (r2 < 0.2) FLCN eQTLs. As multiple independent (r2 < 0.2, n = 14) FLCN eQTLs exist, we also employed multi-SNP Mendelian randomization to assess for an aggregated effect (Wu et al., 2018a) of the eQTLs on diabetic retinopathy mediated through FLCN expression.

FLCN localization in human donor eye retina

Request a detailed protocol

A whole eye from a 69-year-old Caucasian female post-mortem donor without diabetes was obtained from National Disease Research Interchange (NDRI). Findings were replicated in an additional five post-mortem donors without diabetes from the NDRI. The eye was cut in half in a horizontal plane, and each half was placed in an individual cassette. Samples were processed on ASP300 S automated tissue processor (Leica Biosystems) using a standard overnight processing protocol and embedded into paraffin blocks. Tissue was sectioned at 5 µm, and sections were de-paraffinized and stained on BOND RX autostainer (Leica Biosystems) following a preset protocol. In brief, sections were subjected to EDTA-based (BOND ER2 solution, pH9) antigen retrieval for 40 min at 100°C, washed, and incubated with protein block (Background Sniper, Biocare Medical, BS966) for 30 min at room temperature. For immunofluorescence (IF), sequential staining with rabbit polyclonal anti-FLCN antibody (1:50, Abcam #ab93196) and mouse monoclonal anti-CD31 antibody (1:50, DAKO, M0823) was conducted using goat-anti-rabbit Alexa-488 and goat-anti-mouse Alexa-555 secondary antibodies (Molecular Probes) for detection. DAPI (Invitrogen, #D3571) was used to stain nuclei. The slides were mounted with ProLong Diamond Antifade mounting media (ThermoFisher, #P36961). Images were taken at 20× magnification on Vectra three multispectral imaging system (Akoya Biosciences). A spectral library acquired from mono stains for each fluorophore (Alexa-488, Alexa-594), DAPI, and human retina background fluorescence slide was used to spectrally unmix images in InForm software (Akoya Biosciences) for visualization of each color.

Data availability

Source files and code for all the figures and tables have been provided, except for drawings, flowcharts and histopathology findings. We have also included links and references where appropriate. Figure 3 source data 5 and 6 are available on Dryad at https://doi.org/10.5061/dryad.zkh18938j. Additional data files can be found here: microarray expression data at Gene Expression Omnibus (GEO) under accession code GSE146615 and diabetic retinopathy GWAS data at UKBB archive (https://biobank.ctsu.ox.ac.uk/crystal/docs.cgi?id=1).

The following data sets were generated
    1. Skol A
    2. Jung SC
    3. Sokovic AM
    4. Chen S
    5. Fazal S
    6. Sosina O
    7. Borkar PP
    8. Lin A
    9. Sverdlov M
    10. Cao D
    11. Swaroop A
    12. Bebu I
    13. Stranger BE
    14. Grassi MA
    (2020) Dryad Digital Repository
    Data from: Integration of genomics and transcriptomics predicts diabetic retinopathy susceptibility genes.
    https://doi.org/10.5061/dryad.zkh18938j
    1. Sokovic AM
    2. Grassi MA
    (2020) NCBI Gene Expression Omnibus
    ID GSE146615. Mendelian randomization identifies FLCN expression as a mediator of diabetic retinopathy.

References

  1. Book
    1. Becker RA
    (1988)
    The New S Language
    Wadsworth & Brooks/Cole.
    1. Fritsche LG
    2. Igl W
    3. Bailey JN
    4. Grassmann F
    5. Sengupta S
    6. Bragg-Gresham JL
    7. Burdon KP
    8. Hebbring SJ
    9. Wen C
    10. Gorski M
    11. Kim IK
    12. Cho D
    13. Zack D
    14. Souied E
    15. Scholl HP
    16. Bala E
    17. Lee KE
    18. Hunter DJ
    19. Sardell RJ
    20. Mitchell P
    21. Merriam JE
    22. Cipriani V
    23. Hoffman JD
    24. Schick T
    25. Lechanteur YT
    26. Guymer RH
    27. Johnson MP
    28. Jiang Y
    29. Stanton CM
    30. Buitendijk GH
    31. Zhan X
    32. Kwong AM
    33. Boleda A
    34. Brooks M
    35. Gieser L
    36. Ratnapriya R
    37. Branham KE
    38. Foerster JR
    39. Heckenlively JR
    40. Othman MI
    41. Vote BJ
    42. Liang HH
    43. Souzeau E
    44. McAllister IL
    45. Isaacs T
    46. Hall J
    47. Lake S
    48. Mackey DA
    49. Constable IJ
    50. Craig JE
    51. Kitchner TE
    52. Yang Z
    53. Su Z
    54. Luo H
    55. Chen D
    56. Ouyang H
    57. Flagg K
    58. Lin D
    59. Mao G
    60. Ferreyra H
    61. Stark K
    62. von Strachwitz CN
    63. Wolf A
    64. Brandl C
    65. Rudolph G
    66. Olden M
    67. Morrison MA
    68. Morgan DJ
    69. Schu M
    70. Ahn J
    71. Silvestri G
    72. Tsironi EE
    73. Park KH
    74. Farrer LA
    75. Orlin A
    76. Brucker A
    77. Li M
    78. Curcio CA
    79. Mohand-Saïd S
    80. Sahel JA
    81. Audo I
    82. Benchaboune M
    83. Cree AJ
    84. Rennie CA
    85. Goverdhan SV
    86. Grunin M
    87. Hagbi-Levi S
    88. Campochiaro P
    89. Katsanis N
    90. Holz FG
    91. Blond F
    92. Blanché H
    93. Deleuze JF
    94. Igo RP
    95. Truitt B
    96. Peachey NS
    97. Meuer SM
    98. Myers CE
    99. Moore EL
    100. Klein R
    101. Hauser MA
    102. Postel EA
    103. Courtenay MD
    104. Schwartz SG
    105. Kovach JL
    106. Scott WK
    107. Liew G
    108. Tan AG
    109. Gopinath B
    110. Merriam JC
    111. Smith RT
    112. Khan JC
    113. Shahid H
    114. Moore AT
    115. McGrath JA
    116. Laux R
    117. Brantley MA
    118. Agarwal A
    119. Ersoy L
    120. Caramoy A
    121. Langmann T
    122. Saksens NT
    123. de Jong EK
    124. Hoyng CB
    125. Cain MS
    126. Richardson AJ
    127. Martin TM
    128. Blangero J
    129. Weeks DE
    130. Dhillon B
    131. van Duijn CM
    132. Doheny KF
    133. Romm J
    134. Klaver CC
    135. Hayward C
    136. Gorin MB
    137. Klein ML
    138. Baird PN
    139. den Hollander AI
    140. Fauser S
    141. Yates JR
    142. Allikmets R
    143. Wang JJ
    144. Schaumberg DA
    145. Klein BE
    146. Hagstrom SA
    147. Chowers I
    148. Lotery AJ
    149. Léveillard T
    150. Zhang K
    151. Brilliant MH
    152. Hewitt AW
    153. Swaroop A
    154. Chew EY
    155. Pericak-Vance MA
    156. DeAngelis M
    157. Stambolian D
    158. Haines JL
    159. Iyengar SK
    160. Weber BH
    161. Abecasis GR
    162. Heid IM
    (2016) A large genome-wide association study of age-related macular degeneration highlights contributions of rare and common variants
    Nature Genetics 48:134–143.
    https://doi.org/10.1038/ng.3448
    1. Wu L
    2. Shi W
    3. Long J
    4. Guo X
    5. Michailidou K
    6. Beesley J
    7. Bolla MK
    8. Shu XO
    9. Lu Y
    10. Cai Q
    11. Al-Ejeh F
    12. Rozali E
    13. Wang Q
    14. Dennis J
    15. Li B
    16. Zeng C
    17. Feng H
    18. Gusev A
    19. Barfield RT
    20. Andrulis IL
    21. Anton-Culver H
    22. Arndt V
    23. Aronson KJ
    24. Auer PL
    25. Barrdahl M
    26. Baynes C
    27. Beckmann MW
    28. Benitez J
    29. Bermisheva M
    30. Blomqvist C
    31. Bogdanova NV
    32. Bojesen SE
    33. Brauch H
    34. Brenner H
    35. Brinton L
    36. Broberg P
    37. Brucker SY
    38. Burwinkel B
    39. Caldés T
    40. Canzian F
    41. Carter BD
    42. Castelao JE
    43. Chang-Claude J
    44. Chen X
    45. Cheng TD
    46. Christiansen H
    47. Clarke CL
    48. Collée M
    49. Cornelissen S
    50. Couch FJ
    51. Cox D
    52. Cox A
    53. Cross SS
    54. Cunningham JM
    55. Czene K
    56. Daly MB
    57. Devilee P
    58. Doheny KF
    59. Dörk T
    60. Dos-Santos-Silva I
    61. Dumont M
    62. Dwek M
    63. Eccles DM
    64. Eilber U
    65. Eliassen AH
    66. Engel C
    67. Eriksson M
    68. Fachal L
    69. Fasching PA
    70. Figueroa J
    71. Flesch-Janys D
    72. Fletcher O
    73. Flyger H
    74. Fritschi L
    75. Gabrielson M
    76. Gago-Dominguez M
    77. Gapstur SM
    78. García-Closas M
    79. Gaudet MM
    80. Ghoussaini M
    81. Giles GG
    82. Goldberg MS
    83. Goldgar DE
    84. González-Neira A
    85. Guénel P
    86. Hahnen E
    87. Haiman CA
    88. Håkansson N
    89. Hall P
    90. Hallberg E
    91. Hamann U
    92. Harrington P
    93. Hein A
    94. Hicks B
    95. Hillemanns P
    96. Hollestelle A
    97. Hoover RN
    98. Hopper JL
    99. Huang G
    100. Humphreys K
    101. Hunter DJ
    102. Jakubowska A
    103. Janni W
    104. John EM
    105. Johnson N
    106. Jones K
    107. Jones ME
    108. Jung A
    109. Kaaks R
    110. Kerin MJ
    111. Khusnutdinova E
    112. Kosma VM
    113. Kristensen VN
    114. Lambrechts D
    115. Le Marchand L
    116. Li J
    117. Lindström S
    118. Lissowska J
    119. Lo WY
    120. Loibl S
    121. Lubinski J
    122. Luccarini C
    123. Lux MP
    124. MacInnis RJ
    125. Maishman T
    126. Kostovska IM
    127. Mannermaa A
    128. Manson JE
    129. Margolin S
    130. Mavroudis D
    131. Meijers-Heijboer H
    132. Meindl A
    133. Menon U
    134. Meyer J
    135. Mulligan AM
    136. Neuhausen SL
    137. Nevanlinna H
    138. Neven P
    139. Nielsen SF
    140. Nordestgaard BG
    141. Olopade OI
    142. Olson JE
    143. Olsson H
    144. Peterlongo P
    145. Peto J
    146. Plaseska-Karanfilska D
    147. Prentice R
    148. Presneau N
    149. Pylkäs K
    150. Rack B
    151. Radice P
    152. Rahman N
    153. Rennert G
    154. Rennert HS
    155. Rhenius V
    156. Romero A
    157. Romm J
    158. Rudolph A
    159. Saloustros E
    160. Sandler DP
    161. Sawyer EJ
    162. Schmidt MK
    163. Schmutzler RK
    164. Schneeweiss A
    165. Scott RJ
    166. Scott CG
    167. Seal S
    168. Shah M
    169. Shrubsole MJ
    170. Smeets A
    171. Southey MC
    172. Spinelli JJ
    173. Stone J
    174. Surowy H
    175. Swerdlow AJ
    176. Tamimi RM
    177. Tapper W
    178. Taylor JA
    179. Terry MB
    180. Tessier DC
    181. Thomas A
    182. Thöne K
    183. Tollenaar R
    184. Torres D
    185. Truong T
    186. Untch M
    187. Vachon C
    188. Van Den Berg D
    189. Vincent D
    190. Waisfisz Q
    191. Weinberg CR
    192. Wendt C
    193. Whittemore AS
    194. Wildiers H
    195. Willett WC
    196. Winqvist R
    197. Wolk A
    198. Xia L
    199. Yang XR
    200. Ziogas A
    201. Ziv E
    202. Dunning AM
    203. Pharoah PDP
    204. Simard J
    205. Milne RL
    206. Edwards SL
    207. Kraft P
    208. Easton DF
    209. Chenevix-Trench G
    210. Zheng W
    211. NBCS Collaborators
    212. kConFab/AOCS Investigators
    (2018b) A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast Cancer
    Nature Genetics 50:968–978.
    https://doi.org/10.1038/s41588-018-0132-x

Decision letter

  1. David E James
    Senior and Reviewing Editor; The University of Sydney, Australia

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This study addresses the genetic underpinnings of diabetic retinopathy one of the major complications of diabetes in humans. The approach used here was considered innovative. Rather than using traditional genetic approaches, such as genome wide association, the examination of glucose induced changes in gene expression in cell lines from people with type 1 diabetes either with or without retinopathy provided new insights into the disease. The identification of SNPs associated with such changes – so called eQTLs – followed by validation in independent human cohorts took the study one step beyond many others in the field. The further confirmation of FLCN as a mediator of diabetic retinopathy using Mendelian Randomization provided further confidence in the method. This study was considered to provide advances in our understanding of the mechanisms that lead to diabetic retinopathy in humans.

Decision letter after peer review:

Thank you for submitting your article "Mendelian randomization identifies folliculin expression as a mediator of diabetic retinopathy." for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by David James as the Senior and Reviewing Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

As the editors have judged that your manuscript is of interest, but as described below that additional experiments are required before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)

While your study was well performed, several rather serious issues were raised by the reviewers as detailed below and all of these points need to be addressed prior to further consideration by eLife. Also one reviewer noted that the title only focused on "Mendelian randomization", which is an overstatement of what is essentially only a gene expression study. In addition stating that RM " identifies folliculin expression as a mediator of diabetic retinopathy" is also an overstatement as the mediator effect is not shown. So it will be important to reword the title if you decide to undertake a substantial revision.

Summary:

This study investigates gene expression profiling related to diabetic retinopathy using several strategies including differential gene expression associated with response to glucose by comparing lymphoblastoid cell lines (LCLs) between cases (with retinopathy) and controls (without retinopathy) with type 1 diabetes. The study identified significant eQTLs from gene expression analysis and public gene expression databases and then tested significant eSNPs by the meta-analysis GWAS using independent cohorts. The expression of one gene, FLCN, to be a mediator of diabetic retinopathy by the Mendelian Randomization method was confirmed.

Essential revisions:

1) The whole paper and its conclusions are based on a very small number of samples and not supported by strong experimental data about causality. Overall, the small group of studied subjects present huge differences in duration of diabetes and glucose control the 2 main factors of retinopathy. Thus, it is unclear how to differentiate the biological effects of long term high glucose and their impact on retinopathy.

2) Based on the transcriptome analysis the conclusion "This finding suggests that chronic glucose exposure depresses cellular immune responsiveness and may explain in part the increased risk of infection found in patients with diabetes" is not based on evidence as authors selected transcripts of their choice and also causality is not shown.

"Individuals with diabetic retinopathy exhibit a differential transcriptional response to glucose". Note that the level of association shown (especially for PDGF) is somewhat marginal.

3) "Genes with differential response to glucose are implicated in the pathogenesis of diabetic retinopathy." This part is the most intriguing and original but it is based on expression in many tissues and thus the title is also overstated: it shows some kind of association but certainly not that the 103 genes "are implicated" in retinopathy.

4) "Folliculin (FLCN) is a putative diabetic retinopathy disease gene" this part is also interesting (and includes some in vivo experiments) but the original whole genome gene expression study did not detect FLNC as differentially expressed in the cell blood of the patients with retinopathy. Can the authors comment on this? One of the reviewers also mentioned that no SNPs near FLCN have been identified in diabetes (or complications) GWAS and this is somewhat worrying.

5) It is confusing that the authors used different selection criteria for gene identifications. In Results (subsection “Individuals with diabetic retinopathy exhibit a differential transcriptional response to glucose”), they identified 19 differentially response genes (P <0.05) between retinopathy cases and controls. However, they have selected the top 103 genes with P<0.01 (Results, subsection “Genes with differential response to glucose are implicated in the pathogenesis of diabetic retinopathy”) for further investigations. What is the gap between these two gene sets selection? I assume that the FLCN gene is in the top 103 gene set but not in the above 19 gene set. Explanations are needed for including specific genes for different analysis purposes.

6) The authors selected LCLs from individuals of 3 groups, non-diabetes (nDM), type 1 diabetes without retinopathy (nDR) and type 1 diabetes with proliferative diabetic retinopathy (PDR). The benefits of utilizing nDM samples in the analysis was not clear. Although both gene expression and GSEA methods were conducted, the results were not relevant to diabetic retinopathy. What is the purpose of including these samples? Similarly, it is not clear what the purpose of using the gene set enrichment analysis (GSEA) was because it seems that the authors performed most analyses to identify genetic components by gene-based or SNP-based methods in the manuscript.

7) The authors tested gene expression profile and associations using data from type 1 diabetic retinopathy. However, for the confirmation with UK BioBank (UKBB) data, they included all samples with both type 1 and type 2 diabetes. Did you perform the analysis stratified by the type of diabetes? Do you have any explanations of possible differences?

https://doi.org/10.7554/eLife.59980.sa1

Author response

Essential revisions:

1) The whole paper and its conclusions are based on a very small number of samples and not supported by strong experimental data about causality.

The gene expression analyses conducted in our work were performed for the purposes of discovery and hypothesis generation, which were then to be subsequently validated in the genomic studies. While we agree with the reviewer’s observation that the transcriptomic studies were conducted in a small group of individuals, it is important to note that the conclusions of the manuscript are not based on the gene expression findings themselves, but rather on the genomic analyses which incorporated close to twenty thousand individuals from three independent cohorts.

Genetics offers an excellent way to differentiate the biologic effects of long-term high glucose and its impact on retinopathy thereby disentangling causality, epiphenomenon and reverse causality (Pearl, J, Cambridge University Press, 2009; Davies et al., 2019). Recognizing that variation in the underlying genome precedes disease onset and can therefore be considered an instrumental variable, we integrated genetic analyses with gene expression and identified, through Mendelian randomization, potentially causal gene expression changes for retinopathy that avoid potential confounding factors such as duration of diabetes and level of glycemia.

Mendelian randomization provides support for causality in this manuscript. We agree with the reviewer’s point that further corroborating functional evidence from the standpoint of molecular and cell biological assays in vitro and in vivo is lacking. We have changed the language throughout the manuscript to reflect this point. We have also amended the title accordingly to “Integration of genomics and transcriptomics predicts diabetic retinopathy susceptibility genes”.

Overall, the small group of studied subjects present huge differences in duration of diabetes and glucose control the 2 main factors of retinopathy. Thus, it is unclear how to differentiate the biological effects of long term high glucose and their impact on retinopathy.

As expected, differences existed between the cases and controls used in the gene expression studies in the primary risk factors for the development of complications including the level of glycemia and duration of diabetes. As the reviewer notes, it would be unclear how to distinguish fundamentally the role of any identified gene expression differences in this situation as being potentially causal for the development of retinopathy as opposed to purely being an epiphenomenon or, potentially worse, a change that is due to reverse causality from the disease itself (e.g. elevated HbA1c on the gene expression profile). Therein, in our opinion, lies a major strength of this paper through its incorporation of a completely independent, orthogonal approach of genomic analysis. By itself, the gene expression study provides only an association which could be significantly confounded or biased from uncontrolled covariates. The strength of genetic analysis in this setting is that a DNA nucleotide and its association to a disease cannot be confounded by duration of diabetes or the level of glycemia. While gene expression can be altered by both of these covariates, the genome sequence cannot.

To control for potential confounding, a paired analysis compared gene expression between individuals with and without retinopathy who were matched based on age, sex, treatment group, cohort and diabetes duration. Due to the dominant role in the development of retinopathy played by glycemia, we were not able to match individuals between the two groups based on this covariate. The genetic analyses did however account for differences in level of glycemia by controlling for HbA1c. Likewise, given the key role that duration of diabetes plays in the development of retinopathy, this factor also was not able to be completely controlled through matching between the two groups for the gene expression experiments.

In response to the reviewer’s comment, we revised the text of the manuscript further emphasizing (particularly in the Materials and methods and the Results sections) the use of a paired study design with individuals in the gene expression studies matched for age, sex, treatment group cohort, and diabetes duration. For example, we specifically made the following revisions to the manuscript text:

a) We added the following to the description of matching in the “Cell Lines” subsection of the Materials and methods: “For the seven pairs matched on duration: 4 pairs were matched by duration quartiles (baseline duration 0-4 years, 4-8 years, 8-12 years, or 12-15 years) and 3 pairs were matched by duration halves (<8 years vs. >=8 years). Matching by age was done similarly to duration: four pairs by quartile (<21 years, 21-25 years, 26-31 years, >=31 years) and the remaining four by halves (<26 years vs. >=26 years).”

b) We revised the sentence “Differential expression is described using fold change (FC) while differential response reflects fold change (FC) differences between groups” to “Differential response reflects fold change (FC) differences between matched individuals in the two groups in their paired response to glucose”.

c) We have changed the sentence “As anticipated, notable differences were observed between individuals with and without retinopathy (PDR vs. nDR) for duration of diabetes (53 +/- 43.4 months vs. 27 +/- 13.4 months) and mean HbA1c (9.71 +/- 2.37 vs. 7.62 +/- 1.07), respectively, given their significant impact on retinopathy“ to “As anticipated, notable differences were observed between individuals with and without retinopathy (PDR vs. nDR) for mean duration of diabetes (53 ±43.4 months vs. 27 ± 13.4 months) as it was also not possible to completely match participant pairs for this covariate or for level of glycemia (HbA1c), mean HbA1c (9.71 ± 2.37 vs. 7.62 ± 1.07) given their significant impact on retinopathy”.

2) Based on the transcriptome analysis the conclusion "This finding suggests that chronic glucose exposure depresses cellular immune responsiveness and may explain in part the increased risk of infection found in patients with diabetes" is not based on evidence as authors selected transcripts of their choice and also causality is not shown.

"Individuals with diabetic retinopathy exhibit a differential transcriptional response to glucose". Note that the level of association shown (especially for PDGF) is somewhat marginal.

As noted by the reviewer, the transcriptome analysis is underpowered. We agree with the reviewer that the underpowered nature of the transcriptional analyses are insufficient in themselves, given the marginal levels of significance, to suggest more than a limited association, which is why we employed a completely independent orthogonal approach of genomic analysis to validate the gene expression findings. We have accordingly amended the manuscript so as to not overstate the significance of the gene expression findings:

We have changed the sentence “Conversely, genes that modulate the cellular response to infection were considerably down-regulated (type 1 Interferon, FDR < 0.0001; gamma Interferon, FDR < 0.0001; leukocyte chemotaxis genes, FDR < 0.0001) and may explain in part the increased risk of infection found in patients with diabetes…” to “Conversely, genes that modulate the cellular response to infection were considerably down-regulated (type 1 Interferon, FDR < 0.0001; gamma Interferon, FDR < 0.0001; leukocyte chemotaxis genes, FDR < 0.0001) potentially supporting earlier work that chronic glucose exposure depresses cellular immune responsiveness”.

The goal of the gene expression analysis was not to demonstrate causality, but rather as a cross-check to support the biological plausibility of the response to glucose gene expression assay. All validation was done through the genomic analyses. The preliminary gene expression studies were purely hypothesis-generating. Prior to undertaking the genomic analyses, we investigated whether assessing the gene expression response to glucose in lymphoblastoid cell lines provided biologically plausible findings. Assessment for biological plausibility was conducted at both the gene level, as well as the pathway level. The gene level analysis identified TXNIP, a highly glucose-inducible gene in multiple cell types. On the pathway level we performed gene set enrichment analyses and identified several pathways that were down-regulated at an FDR < or = 0.0001 including type 1 gamma interferon response and leukocyte chemotaxis. The identification of these pathways was done through an analysis of the entire transcriptome in a completely unbiased, agnostic fashion with GSEA. Associated pathways at an FDR threshold < or = 0.0001 were highlighted. We felt that the pathway analysis, in addition to the single-gene analysis, supported the biological plausibility of the lymphoblastoid cell line gene expression response to glucose assay, given the pre-existing literature support for depressed cellular responsiveness in diabetes.

3) "Genes with differential response to glucose are implicated in the pathogenesis of diabetic retinopathy." This part is the most intriguing and original but it is based on expression in many tissues and thus the title is also overstated: it shows some kind of association but certainly not that the 103 genes "are implicated" in retinopathy.

We agree that not all 103 genes are likely associated with retinopathy. We have changed the title of this subsection to “Genetic association reveals that some genes with differential response to glucose play a role in susceptibility to diabetic retinopathy” so as not to overstate the nature of the findings. We used eQTLs from multiple tissues from the GTEx Project, because eQTLs that are shared between tissues show stronger associations, specifically for complex-trait associations in the retina. Retina eQTLs are not currently present in GTEx, although 43% of retina eQTLs are shared with LCLs.

4) "Folliculin (FLCN) is a putative diabetic retinopathy disease gene" this part is also interesting (and includes some in vivo experiments) but the original whole genome gene expression study did not detect FLCN as differentially expressed in the cell blood of the patients with retinopathy. Can the authors comment on this?

In fact, the original whole genome expression study did detect differential expression of FLCN between individuals with proliferative diabetic retinopathy and individuals without diabetic retinopathy (log2FC difference = 0.27, P = 2.5x10-3). We agree with the reviewer that this point could be made clearer in the manuscript. Accordingly, we have added text to this section indicating that the differential FLCN expression can also be seen in Figure 2 and Supplementary file 1B of the manuscript in addition to Figure 4—figure supplement 2.

One of the reviewers also mentioned that no SNPs near FLCN have been identified in diabetes (or complications) GWAS and this is somewhat worrying.

In our earlier GWAS published in Human Molecular Genetics (Grassi M.A. et al., 2011) we identified an association with diabetic retinopathy for the SNP rs11867934. rs11867934 is an eSNP for FLCN that has shown an association with diabetic retinopathy in two independent cohorts: GoKinD (P=6.1 X 10-4), EDIC (P=3.3 X 10-3), and their meta-analysis (P= 7 X 10-6). In the present manuscript, Figure 4—figure supplement 3 and Figure 4B reveal an enrichment of small GWAS p-values associated for the eSNPs associated with FLCN expression. Figure 4—figure supplement 4 demonstrates replication of these findings in a third cohort, the UK Biobank. Together, these analyses show the validity and reproducibility of these findings in separate, independent large cohorts of individuals with diabetic retinopathy.

GWASes for diabetic retinopathy have been historically underpowered. Applying a gene-based eQTL approach that aggregates the effects of multiple variants to a single testing unit, the gene, increases study power to identify a novel disease-associated locus by reducing the multiple testing burden by at least two orders of magnitude. This has allowed us, in an innovative and novel fashion, to identify signals that have previously not been identified for diabetic retinopathy. Examining the genome-wide association signal for disease from eQTLs for a gene in aggregate can be a more powerful strategy to discern heterogeneous genetic signals than testing each of these SNPs individually. Collating all of the eSNPs for FLCN, we assessed the aggregate association of FLCN eSNPs to diabetic retinopathy and observed an enrichment with a true positive rate of 0.9 in the GoKinD EDIC meta-analysis and a true positive rate of 0.7 in the UK Biobank. Such an approach has never been done before for diabetic retinopathy and speaks to a strength of the study and the reason for the novelty of the findings.

5) It is confusing that the authors used different selection criteria for gene identifications. In Results (subsection “Individuals with diabetic retinopathy exhibit a differential transcriptional response to glucose”), they identified 19 differentially response genes (P <0.05) between retinopathy cases and controls. However, they have selected the top 103 genes with P<0.01 (Results, subsection “Genes with differential response to glucose are implicated in the pathogenesis of diabetic retinopathy”) for further investigations. What is the gap between these two gene sets selection? I assume that the FLCN gene is in the top 103 gene set but not in the above 19 gene set. Explanations are needed for including specific genes for different analysis purposes.

We agree with the reviewer. In order to address this point we have changed the volcano plot for Figure 2 to highlight all 103 genes (P < 0.01) (red dots) instead of the 19 originally highlighted and updated Supplementary file 1B to reflect this point by including a list of these 103 genes. Both Figure 2 and Supplementary file 1B include FLCN.

The 19 and 103 gene sets come from the same list of differential glucose response genes and rather reflect different parameters. One set simply reflects the 103 genes with P < 0.01, whereas the other set of 19 genes had P < 0.05 and a fold-change difference of > or =0.26 (thresholds which were solely chosen to highlight their position on the volcano plot as well as the pre-existing literature support for some of these genes in terms of their role in retinopathy and their biological plausibility). Both the 19 and 103 gene sets contain FLCN and were derived from the differential response to glucose gene expression analysis between individuals with and without retinopathy. The entire data set listing all of these genes can be found in the source files for the manuscript.

6) The authors selected LCLs from individuals of 3 groups, non-diabetes (nDM), type 1 diabetes without retinopathy (nDR) and type 1 diabetes with proliferative diabetic retinopathy (PDR). The benefits of utilizing nDM samples in the analysis was not clear. What is the purpose of including these samples?

We agree with the reviewer that the nDM individuals were not central to the main message of the study. We initially included the nDM individuals because prior to this experiment we did not know whether we would see any difference between the gene expression profiles of individuals with and without retinopathy and how this might differ simply between individuals with and without diabetes itself. Our earlier work (Grassi et al., 2016) that interrogated expression differences for a small subset of candidate genes in these groups suggested we might not see a difference between individuals with and without retinopathy. If a difference was indeed present in the individuals with retinopathy, it was also unclear whether that would be due to a result of the differences in diabetes severity (from the standpoint of known differences in levels of glycemia and duration of diabetes between individuals with and without retinopathy), or due to underlying genetic differences. For this reason, we decided it would be prudent to include a group with no diabetes (nDM) for comparison purposes. It was only after we performed the multidimensional scaling analysis in which we saw clustering based on gene expression, revealing differences in individuals with and without retinopathy (Figure 2—figure supplement 1), and saw that these differences were grounded in known biology of diabetes and complications (Figure 2—figure supplement 2), that we then narrowed our focus to disentangling and better discerning the genetic basis for these differences. Ultimately, we decided to keep the nDM samples in the manuscript because their inclusion was relevant to the first question of simply whether there is interindividual variation in the transcriptional response (RGall). Including the nDM individuals increased the power of the RGall analysis by increasing the sample size substantially. All of the findings from the analysis of the nDM group itself and in comparison to the other two groups can be found in the source files. We felt that including these specific analyses for the nDM group in the manuscript distracted from the primary findings presented for retinopathy.

Although both gene expression and GSEA methods were conducted, the results were not relevant to diabetic retinopathy. Similarly, it is not clear what the purpose of using the gene set enrichment analysis (GSEA) was because it seems that the authors performed most analyses to identify genetic components by gene-based or SNP-based methods in the manuscript.

GSEA was performed to support the biological plausibility of assessing glucose response in LCLs. The initial analyses assessing the response to glucose in the entire cohort (RGall, which included nDM, nDR and PDR) investigated whether there was any inter-individual variation in the transcriptional response to glucose (Figure 1—figure supplements 1-4), and whether the genes showing an inter-individual response to glucose were biologically plausible for diabetes and diabetic complications at both gene-level (e.g. TXNIP) (Figure 1A) and pathway-level (e.g. PDGF) (Figure 1B). Including the nDM individuals in this portion of the analysis increased the power of the study to identify these changes by increasing the sample size almost 50%.

7) The authors tested gene expression profile and associations using data from type 1 diabetic retinopathy. However, for the confirmation with UK BioBank (UKBB) data, they included all samples with both type 1 and type 2 diabetes. Did you perform the analysis stratified by the type of diabetes? Do you have any explanations of possible differences?

We decided to study individuals with both type 1 and type 2 diabetes from the UKBB to increase the study power as has been done in other consortia in which we have participated (Pollack et al., 2019; Meng, W et al., Acta Ophthalmologica, 2018). In order to account for the discrepancy between analyzing type 1 individuals (in GoKinD and EDIC) and type 2 individuals (in UKBB), we controlled for type of diabetes by including it as a covariate in our model for the GWAS in the UK Biobank. In fact, controlling for type of diabetes diminished the power of the UKBB GWAS (TPR 0.76 without vs. 0.73 with type of diabetes as a covariate) likely explaining some its decreased true positive rate compared to the GoKinD/EDIC meta-analysis. We did not perform a stratified analysis based on type of diabetes as the vast majority of individuals in the UK Biobank have type 2 diabetes. For our analysis, likely over 90% of cases had type 2 diabetes (8% had type 1 diabetes, but 15% were unspecified – the vast majority of which one would assume is likely type 2 diabetes); for controls, this number was even more pronounced with only 3% having type 1 diabetes. Hence, almost all of our association signals were found in individuals with type 2 diabetes. We were not sufficiently powered to reliably assess this signal only in individuals with type 1 diabetes. From a clinical and biological standpoint, we were most interested in identifying genes that predispose to retinopathy in the setting of diabetes, whether it be type 1, type 2 or unspecified.

https://doi.org/10.7554/eLife.59980.sa2

Article and author information

Author details

  1. Andrew D Skol

    Department of Pathology and Laboratory Medicine, Ann and Robert H. Lurie Children's Hospital of Chicago, Chicago, United States
    Contribution
    Data curation, Software, Formal analysis, Validation, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  2. Segun C Jung

    Research and Development, NeoGenomics Laboratories, Aliso Viejo, United States
    Contribution
    Data curation, Software, Formal analysis, Validation, Writing - review and editing
    Competing interests
    No competing interests declared
  3. Ana Marija Sokovic

    University of Illinois at Chicago, Chicago, United States
    Contribution
    Data curation, Software, Formal analysis, Validation
    Competing interests
    No competing interests declared
  4. Siquan Chen

    Cellular Screening Center, Office of Shared Research Facilities, The University of Chicago, Chicago, United States
    Contribution
    Investigation, Methodology
    Competing interests
    No competing interests declared
  5. Sarah Fazal

    Cellular Screening Center, Office of Shared Research Facilities, The University of Chicago, Chicago, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  6. Olukayode Sosina

    1. Department of Biostatistics, Johns Hopkins University, Baltimore, United States
    2. National Eye Institute, National Institutes of Health (NIH), Bethesda, United States
    Contribution
    Data curation, Formal analysis
    Competing interests
    No competing interests declared
  7. Poulami P Borkar

    University of Illinois at Chicago, Chicago, United States
    Contribution
    Resources, Investigation, Project administration, Writing - review and editing
    Competing interests
    No competing interests declared
  8. Amy Lin

    University of Illinois at Chicago, Chicago, United States
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
  9. Maria Sverdlov

    University of Illinois at Chicago, Chicago, United States
    Contribution
    Investigation, Methodology
    Competing interests
    No competing interests declared
  10. Dingcai Cao

    University of Illinois at Chicago, Chicago, United States
    Contribution
    Software, Formal analysis
    Competing interests
    No competing interests declared
  11. Anand Swaroop

    National Eye Institute, National Institutes of Health (NIH), Bethesda, United States
    Contribution
    Formal analysis, Supervision, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1975-1141
  12. Ionut Bebu

    The George Washington University, Biostatistics Center, Rockville, United States
    Contribution
    Resources, Software, Formal analysis
    Competing interests
    No competing interests declared
  13. DCCT/EDIC Study group

    Contribution
    Resources
  14. Barbara E Stranger

    Department of Pharmacology, Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Supervision, Validation, Visualization, Methodology, Writing - review and editing
    For correspondence
    barbara.stranger@northwestern.edu
    Competing interests
    No competing interests declared
  15. Michael A Grassi

    University of Illinois at Chicago, Chicago, United States
    Contribution
    Conceptualization, Resources, Formal analysis, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    grassim@uic.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8467-3223

Funding

National Eye Institute (R01EY023644)

  • Michael A Grassi

National Eye Institute (ZIAEY000546)

  • Anand Swaroop

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This research has been conducted using the UK Biobank Resource under Application Number 44316. We acknowledge the guidance and assistance provided by the members of the DCCT/EDIC committee at the time of this publication. A complete list of investigators and members of the Research Group appears in DCCT/EDIC Research Group et al., 2017. We thank the DNA Services Facility and the Research Histology and Tissue Imaging Core at UIC Research Resources Center for assistance in histological techniques and image acquisition. We thank Andrew D Paterson, MD, for helpful input and comments.

Senior and Reviewing Editor

  1. David E James, The University of Sydney, Australia

Publication history

  1. Received: June 13, 2020
  2. Accepted: November 6, 2020
  3. Accepted Manuscript published: November 9, 2020 (version 1)
  4. Version of Record published: December 10, 2020 (version 2)

Copyright

© 2020, Skol et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,636
    Page views
  • 171
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Computational and Systems Biology
    2. Genetics and Genomics
    Yekaterina Shulgina, Sean R Eddy
    Research Article Updated

    The genetic code has been proposed to be a ‘frozen accident,’ but the discovery of alternative genetic codes over the past four decades has shown that it can evolve to some degree. Since most examples were found anecdotally, it is difficult to draw general conclusions about the evolutionary trajectories of codon reassignment and why some codons are affected more frequently. To fill in the diversity of genetic codes, we developed Codetta, a computational method to predict the amino acid decoding of each codon from nucleotide sequence data. We surveyed the genetic code usage of over 250,000 bacterial and archaeal genome sequences in GenBank and discovered five new reassignments of arginine codons (AGG, CGA, and CGG), representing the first sense codon changes in bacteria. In a clade of uncultivated Bacilli, the reassignment of AGG to become the dominant methionine codon likely evolved by a change in the amino acid charging of an arginine tRNA. The reassignments of CGA and/or CGG were found in genomes with low GC content, an evolutionary force that likely helped drive these codons to low frequency and enable their reassignment.

    1. Genetics and Genomics
    2. Immunology and Inflammation
    Ilana Fox-Fisher et al.
    Research Article

    Blood cell counts often fail to report on immune processes occurring in remote tissues. Here we use immune cell type-specific methylation patterns in circulating cell-free DNA (cfDNA) for studying human immune cell dynamics. We characterized cfDNA released from specific immune cell types in healthy individuals (N=242), cross sectionally and longitudinally. Immune cfDNA levels had no individual steady state as opposed to blood cell counts, suggesting that cfDNA concentration reflects adjustment of cell survival to maintain homeostatic cell numbers. We also observed selective elevation of immune-derived cfDNA upon perturbations of immune homeostasis. Following influenza vaccination (N=92), B-cell-derived cfDNA levels increased prior to elevated B-cell counts and predicted efficacy of antibody production. Patients with Eosinophilic Esophagitis (N=21) and B-cell lymphoma (N=27) showed selective elevation of eosinophil and B-cell cfDNA respectively, which were undetectable by cell counts in blood. Immune-derived cfDNA provides a novel biomarker for monitoring immune responses to physiological and pathological processes that are not accessible using conventional methods.