A conditional gene-based association framework integrating isoform-level eQTL data reveals new susceptibility genes for schizophrenia

  1. Xiangyi Li
  2. Lin Jiang
  3. Chao Xue
  4. Mulin Jun Li
  5. Miaoxin Li  Is a corresponding author
  1. Program in Bioinformatics, Zhongshan School of Medicine and The Fifth Affiliated Hospital, Sun Yat-sen University, China
  2. Key Laboratory of Tropical Disease Control (Sun Yat-sen University), Ministry of Education, China
  3. Center for Precision Medicine, Sun Yat-sen University, China
  4. Research Center of Medical Sciences, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, China
  5. The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, China
  6. Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-sen University, China
11 figures, 6 tables and 2 additional files

Figures

The advantages of eDESE over ECS and DESE.

First, we proposed a new ECS and chose the best exponent c between 1 and 2 to properly control the type I error. Second, we first adopted three strategies to map SNP to genes to perform the unconditional gene-based association analysis with the improved ECS. Then the unconditional gene-based association analysis results were put into the conditional gene-based association analysis and the following iterative procedure. 5 kb: 5000 base pairs. 1 Mb: 106 base pairs.

Q-Q plots of the p-value of the ECS test under null hypothesis based on the two extreme exponents (i.e. 1 and 2).

(a), (b), and (c) represent the variant number of 50, 100, and 500, respectively.

The boxplots of the favorable c values at different simulation scenarios.

(a) Binary and continuous phenotypes; (b) Different sample sizes; (c) Different variant number.

Q-Q plots of the conditional, unconditional gene-based association test and likelihood-ratio test under the null hypothesis.

(a) and (d) two gene-variant pairs with the similar variant number (SIPA1L2 with 29 variants and LOC729336 with 30 variants). (b) and (e) two gene-variant pairs with the different variant number, and the first is larger than the second (CACHD1 with 41 variants and RAVER2 with eight variants). (c) and (f) Two gene-variant pairs with the different variant number, and the second is larger than the first (LOC647132 with five variants and FAM5C with 48 variants). (a), (b) and (c) The former gene has no QTL, and QTL explained 0.5 % of heritability in the latter gene. (d), (e) and (f) The former gene has no QTL, and QTL explained 1 % of heritability in the latter gene. Ten thousand phenotype datasets were simulated for each scenario. Unconditional Eff. Chi. (the red) represents unconditional association analysis at the former gene by the improved ECS. Conditional Eff. Chi (the blue) represents conditional association analysis at the former gene conditioning on the latter gene by the improved ECS. The likelihood ratio test (the yellow) was conducted based on the nested linear regression models.

Q-Q plots of the conditional gene-based association test and likelihood-ratio test at different representative gene-variant pairs.

The variant number of the two gene-variant pairs involved in (a)-(f) are the same as that in Figure 4 legend. The difference is: in (a)-(c), the QTL in either gene (former and latter) explained 0.25% of heritability. In (d)-(f), the QTL in either gene explained 0.5% of heritability. One thousand phenotype datasets were simulated for each scenario.

The comparison of the potential susceptibility genes for schizophrenia identified by MAGMA and eDESE:dist.

(a) The Venn diagram shows the overlapped and unique genes identified by MAGMA and eDESE:dist. (b) The bar plot shows the top GO enrichment terms of the overlapped genes. MF: Molecular Function of GO. BP: Biological Process terms of GO. CC: Cellular Component terms of GO. The x-axis label represents the top ( ≤ 10) significant GO enrichment terms (MF, BP, and CC). The y-axis label represents the negative log10 of the adjusted p-value of each term. See also Figure 6—source data 1, Figure 6—source data 2 and Figure 6—source data 3.

Figure 6—source data 1

Significant genes identified by MAGMA.

https://cdn.elifesciences.org/articles/70779/elife-70779-fig6-data1-v1.xlsx
Figure 6—source data 2

Significant genes identified by eDESE:dist.

https://cdn.elifesciences.org/articles/70779/elife-70779-fig6-data2-v1.xlsx
Figure 6—source data 3

Enrichment results of the common 105 genes identified by MAGMA and eDESE:dist.

https://cdn.elifesciences.org/articles/70779/elife-70779-fig6-data3-v1.xlsx
Comparison of the potential susceptibility genes identified by S-PrediXcan, eDESE:gene, and eDESE:isoform.

(a) The bar plot shows the count of potential susceptibility genes in each of the five optimized brain regions. (b) The Venn diagram shows the count of the overlapped and unique genes identified by S-PrediXcan, eDESE:gene and eDESE:isoform in the five optimized brain regions. See also Figure 7—source data 1 and Figure 7—source data 2.

Figure 7—source data 1

The count of potential susceptibility genes in each of the five optimized brain regions.

https://cdn.elifesciences.org/articles/70779/elife-70779-fig7-data1-v1.xlsx
Figure 7—source data 2

The potential susceptibility genes identified by S-PrediXcan, eDESE:gene and eDESE:isoform in the five optimized brain regions.

https://cdn.elifesciences.org/articles/70779/elife-70779-fig7-data2-v1.xlsx
The GO enrichment terms of the genes in the consensus module (colored ‘turquoise’).

The x-axis label represents the top (≤10) significant GO enrichment terms (in MF, BP, and CC). The y-axis label represents the negative log10 of the adjusted p-value of each term. See also Figure 8—source data 1 and Figure 8—source data 2.

Figure 8—source data 1

Genes in the consensus module (colored "turquoise").

https://cdn.elifesciences.org/articles/70779/elife-70779-fig8-data1-v1.xlsx
Figure 8—source data 2

Enrichment results of the genes in the consensus module (colored ‘turquoise’).

https://cdn.elifesciences.org/articles/70779/elife-70779-fig8-data2-v1.xlsx
The effect sizes of variants to the correlation of chi-squares.

(a) Under the additive model; (b) Under the multiplicative model. The allele frequencies are assigned randomly.

Author response image 1
The tissue importance generated by eDESE:gene based on the tissue-specific eQTLs.

(a) Brain-Front Cortex (BA9); (b) Brain-Cerebellum. The red dotted lines denote the significant threshold.

Author response image 2
Q-Q plots of the conditional, unconditional gene-based association test and likelihood-ratio test under the null hypothesis (c=1.8).

(a) and (d), two gene-variant pairs with the similar variant number (SIPA1L2 with 29 variants and LOC729336 with 30 variants). (b) and (e): two gene-variant pairs with different variant numbers, and the first is larger than the second (CACHD1 with 41 variants and RAVER2 with eight variants). (c) and (f): two gene-variant pairs with different variant numbers, and the second is larger than the first (LOC647132 with five variants and FAM5C with 48 variants). (a), (b) and (c): the former gene has no QTL, and QTL explained 0.5% of heritability in the latter gene. (d), (e) and (f): the former gene has no QTL, and QTL explained 1% of heritability in the latter gene. Ten thousand phenotype datasets were simulated for each scenario. Unconditional Eff. Chi. (the red) represents unconditional association analysis at the former gene by the improved ECS. Conditional Eff. Chi (the blue) represents conditional association analysis at the former gene conditioning on the latter gene by the improved ECS. The likelihood ratio test (the yellow) was conducted based on the nested linear regression models.

Tables

Table 1
Type I error and power of different simulation scenarios in association analysis.
ScenariosImportant parametersBinary traitContinuous trait
EgVgpVgeAllVarIsoeQTLGeneQTLGen3eQTLGen6eQTLAllVarIsoeQTLGeneQTLGen3eQTLGen6eQTL
Type I error
1000.0500000.002000.0010.0020.003
2000.1500.0020.001000000.0010
3000.30000.002000.0010.0010.0020.002
Power
400.0050.050.2510.0360.0220.0340.0310.2460.0320.0190.0320.038
500.0050.150.2190.0210.0130.0230.0320.3010.0250.0170.0370.043
600.0050.30.2290.0280.0170.0210.0340.2820.0240.0170.0250.039
70.100.0500.0170.0190.0060.00100.0170.0270.0090.001
80.100.1500.2130.2210.1130.0540.0020.2450.2450.1320.068
90.100.30.0180.7040.6590.5810.3880.0270.720.6860.6070.446
100.10.0050.050.2880.0520.0760.0430.0430.3130.0630.0910.050.041
110.10.0050.150.4030.330.3020.1990.1340.460.3570.3340.2290.136
120.10.0050.30.5690.7780.7380.6770.4850.620.8050.7740.7120.512
  1. Eg denotes the effect size of gene expression on phenotype. Vgp denotes phenotype variance explained by all variants. Vge denotes gene expression variance explained by all variants.

Table 2
The result about whether the brain was optimized as the schizophrenia-associated tissue based on each brain region’s gene/isoform-level eQTLs.
Brain regionsGene-level eQTLIsoform-level eQTL
Brain-Anterior cingulate cortex (BA24)YesYes
Brain-CerebellumYesYes
Brain-Frontal Cortex (BA9)YesYes
Brain-HippocampusYesYes
Brain-Spinal cord (cervical c-1)YesYes
Brain-AmygdalaYesNo
Brain-Caudate (basal ganglia)YesNo
Brain-Cerebellar HemisphereYesNo
Brain-CortexYesNo
Brain-HypothalamusYesNo
Brain-Nucleus accumbens (basal ganglia)YesNo
Brain-Putamen (basal ganglia)NoNo
Brain-Substantia nigraYesNo
  1. “Yes” denotes that brain (i.e., all thirteen brain tissues) was estimated as the significantly schizophrenia-associated tissue based on the gene/isoform-level eQTLs of the tissue. “No” denotes the contrary. The font names of the optimized brain regions are bold. See also Table 2—source data 1, Table 2—source data 2, Table 2—source data 3 and Table 2—source data 4.

Table 2—source data 1

Tissue significance estimated by eDESE:dist based on the gene-level expression profiles.

https://cdn.elifesciences.org/articles/70779/elife-70779-table2-data1-v1.xlsx
Table 2—source data 2

Tissue significance estimated by eDESE:dist based on the isoform-level expression profiles.

https://cdn.elifesciences.org/articles/70779/elife-70779-table2-data2-v1.xlsx
Table 2—source data 3

Tissue significance estimated by eDESE:gene based on the gene-level eQTLs of each brain region.

https://cdn.elifesciences.org/articles/70779/elife-70779-table2-data3-v1.xlsx
Table 2—source data 4

Tissue significance estimated by eDESE:isoform based on the isoform-level eQTLs of each brain region.

https://cdn.elifesciences.org/articles/70779/elife-70779-table2-data4-v1.xlsx
Table 3
The important examples of potential susceptibility genes exclusively predicted by eDESE:isoform.
Gene name# of hits in PubMed
RGS4> 100
TCF4> 100
RANGAP1> 100
GRIA180
GRM376
TSPO39
TPH235
FEZ131
ZDHHC824
VRK223
KCNN320
NCAM120
MIP15
SLC39A814
DLG114
BDNF-AS13
FGA13
ADRA1A12
MAPT10
Table 3—source data 1

The PubMed search hits of the unique potential susceptibility genes of schizophrenia identified by eDESE:isoform (compared with S-PrediXcan and eDESE:gene).

https://cdn.elifesciences.org/articles/70779/elife-70779-table3-data1-v1.xlsx
Table 4
The GO enrichment terms of the potential susceptibility genes in each optimized brain region identified by S-PrediXcan, eDESE:gene and eDESE:isoform.
Tissue name*S-PrediXcaneDESE:geneeDESE:isoform
Brain-Anterior cingulate cortex (BA24)-Regulation of gap junction assembly (BP)Potassium ion transmembrane transporter activity (MF); potassium: chloride symporter activity (MF); nervous system development (BP); generation of neurons (BP); neurogenesis (BP)
Brain-CerebellumIon binding (MF); cation binding (MF); metal ion binding (MF); intracellular organelle (CC)Dendrite (CC); dendritic tree (CC); neuron projection (CC); postsynapse (CC); synapse (CC)Voltage-gated calcium channel activity involved in cardiac muscle cell action potential (MF); nervous system development (BP)
Brain-Frontal Cortex (BA9)Intracellular organelle (CC); intracellular membrane-bounded organelle (CC)Somatodendritic compartment (CC); dendrite (CC); dendritic tree (CC); synaptic vesicle membrane (CC); exocytic vesicle membrane (CC)-
Brain-HippocampusIon binding (MF)-Postsynaptic density (CC); asymmetric synapse (CC)
Brain-Spinal cord (cervical c-1)-High voltage-gated calcium channel activity (MF); voltage-gated calcium channel activity involved in AV node cell action potential (MF); regulation of B cell tolerance induction (BP); positive regulation of B cell tolerance induction (BP); L-type voltage-gated calcium channel complex (CC)Nitrogen compound transport (BP); organelle (CC)
  1. *

    MF: Molecular Function terms of GO. BP: Biological Process terms of GO. CC: Cellular Component terms of GO.

Table 5
The target genes of the antipsychotics predicted as the potential susceptibility genes by MAGMA, S-PrediXcan, and eDESE.
Target geneModels
DRD2eDESE:dist & eDESE:gene & eDESE:isoform & MAGMA
ADRA1AeDESE:isoform
CHRM3eDESE:gene & eDESE:isoform
CHRM4eDESE:gene & eDESE:isoform& MAGMA
OPRD1eDESE:dist
GABRDeDESE:isoform
CYP2D6eDESE:gene & eDESE:isoform& MAGMA & S-PrediXcan
Table 6
The enrichment of drug-gene interaction terms in DGIdb for the susceptibility genes identified by MAGMA, S-PrediXcan and eDESE.
Models# of antipsychotics-gene interaction terms# of total drug-gene interaction termsEnrichment p*
MAGMA579371.62e-11
S-PrediXcan124520.33
eDESE:dist342791.56e-15
eDESE:gene569681.65e-10
eDESE:isoform701,1048.74e-15
  1. *

    Enrichment p denotes the p-value of the hypergeometric distribution test. See also Table 6—source data 1, Table 6—source data 2, Table 6—source data 3, Table 6—source data 4, and Table 6—source data 5.

Table 6—source data 1

The drug-gene interaction term results of the potential susceptibility genes of schizophrenia identified by MAGMA in DGIdb.

https://cdn.elifesciences.org/articles/70779/elife-70779-table6-data1-v1.xlsx
Table 6—source data 2

The drug-gene interaction term results of the potential susceptibility genes of schizophrenia identified by S-PrediXcan in DGIdb.

https://cdn.elifesciences.org/articles/70779/elife-70779-table6-data2-v1.xlsx
Table 6—source data 3

The drug-gene interaction term results of the potential susceptibility genes of schizophrenia identified by eDESE:dist in DGIdb.

https://cdn.elifesciences.org/articles/70779/elife-70779-table6-data3-v1.xlsx
Table 6—source data 4

The drug-gene interaction term results of the potential susceptibility genes of schizophrenia identified by eDESE:gene in DGIdb.

https://cdn.elifesciences.org/articles/70779/elife-70779-table6-data4-v1.xlsx
Table 6—source data 5

The drug-gene interaction term results of the potential susceptibility genes of schizophrenia identified by eDESE:isoform in DGIdb.

https://cdn.elifesciences.org/articles/70779/elife-70779-table6-data5-v1.xlsx

Additional files

Supplementary file 1

For eDESE analyses.

(a) The schizophrenia potential susceptibility genes identified by eDESE:dist. (b) The GO enrichment results based on the overlapped susceptibility genes identified by MAGMA and eDESE:dist. (c) The PubMed search hits of the schizophrenia potential susceptibility genes identified by MAGMA and eDESE:dist. (d) Tissue significances generated by eDESE:dist based on the gene-level and isoform-level expression profiles. (e) Tissue significances generated by eDESE:gene/isoform using the gene-level and isoform-level eQTLs of Muscle Skeletal and Skin Sun Exposed Lower Leg. (f-h) The schizophrenia potential susceptibility genes identified by S-PrediXcan, eDESE:gene and eDESE:isoform, respectively. (i-k) The PubMed search hits of the combined susceptibility gene set of schizophrenia identified by S-PrediXcan, eDESE:gene and eDESE:isoform, respectively. (l) The GO enrichment results based on the susceptibility genes exclusively predicted by eDESE:isoform. (m) The genes in the consensus module (colored "turquoise") of the brain weighted gene co-expression network. (n) The potential susceptibility isoforms of the 55 overlapped genes. (o) The FDA-approved antipsychotics included in the DGIdb (v4.2.0).

https://cdn.elifesciences.org/articles/70779/elife-70779-supp1-v1.xlsx
Transparent reporting form
https://cdn.elifesciences.org/articles/70779/elife-70779-transrepform1-v1.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Xiangyi Li
  2. Lin Jiang
  3. Chao Xue
  4. Mulin Jun Li
  5. Miaoxin Li
(2022)
A conditional gene-based association framework integrating isoform-level eQTL data reveals new susceptibility genes for schizophrenia
eLife 11:e70779.
https://doi.org/10.7554/eLife.70779