Abstract

There is growing recognition that co-morbidity and co-occurrence of disease traits are often determined by shared genetic and molecular mechanisms. In most cases, however, the specific mechanisms that lead to such trait–trait relationships are yet unknown. Here we present an analysis of a broad spectrum of behavioral and physiological traits together with gene-expression measurements across genetically diverse mouse strains. We develop an unbiased methodology that constructs potentially overlapping groups of traits and resolves their underlying combination of genetic loci and molecular mechanisms. For example, our method predicts that genetic variation in the Klf7 gene may influence gene transcripts in bone marrow-derived myeloid cells, which in turn affect 17 behavioral traits following morphine injection; this predicted effect of Klf7 is consistent with an in vitro perturbation of Klf7 in bone marrow cells. Our analysis demonstrates the utility of studying hidden causative mechanisms that lead to relationships between complex traits.

https://doi.org/10.7554/eLife.04346.001

eLife digest

Many individuals who have diabetes also have other diseases that affect the heart and blood vessels. It is not uncommon for human diseases to occur together like this; and understanding the relationships between diseases and other traits can make it easier to diagnose conditions. Furthermore, it can also help researchers develop treatments that are more precisely targeted to each condition and cause fewer side effects.

Two conditions or traits tend to occur together if they are caused by mutations in the same gene or genes; or if they involve processes within cells that share the same proteins and other molecules. However, in most cases the genes and molecular mechanisms involved are not yet known so it is more difficult to work out how the traits are connected.

Computing techniques make it possible to assess the relationships between hundreds or thousands of traits at the same time. These high volume analyses can also allow scientists to identify less obvious relationships that might be missed in more traditional types of study.

Here, Oren et al. created a new computer algorithm to identify related traits, their shared genetic basis, and the molecular mechanisms behind them. The algorithm is called GEMOT and uses a three-step approach to sift through a large amount of data. Oren et al. tested GEMOT using a database of 1738 documented traits—including diseases and behaviors—in laboratory mice.

Oren et al. identified many clusters of traits in the mice and the underlying genetic and molecular mechanisms that link them. For example, they found that a mutation in a gene called Klf7 affected the expression of other genes that are involved in making new cells in the bone marrow. In turn, these changes influenced 17 different behaviors in the mice after they were injected with the painkiller morphine.

In humans, the same genes that underlie behaviors related to morphine treatment have been linked to the survival rate of patients with a form of brain cancer. This suggests that—alongside providing pain-relief—morphine may influence how the tumor grows. The algorithm developed by Oren et al. can now be used to further explore the impact of the environment on the relationships between traits.

https://doi.org/10.7554/eLife.04346.002

Introduction

Epidemiological and clinical research has identified a profusion of correlated physiological traits, as well as a remarkably high incidence of co-occurrence and comorbidity among disorders. Various studies have shown that such connections among diseases are typically attributable to a common underlying genetic or molecular mechanism (Rzhetsky et al., 2007; Oti et al., 2008; Barabási et al., 2011; Cotsapas et al., 2011; Lee et al., 2012; Cross-Disorder Group of the Psychiatric Genomics Consortium, 2013). Disclosure of unexpected relationships among disease phenotypes and understanding of their common mechanisms opens the way to improved disease classification and treatment. In particular, it may allow a drug approved for one disease to be used for the treatment of another disease, and provide us with the means to anticipate undesired off-target effects (e.g., Ashburn and Thor, 2004; Dudley et al., 2011).

Advanced computational methods have made it possible to study the mechanisms underlying trait connections in an unbiased manner. One approach is to derive trait connectivity based on trait–trait comorbidity, co-occurrence, and correlations (Hidalgo et al., 2009; Shi et al., 2010; Blednov et al., 2012). As an example, Figure 1—figure supplement 1A illustrates a sample network and Figure 1—figure supplement 1B depicts a group of correlated traits in this network. Relying entirely on trait information, however, makes it difficult to identify the shared mechanisms and to distinguish shared molecular mechanisms from shared environmental influences. Alternatively, a common way to improve predictions is by integrating relationships between genes and traits, using gene–trait correlations, associations, or causal mutations (Rzhetsky et al., 2007; Cotsapas et al., 2011; Baker et al., 2012; Hwang et al., 2012; Gat-Viks et al., 2013). Such pairwise gene–trait connections were used to construct two-layer clusters (‘biclusters’) consisting of groups of traits linked to the same group of genes. For example, Figure 1—figure supplement 1C depicts a bicluster for the sample network in Figure 1—figure supplement 1A. Notably, although such ‘gene-based’ approaches provide a list of putative non-environmental mechanisms, their utilization has two major drawbacks. First, these approaches assign a single mechanistic layer whereas in fact what is affected by genetic variation is a number of molecular components (such as transcripts), which affect the physiological traits secondarily; thus, the model should include a series of mechanistic layers and the propagation of information between them. Secondly, gene-based approaches do not distinguish between reactive, independent, and causative relationships, whereas molecular components should be related causatively to the group of traits. For example, although transcript g2 is reactive to the p4 trait but does not cause it (Figure 1—figure supplement 1A), it is still part of a two-layer model (Figure 1—figure supplement 1C). Thus, a valid and reliable methodology for understanding similarities among distinct traits should identify a series of layers and causative relationships.

We have developed GEMOT, a methodology for constructing a causative model of trait–trait connections. GEMOT addresses the above challenges by constructing three-layer modules in which each module consists of a group of molecular mechanisms (here, gene transcripts within a ‘transcripts layer’) translating between a genetic layer and a layer of traits. In particular, a GEMOT module is focused specifically on causative, non-environmental relationships rather than on relationships of any kind, and accordingly it includes only transcripts that are part of causative relationships (referred to as ‘driver transcripts’, see Figure 1A, Figure 1—figure supplement 1D). Naively, systematic identification of GEMOT modules could be obtained by a detailed reconstruction of all relationships among variants, traits and transcripts (e.g., Neto et al., 2010; Hageman et al., 2011; Wang and van Eeuwijk, 2014). However, such a detailed reconstruction is a major computational and statistical burden, especially considering the large number of components. This problem is avoided in GEMOT by the use of a stepwise heuristic approach that drastically reduces the search space and allows scalability to large networks.

Figure 1 with 2 supplements see all
Schematic illustration of the GEMOT algorithm.

(A) An overview of GEMOT for the scenario depicted in Figure 1—figure supplement 1A. GEMOT incorporates 3 stages (I, II and III) that are schematically described in BD, EF and GI, respectively. Stage I: (B) High and low link-potential scores for pairs of variants and traits. A typical calculation for variant–trait pairs (left: v1p1, right: v1p2). Shown are variant–transcript associations (y-axis) and transcript–trait correlations (x-axis) for each transcript (black dots). GEMOT uses a threshold (horizontal dashed line) to compare the distribution of transcript–trait correlation in high and low transcript–variant associations scores. Link potential is high (left) when the distributions of correlation differ between the high and low association range more than expected by chance; link potential is low (right) where no difference is observed. Notably, a high link potential reflects the potential that some transcripts may translate between a variant and a trait, although such transcripts are not yet specified. (C) Bipartite graph construction. GEMOT constructs a graph whose two parts are variants (squares) and traits (diamonds); edge weights are the link-potential scores (solid lines, high; dashed lines, low). (D) Bipartite module identification. GEMOT identifies ‘heavy’ biclusters in the bipartite graph (in this example, 1 module). Stage II: (E) Transcript link score. The input is provided by all calculated correlation and association scores (such as the three plots on the left). On the right: given a transcript, GEMOT aggregates and ranks all its scores in a horizontal track (red, correlations; blue, associations) and uses the distribution of ranks to score the transcript for significance. (F) Tripartite module construction. GEMOT adds high-scoring transcripts from E, referred to as ‘candidate transcripts’ (circles), to the module. Stage III: (G) Causality p value scores. GEMOT uses a statistical score to assess causative relationships (blue, significant; white, non-significant) for each transcript (row) and trait (column) in a module. Non-causative relationships attain non-significant scores (cartoon examples on the left). (H) Module refinement. Starting with the causality p value scores for the tripartite module (from G), GEMOT eliminates non-causative transcripts (left) and non-affected traits (right) in an iterative manner. (I) The resulting GEMOT module (arrows, causative relationships).

https://doi.org/10.7554/eLife.04346.003

We applied GEMOT to a large dataset of 1738 traits spanning a broad spectrum of physiological, biochemical, clinical, and behavioral traits that were measured across the genotyped BXD mouse recombinant inbred strains (a cross between the C57BL/6J and DBA/2J strains [Wang et al., 2003]). We used measurements of transcript levels in bone marrow-derived myeloid cells (Gerrits et al., 2009). The modules were used to uncover shared driver transcripts underlying collections of closely related or distinct traits. Notably, many of the findings were supported by independent knowledge or data. We also demonstrated the tissue specificity of modules, based on a post-processing analysis of gene expression in additional tissues. Our study highlights the power of causative reconstruction combined with grouping of complex traits to reveal a comprehensive picture of phenome connections.

Results

The GEMOT algorithm

Global identification of traits that share common causal transcripts and genetic mechanisms requires a reliable reconstruction of a global causal network among variants, transcripts and traits—a notoriously difficult problem, particularly in the case of a large collection of traits and high throughput data. We designed GEMOT, a scalable algorithm for the systematic construction of three-layer modules, each consisting of a group of traits, their shared causal driver transcripts and a genetic variant. GEMOT is based on the notion of ‘linked relationships’ between a variant, a transcript, and a trait. Such relationships incorporate a transcript that is both associated with a variant and correlated with a trait, without a direct evaluation of the causative relationship between the three components. In particular, it relies on the observation that causative relationships are expected also to be linked relationships (but not vice versa). It is therefore possible to start by constructing candidate modules based on the potential of variants and traits to be linked through transcripts. The internal structure is then constructed within each of these modules. Notably, the linked relationships are exploited to avoid global reconstruction of the particular types of relationships, which are then confirmed only at the validation stage.

GEMOT's input is a collection of traits, genotyping, and molecular data across a population of individuals. GEMOT consists of three main stages (see ‘Materials and methods’, Figure 1A). In stage I, GEMOT identifies ‘bipartite modules’ consisting of a single genomic interval and multiple traits that are connected by linked relationships through certain transcripts (e.g., traits p1, p3, p4 and variant v1 in Figure 1A, left). In stage II, candidate transcripts are added to the modules solely on the basis of their linked relationships (e.g., candidate transcripts g1g3, Figure 1A, middle); the resulting ‘tripartite modules’, however, are not limited to causative relationships. Finally, in stage III, GEMOT refines the tripartite modules by investigating the causal relationships within them and eliminating non-causative transcripts. The output GEMOT module, therefore, finally consists of the validated driver transcripts, the trait(s) they affect, and a single causal genomic interval (e.g., driver transcripts g1, g3; traits p1, p4; variant v1 in Figure 1A, right). Overall, each of the first two stages is aimed at filtering relevant objects for the next stage: candidate modules are selected on the basis of their potential to include candidate transcripts (stage I), and candidate transcripts are selected on the basis of an efficient score that is expected to be high in true driver transcripts (stage II). The final stage (stage III) is aimed at validating the causative relationships in the candidate modules from the previous stage. In the following we provide additional details about the three GEMOT stages.

  • I. We start by calculating associations between each transcript and each variant (a ‘variant–transcript association’) and correlations between each transcript and each trait (a ‘transcript–trait correlation’). We combine these two measures in a statistical test to identify variant–trait pairs that have high potential to be linked through many transcripts. We call this scoring scheme a ‘link potential’ (Figure 1B). From these link-potential scores we construct a bipartite graph whose two parts are variants and traits, where edge weights are the link-potential scores (Figure 1C). We use a biclustering approach (the REL algorithm; [Gat-Viks et al., 2010]) to identify within this graph heavy ‘bipartite modules’, each consisting of a single genomic interval (harbouring a consecutive list of variants) and a collection of traits (Figure 1D). Importantly, such bipartite modules do not represent pleiotropy in general, but only pleiotropy that is likely mediated through transcripts.

  • II. We next apply a statistical test to identify transcripts whose linked relationships in the module are higher than expected by chance. This is done by evaluating the correlations and associations of transcripts with the module's traits and genomic interval respectively, computing ‘transcript link scores’, and using it to filter promising ‘candidate transcripts’ (Figure 1E). We then add the candidate transcripts to the bipartite modules, thus forming ‘tripartite modules’ (Figure 1F).

  • III. In the third stage the aim is to investigate the internal relationships within a module, thus allowing identification of driver transcripts and their affected traits. Recent methods allow structural reconstruction of a causality network (e.g., Neto et al., 2010; Hageman et al., 2011; Wang and van Eeuwijk, 2014), and can therefore be applied on each module to reveal its internal structure. However, since such methods may fail owing to a lack of scalability to large networks, we use an alternative approach that aims only to identify the relationships among layers, without the need to infer the causative relationships within each of the layers. To that end, we devise the following 2-step procedure: We first assess the causality among all candidate transcripts and traits in a module, assuming a single representative variant for the module's genomic interval (Figure 1G). We use a ‘causality p value’ score to assess the quality of such causative relationships. Next, the modules are refined by the iterative removal of transcripts and traits whose causality p values are non-significant (Figure 1H). The output is a list of ‘GEMOT modules’, each consisting of a single genomic interval, a group of validated ‘driver transcripts’, and their affected traits (Figure 1I).

Notably, the GEMOT algorithm is a unified pipeline that integrates several independent procedures, including biclustering, causality testing and network reconstruction. In this study we use the ReL biclustering algorithm (Gat-Viks et al., 2010); the causality testing proposed by Neto et al. (2013); and a tailored network reconstruction scheme. However, the GEMOT pipeline is general and can be applied using alternative procedures (e.g., biclustering [Tanay et al., 2002]; network reconstruction [Neto et al., 2010; Hageman et al., 2011; Wang and van Eeuwijk, 2014]; causality testing [Lee et al., 2009; Neto et al., 2013]).

Application of GEMOT in phenotypically diverse recombinant inbred mouse strains

We applied GEMOT to study phenome connections using gene expression in myeloid Gr-1+ cells (Gerrits et al., 2009) and 1738 traits across recombinant inbred BXD mice (see ‘Materials and methods’). Using GEMOT, we found 40 bipartite modules, 11 tripartite modules, and 8 mature GEMOT modules with non-overlapping sets of drivers (permutation-based FDR < 10−3, ‘Materials and methods’; see Figure 2A, Figure 2—figure supplements 1, 2 and Supplementary file 1A,B). For comparison, no GEMOT modules were found in 100 permutation tests, and the average number of bipartite and tripartite modules in 100 permutation tests was 32 and 0.06, respectively (Figure 2—figure supplement 2). As expected, the observed link potential scores in mice were much higher than would be expected by chance (p < 10−10, permutation test; Figure 2—figure supplement 3 and ‘Materials and methods’). It is highly unlikely, therefore, that our GEMOT modules were generated at random (p < 0.01). Notably modules nos. 1–3, 6–8 show strong correlations between traits, whereas the remaining modules show moderate or weak trait–trait correlations (Figure 2—figure supplement 4).

Figure 2 with 5 supplements see all
GEMOT modules in BXD mouse strains.

(A) Shown is a module identifier (column 1), its genomic interval (column 2), the numbers of driver transcripts and traits in the module (columns 3 and 4, respectively), and the main characteristic of its traits (column 5, see description in Supplementary file 1A). (B) Global visualization of the GEMOT modules. The graph presents the genomic intervals (squares, bottom), the transcripts (circles, middle) and traits (diamonds, top) of all eight resulting GEMOT modules. The transcripts, which are unique to each module, are connected to their module's variants and traits. Sets of traits known to be related to the same process are enclosed in a box and labeled (top). The module's genetic and transcripts layers are color coded as in A; the traits are color coded based on their gender: female (white), male (gray) or both (black). Notably, some modules have overlapping collections of traits, or their traits relate to the same process.

https://doi.org/10.7554/eLife.04346.006

Given that our findings were significant, we next aimed to obtain a global perspective on the resulting modules. To that end we generated a graph of the modules, where the transcripts layer of each module is connected to the module's traits and genomic interval (Figure 2B). On this graph we marked groups of similar traits, such as different behavioral and physiological responses after ethanol stimulation, or the results of different thermal nociception tests. Notably, the trait collections of the different modules are partly overlapping. For example, five of the modules relate to morphine responses (nos. 2, 5–8) and two modules relate to thermal nociception (nos. 3 and 5). Nevertheless, although some of the modules have overlapping traits, there is no overlap between the drivers and variants of the different modules (Figure 2A and Supplementary file 1B), demonstrating GEMOT's ability to predict several distinct underlying mechanisms for the same collection of traits. Furthermore, a focused examination revealed that the traits of each of the modules shared a unique characteristic. For example, module nos. 6 and 2 relate to place preference following morphine injection, but at distinct time intervals (Figure 2); similarly, module nos. 1 and 3 relate to an anxiety assay, but with and without ethanol stimulation, respectively. Taken together, our results indicated high-level organization of overlapping collections of traits, while each module reflects a unique molecular and genetic signature that underlies a different trait characteristic.

Notably, some of the resulting modules consisted of multiple traits that are related to the same process, whereas others consisted of a collection of distinct traits (Figure 2 and Supplementary file 1B). For example, a module related to morphine response (module no. 2) consists of 17 different traits that were measured following treatment of mice with morphine at different time points and in various behavioral assays. Similarly, a module related to an anxiety assay (module no. 1) consists of two different traits that were measured following treatment with ethanol at different time points. In contrast, and consistently with our goal of identifying novel relationships among traits, module nos. 3, 4 and 5 suggest previously unknown connections between traits.

We next characterized pairs of traits within each group of traits (‘trait pairs’) to show that the quality of these pairs is not lower than in existing methods. We focused on three main properties of trait pairs: the correlation among traits in a pair; the correlation between a trait pair and the transcripts; and the knowledge-based relationships among traits. As a reference we demonstrate these properties for modules that were generated using three alternative methods: (i) the trait-based hierarchical clustering approach (denoted ‘HC’; [Hastie et al., 2009], as in Figure 1—figure supplement 1B); (ii) the gene-based INVAMOD algorithm, which identify pleiotropic genetic variants and their associated groups of traits in an agglomerative manner (Gat-Viks et al., 2013; as illustrated in Figure 1—figure supplement 1C); and (iii) a set of randomly sampled trait pairs (5% of all possible trait pairs). In particular, the bipartite modules were compared to the top 40 groups from each method (Figure 2—figure supplement 5A,B). Similar results were obtained when the eight GEMOT modules were compared to the top eight groups from each of the alternative methods (Figure 2—figure supplement 5C,D). For each property, we first explain the metric of evaluation and then present the results with GEMOT and with the alternative methods.

Correlation among traits

GEMOT's correlations among trait pairs were much stronger than expected by chance (p < 10−200), and were comparable to those obtained by the gene-based INVAMOD approach but weaker than those from the trait-based HC approach (p < 10−160; Figure 2—figure supplement 5E). This result was consistent with the fact that HC, but not GEMOT or INVAMOD, directly optimizes for such correlations.

Shared transcripts

For each trait pair we searched for a potential shared transcript that showed the best correlation (on average) with both traits. We found that GEMOT's predicted trait pairs were supported by the best-correlated transcripts across a wide range of trait–trait correlations (Figure 2—figure supplement 5A,C). As expected, in all methods the higher the correlation between traits, the stronger the correlations with the shared transcripts. This analysis indicated that GEMOT outperforms the gene-based and random methods (p < 10−30 and p < 10−50, respectively) and is comparable to the trait-based method.

Knowledge-based relations among traits

We next aimed at determining whether trait pairs predicted by GEMOT were supported by previously known connections between traits. Evaluation of such distinct trait pairs in light of previously known trait connections is a general problem, and no suitable systematic annotation is currently available. To tackle this problem we constructed an unbiased matched-annotation set of connections, and used it in our analysis as the gold standard. To systematically cover, independently of the correlation and co-association measures, the qualitative knowledge about connections among traits, we adopted the descriptive title of each trait in the GeneNetwork Phenotype Database (Wang et al., 2003). If the descriptions of two traits included at least one shared biological term, the traits were considered as ‘matched-term traits’ and were included in our gold standard set. Using this matched-annotation set, we found that all methods attained a similar proportion of matched-term trait; the higher the correlation between traits—the higher the proportion of matched terms, as expected (Figure 2—figure supplement 5B,D).

In the following we demonstrate three reconstructed modules, demonstrating GEMOT's ability to identify a model for a collection of tightly related traits, and for a previously uncharacterized combination of traits.

Discovery of a model for a collection of tightly related traits

The morphine module (module no. 2, see Figure 3) exemplifies the ability of GEMOT to suggest an underlying mechanism for an entire group of traits known to be tightly related. The module consisted of a collection of 17 behavioral assays in the recombinant mouse strains, all carried out to measure their responses to injection of morphine (50 mg/kg, over different periods of time). Measured parameters included distance traveled, place preference, and vertical activity ([Philip et al., 2010]; Supplementary file 1A). All module traits showed a strong positive correlation with each other (|r| values ranged from 0.6 to 0.99, Figure 2—figure supplement 4) and shared similar peaks within the genomic interval of module no. 2 (Figure 3A, top).

Figure 3 with 5 supplements see all
Characteristics of module no. 2.

(A) Genetic associations. Shown are the association scores (y-axis) across the genomic positions of chromosome 1 (x-axis) for four module traits (top) and for seven selected drivers (bottom) in myeloid cells. The position of the Klf7 gene is marked below the x-axis. (BE) Characterization of driver groups I and II. (B) A matrix of traits (rows) vs drivers (columns), where the blue/red scale indicates negative/positive Pearson correlation coefficients among them. (C) The transcript causality p value scores (y-axis) are shown for each module transcript (log scale), assuming a representative variant in the module's genomic interval (rs13475891 in chr1:62 Mbp). (D) The histogram represents the Pearson correlation coefficient (y-axis) of Klf7 with the remaining drivers (x-axis). (E) Genetic effect size of variant rs13475891 on the driver transcripts. The histogram represents the average expression levels of DBA2-carrying individuals minus the B6-carrying individuals (y-axis) for each module driver (x-axis). (F) Distribution of the causality −log p value scores of Idh1 (blue) and of Klf7 (black) on each of the remaining drivers in the module. Causality p values were calculated by positioning Idh1 or Klf7 between the module variant and a driver from this module. (G) Validation of the effect size of Klf7 perturbation of knockdown (left) or overexpression (right) on other drivers in bone-marrow hematopoietic stem cells (y-axis). The ‘effect size’ of perturbation (either knockdown or overexpression) on a certain transcript g is defined as the difference between the (log-scaled) expression of g in normal cells to that in the perturbed cells. In each panel, the first and second columns refer to groups I and II, respectively. **a significant t-test p value for determining whether the mean effect size is different from zero (FDR < 0.06). (H) Scatter plot, where for each driver in group I (a black square) the y-axis shows the transcript causality score (for morphine-response traits in this module; p values), and the x-axis shows the significance of Klf7 knockdown effect on its transcription level (t-test p value). The plot indicates that in group I, drivers with highly significant causality on behavioral responses to morphine were also significantly influenced by Klf7 knockdown. (I) Overall model illustration of module no. 2.

https://doi.org/10.7554/eLife.04346.012

Module no. 2 consists of a group of 32 driver transcripts (e.g., Klf7, p35, Lrrk2), all associated with the module's genotype and strongly correlated with each other and with the module traits (Figure 3A, bottom; Figure 3B; Figure 3—figure supplement 1A). For all driver transcripts, the causative relationships were much preferable to the alternative relationships (p value ≤ 0.005, permutation-based FDR < 6 × 10−5, ‘Materials and methods’; Figure 3C and Figure 3—figure supplement 1B). We found two main groups of drivers (Figure 3D,E, Figure 3—figure supplement 1A). The first (denoted ‘group I’) consists of 21 genes whose transcript levels are negatively correlated with the morphine traits. In this group, individuals carrying the DBA/2J allele in the module's variant have higher gene expression values than those of individuals carrying the C57BL/6J allele (e.g., Klf7 in Figure 3—figure supplement 2A). The second driver group (denoted ‘group II’) has the opposite correlation with the morphine traits and the opposite genetic effect (e.g., Idh1, in Figure 3—figure supplement 2B). These observations coincide with the fact that for all module traits, individuals carrying the C57BL/6J-allele have higher trait values (Figure 3—figure supplement 2C,D). Notably, causality p values in group I are more significant than in group II (p < 0.05, t-test; Figure 3C); one possible reason is that the two groups might relate to distinct mechanisms that differ in their causality strength.

The role of the Klf7 gene in morphine module no. 2 is particularly interesting. Klf7 and Idh1 are the only two cis-associated module drivers and their causative role in mediating morphine traits is highly significant. We hypothesized that the cis-associated variation in gene-expression levels may lead to variation in the trans-associated module drivers. To test this hypothesis we used the causality p value score (‘Materials and methods’), but utilized a cis-associated gene (Klf7 or Idh1) as the transcript positioned between the module's genomic interval and another module transcript. Using these scores we found that the causative p values of Klf7 on the remaining module transcripts were substantially more significant than the causative p values of Idh1 on those transcripts (Figure 3F, Figure 3—figure supplement 3A, p < 10−7, K-S test). This finding holds for each of the driver groups I and II (Figure 3—figure supplement 3B), suggesting that Klf7, but not Idh1, likely affects the other module drivers, which in turn affect behavioral activity in response to morphine.

The finding of positive and negative correlations of Klf7 with the drivers in groups I and II is particularly intriguing because it suggests that Klf7 is an activator of group I and a repressor of group II (Figure 3D). To validate the suggested central role of Klf7 in mediating variation in other drivers, we analyzed the influences of knockdown and overexpression of Klf7 on gene expression in bone-marrow hematopoietic stem cells from the C57BL/6J mouse strain (termed Klf7KD and Klf7OE, using three and four biological repeats, respectively; data were taken from [Schuettpelz et al., 2012]). Indeed, whereas knockdown of Klf7 led to a down-regulation of group I transcripts (p < 1.5 × 10−5, FDR < 5 × 10−5, t-test), it had no influence in group II (p > 0.32; Figure 3G, left), in agreement with the predicted role of Klf7 as an activator and repressor of groups I and II, respectively. Furthermore, we found that the more significant the influence of Klf7 knockdown on a driver in group I, the stronger the causative role of the driver on behavioral morphine traits (Figure 3H). Overexpression of Klf7 had the opposite effect, with significant down-regulation of transcripts in group II (p < 0.05, FDR < 0.06, t-test) and even a small increase of transcripts in group I (p < 0.002, FDR < 0.005; Figure 3G, right), consistently with our model. Taken together, the two lines of evidence—both natural and experimental perturbations—indicated that Klf7 is a key driver mediating the effects of additional drivers in groups I and II, which in turn affect morphine response diversity (Figure 3I).

Module no. 2 may affect morphine responses through a variety of mechanisms. For example, the p35/Cdk5 driver directly phosphorylates the opioid receptor (Xie et al., 2009; Pareek et al., 2012), and the morphine adduct MO-GSH is controlled by the Idh1 and Ggt1 drivers (Correia et al., 1984; Kumagai et al., 1990; Muller and Do, 2012). Furthermore, morphine treatment may exert its action through cell migration and cell invasion processes (Gach et al., 2011): the p35 driver, as part of the p35/Cdk5 complex, affects the Rac/Cdc42 complex through PAK inhibition, resulting in altered cell migration (Nikolic et al., 1998), while the Mmp13 driver alters cell migration in response to morphine treatment because of its ability to degrade collagen (Gach et al., 2011; Wang et al., 2013). Both the Klf7 and the Cdk5/p35 drivers activate p27 by expression or phosphorylation (Laub et al., 2001; Smaldone et al., 2004), and p27 in turn affects the Rho GTPases Rac and RhoA, which then alter cell invasiveness and infiltration (Kawauchi et al., 2006). In glioblastoma, for example, the p27/Rho pathway affects infiltration of tumor cells (Ruiz-Ontañon et al., 2013). In agreement with this prediction, using the Rembrandt database (Madhavan et al., 2009) we found that all four top-ranked Klf7-mediated drivers attain significant effects on survival of patients with glioblastoma (Kaplan–Meier p < 1 × 10−11, 1.4 × 10−4, 1 × 10−4 and 0.01 for the four drivers Ntng2, p35, Rexo2 and Lrrk2, respectively; in all cases, FDR < 0.05, Figure 3—figure supplement 1C), supporting the role of module no. 2 in cell invasiveness. Further experimental studies are required in order to test these suggested pathways and search for additional mechanisms.

Our findings in module no. 2 agree well with previous studies showing that several driver genes participate in the morphine response. For example, the Klf7 transcript is up-regulated in response to morphine (being one of the top ten up-regulated genes [Suzuki et al., 2003]). Both the p35 driver and its activated protein Cdk5 were up-regulated in response to acute morphine but down-regulated on exposure to chronic morphine (Ferrer-Alcón et al., 2003). The coregulated Idh1 and Ggt1 drivers (Muller and Do, 2012) are responsible for the synthesis and degradation, respectively, of the reduced form of glutathione (GSH). A conjugated form of morphine and GSH (MO-GSH) attains higher morphine reactivity (Correia et al., 1984; Kumagai et al., 1990) that may alter the morphine responses. However, whereas key roles of Klf7 have been reported primarily in neurons (Bieker, 2001; Laub et al., 2005), here we found a causative effect of Klf7 on behavioral responses to morphine that was specific to myeloid tissue (Figure 3—figure supplement 4). We cannot as yet explain this observation; however, p35/Cdk5-mediated neutrophil secretion (Rosales et al., 2004) and the cytokine-mediated regulation of Klf7 by morphine in lymphocytes (Suzuki et al., 2003) potentially provide an explanation for this tissue specificity. Furthermore, morphine injection leads to a reduction in neutrophil infiltration 30–120 min after treatment (Clark et al., 2007), in agreement with the timing of causative relationships between Klf7 and behavioral assays at 30–120 min after morphine injection (Figure 3—figure supplement 5). Taken together, our results suggest that in vivo behavioral responses to morphine are affected not only by neuronal activity, but also through certain components of the immune system.

Discovery of novel connections among traits

In the following we demonstrate GEMOT's ability to identify a model for previously uncharacterized connections, with either strong (module no. 3) or weak (module no. 4) correlations among traits.

Module no. 3 (Figure 4A) shows the ability of GEMOT to group a variety of distinct traits. This module consists of a genomic interval in chr4:133–142 Mbp and five correlated traits, namely pain response (thermal nociception [Philip et al., 2010]), lens weight (Zhou and Williams, 1999), eye weight (with or without correction for brain weight [Zhou and Williams, 1999]), and ethanol response (place preference [Cunningham, 1995]). All traits were found to be strongly intercorrelated, with |r| values ranging from 0.67 to 0.9 (Figure 2—figure supplement 4). Eya3 and Cd52 were proposed as cis-associated drivers, a suggestion further supported by the known role of Eya3 in eye development (Tadjuidje and Hegde, 2013) and the involvement of Cd52 in pain signaling (Poh et al., 2012). Figure 4—figure supplement 1 shows that the significant causative role of these drivers can be found along the entire myeloid pathway (including stem cells [LinSca-1+c-Kit+], common progenitors of the myeloid and erythroid lineages [LinSca-1c-Kit+], erythroid [TER-119+] and myeloid [Gr-1+] lineages), but not when using data from the lymphoid, eye, or brain tissues. This suggests that Eya3 and Cd52 play a role in pain processes and eye conditions mainly through their functionality in myeloid cells.

Figure 4 with 2 supplements see all
Characteristics of module nos. 3 (A) and 4 (B).

Left: Genetic associations. Association scores (y-axis) across the genomic positions (x-axis) for the module's driver transcripts, based on gene-expression data in myeloid cells. Middle: Matrix of correlations among traits (columns) vs driver transcripts (rows); the blue/red scale indicates negative/positive correlation coefficients. Right: Matrix of traits (columns) vs driver transcripts (rows), where the blue/white scale indicates the significance of their causative relationships based on gene-expression data from the myeloid tissue. A histogram depicting transcript causality p value scores is shown as in Figure 3C. A representative variant in the module's genomic interval is assumed (see Supplementary file 1C).

https://doi.org/10.7554/eLife.04346.018

Module no. 4 (Figure 4B) demonstrates GEMOT's ability to identify a group of traits that show weak correlations among themselves, but share the same driver transcripts. The module consists of a genomic interval in chr11:8–19 Mbp and five traits: the measures of two hippocampal structures (volume and age-related lesions) from two distinct publications (Jucker et al., 2000; Martin et al., 2006), locomotor response to cocaine (Philip et al., 2010), number of liver tumors (Lee et al., 1995), and number of haematopoietic stem cells (Liang et al., 2007). The module's traits show weak and moderate correlations, with |r| values ranging from 0.01 to 0.58 (Figure 2—figure supplement 4). Of six suggested drivers, Ythdf2 and Aldh6a1 were predicted as the top drivers (Figure 4B, right panel). The validity of this prediction is supported by the known involvement of Aldh6a1 in brain structure (Marcadier et al., 2013), and is further supported by a recent report (Meyer and Jaffrey, 2014) that N6-methyladenosine (m6A) modification of RNA, whose readers are Ythdf13, causes an altered locomotor response to cocaine. In this module, a significant causative role was found in myeloid cells, but not in brain, eye, liver, or lymphoid tissue (Figure 4—figure supplement 2), suggesting a novel function of myeloid cells in regulating neurobiological and behavioral traits—an intriguing possibility that warrants future investigation.

Systematic evaluation of the GEMOT algorithm using synthetic data analysis

We next investigated the ability of GEMOT to identify the subsets of traits that are caused by the same transcripts. We first focused on studying GEMOT's utility for small sub-networks of co-mapped components, where each sub-network consists of a subset of traits caused by the same transcripts, as well as additional transcripts and traits that are independently or reactively related (Figure 5A). Such sub-networks mimic the tripartite modules that serve as input at the third stage of the GEMOT algorithm. A single synthetic data collection consisted of genotyping, phenotyping, and gene expression for 100 such sub-networks with two characteristic parameters: number of traits and noise level; in all cases we used 100 individuals (‘Materials and methods’). Using these synthetic data, GEMOT performance was compared to that of three alternative network reconstruction methods: Hageman et al., 2011; Wang and van Eeuwijk, 2014 (‘QPSO’); and Neto et al., 2010 (‘QTLNet’) (see parameter selection in Figure 5—figure supplement 2).

Figure 5 with 5 supplements see all
Comparative performance analysis on simulated models.

(A) Illustration of a sub-network model, in which all components are mapped to the same genetic variant v1 but not necessarily through the same relationships. In particular, the model includes k + 2 traits ,with k traits p1,..,pk that share the same underlying transcripts g1, g2, g3, and two additional traits pk + 1 and pk + 2 that are affected through other mechanisms (see detailed in Figure 5—figure supplement 1A). (B and C) Accuracy assessments of the synthetic sub-network depicted in A. Accuracy (y-axes) is compared across methods and different data parameters. Results are shown in models of different noise levels (x-axis, log-scaled; B) with either 5 (left) or 7 (right) traits, or over different numbers of traits (x-axis; C) with either a low noise level = 0.5 (left) or a high noise level = 2 (right). The accuracy metric evaluates whether p1,..,pk, but not pk + 1,..,pk + 2, share the same mechanisms, as detailed in Figure 5—figure supplement 1B. Plots depict alternative network construction methods (color coded, see Figure 5—figure supplement 2), indicating that GEMOT has an advantage over existing methods with noise levels ranging between 0.25 and 1, which is the relevant range for the mouse data in this study (see Figure 5—figure supplement 3).

https://doi.org/10.7554/eLife.04346.021

Performance was evaluated using an accuracy metric, which reflects the ability of a method to discern the correct subset of traits sharing the same transcripts (e.g., traits p1,...,pk but not pk + 1 and pk + 2 in Figure 5A and Figure 5—figure supplement 1B for details). Figure 5B presents the accuracy for synthetic data collections of varying levels of noise (using the sub-networks from Figure 5A with 5 or 7 traits). GEMOT displayed the best accuracy for noise levels ranging between 0.25 and 1, with lower accuracy for higher noise levels. Analysis of various network properties in both mouse and synthetic data shows that sub-networks with a noise level that do not exceed 1 are more likely to represent real biological modules (Figure 5—figure supplement 3). Unlike the compared methods, GEMOT's accuracy remained high with an increasing number of traits (Figure 5C); similar results were obtained for alternative network structures (Figure 5—figure supplement 4). Overall, in our simulation, GEMOT outperformed the compared algorithms in handling a growing number of traits and in identifying the correct groups of traits when using biologically-relevant parameters. These results do not rule out the possibility that for other tissues, conditions or organisms, utilizing the alternative methods as part of the third stage of the GEMOT algorithm may enhance its performance.

We next aimed to characterize GEMOT's utility for a large biological network that included groups of traits that share the same causal transcripts. Accordingly, each synthetic network included 100 traits, 200 transcripts and 100 variants, featuring five co-mapped sub-networks. A singe data collection consists of 100 networks, each containing five co-mapped sub-networks that carry the same number of traits (‘Materials and methods’). We compared GEMOT to two alternative trait-grouping methods: the trait-based iterative clique enumeration (ICE) approach (Shi et al., 2010) and the gene-based INVAMOD approach (Gat-Viks et al., 2013) (‘Materials and methods’). Notably, network construction methods (e.g., Neto et al., 2010; Hageman et al., 2011; Wang and van Eeuwijk, 2014) could not be compared owing to an unrealistic running time in the case of large networks. The analysis suggested that although all compared methods successfully discern all traits in a sub-network (Figure 5—figure supplement 5A), GEMOT attains higher accuracy in discerning those traits that share the same driver transcripts (Figure 5—figure supplement 5B). Notably, the GEMOT algorithm is tailored for identification of causative relationships, unlike the compared methods, explaining why GEMOT succeeded in discriminating the correct subsets of co-regulated traits.

Discussion

We set out to identify the molecular and genetic mechanisms underlying connections among groups of traits. To that end, we combined module identification (Gat-Viks et al., 2010) with causality testing (Neto et al., 2013) in a unified pipeline that relies on the definition of linked relationships so that candidate modules can be filtered out prior to the validation stage. Our results in mice highlighted three types of high-order organization of traits. (i) Groups of tightly related traits that share the same transcripts mechanisms (modules 1, 2, 6, 7, 8, e.g., Figure 3). (ii) Groups of distinct traits that share the same transcripts mechanism, but not necessarily high correlations among them (modules 3, 4, 5, e.g., Figure 4). (iii) Different groups commonly have overlapping traits, but typically differ in their underlying mechanisms (Figure 2B).

Our study emphasizes the need for methodologies for constructing causative models that underlie connections among traits. Whereas previous trait-grouping methods have used genetic or molecular data separately, and thus did not validate causative transcripts (e.g., Shi et al., 2010; Gat-Viks et al., 2013; as illustrated in Figure 1—figure supplement 1B,C), the GEMOT method aims at directly filtering and validating such relationships. Our simulations showed that GEMOT is superior to these methods in identifying trait groups that share the same underlying transcripts (Figure 5—figure supplement 5B). Another strategy is to use network reconstruction methods to construct groups of related traits (e.g., Neto et al., 2010; Hageman et al., 2011; Wang and van Eeuwijk, 2014). These methods can be applied in the case of either the complete biological network or the sub-networks within the tripartite modules. Whereas these methods are limited in their scalability and may be particularly inefficient when applied on a large number of components, our approach can be scaled to larger networks, but can construct the network only partially. For example, in comparing both running time and accuracy under increased network sizes we found GEMOT to be more scalable than the alternative network construction methods (Figure 5C and Figure 5—figure supplements 2B, 4B).

Our methodology opens the door to a variety of future research directions. One possibility is that GEMOT will be applied on a compendium of molecular data from multiple tissues. In such cases, GEMOT predictions can be used to simultaneously identify both the biological mechanism and its relevant tissue. Second, GEMOT is applicable to a variety of molecular data types in addition to gene-expression data. For example, its application in blood cytokines or plasma lipids (Teslovich et al., 2010) is expected to make it possible to identify molecular factors acting at the cell–cell communication level. Similarly, future extensions of GEMOT may provide the means to include environmental factors as part of the module. Third, monitoring of additional strains would allow discriminating between several alternative genomic intervals for the same module, which may arise due to linkage disequilibrium between chromosomally distinct loci. Fourth, characterization of GEMOT modules that share a similar collection of traits but have different genomic intervals may reflect gene–gene interactions that lead to connections among traits. Finally, GEMOT can potentially be further improved by the construction of internal causative relationships within the transcripts and the traits layers. For example, some drivers may control other drivers, which in turn affect a collection of traits (as exemplified in the case of Klf7 in Figure 3). It should be noted, however, that GEMOT cannot distinguish cases of multiple drivers that are part of the same regulatory circuit from cases of multiple drivers that act through several distinct circuits. Rather, its predictions provide biological or clinical hypotheses for additional experimental investigations.

Overall, our approach paves the way to the simultaneous study of several mechanistic layers underlying connections among traits, providing a multilayered view of phenotypic connections. Because the GEMOT methodology is a general one and can be applied to the study of other taxa, this approach may facilitate our understanding of the molecular mechanisms underlying human disease.

Materials and methods

The GEMOT algorithm

Request a detailed protocol

GEMOT is designed to identify three-layer modules in which driver components translate between a single genomic interval and a collection of traits. As input, GEMOT takes three types of objects: (1) a collection of traits across a population of individuals; (2) genotyped genetic variants for these individuals; and (3) high-throughput gene expression data across the same population.

Our algorithm incorporates three stages (Figure 1A). The first stage constructs candidate bipartite modules consisting of a group of traits and a genomic interval. In the second stage, candidate transcripts are added to each module from the previous stage, thus forming tripartite modules. The final stage validates the actual drivers and refines the modules accordingly. The GEMOT code is available at http://csgi.tau.ac.il/gemot/.

GEMOT stage I: construct bipartite modules

Request a detailed protocol

In the following we first define the construction of a bipartite graph and then explain the identification of bipartite clusters within this graph as previously described (Gat-Viks et al., 2010). We define a bipartite graph whose two parts correspond to genetic variants and traits, and in which the edges reflect the potential of a variant and a trait to have significant linked relationships (Figure 1C). Edge weights are calculated as follows (Figure 1B): First, for each pair of a genetic variant and a transcript, we evaluate the genetic association between the expression of the transcript and the candidate genetic variant. This yields a ‘variant–transcript association score’. In this study, for the case of homozygous recombinant inbred strains the association score is a −log t-test p value for the different gene-expression values between the strains carrying the two possible variant alleles. For other cases, such as an outbred population, other standard association scores can be applied (Falconer and Mackay, 1996). Secondly, for each pair of a transcript and a trait, we calculate the absolute Pearson correlation coefficient across genetic backgrounds. We term this score the ‘transcript–trait correlation score’. Finally, for each genetic variant and each trait we compare the distribution of transcript–trait correlations in high and low transcript–variant association scores (a statistical t-test). We assign such a t-test p value to five different transcript–variant association cutoffs (the five cutoffs partition the association range into 6 equally sized groups) and record the top −log t-test p value across these five cutoffs. We refer to the recorded −log p values as ‘link potentials’ and use them as edge weights in the bipartite graph.

Within this graph we use the ReL software package (Gat-Viks et al., 2010) to identify significant biclusters (Figure 1D). Briefly, the ReL algorithm starts with a set of seed clusters consisting of one trait and one variant whose link potential exceeds a certain initialization cutoff, cs. A trait or a variant can be included in a cluster if and only if its average link potential exceeds an improvement cutoff ci (here, cs = 180, ci = 90). Each bipartite cluster is subject to iterative improvements by addition or removal of traits and variants based on this cutoff. We refer to the bipartite clusters as ‘bipartite modules’ and further improve them in the following stages. Notably, ReL provides the same results when applied with or without Boneferroni correction for the gene–variant association −log p values and link potentials scores, since the construction of the biclusters is robust to an additive rescaling of these scores.

GEMOT stage II: construct tripartite modules

Request a detailed protocol

In this stage a rough list of candidate transcripts is constructed for each module (Figure 1E). To this end, for each transcript in a given module we rank its correlations and associations with all traits and variants in the input dataset and record its ranks of associations and correlations within the module. We next compare the distribution of recorded ranks in this module with the distribution of all ranks. The two distributions are compared using a Kolmogorov–Smirnov (K-S) test, a nonparametric test that may be used to compared two samples. We refer to the K-S p value for a transcript in a module as a ‘transcript link score’. Only transcripts with significant link scores are added to their module. Such transcripts are called ‘candidate transcripts’ and the resulting extended modules are referred to as ‘tripartite modules’ (Figure 1F).

GEMOT stage III: refine module (validate drivers)

In the following we first define the causality test and then describe the procedures for identifying driver transcripts and for module refinement.

Causality testing

Request a detailed protocol

The input for a causality test is a triplet of objects—a variant, a transcript, and a trait—each measured across the same population of individuals. Several types of relationships exist, including causative, reactive, and independent, and the goal of causality testing is to identify causative relationships among a given triplet of objects. Our causality analysis here follows the formulation of Neto et al. (2013). In this formulation, a likelihood score is first calculated for the causality model (denoted ‘M1’) as well as for three alternative models (denoted M2–M5, where M4 and M5 are modeled together, Figure 1—figure supplement 2A). A likelihood ratio score is then calculated between each of the three alternative models (as null hypotheses) and the causative model (as an alternative hypothesis), whereas the three respective p values are calculated on the basis of a theoretically derived distribution of the likelihood ratios. The final ‘causality p value’ is the maximal obtained for the three calculated p values.

Importantly, in the present study we introduce two important modifications in the scheme suggested by Neto et al. (2013). First, the distribution of likelihood ratios was evaluated empirically by randomly reshuffling the transcript levels of the candidate transcript and recomputing the likelihood ratio (repeating the procedure 100 times). The causality p value of a given triplet of objects was calculated according to the distribution of these reshuffling-based likelihood ratio scores (rather than using a theoretically derived distribution).

Second, we observed that in a GEMOT module a transcript may have a causal effect on a trait even in the presence of additional transcripts through which the variant influences the trait. Accordingly, in the GEMOT framework, both models M1 and M4 should be considered as causative relationships (a ‘broad’ view of causality), as opposed to the classical definition of causative relationships according to model M1 only (Figure 1—figure supplement 2A). In the present study we therefore utilized the broad-sense causative relationships. We observed that samples of model M4 attained significant likelihood ratios of M1 against models M2 and M3, and we therefore defined the (broad sense) ‘causality p value score’ as the maximal across the p values attained for M1 against M2 or M3. Our simulation study (see ‘Materials and methods’) indicated that GEMOT attains similar performance to that of the compared methods for models M1–M3, but outperforms the existing methods when adding the remaining models (Figure 1—figure supplement 2B–D).

Driver validation and module refinement

Request a detailed protocol

Transcripts that are causally related to traits can be identified by constructing a detailed causal network of relationships among all the components in the tripartite module. This, however, is a difficult task in the case of large tripartite modules harboring numerous components. We therefore devised an algorithm to identify the driver transcripts and their affected traits without needing to construct the module's entire network. The algorithm proceeds as follows: For each module, stage III starts with calculation of the causality scores for each pair of candidate transcripts and traits in the module, assuming a fixed single ‘representative variant’ from the module's genomic interval (Figure 1G). In this study, the representative variant was the one that attained the best average association scores across the module's candidate driver transcripts. Next, an iterative module refinement is applied in two steps. In step (i) we reveal the driver transcripts that are causally related to many of the module traits (Figure 1H, middle). In particular, we used Fisher's method (Sokal and Rohlf, 1990) to calculate the overall causality p value of a transcript (denoted ‘transcript causality p value’); only those transcripts with significant transcript causality p values were retained. The transcript causality p value is determined based on chi-squared distribution with 2k degrees of freedom: χ2k2(gi)=2j=1,..,kln(pi,j), where gi is the transcript; j = 1,...,k are the module traits; and pij is the causality p value for the transcript gi and a trait j assuming the module's representative variant. Next, in step (ii) those traits that do not have at least one transcript with a significant causality score are filtered out from among the transcripts that were selected in step (i) (Figure 1H, right). This process is repeated iteratively until convergence. Thereafter, the retained transcripts are called ‘driver transcripts’ and modules that contain at least one driver are termed ‘GEMOT modules’ (Figure 1I).

Mouse data

Request a detailed protocol

All mice data was taken from a previously produced body of work. We applied our analysis to data obtained from homozygous BXD recombinant inbred mouse strains (Peirce et al., 2004) generated by crossing C57BL/6J and DBA/2J inbred strains for many generations. Microarray expression data in myeloid cells across 24 BXD strains have been measured (Gerrits et al., 2009). To identify high-quality candidates we selected 5786 genes whose variation in expression across BXD strains, based on average intensities of the genes, was higher than expected. Expected variance values were calculated using a sliding window along the genes' average intensities. Gene-expression values were log10-transformed and normalized by Z-score normalization. All 2885 traits and 3796 genetic variants across BXD strains were downloaded from the WebQTL dataset (Wang et al., 2003). Trait values were normalized by Z-score normalization. Given the strains in the gene expression and trait datasets, we restricted our analysis to 1738 traits that had at least 15 strains in common. Other compared cell types or tissues (Gatti et al., 2007; Geisert et al., 2009; Gerrits et al., 2009; Alberts et al., 2011; Mozhui et al., 2012) were similarly preprocessed. Supplementary file 1C records the particular representative variants that were used for causality testing in each predicted GEMOT module.

To assess the corresponding false discovery rates, we generated negative controls based on a permutation test in which the transcript levels of each transcript were randomly shuffled and the GEMOT modules were recomputed (a process that was repeated 100 times). In each repeat, a variety of statistics (such as the number of identified modules) were recorded. The permutation-based ‘false discovery rate’ (FDR) is the ratio of the averaged number of statistics that were declared significant using the permuted data to the number of statistics that were declared significant using the original (non-permuted) data. In this study, GEMOT was applied using transcript link score cutoff = 10−95 for identifying candidate transcripts (stage II, permutation-based FDR < 0.09) and transcript causality p value cutoff = 0.005 (stage III, permutation-based FDR < 6 × 10−5).

Causality testing–performance evaluation

Request a detailed protocol

To investigate the performance of the causality score we simulated triplets of objects, each consisting of a variant v, a transcript g and a trait p. In all such triplets we assume 100 homozygous individuals. The genotyping of each variant was generated by sampling a vector of values 0 and 1 from a binomial distribution (with p = 0.5). Based on these genotyping values, the values of g and p were generated according to the following five different models (denoted M1–M5), as depicted in Figure 1—figure supplement 2A: M1, a causative model v → g → p; M2, an independent model v → g, v → p; M3, a reactive model v → p → g; M4, v → g → p and v → p; and M5, v → p → g and v → g. Each arrow in these five models was simulated as a linear expression with a normally distributed error term. For example, based on model M1, data were generated as g = αv + ε, p = λg + ε, εN(0, σ2); similarly, model M4 was generated as g = αv + ε, p = αv + λg + ε, εN(0, σ2). A single ‘synthetic collection’ consisted of 250 relationships from each of the five models, a total of 1250 samples. Results are displayed for many different collections, each generated using different combination of λ and σ values with α = 0.5. GEMOT's causality p values were compared to those obtained by two alternative methods: QTLHot (Neto et al., 2013) and an AIC-based method (Lee et al., 2009).

For a given significance threshold we evaluated the ability to identify causal relationships using true positive (TP), true negative (TN), false positive (FP) and false negative (FN) counts, which were defined according to the broad-sense definition of causality (Figure 1—figure supplement 2B). The area under the receiver operating characteristic (ROC) curve (the AUC) was calculated accordingly, where the higher the AUC the better the method. In addition, a balanced false discovery rate (FDR) can be used as a criterion for comparisons of methods, computed as FDR = FPa/(FPa + TP) where FPa accounts for imbalanced data by dividing FP by the ratio between the negative and positive synthetic datasets, calculated as FPa = FP/π0, π0 = (FP + TN)/(TP + FN). The method with lowest FDR is regarded as the best method (when all methods use the same p value cutoff). Notably, GEMOT attains similar performance to that of the compared methods for models M1–M3 (Figure 1—figure supplement 2C), but outperforms the existing methods when adding synthetic M4 and M5 samples (Figure 1—figure supplement 2D).

GEMOT performance analysis

The synthetic data analysis is focused on two simulations with increasing complexity of the input network: (i) sub-network analysis, in which the input is a sub-network where all components are co-mapped to the same variant; and (ii) network analysis, in which the input is a large network comprising several co-mapped sub-networks.

Sub-network analysis

Request a detailed protocol

A synthetic co-mapped sub-network consists of one variant v and m transcripts GC={g1C,...,gmC} that are associated with the variant v and are causally related to k traits PC={p1C,...,pkC}, that is, v{g1C,...,gmC}{p1C,...,pkC}. In addition, the sub-network includes a single transcript g1R that is reactive to phenotype p1R (thus, vp1Rg1R) and a pair of a transcript g1I and a trait p1I that are independently affected by v such that vp1I and vg1I. Altogether, a sub-network consists of the triplet (v, P, G), where P={PC,p1R,p1I} and G={GC,g1R,g1I}. In this study we analyzed three sub-network structures, referred to as Net-1 (Figure 5 and Figure 5—figure supplement 1A), Net-2 (Figure 5—figure supplement 4A, left), and Net-3 (Figure 5—figure supplement 4A, right); unless stated otherwise, the results refer to Net-1. In all cases, the total ‘number of traits’ in a sub-network is k + 2. The simulation data was generated by using for each edge aiaj in these sub-networks a linear coefficient β such that aj = βai + ε(β), ε(β) ∼N(0, σ2), σ = qεβ, where qε represents the relative proportion of noise referred to as the ‘noise level’. Observe that ε(β) depends on both the linear coefficient of the relationship β and on the noise level qε. We generated ‘synthetic collections’ of 100 sub-networks (100 individuals in each case); each collection is constructed for a given network structure (sub-networks Net-1, Net-2, Net-3), a certain total number of traits (k + 2 = 5, 7, 12, or 17 traits) and a certain noise level (qε = 0.25, 0.5, 1, 2, 4). In all cases we used β = 1.5 for relationships vp and vg and β = 0.6 for relationships gp and pg.

We compared GEMOT's performance on co-mapped sub-networks to that of alternative network construction algorithms, including the QTLNet method (Neto et al., 2010), the QPSO methodology (Wang and van Eeuwijk, 2014), and a Bayesian construction proposed by Hageman et al. (2011); for all methods, we used the parameters that gave the best results for the tested networks (Figure 5—figure supplement 2A). We evaluated the performance of each method using the true positive (TP), true negative (TN), false positive (FP) and false negative (FN) counts as defined in Figure 5—figure supplement 1B, which depicts whether the network construction correctly identified shared transcripts for the traits in PC. Relying on these counts, the balanced accuracy score was calculated as (TPR + TNR)/2 (here, TNR = TN/(TN + FP) and TPR = TP/(TP + FN)). AUC could not be computed because the three compared approaches do not provide a measure of statistical significance.

Network analysis

Request a detailed protocol

The synthetic network consisted of two parts: first, five co-mapped sub-networks s1,...,s5 of the form (v(sL), P(sL), G(sL)), and second, the remaining network comprising additional transcripts, variants, and traits. In total, the network consisted of 100 variants, 100 traits, and 200 transcripts that were used to generate synthetic data across 100 individuals. The data for the five sub-networks were generated as in the sub-network analysis (‘Materials and methods’), using varying numbers of traits. The remaining data were generated as follows: (i) genotyping of each variant was generated independently, as described in the causality-testing simulation (‘Materials and methods’); (ii) for each trait and individual, data values were sampled from a standard normal distribution; and (iii) gene-expression data were generated while maintaining the correlation between transcripts of the non-module components and between them and the module component. More specifically, for each individual i and transcript j that are not in a sub-network, the gene-expression data zij were generated in two steps: first, to maintain correlation among all non-sub-network transcripts we sampled the real data in murine myeloid cells (data taken from Gerrits et al., 2009) to generate a 200 × 200 covariance matrix, and then used this matrix to generate synthetic data values xij which approximately have the same covariance matrix. Next, to improve the correlation of the xij values with at least one co-mapped sub-network, we calculated the gene-expression data as zij = xij + cyik, where yik is the synthetic gene expression for an individual i and the (arbitrarily selected) kth transcript in the sub-network (in all cases, c = 1).

The synthetic network was used to test the ability of GEMOT to identify the groups PC(su) (here, u = 1,..,5), which are the groups of traits with the same causative mechanism behind them. Furthermore, we tested an existing trait-based approach (Iterative Clique Enumeration (ICE); Shi et al., 2010) and a gene-based approach (INVAMOD, Gat-Viks et al., 2013). Two different measures were used for performance evaluation: first we evaluated grouping of traits based on their co-mapping to the same variant (Figure 5—figure supplement 5A). Secondly, we evaluated the ability of each method to identify groups of traits sharing the same causal transcripts (Figure 5—figure supplement 5B).

Data availability

The following previously published data sets were used

References

  1. Book
    1. Falconer DS
    2. Mackay TFC
    (1996)
    Introduction to quantitative genetics (4th edition)
    Pearson Prentice Hall.
    1. Ferrer-Alcón M
    2. La Harpe R
    3. Guimón J
    4. García-Sevilla JA
    (2003)
    Downregulation of neuronal cdk5/p35 in opioid addicts and opiate-treated rats: relation to neurofilament phosphorylation
    Neuropsychopharmacology 28:947–955.
    1. Geisert EE
    2. Lu L
    3. Freeman-Anderson NE
    4. Templeton JP
    5. Nassr M
    6. Wang X
    7. Gu W
    8. Jiao Y
    9. Williams RW
    (2009)
    Gene expression in the mouse eye: an online resource for genetics using 103 strains of mice
    Molecular Vision 15:1730–1763.
  2. Book
    1. Hastie T
    2. Tibshirani R
    3. Friedman J
    (2009)
    The elements of statistical learning
    New York: Springer.
    1. Kumagai Y
    2. Todaka T
    3. Toki S
    (1990)
    A new metabolic pathway of morphine: in vivo and in vitro formation of morphinone and morphine-glutathione adduct in guinea pig
    The Journal of Pharmacology and Experimental Therapeutics 255:504–510.
    1. Lee GH
    2. Bennett LM
    3. Carabeo RA
    4. Drinkwater NR
    (1995)
    Identification of hepatocarcinogen-resistance genes in DBA/2 mice
    Genetics 139:387–395.
    1. Muller P
    2. Do KA
    (2012)
    Bayesian inference for gene expression and proteomics
    Bayesian inference for gene expression and proteomics.
    1. Pareek T
    2. Zipp L
    3. Letterio JJ
    (2012)
    Cdk5: an emerging kinase in pain signaling
    Brain Disorders and Therapy, S:1.
    1. Sokal RR
    2. Rohlf FJ
    (1990)
    Biometry
    Biometry.
    1. Teslovich TM
    2. Musunuru K
    3. Smith AV
    4. Edmondson AC
    5. Stylianou IM
    6. Koseki M
    7. Pirruccello JP
    8. Ripatti S
    9. Chasman DI
    10. Willer CJ
    11. Johansen CT
    12. Fouchier SW
    13. Isaacs A
    14. Peloso GM
    15. Barbalic M
    16. Ricketts SL
    17. Bis JC
    18. Aulchenko YS
    19. Thorleifsson G
    20. Feitosa MF
    21. Chambers J
    22. Orho-Melander M
    23. Melander O
    24. Johnson T
    25. Li X
    26. Guo X
    27. Li M
    28. Shin Cho Y
    29. Jin Go M
    30. Jin Kim Y
    31. Lee JY
    32. Park T
    33. Kim K
    34. Sim X
    35. Twee-Hee Ong R
    36. Croteau-Chonka DC
    37. Lange LA
    38. Smith JD
    39. Song K
    40. Hua Zhao J
    41. Yuan X
    42. Luan J
    43. Lamina C
    44. Ziegler A
    45. Zhang W
    46. Zee RY
    47. Wright AF
    48. Witteman JC
    49. Wilson JF
    50. Willemsen G
    51. Wichmann HE
    52. Whitfield JB
    53. Waterworth DM
    54. Wareham NJ
    55. Waeber G
    56. Vollenweider P
    57. Voight BF
    58. Vitart V
    59. Uitterlinden AG
    60. Uda M
    61. Tuomilehto J
    62. Thompson JR
    63. Tanaka T
    64. Surakka I
    65. Stringham HM
    66. Spector TD
    67. Soranzo N
    68. Smit JH
    69. Sinisalo J
    70. Silander K
    71. Sijbrands EJ
    72. Scuteri A
    73. Scott J
    74. Schlessinger D
    75. Sanna S
    76. Salomaa V
    77. Saharinen J
    78. Sabatti C
    79. Ruokonen A
    80. Rudan I
    81. Rose LM
    82. Roberts R
    83. Rieder M
    84. Psaty BM
    85. Pramstaller PP
    86. Pichler I
    87. Perola M
    88. Penninx BW
    89. Pedersen NL
    90. Pattaro C
    91. Parker AN
    92. Pare G
    93. Oostra BA
    94. O'Donnell CJ
    95. Nieminen MS
    96. Nickerson DA
    97. Montgomery GW
    98. Meitinger T
    99. McPherson R
    100. McCarthy MI
    101. McArdle W
    102. Masson D
    103. Martin NG
    104. Marroni F
    105. Mangino M
    106. Magnusson PK
    107. Lucas G
    108. Luben R
    109. Loos RJ
    110. Lokki ML
    111. Lettre G
    112. Langenberg C
    113. Launer LJ
    114. Lakatta EG
    115. Laaksonen R
    116. Kyvik KO
    117. Kronenberg F
    118. König IR
    119. Khaw KT
    120. Kaprio J
    121. Kaplan LM
    122. Johansson A
    123. Jarvelin MR
    124. Janssens AC
    125. Ingelsson E
    126. Igl W
    127. Kees Hovingh G
    128. Hottenga JJ
    129. Hofman A
    130. Hicks AA
    131. Hengstenberg C
    132. Heid IM
    133. Hayward C
    134. Havulinna AS
    135. Hastie ND
    136. Harris TB
    137. Haritunians T
    138. Hall AS
    139. Gyllensten U
    140. Guiducci C
    141. Groop LC
    142. Gonzalez E
    143. Gieger C
    144. Freimer NB
    145. Ferrucci L
    146. Erdmann J
    147. Elliott P
    148. Ejebe KG
    149. Döring A
    150. Dominiczak AF
    151. Demissie S
    152. Deloukas P
    153. de Geus EJ
    154. de Faire U
    155. Crawford G
    156. Collins FS
    157. Chen YD
    158. Caulfield MJ
    159. Campbell H
    160. Burtt NP
    161. Bonnycastle LL
    162. Boomsma DI
    163. Boekholdt SM
    164. Bergman RN
    165. Barroso I
    166. Bandinelli S
    167. Ballantyne CM
    168. Assimes TL
    169. Quertermous T
    170. Altshuler D
    171. Seielstad M
    172. Wong TY
    173. Tai ES
    174. Feranil AB
    175. Kuzawa CW
    176. Adair LS
    177. Taylor HA Jr
    178. Borecki IB
    179. Gabriel SB
    180. Wilson JG
    181. Holm H
    182. Thorsteinsdottir U
    183. Gudnason V
    184. Krauss RM
    185. Mohlke KL
    186. Ordovas JM
    187. Munroe PB
    188. Kooner JS
    189. Tall AR
    190. Hegele RA
    191. Kastelein JJ
    192. Schadt EE
    193. Rotter JI
    194. Boerwinkle E
    195. Strachan DP
    196. Mooser V
    197. Stefansson K
    198. Reilly MP
    199. Samani NJ
    200. Schunkert H
    201. Cupples LA
    202. Sandhu MS
    203. Ridker PM
    204. Rader DJ
    205. van Duijn CM
    206. Peltonen L
    207. Abecasis GR
    208. Boehnke M
    209. Kathiresan S
    (2010) Biological, clinical and population relevance of 95 loci for blood lipids
    Nature 466:707–713.
    https://doi.org/10.1038/nature09270
    1. Wang JR
    2. Li XH
    3. Gao XJ
    4. An SC
    5. Liu H
    6. Liang J
    7. Zhang K
    8. Liu Z
    9. Wang J
    10. Chen Z
    11. Sun W
    (2013)
    Expression of MMP-13 is associated with invasion and metastasis of papillary thyroid carcinoma
    European Review for Medical and Pharmacological Sciences 17:427–435.
    1. Zhou G
    2. Williams RW
    (1999)
    Eye1 and Eye2: gene loci that modulate eye size, lens weight, and retinal area in the mouse
    Investigative Ophthalmology & Visual Science 40:817–825.

Decision letter

  1. Naama Barkai
    Reviewing Editor; Weizmann Institute of Science, Israel

eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see review process). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.

Thank you for sending your work entitled “Linking traits based on their shared molecular mechanisms: A systems phenomics approach” for consideration at eLife. Your article has been favorably evaluated by Aviv Regev (Senior editor), a Reviewing editor, and 2 reviewers.

The Reviewing editor and the reviewers discussed their comments before we reached this decision, and the Reviewing editor has assembled the following comments to help you prepare a revised submission.

There is an overall appreciation for methods, such as the proposed GEMOT, that predict causative mutations leading to transcript variations that in turn drive phenotypes. However, to convince that the present method provides an advantage over existing approaches, three additions are required:

1) GEMOT performance should be compared to existing methods. This includes the qtlnet software (PMID 23288936, 21218138, and others) as well as similar methods (PMID 19540336, 21310061, 21242536, 25144184, 15711545, 25114278, 24013639, etc.).

2) Simulations should be used to evaluate the method against a 'truth' standard.

3) With regards to the Klf7 results: a consistent model describing the effect of each allele (B6, DBA, overexpression, knockout) should be presented to provide a clear hypothesis of how Klf7 effect is generated. (The interpretation of the Klf7 effects is unclear in the manuscript. Firstly, what is the nature of variation between the two Klf7 alleles? It has a cis-associated eQTL so, for example, which strain shows higher expression and can be linked to behavioral response to morphine? In the perturbation experiments, what is the strain background? Are the Klf7 knockout and overexpression effects on “driver genes” (up or down-regulation) consistent with corresponding Klf7 expression differences in the BxD population? That is, do Klf7 knockout effects look like a low-expressing Klf7 strain, and do Klf7 overexpression effects look like a high-expressing Klf7 strain? Figure 4 is also confusing as it appears to combine the knockout and overexpression experiments into summary scores. It would be useful to clearly see how these data support the inferred BxD model.)

4) The manuscript would be more convincing if the authors identify a group of traits that show low Pearson correlation among themselves, but share the same driver genes.

5) Considering the many methods already available (see above), the motivation for the present study should be better clarified.

6) The manuscript can be difficult to read at times, primarily due to a reliance on its own jargon. In some cases new nomenclature might be necessary, but many of the new terms seem to be very similar (if not identical) to widely-used, existing terms. In such cases, the paper should conform to standard nomenclature or define why such standards will not suffice. Notable examples are: “variant-gene associations” instead of eQTL; “linked relationships” instead of Pearson correlation and “link potential” instead of averaged correlation; “bipartite module” instead of pleiotropy. Furthermore, “gene drivers” and the “drivers layer” can probably be more simply referred to as “transcripts” and a “transcript layer” with appropriate context. “Causality score” appears to be a P value; why not refer to it as such? In some cases the new nomenclature can be potentially misleading, such as in the Klf7 discussion where a P value is referred to as a “perturbation effect”. Effect size and significance are different concepts with different biological interpretations.

7) High-dimensional approaches like GEMOT are prone to generating false positive associations, especially in such small experimental populations. Are P values systematically corrected for multiple tests? It is also important to estimate false discovery rates for a given significance threshold, which are not addressed in the manuscript. Additionally, the sample size in this study is very small, and BxD lines often exhibit non-local linkage disequilibrium due to limited recombination across few individuals. This can lead to many false-positive associations. How has this potential issue been addressed in the current study?

https://doi.org/10.7554/eLife.04346.028

Author response

There is an overall appreciation for methods, such as the proposed GEMOT, that predict causative mutations leading to transcript variations that in turn drive phenotypes. However, to convince that the present method provides an advantage over existing approaches, three additions are required:

1) GEMOT performance should be compared to existing methods. This includes the qtlnet software (PMID 23288936, 21218138, and others) as well as similar methods (PMID 19540336, 21310061, 21242536, 25144184, 15711545, 25114278, 24013639, etc.).

We thank the reviewers for their suggestion, which we believe improves our conclusions. The specified publications are focused on three different aspects: (1) Testing causative relationships (Neto 2013, Lee 2009); (2) Network reconstruction (Neto 2010, Hageman 2011, Wang 2014); and (3) grouping traits (Chesler 2005). As suggested by the reviewers, in the revised manuscript we added simulations that relate to each of these three aspects, with a focus on comparison to the suggested methods.

(1) Testing causative relationships. We agree with the reviewers that our original manuscript did not provide assessment of our causality testing methodology. Notably, our goal was not to develop a new method for a causality P-value score. Rather, we aimed to use an existing methodology with minor modifications that were required in our particular case. In accordance, we used a methodology that is very similar to Neto et al. 2013, with two small modifications: first, we use empirical rather than theoretical evaluation of the likelihood ratio distribution; second, Neto et al. was interested in a 'classical' definition of causality, where a transcript translates between a variant and a trait. Here, we were interested in an additional objective function: identifying causal transcripts in the presence of other transcripts. This situation is common in our case of GEMOT modules harboring multiple causal transcripts translating between the same variant and a group of traits. We have therefore slightly modified the method suggested by Neto et al. to capture this 'broad' definition of causality. We apologize for omitting this important information from the original submission. We now added a detailed description of the methodology and alterations from Neto et al. to the subsection headed “GEMOT stage III: Refine module (validate drivers). We also added a new simulation study (new Figure 1–figure supplement 2, subsection “Causality testing—performance evaluation”, legend for Figure 1–figure supplement 2), which shows that in comparison to the compared methods (Neto et al. and Lee et al.), GEMOT attained improved accuracy in our relevant case of a broad-sense definition of causality. We now clarify this in the Methods section.

(2) Network reconstruction. In the third stage of the GEMOT algorithm, GEMOT aims to identify the groups of traits that share causal drivers within the given tripartite modules. Such identification can also be performed by constructing a detailed network within a module (using existing methods) and using this solution to identify the desired group of traits. Notably, our goal was not to better construct any given network. Rather, we aimed to develop scalable methodology for the identification of the shared drivers of the module's traits, which can scale to the large tripartite modules we found in real mouse data (with a typical size of over 100 candidate transcripts and 2-40 traits in the identified tripartite modules). According to our observations, the accuracy of the existing methods dropped drastically with an increased number of components, and had an unrealistic running time for our BXD modules. This led us to develop an alternative methodology in GEMOT phase III for identifying the players behind the group of traits without a full reconstruction of the network. However, the GEMOT pipeline is general and existing network reconstruction methods may be used in phase III of the pipeline, alongside or instead of our suggested phase-III approach. We now clarify these points in the Introduction, Results and Methods sections.

In the revised manuscript, we added a new simulation study that takes as input co-mapped subnetworks that mimic tripartite modules so as to identify the desired group of traits that share the same transcript players behind them. Notably, GEMOT does not construct a causal network but only predicts a subset of traits sharing the same mechanism. To avoid biases, we therefore based our systematic evaluation on the ability to identify the correct group of traits sharing a mechanism, thus ignoring irrelevant additional information produced by the compared network reconstruction methods. We now provide a detailed description of the simulated sub-networks and comparisons in the Results and Methods (subsections “Systematic evaluation of the GEMOT algorithm using synthetic data analysis” and “GEMOT Performance analysis”) and new five figures (new Figure 5 and new Figure 5–figure supplements 1, 2, 3 and 4). These simulation results support our usage of GEMOT in the range of biologically-relevant network parameters.

(3) Grouping traits. We have added a relevant simulation of a large biological network that harbors small groups of traits that share the same causal transcripts within it. We now discuss the simulation of this large network in the Methods (“GEMOT Performance analysis”) and Results (“Systematic evaluation of the GEMOT algorithm using synthetic data analysis”) sections. GEMOT's performance is compared to two alternative methods: (I) the ICE algorithm (Shi et al. 2010), which was used by Chesler et al. 2005 (as recommended by the reviewers); and (II) the INVAMOD algorithm. We find that GEMOT outperforms the alternative methods in identifying groups by causality across varying numbers of traits in a group (new Figure 5–supplementary figure 5B).

Importantly, an equivalent systematic comparison between GEMOT and the alternative network reconstruction algorithms on the entire network is not feasible due to the long running time of the network reconstruction algorithm. Therefore, the entire 3-phase GEMOT pipeline, which can be applied on thousands of components, cannot be directly compared to the network reconstruction methods. We now highlight this point and exemplify the running time of the compared methods for the case of small sub-networks (new Figure 5–figure supplement 2B).

Notably, three additional suggested publications (Westra 2013, Rutledge 2014 and Rosa 2009) have not been directly compared since they do not provide or develop any code that we could systematically apply to the synthetic data; these publications, however, discuss methodologies that are similar to the methods by Chesler 2005, Neto 2013 and Lee 2009, which were compared in our revised manuscript, as detailed above.

2) Simulations should be used to evaluate the method against a 'truth' standard.

We addressed this issue above (in our answer to comment 1). As noted, we now added simulations to evaluate GEMOT's accuracy. In particular, using a gold standard, we present the superiority of GEMOT for the goal of identifying groups of traits with causative players behind them. This is shown for synthetic networks of increasing complexity:

(I) Small network fragments including triplets of components (new Figure 1–figure supplement 2, subsection “Causality testing—performance evaluation” and Figure 1–figure supplement 2, legend).

(II) Co-mapped sub-networks (new figure 5 and Figures 5–figure supplement 1-4, and subsections “GEMOT Performance analysis” and “Systematic evaluation of the GEMOT algorithm using synthetic data analysis”).

(III) Full large networks (in the same subsections).

3) With regards to the Klf7 results: a consistent model describing the effect of each allele (B6, DBA, overexpression, knockout) should be presented to provide a clear hypothesis of how Klf7 effect is generated. (The interpretation of the Klf7 effects is unclear in the manuscript. Firstly, what is the nature of variation between the two Klf7 alleles? It has a cis-associated eQTL so, for example, which strain shows higher expression and can be linked to behavioral response to morphine? In the perturbation experiments, what is the strain background? Are the Klf7 knockout and overexpression effects on “driver genes” (up or down-regulation) consistent with corresponding Klf7 expression differences in the BxD population? That is, do Klf7 knockout effects look like a low-expressing Klf7 strain, and do Klf7 overexpression effects look like a high-expressing Klf7 strain? Figure 4 is also confusing as it appears to combine the knockout and overexpression experiments into summary scores. It would be useful to clearly see how these data support the inferred BxD model.)

Following the reviewers' suggestion, we first extended the analysis of module number 2 using the BxD data, and then demonstrated the agreement between this data and the perturbation experiment. In the revised manuscript we present a full consistency between the experimental perturbation results and the BxD-based model. Our extended analysis shows that the drivers in module number 2 can be partitioned into two types: group I is positively correlated with the Klf7 transcript and negatively correlated with the morphine-related traits; group II is negatively correlated with Klf7 and positively correlated with these traits. Consistently, strain DBA/2J shows higher levels of group I and Klf7 transcripts, whereas strain C57BL/6J shows higher levels of group II transcripts and of the morphine-related trait measurements. These results suggest that Klf7 acts as an activator of group I drivers and as a repressor of group II drivers. In accordance with this prediction, Klf7 overexpression in the C57BL/6J strain background up-regulated group I and down-regulated group II. Furthermore, Klf7 knockdown in the C57BL/6J strain background down-regulated group I, as expected. (Klf7 knockdown have no influence on group II, likely due to the low Klf7 level in the wild-type C57BL/6J, which already leads to an ineffective repression). We thank the reviewers for this suggestion, which we believe strengthened our results.

We have substantially revised the manuscript to include these new results that now support a causal role for Klf7 as an activator of drivers in group I and repressor of drivers in group II (in “Discovery of a model for a collection of tightly related traits”, in Results), and present new Figures 3B–group I and new Figure 3–figure supplements 1A, 2A-D and 3B (which replaces former Figure 4B-E and Figure 4–figure supplement 1) by providing all the relevant data about the partition of drivers into two groups, the correlation between objects and the difference in trait and expression levels between alleles, along with the corrected final model. In particular, we do not use a summary score for overexpression and knockdown, as suggested by the reviewers. Instead, Figures 3G, H now present the results of over-expression and knockdown in separate plots (replacing the unified presentation in former Figure 4D).

4) The manuscript would be more convincing if the authors identify a group of traits that show low Pearson correlation among themselves, but share the same driver genes.

We agree with the reviewers, and apologize for the missing details in our description of the modules in the original submission. The original submission already exemplified two modules with high trait-trait correlations (modules nos. 2 and 3) and one module with poor trait-trait correlations (module number 4, in formerly Figure 5B, currently Figure 4B). The revised manuscript clarifies this point in the Results and Discussion sections. We also added a new Figure 2–figure supplement 4 presenting the trait-trait correlations in each of the modules.

5) Considering the many methods already available (see above), the motivation for the present study should be better clarified.

The comment pointed out to a lack of clarity in comparison to existing methods, which we now address in full. First, we clarify the difference from existing methods. Whereas many trait-clustering methods are available, these do not rely on causative molecular players behind the grouped traits. Furthermore, whereas many causal network construction methods exist, such methods construct the entire networks and therefore cannot scale to large networks that include thousands of genes and traits (due to a lack of power and a non-realistic running time). We have revised the text in the Introduction, Results , and Discussion sections to clarify these points. Second, in the revised manuscript, we use simulations to demonstrate the advantages of GEMOT over various existing methods (network construction: Neto 2010, Hageman 2011, Wang 2014; grouping methods: Shi et al. 2010, Gat-Viks et al. 2013) for the goal of identifying groups of traits that share the same causal transcripts, comparing both running time and accuracy. In addition, we clarify the differences from existing causality testing approaches, including a comparative simulation study. We address the simulation studies in detail in response to comment 1, above. Third, we emphasize that the GEMOT pipeline is general and different existing methods may be used as subroutines within the GEMOT pipeline. For example, network construction methods may be applied to construct the sub-networks within the tripartite modules; existing causality-testing methods may be used to identify the driver genes; existing biclustering approaches may be utilized to identify the bipartite modules. We now clarify this point in the revised manuscript.

6) The manuscript can be difficult to read at times, primarily due to a reliance on its own jargon. In some cases new nomenclature might be necessary, but many of the new terms seem to be very similar (if not identical) to widely-used, existing terms. In such cases, the paper should conform to standard nomenclature or define why such standards will not suffice. Notable examples are: “variant-gene associations” instead of eQTL; “linked relationships” instead of Pearson correlation and “link potential” instead of averaged correlation; “bipartite module” instead of pleiotropy. Furthermore, “gene drivers” and the “drivers layer” can probably be more simply referred to as “transcripts” and a “transcript layer” with appropriate context. “Causality score” appears to be a P value; why not refer to it as such? In some cases the new nomenclature can be potentially misleading, such as in the Klf7 discussion where a P value is referred to as a “perturbation effect”. Effect size and significance are different concepts with different biological interpretations.

We thank the reviewers for this suggestion, and made the following changes accordingly:

1) Transcripts: The term 'gene' was replaced with ‘transcript’, and accordingly, the terms ‘driver layer’ and 'molecular layer' were replaced with ‘transcripts layer’. We modified the text throughout the manuscript accordingly.

2) Driver-related terms: We simplified the terminology by replacing the term 'candidate driver' with 'candidate transcript', and 'driver' with 'driver transcript'. Notably, we could not completely replace the term 'driver' and use the term 'transcript' instead, since we have two kinds of transcripts: the candidate transcripts in stage number II and the transcripts that were identified as drivers in stage number III.

3) Causative scores: The term 'causative score' was replaced with 'causality P value', and the term 'gene causality score' was replaced by 'transcript causality P value', as suggested.

4) The Klf7 discussion: In the results referring to module number 2 we now conformed to the standard terminology, where 'effect size' refers to difference between strains, and 'significance of effect' refers to the P value for these differences.

5) Linked relationships and link potential: In both cases, the term could not be replaced with an existing term. In the case of 'Linked relationships', we believe that replacing it with the term 'correlation' is misleading since 'correlation' refers to relations between two objects, whereas 'linked relationships' refers to the relations between three objects. In particular, 'linked relationships' refers to the combination of two existing terms: one is a simple 'correlation' between a transcript and a trait, and the second is an 'association' of a variant and a transcript. Therefore, there is no existing term referring to this combination. We now have the text slightly refined accordingly. Notably, since the term 'link potential' refers to an aggregation of 'linked relationships', we therefore could not use an existing term in this case.

6). Bipartite modules: In the GEMOT pipeline, bipartite modules do not represent any kind of pleiotropy, but only pleiotropy that is likely mediated through transcripts. The term 'Bipartite module' is therefore not equivalent to 'pleiotropy' and cannot be replaced. We now clarify this in the revised manuscript. Notably, bipartite graphs and sub-graphs conform to the standard terminology in the biclustering literature (e.g., Tanay et al., Bioinformatics 2002).

7) Variants: In some cases along the GEMOT pipeline, using the term 'eQTL' instead of 'variant' may be misleading. In particular, for the output GEMOT modules and association scores, the variant can be also terms an eQTL. However, in the remaining manuscript, the variant is not an eQTL. For example, during phase 1 of GEMOT, the variant only has linked relationships with a trait, therefore it cannot be declared as an eQTL; in bipartite and tripartite modules and also throughout the simulation studies (Figures 1A and 5A), reactive relationships may appear, and therefore the variant is not an eQTL. We believe that a consistent usage of the term 'variant' in all cases is more clear than using two different terms for the same object in different contexts.

Overall, in the revised manuscript, we changed 8 terms so as to conform to the standard terminology according to the reviewers’ suggestion.

7) High-dimensional approaches like GEMOT are prone to generating false positive associations, especially in such small experimental populations. Are P value systematically corrected for multiple tests? It is also important to estimate false discovery rates for a given significance threshold, which are not addressed in the manuscript. Additionally, the sample size in this study is very small, and BxD lines often exhibit non-local linkage disequilibrium due to limited recombination across few individuals. This can lead to many false-positive associations. How has this potential issue been addressed in the current study?

The reviewer raises two concerns regarding false positives. The first refers to the evaluation of FDR and the multiple testing problem, and the second relates to the linkage disequilibrium issue.

1) Multiple testing correction and FDR: In the revised manuscript, we now systematically correct for multiple testing and estimate FDR. First, in the case of real data, FDR is estimated based on a permutation test. We assessed the FDR for various predictions along the GEMOT pipeline, including the thresholds for identifying candidate transcripts and driver transcripts. We have substantially revised the Methods section accordingly. Second, we provide direct evaluation of FDR for the grouping of traits in synthetic networks, when the actual solution is known (new Figure 5–figure Supplement 5B and Figure 1–figure supplement 2C,D). Finally, We now systematically added multiple testing corrections for all reported P values throughout the manuscript. Notably, GEMOT provides the same results when applied with or without Boneferroni correction for the gene-variant association and link potential scores, since the construction of the bipartite modules is not sensitive to scaling of these values. We clarify this in the revised text.

2) Non-local linkage disequilibrium: We certainly agree that in some cases it would not be possible to discern between several potential genomic intervals due to non-local linkage disequilibrium. To address this issue, we tested the linkage disequilibrium (LD) between the genomic interval of each of our identified modules and the entire genome. We added a new Figure 2–figure supplement 1 presenting these results, allowing the observation of additional potential intervals for a module. We briefly discuss it in the manuscript (please see the Discussion section and the legend for Figure 2–figure supplement 1). For the particular modules presented in this study, all non-local LDs did not exceeded |r|=0.75.

https://doi.org/10.7554/eLife.04346.029

Article and author information

Author details

  1. Yael Oren

    Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
    Contribution
    YO, Conception and design, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  2. Aharon Nachshon

    Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
    Contribution
    AN, Conception and design, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  3. Amit Frishberg

    Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
    Contribution
    AF, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  4. Roni Wilentzik

    Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
    Contribution
    RW, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  5. Irit Gat-Viks

    Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
    Contribution
    IG-V, Conception and design, Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    iritgv@post.tau.ac.il
    Competing interests
    The authors declare that no competing interests exist.

Funding

Israel Science Foundation (ISF) (1643/13)

  • Yael Oren
  • Roni Wilentzik

Israeli Centers for Research Excellence (I-CORE)

  • Yael Oren
  • Roni Wilentzik

Tel Aviv University (Edmond J. Safra Center for Bioinformatics)

  • Roni Wilentzik
  • Irit Gat-Viks

Broad Foundation (1168/14)

  • Aharon Nachshon
  • Amit Frishberg

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Prof. Marcelo Ehrlich for valuable discussions and comments, and Avital Brodt for artwork. This research is supported by the Israeli Centers of Research Excellence (I-CORE): Gene Regulation in Complex Human Disease, Center No. 41/11 (YO, RW); Israel Science Foundation (ISF) fund no. 1643/13 (YO, RW), ISF-Broad fund no. 1168/14 (AN, AF), and the Edmond J Safra Center for Bioinformatics at Tel-Aviv University (IGV, RW). IGV is a Faculty Fellow of the Edmond J Safra Center for Bioinformatics at Tel Aviv University and an Alon Fellow.

Reviewing Editor

  1. Naama Barkai, Weizmann Institute of Science, Israel

Version history

  1. Received: September 3, 2014
  2. Accepted: February 20, 2015
  3. Version of Record published: March 17, 2015 (version 1)

Copyright

© 2015, Oren et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,467
    Page views
  • 202
    Downloads
  • 5
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Yael Oren
  2. Aharon Nachshon
  3. Amit Frishberg
  4. Roni Wilentzik
  5. Irit Gat-Viks
(2015)
Linking traits based on their shared molecular mechanisms
eLife 4:e04346.
https://doi.org/10.7554/eLife.04346

Further reading

    1. Computational and Systems Biology
    Barbara Bravi, Andrea Di Gioacchino ... Rémi Monasson
    Research Article Updated

    Antigen immunogenicity and the specificity of binding of T-cell receptors to antigens are key properties underlying effective immune responses. Here we propose diffRBM, an approach based on transfer learning and Restricted Boltzmann Machines, to build sequence-based predictive models of these properties. DiffRBM is designed to learn the distinctive patterns in amino-acid composition that, on the one hand, underlie the antigen’s probability of triggering a response, and on the other hand the T-cell receptor’s ability to bind to a given antigen. We show that the patterns learnt by diffRBM allow us to predict putative contact sites of the antigen-receptor complex. We also discriminate immunogenic and non-immunogenic antigens, antigen-specific and generic receptors, reaching performances that compare favorably to existing sequence-based predictors of antigen immunogenicity and T-cell receptor specificity.

    1. Computational and Systems Biology
    2. Ecology
    Vanessa Rossetto Marcelino
    Insight

    High proportions of gut bacteria that produce their own food can be an indicator for poor gut health.