Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.
Read more about eLife’s peer review process.Editors
- Reviewing EditorMurim ChoiSeoul National University, Seoul, Republic of Korea
- Senior EditorMurim ChoiSeoul National University, Seoul, Republic of Korea
Reviewer #1 (Public review):
The authors tried to quantify the difference between human complex traits by calculating genetic overlap scores between a pair of traits. Sherlock-II was devised to integrate GWAS with eQTL signals. The authors claim that Sherlock-II is superior to the previous version (robustness, accuracy, etc). It appears that their framework provides a reasonable solution to this important question, although the study needs further clarification and improvements.
(1) Sherlock-II incorporates GWAS and eQTL signals to better quantify genetic signals for a given complex trait. However, this approach is based on the hypothesis that "all GWAS signals confer association to complex trait via eQTL", which is not true (PMID: 37857933). This should be acknowledged (through mentioning in the text) and incorporated into the current setup (through differential analysis - for example, with or without eQTL signals, or with strong colocalization only).
(2) When incorporating eQTL, why did the authors use the top p-value tissues for eQTL? This approach seems simpler and probably more robust. But many eQTLs are tissue-specific. Therefore, it would also be important to know if eQTLS from appropriate tissues were incorporated instead.
(3) One of the main examples is the novel association between Alzheimer's disease and breast cancer. Although the authors provided a molecular clue underlying the association, it is still hard to comprehend the association easily, as the two diseases are generally known to be exclusive to each other. This is probably because breast cancer GWAS is performed for germline variants and does not consider the contribution of somatic variants.
(4) It would help readers understand the story better if a summary figure of the entire process were provided. The current Figure 1 does not fulfil that role.
(5) Figure 2 is not very informative. The readers would want to know more quantitative information rather than a heatmap-style display. Is there directionality to the relationship, or is it always unidirectional?
(6) In Figure 3, readers may want to know more specific information. For example, what gene signals are really driving the hypoxia signal in Alzheimer's disease vs breast cancer? And what SNP signals are driving these gene-level signals?
Reviewer #2 (Public review):
Summary:
The authors introduce a gene-level framework to detect shared genetic architecture between complex traits by integrating GWAS summary statistics with eQTL data via a new algorithm, Sherlock-II, which aggregates signals from multiple (cis/trans) eSNPs to produce gene-phenotype p-values. Shared pathways are identified with Partial-Pearson-Correlation Analysis (PPCA).
Strengths:
The authors show the gene-based approach is complementary and often more sensitive than SNP-level methods, and discuss limitations (in terms of no directionality, dependence on eQTL coverage).
Weaknesses:
(1) How do the authors explain data where missing tissues or sparse eQTL mapping are available? Would that bias as to which genes/traits can be linked and may produce false negatives or tissue-specific false positives?
(2) Aggregating SNP-level signals into gene scores can be confounded by LD; for example, a nearby causal variant for a different gene or non-expression mechanism may drive a gene's score, producing spurious gene-trait links. How do the authors prevent this?
(3) How the SNPs are assigned to genes would affect results, this is because different choices can change which genes appear shared between traits. The authors can expand on these.
(4) Many reported novel trait links remain speculative without functional or orthogonal validation (e.g., colocalization, perturbation data). Thus, the manuscript's claims are inconclusive and speculative.
(5) It would be best to run LD-aware colocalization and power-matched simulations to check for robustness.