A mathematical model clarifies the ABC Score formula used in enhancer-gene prediction

  1. Department of Systems Biology, Harvard Medical School, Boston, United States
  2. Department of Physics, Brandeis University, Waltham, United States
  3. Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, United States

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Ariel Amir
    Weizmann Institute of Science, Rehovot, Israel
  • Senior Editor
    Alan Moses
    University of Toronto, Toronto, Canada

Reviewer #1 (Public review):

Summary:

The authors aim to formalize the mathematical underpinnings of a proposed general model and discuss the relationship of this model to the ABC Score, a widely adopted heuristic for enhancer-gene predictions. While the ABC model serves as a useful binary classifier, it struggles to predict quantitative enhancer effects on gene expression. Using a graph-theoretic linear framework, the authors derive a mathematical model (the "default model") that explains how the algebraic form of the ABC Score arises under specific assumptions. They further demonstrate that the default model's predictions of enhancer additivity are inconsistent with observed non-additive enhancer effects and propose alternative assumptions to account for these discrepancies.

Strengths:

The graph-theoretic approach enables systematic exploration of enhancer interactions beyond simple additivity and enables hypothesis generation when such expectations fail. This work makes clear where assumptions are made and the consequences of those assumptions.

Weaknesses:

While the theoretical framework is elegant, I think there is always more space to demonstrate the practicality of this approach. Further guidance for how to experimentally connect this framework with typical measurements could help bolster the immediate benefits. To be clear, I do not think this is something the authors "must" do, but rather something that might help drive home the usefulness in a more accessible way.

Reviewer #2 (Public review):

Summary:

The Activity-by-Contact (ABC) model is a relatively widespread model of enhancer-gene regulation. This model leverages CRISPRi data to predict whether a gene is regulated by a given enhancer. To make this possible, this model accounts for the activity of an enhancer and its contact frequency with a target promoter in order to produce an "ABC score". However, while quantitative in its ability to predict enhancer-promoter regulation, this model is mostly phenomenological and does not commit to specific molecular mechanisms.

In this manuscript, the authors formalize the molecular and mathematical assumptions made by the ABC model. Specifically, they demonstrate a basic set of assumptions that can be made to arrive at the ABC model's mathematical structure. The resulting default model (basically, a null model) places particular emphasis on the requirement that gene activation and enhancer-gene communication must be independent and at a steady state. The authors leverage and extend a graph-based formalism they have previously spearheaded to show the generality of their conclusions with respect to different molecular realizations of the process by which enhancers interact with their promoters.

Previously published works have found that specific models of how multiple enhancers communicate with the same gene can result in additive mRNA production rates. Here, the authors demonstrate that steady-state mRNA levels are additive regardless of the specific Markovian model for how any individual enhancer communicates with the gene, as long as the model follows the basic assumptions of their default model.

By coarse-graining, both gene activation and enhancer-gene communication to simple two-state models, the authors then clearly demonstrate that the mathematical structure of the ABC model emerges. This mathematical structure implies that the ABC score summed over all the enhancers regulating a given gene must equal 1. However, experimental measurements show values ranging from 0 to 3. The authors show that, in order to explain these experimental deviations with respect to the theory, at least one of the assumptions of the default model must be broken. They demonstrate that either invoking enhancer cooperativity in mRNA production rates or breaking the assumption that individual enhancers communicate with the gene independently can explain existing experimental data.

Strengths:

By demonstrating that the mathematical structure of the ABC model emerges from a set of basic assumptions including the independence of gene activation and enhancer-gene communication, the authors succeeded in their aim to put the ABC model on a formal and molecular footing. Since some experimental results do not agree with the ABC model, the authors importantly demonstrated which assumptions of the model can be broken to explain such data. The theoretical work in this manuscript is written in a reasonably accessible manner that features how a graph theory-based approach to modeling biochemical networks can result in general statements about biological phenomena.

Weaknesses:

While the authors discuss a number of experimental techniques that can be used to test the validity of their model, a more specific discussion of proposed experiments could have strengthened the impact of the paper by providing explicit opportunities for dialogue with experimentalists.

Author response:

We thank both reviewers for their time and effort in considering our manuscript. We are pleased that the reviewers recognised the strength of our theoretical analysis and found it "elegant" and "reasonably accessible". We also acknowledge the suggestions made by both reviewers that the manuscript could be improved by more discussion of potential experiments. We were concerned not to make the original manuscript too long but, in the light of the reviewers' comments, we will submit a revised version with more details of the kinds of experiments that would build on the results that we have presented.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation