A differentiable model for optimizing the genetic drivers of synaptogenesis

  1. Department of Biomedicine and Prevention, University of Rome Tor Vergata, Rome, Italy
  2. A.A. Martinos Center for Biomedical Imaging and Harvard Medical School, Boston, United States

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Dániel Barabási
    Harvard University, Cambridge, United States of America
  • Senior Editor
    Panayiota Poirazi
    FORTH Institute of Molecular Biology and Biotechnology, Heraklion, Greece

Reviewer #1 (Public review):

The authors address a set of important and challenging questions at the interface of (developmental) neuroscience, genetics, and computation. They ask how complex neural circuits could emerge from compact genomic information, and they outline a bold vision in which this process might eventually be harnessed to design synthetic biological intelligence through genetic control of synaptogenesis. These are significant and stimulating ideas that merit rigorous theoretical and experimental exploration.

However, the present work does not convincingly engage with these questions at a mechanistic level. Most of the circuit formation aspects appear to be adopted from prior models, and it is not clear how the main methodological modifications-introducing synaptic conductance and stochastic formalisms-provide new conceptual insight into genomic specification of neural circuitry. The manuscript does not include significant biological data or validation to support the proposed framework, and the results provided instead use artificial reinforcement learning benchmarks, which do not appear informative with respect to the biological claims.

Overall, while the manuscript raises intriguing themes and ambitions, the proposed model is conceptually disconnected from the biological problem it purports to address. The strength of evidence does not support the strong interpretative or translational claims, and substantial rethinking of the modeling framework, in particular its validation strategy, would be required for the work to match the claims of our improved understanding of the genomic basis of neural circuit formation and our ability to engineer it.

Reviewer #2 (Public review):

In this manuscript, the authors built upon the Connectome Model literature and proposed SynaptoGen, a differentiable model that explicitly takes into account multiplicity and conductance in neural connectivity. The authors evaluated SynaptoGen through simulated reinforcement learning tasks and established its performance as often superior to two considered baselines. This work is a valuable addition to the field, supported by a solid methodology with some details and limitations missing.

Major points:

(1) The genetic features in the X and Y matrices in the CM were originally introduced as combinatorial gene expression patterns that correspond to the presence and even absence of a subset of genes. The authors oversimplify this original scope by only considering single-gene expression features. While this was arguably a reasonable first approximation for a case study of gap junctions in C. elegans, it is by no means expected to be a plausible expectation for chemical synapses. As the authors appear to motivate their model by chemical synapses that have polarities, they should either consider combinatorial rules in the model or at least present this explicitly as a key limitation of the model. Omitting combinatorial effects also renders the presented "bioplausible" baseline much less bioplausible, likely calling for a different name.

(2) It is not fully explained how Equation (11) is obtained, even conceptually. It is unclear why \bar{B} and \bar{G} should be element-wise multiplied together, both already being expected values. Moreover, the authors acknowledged in lines 147-149 that the components of \bar{G} actually depend on gene expression X, which is a component in \bar{B}, so the logic here seems circular.

(3) The authors considered two baselines, namely SNES and a bioplausible control. However, it would be of interest to also investigate: a) Vanilla DQN with the same size trained on the same MLP, to judge whether the biological insights behind SynaptoGen parameterization add value to performance. b) Using Equation (7) instead of Equation (11) to construct the weight matrices, to judge whether incorporating the conductance adds value to performance.

Reviewer #3 (Public review):

Summary

Boccato et al. present an ambitious and thoughtfully developed framework, SynaptoGen, which proposes a differentiable model of synaptogenesis grounded in gene-expression vectors, protein interaction probabilities, and conductance rules. The authors aim to bridge the gap between computational connectomics and synthetic biological intelligence by enabling gradient-based optimization of genetically encoded circuit architectures. They support this goal with mathematical derivations, simulation experiments across several RL benchmarks, and a biologically grounded validation using C. elegans adhesion-molecule co-expression data. The paper is timely and conceptually compelling, offering a unified formulation of synaptic multiplicity and synaptic weight formation that can be integrated directly into learning systems.

Strengths

(1) Well-motivated framework with clear conceptual contributions.

(2) Rigorous mathematical development.

(3) Compelling empirical validation.

(4) Excellent framing and discussion of future impact.

Weaknesses

(1) Overstated claims in the abstract and discussion.

(2) Ambiguity in "first of its kind" assertions.

Author response:

Response to Reviewer #1

Our work builds upon the foundations of what we term the “CM family”, specifically the Connectome Model (CM) introduced by Kovács et al.. This was a deliberate choice, as our objectives substantially overlap with those of works in this family. Moreover, we wished to avoid reinventing the wheel—starting instead from a solid body of work with validations we found convincing (thereby inheriting this solidity) and, importantly, addressing the same research community using a “familiar” conceptual language. We therefore wish to clarify how our contributions indeed constitute new conceptual insights into the genomic specification of neural circuitry.

The function implemented by a neural circuit clearly depends on how information propagates between its nodes and connections; the contribution of synapses—their number and properties—cannot be neglected when understanding, manipulating, or designing such function. To the best of our understanding, in Kovács et al., the primary objects of interest are binary connectomes (presence or absence of synapses) or weighted connectomes where “in the occasion of multiple [genetic] rules contributing to the same link”, “the weight of each link correspond[s] to the number of rules involved”. In Barabási et al., a “relaxed” version of the CM directly provides weights for an artificial neural network without explicitly specifying how each weight might result from the combination of a specific number of synapses and their respective properties. The random variable formalism and the introduction of conductances that we propose precisely add this further—yet important—element of complexity and representational detail: synaptic multiplicity. This extends existing models with the hope of laying the groundwork for what could, in the distant future, become a technology capable of producing neural circuits genetically programmed to implement a defined function.

Regarding the proposed validation, we acknowledge its limitations, but we clarify that at the time this work was conducted, to the best of our knowledge, no public datasets existed to perform validation as the reviewer envisions. We therefore did the best that was materially feasible: we assumed the biological correctness of the model (also based on the validations accompanying the models upon which ours was built) and verified, through simulation, that it could be used to obtain genetic variables of interest capable of producing neural agents able to solve a pre-specified task—even with the additional constraint of genetic rules derived from experimental data.

Response to Reviewer #2

We address the points raised by Reviewer #2 in the following paragraphs.

Regarding point (1), we agree with the reviewer that considering single-gene expression features is a simplification, especially in the case of chemical synapses. However, as with the CM, our model can also be extended to account for combinatorial rules. One possibility is to add columns to the X matrix, as many as there are gene expression patterns of interest. For each new column, a function would be defined to compute the expression feature from the expression features of the genes involved in the pattern, and this function would be used to populate the values of the new columns. The O matrix would likewise be updated with the corresponding new probabilities. While such extension is possible, it is important to note that this gives rise to the problem of combinatorial explosion of genetic rules, with the consequent construction of matrices whose dimensionality becomes difficult to handle. Moreover, the biological plausibility of the model would then shift toward how these functions are defined, along with the interpretation of the values contained in the X matrix. Depending on the use case of our model, one possible solution to the combinatorial explosion problem could be to consider only expression patterns valid for synapse formation by extracting this information from available experimental data, thereby restricting the number of rules. We acknowledge that this problem remains open and will require more precise formulations and future work.

Regarding point (2), Equation (11) can be derived from the assumption that the various synapses between two neurons behave as resistors in parallel. Accepting this, the equivalent conductance Guv, as denoted in the paper, can be expressed as the sum of all conductances between neurons u and v. Moving to the random variable formalism and having defined 𝒢 as the random variable representing the “signed conductance of a synapse randomly selected from the ones that connect neurons u and v”, the equivalent conductance (as a random variable) becomes ℬ·𝒢. Recall that ℬ is the random variable representing the number of synaptic connections between two neurons of interest. At this point, under the further assumption that the random variables ℬ and 𝒢 are independent, the expectation of the equivalent conductance can be calculated as the product of the expected values of ℬ and 𝒢. Equation (11) follows immediately from this. We acknowledge that these assumptions may not correspond to biological reality, but we consider them a reasonable starting point for addressing the problem.

Finally, we explain the reasons why the baselines suggested by the reviewer are not included in the work. We did not train classical MLPs because the main objective of the work was not to develop new bio-inspired architectures aimed at generically improving the performance of neural networks in RL, and we deemed it an additional source of confusion to propose a comparison that would suggest this direction. The main objective of the work is instead to contribute to the modeling of synaptogenesis and to lay the groundwork for—or advance the state of knowledge of—what will be a future technology that allows us to manipulate it (synaptogenesis). A similar reasoning applies to a potential baseline in which the weight matrix is constructed from Equation (7). Again, the interest is not in verifying that conductances provide a performance advantage, but rather that they are a necessary element for a sufficient level of biological plausibility. Beyond this, the exclusive and direct use of matrix B in the simulation of synaptogenesis introduces a quantization problem as described in the Appendix.

Response to Reviewer #3

We believe the concerns raised by the reviewer regarding the weaknesses of the work are legitimate. We wish to emphasize that all claims made in the paper were made in good faith, with the intent to generate enthusiasm for the discipline while avoiding excess or the assertion of anything incorrect or untruthful. Given that the work is inherently interdisciplinary, we recognize that reader expectations depend on their reference community, and we clarify that our primary area of expertise is AI, and that the biological claims were therefore made from this perspective.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation