Multi-scale overview of the SynaptoGen model.

A. In neural networks, synaptogenesis-the formation of synapses-can be approximated as the outcome of interactions (e.g., molecular binding) between proteins translated from gene pairs. SynaptoGen models this process as the realization of a random variable defined as the sum of multiple binomial random variables (e.g., BAB + BBC), one for each potentially interacting gene pair. In the notation used, the uv subscript indicates the neuronal pair taken into consideration. B. The use of binomial random variables stems from the idea of modeling synapse formation for each gene pair, as a process akin to flipping nij biased coins, each representing a Bernoulli(pij) random variable. C. When working with matrix representations of the genetic factors in the panel, it is mathematically provable (see Section Methods) that a series of matrix multiplications and point-wise operations yields a matrix where the entry in the u-th row and v-th column contains the expected number of synapses formed between neurons u and v, multiplied by their average synaptic conductance. D. This resulting matrix, W ( in Section Methods), can be interpreted as a weight matrix and integrated into architectures such as multilayer perceptrons (MLPs). Here, x denotes an MLP layer’s input, while a the resulting activations.

Mean reward distributions from the model families, characterized by a 16-gene profile, tested on the four selected RL environments.

The green crosses represent the mean rewards, averaged over 10 episodes, achieved by the trained SynaptoGen models. Each scatterplot point represents the mean reward obtained by a specific agent. The black crosses denote instead the distribution means. Model families are color-coded. The dashed horizontal lines indicate the reward threshold beyond which the task associated with an environment is considered solved.

Summary metrics computed over the reward distributions obtained from the model families characterized by a 16-gene profile.

Three metrics are reported: the distribution mean, a mean computed over the top-10 agents, and the percentage of simulated agents that solved the task. Each group of rows refers to one of the selected RL environments and the best scores are highlighted in bold.

A. Co-expression data computed following our extended nDGE variant from the expression patterns released with (Taylor et al., 2021). In the matrix, rows correspond to genes expressed in pre-synaptic neurons, while columns represent genes utilized in post-synaptic neurons. The green circles indicate pairs of genes that are co-expressed in neurons that are connected but not co-expressed in neurons that could be connected but lack synapses. Genes involved in co-expressed pairs are highlighted in bold. B. Visualization of the two sigmoids used to map the learnable parameters associated with the genetic rules into the probabilities in O. The green sigmoid is applied to the parameters corresponding to co-expressed pairs, while the pink sigmoid is applied to the remaining ones. We also show in blue the parameter distribution after initialization and in green (co-expressed pairs) and pink (non-co-expressed pairs) the distributions of the probabilities obtainable in an example scenario where 50% of the gene pairs are co-expressed. C. Genetic rules learned by SynaptoGen in our bio-plausible validations. The emerging patterns are fully consistent with the co-expression data in A. (correlation coefficients: 0.92, 0.74, 0.44, 0.84; same order as in C.) and demonstrate how the model assigned high probabilities (~1) specifically to the genetic rules corresponding to the pairs of co-expressed genes in the C. elegans nerve ring.

Mean reward distributions from the model families, characterized by a 39-gene profile which includes the genetic rules derived from C. elegans, tested on the four selected RL environments.

The green crosses represent the mean rewards, averaged over 10 episodes, achieved by the trained SynaptoGen models. Each scatterplot point represents the mean reward obtained by a specific agent. The black crosses denote instead the distribution means. Model families are color-coded. The dashed horizontal lines indicate the reward threshold beyond which the task associated with an environment is considered solved.

Summary metrics computed over the reward distributions obtained from the model families characterized by a 39-gene profile which includes the genetic rules derived from C. elegans.

Three metrics are reported: the distribution mean, a mean computed over the top-10 agents, and the percentage of simulated agents that solved the task. Each group of rows refers to one of the selected RL environments and the best scores are highlighted in bold.

Our synaptogenesis simulation procedure.

The SNES optimization technique.

Our lineal model-based initialization procedure for gene expression.

Mean reward distributions from the model families, characterized by a 32-gene profile, tested on the four selected RL environments.

The green crosses represent the mean rewards, averaged over 10 episodes, achieved by the trained SynaptoGen models. Each scatterplot point represents the mean reward obtained by a specific agent. The black crosses denote instead the distribution means. Model families are color-coded. The dashed horizontal lines indicate the reward threshold beyond which the task associated with an environment is considered solved.

Summary metrics computed over the reward distributions obtained from the model families characterized by a 32-gene profile.

Three metrics are reported: the distribution mean, a mean computed over the top-10 agents, and the percentage of simulated agents that solved the task. Each group of rows refers to one of the selected RL environments and the best scores are highlighted in bold.

Mean reward distributions from the model families, characterized by a 64-gene profile, tested on the four selected RL environments.

The green crosses represent the mean rewards, averaged over 10 episodes, achieved by the trained SynaptoGen models. Each scatterplot point represents the mean reward obtained by a specific agent. The black crosses denote instead the distribution means. Model families are color-coded. The dashed horizontal lines indicate the reward threshold beyond which the task associated with an environment is considered solved.

Summary metrics computed over the reward distributions obtained from the model families characterized by a 64-gene profile.

Three metrics are reported: the distribution mean, a mean computed over the top-10 agents, and the percentage of simulated agents that solved the task. Each group of rows refers to one ofthe selected RL environments and the best scores are highlighted in bold.

Our global nDGE variant.