<?xml version="1.0" ?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.3 20210610//EN"  "JATS-archivearticle1-mathml3.dtd"><article xmlns:ali="http://www.niso.org/schemas/ali/1.0/" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.3" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">elife</journal-id>
<journal-id journal-id-type="publisher-id">eLife</journal-id>
<journal-title-group>
<journal-title>eLife</journal-title>
</journal-title-group>
<issn publication-format="electronic" pub-type="epub">2050-084X</issn>
<publisher>
<publisher-name>eLife Sciences Publications, Ltd</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">91532</article-id>
<article-id pub-id-type="doi">10.7554/eLife.91532</article-id>
<article-id pub-id-type="doi" specific-use="version">10.7554/eLife.91532.1</article-id>
<article-version-alternatives>
<article-version article-version-type="publication-state">reviewed preprint</article-version>
<article-version article-version-type="preprint-version">1.1</article-version>
</article-version-alternatives>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
</subj-group>
<subj-group subj-group-type="heading">
<subject>Computational and Systems Biology</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Deciphering the Genetic Code of Neuronal Type Connectivity: A Bilinear Modeling Approach</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Qiao</surname>
<given-names>Mu</given-names>
</name>
<xref ref-type="aff" rid="a1">1</xref>
<xref ref-type="corresp" rid="cor1">*</xref>
</contrib>
<aff id="a1"><label>1</label><institution>LinkedIn</institution>, Mountain View, CA, 94043</aff>
</contrib-group>
<contrib-group content-type="section">
<contrib contrib-type="editor">
<name>
<surname>Nelson</surname>
<given-names>Sacha B</given-names>
</name>
<role>Reviewing Editor</role>
<aff>
<institution-wrap>
<institution>Brandeis University</institution>
</institution-wrap>
<city>Waltham</city>
<country>United States of America</country>
</aff>
</contrib>
<contrib contrib-type="senior_editor">
<name>
<surname>Nelson</surname>
<given-names>Sacha B</given-names>
</name>
<role>Senior Editor</role>
<aff>
<institution-wrap>
<institution>Brandeis University</institution>
</institution-wrap>
<city>Waltham</city>
<country>United States of America</country>
</aff>
</contrib>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>*</label> Corresponding author; email: <email>muqiao0626@gmail.com</email></corresp>
</author-notes>
<pub-date date-type="original-publication" iso-8601-date="2023-11-28">
<day>28</day>
<month>11</month>
<year>2023</year>
</pub-date>
<volume>12</volume>
<elocation-id>RP91532</elocation-id>
<history>
<date date-type="sent-for-review" iso-8601-date="2023-08-25">
<day>25</day>
<month>08</month>
<year>2023</year>
</date>
</history>
<pub-history>
<event>
<event-desc>Preprint posted</event-desc>
<date date-type="preprint" iso-8601-date="2023-08-04">
<day>04</day>
<month>08</month>
<year>2023</year>
</date>
<self-uri content-type="preprint" xlink:href="https://doi.org/10.1101/2023.08.01.551532"/>
</event>
</pub-history>
<permissions>
<copyright-statement>© 2023, Qiao</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Qiao</copyright-holder>
<ali:free_to_read/>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<ali:license_ref>https://creativecommons.org/licenses/by/4.0/</ali:license_ref>
<license-p>This article is distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License</ext-link>, which permits unrestricted use and redistribution provided that the original author and source are credited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="elife-preprint-91532-v1.pdf"/>
<abstract>
<title>Abstract</title>
<p>Understanding how different neuronal types connect and communicate is critical to interpreting brain function and behavior. However, it has remained a formidable challenge to decipher the genetic underpinnings that dictate the specific connections formed between pre- and post-synaptic neuronal types. To address this, we propose a novel bilinear modeling approach that leverages the architecture similar to that of recommendation systems. Our model transforms the gene expressions of mouse bipolar cells (presynaptic) and retinal ganglion cells (postsynaptic), obtained from single-cell transcriptomics, into a covariance matrix. The objective is to construct this covariance matrix that closely mirrors a connectivity matrix, derived from connectomic data, reflecting the known anatomical connections between these neuronal types. Our model successfully recaptiulates recognized connectivity motifs and provides interpretable insights into genetic interactions that shape the connectivity. Specifically, it identifies unique genetic signatures associated with different connectivity motifs, including genes important to cell-cell adhesion and synapse formation, highlighting their role in orchestrating specific synaptic connections between these neurons. Our work establishes an innovative computational strategy for decoding the genetic programming of neuronal type connectivity. It not only sets a new benchmark for single-cell transcriptomic analysis of synaptic connections but also paves the way for mechanistic studies of neural circuit assembly and genetic manipulation of circuit wiring.</p>
</abstract>

</article-meta>
<notes>
<notes notes-type="competing-interest-statement">
<title>Competing Interest Statement</title><p>The authors have declared no competing interest.</p></notes>
</notes>
</front>
<body>
<sec id="s1">
<label>1</label>
<title>Introduction</title>
<p>One of the fundamental objectives in neuroscience is understanding how diverse neuronal cell types establish connections to form functional circuits. This understanding serves as a cornerstone for decoding how the nervous system processes information and coordinates responses to stimuli [<xref ref-type="bibr" rid="c1">1</xref>]. Despite this, the genetic mechanisms determining the specific connections between distinct neuronal types, especially within complex brain structures, remains elusive [<xref ref-type="bibr" rid="c2">2</xref>, <xref ref-type="bibr" rid="c3">3</xref>].</p>
<p>Recent advances in transcriptomics and connectomics provide opportunities to probe this. Single-cell transcriptomics enables high-resolution profiling of gene expressions across neuronal types [<xref ref-type="bibr" rid="c4">4</xref>, <xref ref-type="bibr" rid="c5">5</xref>], while connectomic data offers detailed maps quantifying connections between neuronal cell types [<xref ref-type="bibr" rid="c6">6</xref>, <xref ref-type="bibr" rid="c7">7</xref>, <xref ref-type="bibr" rid="c8">8</xref>]. However, the challenge of linking gene expressions derived from single-cell transcriptomics to precise neuronal connectivity patterns evident from connectomic data has yet to be fully addressed.</p>
<p>Drawing inspiration from the field of machine learning, particularly recommendation systems, we introduce a bilinear model to bridge this gap. This model, in the context of recommendation systems, has been successful in capturing intricate user-item interactions [<xref ref-type="bibr" rid="c9">9</xref>]. By treating the gene expressions of pre- and post-synaptic neurons and their connectivity akin to users, items, and their ratings, we adapt the architecture of recommendation systems to the neurobiological domain. We hypothesize that a similar model could capture the complex relationships between genetic patterns of presynaptic and postsynaptic neurons and their connectivity.</p>
<p>Applying this model to single-cell transcriptomic and connectomic data from mouse retinal neurons, we demonstrate that it can effectively learn connectivity patterns between bipolar cells (BCs, presynaptic) and retinal ganglion cells (RGCs, postsynaptic). The model not only unveils connectivity motifs between BCs and RGCs but also provides biologically meaningful insights into candidate genes and the genetic interactions that orchestrate this connectivity. Furthermore, our model predicts potential BC partners for RGC transcriptomic types, with these predictions aligned substantially with functional descriptions of these cell types from previous studies. Collectively, this work significantly contributes to the ongoing exploration of the genetic code underlying neuronal connectivity and suggests a potential paradigm shift in the analysis of single-cell transcriptomic data in neuroscience.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Background: Synaptic Specificity between Neuronal Types</title>
<p>The intricate neural networks that form the basis of our nervous system are a product of specific synaptic connections between different types of neurons. This specificity is not a mere coincidence but a meticulously orchestrated process that underpins the functionality of the entire network [<xref ref-type="bibr" rid="c3">3</xref>]. Each neuron can form thousands of connections, or synapses, with other neurons, and the specificity of these connections determines the neuron’s function and, by extension, the network’s function as a whole. A classic example of this is seen in the retina, where different types of BCs form specific synaptic connections with various types of RGCs [<xref ref-type="bibr" rid="c7">7</xref>, <xref ref-type="bibr" rid="c10">10</xref>, <xref ref-type="bibr" rid="c11">11</xref>]. These connections create parallel pathways that transform visual signals from photoreceptors to RGCs, which subsequently transmit the information to the brain [<xref ref-type="bibr" rid="c12">12</xref>, <xref ref-type="bibr" rid="c13">13</xref>].</p>
<p>The genetic principles guiding the formation of these specific connections, particularly in complex brain structures, remains elusive. The brain’s complexity, with its billions of neurons and trillions of synapses, poses significant challenges in identifying the specific genes and genetic mechanisms that guide the formation of these connections. Despite advances in genetic and neurobiological research, such as understanding the roles of certain recognition molecules and adhesion molecules in synaptic specificity, the genetic foundation of connectivity between neuronal types is still largely unknown [<xref ref-type="bibr" rid="c14">14</xref>, <xref ref-type="bibr" rid="c3">3</xref>, <xref ref-type="bibr" rid="c15">15</xref>].</p>
<p>Emerging tools and technologies offer unprecedented opportunities to unravel these mysteries. Among these, the transcriptome and connectome are particularly promising [<xref ref-type="bibr" rid="c3">3</xref>, <xref ref-type="bibr" rid="c16">16</xref>]. The transcriptome, the complete set of RNA transcripts produced by the genome, can provide valuable insights into the genes that are active in different types of neurons and at different stages of neuronal development. This can help identify candidate genes that may play a role in guiding neuronal connectivity. The connectome, on the other hand, provides a detailed map of the connections between neurons. By combining information from the transcriptome and connectome, it is possible to link specific genes to specific connections, thereby shedding light on the genetic basis of neuronal connectivity.</p>
</sec>
<sec id="s3">
<label>3</label>
<title>Related Work: Collaborative Filtering</title>
<p>Our strategy draws inspiration from the concept of collaborative filtering using bilinear models, a technique fundamental to recommendation systems [<xref ref-type="bibr" rid="c17">17</xref>, <xref ref-type="bibr" rid="c18">18</xref>]. These systems predict a user’s preference for an item (e.g., a movie or product) based on user-item interaction data.</p>
<p>Bilinear models capture the interaction between users and items via low-dimensional latent features [<xref ref-type="bibr" rid="c9">9</xref>, <xref ref-type="bibr" rid="c19">19</xref>]. Mathematically, for user <italic>i</italic> and item <italic>j</italic>, we denote their original features as <bold><italic>x</italic></bold><sub><italic>i</italic></sub> ∈ <bold>R</bold><sup>1<italic>×p</italic></sup> and <bold><italic>y</italic></bold><sub><italic>j</italic></sub> ∈ <bold>R</bold><sup>1<italic>×q</italic></sup>, respectively. These features are then projected into a shared latent space with dimension <italic>d</italic> via transformations <bold><italic>x</italic></bold><sub><italic>i</italic></sub><bold><italic>A</italic></bold> (where <bold><italic>A</italic></bold> ∈ <bold>R</bold><sup><italic>p×d</italic></sup>) and <bold><italic>y</italic></bold><sub><italic>j</italic></sub><bold><italic>B</italic></bold> (where <bold><italic>B</italic></bold> ∈ <bold>R</bold><sup><italic>q×d</italic></sup>). The predicted rating of the user for the item is then formulated as:
<disp-formula id="eqn1">
<alternatives><graphic xlink:href="551532v1_eqn1.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
In the context of collaborative filtering, the goal is to optimize the transformation matrices <bold><italic>A</italic></bold> and <bold><italic>B</italic></bold> to align the predicted rating <italic>r</italic><sub><italic>ij</italic></sub> with the ground-truth <italic>z</italic><sub><italic>ij</italic></sub>. This is expressed as the following optimization problem:
<disp-formula id="eqn2">
<alternatives><graphic xlink:href="551532v1_eqn2.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
Or in the matrix form:
<disp-formula id="eqn3">
<alternatives><graphic xlink:href="551532v1_eqn3.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
Here, the objective is to minimize the Frobenius norm of the residual matrix <bold><italic>Z</italic></bold> − (<bold><italic>XA</italic></bold>)(<bold><italic>Y B</italic></bold>)<sup><italic>T</italic></sup>.</p>
<p>In our study, we interpret neuronal connectivity through the lens of recommendation systems, viewing presynaptic neurons as “users”, postsynaptic neurons as “items”, and the synapses formed between them as “ratings”. Our chosen bilinear model extracts latent features of pre- and post-synaptic neurons from their respective gene expressions. One key advantage of the bilinear model is its capacity to assign different weights to the gene expressions of pre- and post-synaptic neurons, enabling the model to capture not just homogeneous but also complex, heterogeneous interactions fundamental to understanding neuronal connectivity. Prior studies have highlighted such heterogeneous interactions, noting the formation of connections between pre- and post-synaptic neurons expressing different cadherins, indicative of a heterogeneous adhesion process [<xref ref-type="bibr" rid="c20">20</xref>, <xref ref-type="bibr" rid="c21">21</xref>].</p>
</sec>
<sec id="s4">
<label>4</label>
<title>Bilinear Model for Neuronal Type Connectivity</title>
<sec id="s4a">
<label>4.1</label>
<title>Objective Functions</title>
<p>We discuss the bilinear model for neuronal type connectivity in the following two scenarios: the first in which gene expression and connectivity of each cell are known simultaneously and the second where connectivity and gene expressions of neuronal types are from different sources. The bilinear models for these two situations are illustrated in <xref rid="fig1" ref-type="fig">Figure 1</xref>.</p>
<fig id="fig1" position="float" fig-type="figure">
<label>Figure 1:</label>
<caption><p>Illustration of our approach. (a) In an ideal scenario where gene expression profiles and connectivity data of individual cells are available simultaneously, we establish the relationship between connectivity and gene expression profiles via two transformation matrices <bold><italic>A</italic></bold> and <bold><italic>B</italic></bold> (b) In practical situations where the gene expression profiles are derived from distinct sources, such as single-cell transcriptomic and connectomic data, we propose that the connectivity of individual cells and their latent gene expression features can be approximated by the averages of their corresponding cell types, and establish their relationship through transformation matrices <bold><italic>Â</italic></bold> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline2.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>.</p></caption>
<graphic xlink:href="551532v1_fig1.tif" mimetype="image" mime-subtype="tiff"/>
</fig>
<sec id="s4b1">
<label>4.1.1</label>
<title>Gene Expression and Connectivity of Each Cell are Known Simultaneously</title>
<p>We begin with an ideal scenario where both the gene expression profiles and connectivity of individual cells are known concurrently. In this setting, we have <italic>a</italic> presynaptic neuronal types and <italic>b</italic> postsynaptic neuronal types, indexed by <italic>i</italic> and <italic>j</italic>, respectively. Each type contains a number of neurons, signified as <italic>n</italic><sub><italic>i</italic></sub> for presynaptic and <italic>n</italic><sub><italic>j</italic></sub> for postsynaptic types. The gene expression vector for the <italic>k</italic><sup><italic>th</italic></sup> cell in the presynaptic type <italic>i</italic> is designated as <bold><italic>x</italic></bold><sub>(<italic>ik</italic>)</sub>, where <italic>k</italic> ∈ 1, 2, …, <italic>n</italic><sub><italic>i</italic></sub>, while for the <italic>l</italic><sup><italic>th</italic></sup> cell in postsynaptic type <italic>j</italic>, it is <bold><italic>y</italic></bold><sub>(<italic>jl</italic>)</sub> with <italic>l</italic> ∈ 1, 2, …, <italic>n</italic><sub><italic>j</italic></sub>. We depict the connectivity metric between a presynaptic neuron and a postsynaptic neuron as <italic>z</italic><sub>(<italic>ik</italic>)(<italic>jl</italic>)</sub>.</p>
<p>Drawing from the principles of collaborative filtering, we develop the following optimization objective:
<disp-formula id="eqn4">
<alternatives><graphic xlink:href="551532v1_eqn4.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
Here, <bold><italic>A</italic></bold> and <bold><italic>B</italic></bold> denote the transformation matrices we aim to learn. This formula can also be expressed in its matrix form as:
<disp-formula id="eqn5">
<alternatives><graphic xlink:href="551532v1_eqn5.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
In this equation, <bold><italic>W</italic></bold> symbolizes a weight matrix where each element <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline1.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>. As our study focuses on the genetic code of pre-and post-synaptic neuronal types rather than individual neurons, this weight matrix ensures that the model does not disproportionately favor neuronal types with a greater number of neurons over rarer types.</p>
<p>In the context of high dimensionality of gene expressions, the bilinear model may face a common issue in machine learning called multicollinearity, a condition where one or more predictor variables are highly correlated. To overcome this, we can perform principal component analysis (PCA) on the gene expression vectors, transforming them into a new coordinate system. By excluding components whose negligible eigenvalues, we effectively get rid of redundant information, thus mitigating the effects of multicollinearity. This transformed gene expression data can then be used in our bilinear model, leading to more stable and reliable estimates of connectivity. In mathematical terms, applying PCA to the zero-centered and unit-variance adjusted matrices <bold><italic>X</italic></bold> and <bold><italic>Y</italic></bold> results in the approximations:
<disp-formula id="eqn6">
<alternatives><graphic xlink:href="551532v1_eqn6.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
and
<disp-formula id="eqn7">
<alternatives><graphic xlink:href="551532v1_eqn7.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
where <bold><italic>U</italic></bold> and <bold><italic>V</italic></bold> indicate the PCA transformation of <bold><italic>X</italic></bold> and <bold><italic>Y</italic></bold> respectively. The original optimization problem now becomes:
<disp-formula id="eqn8">
<alternatives><graphic xlink:href="551532v1_eqn8.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
where <bold><italic>X</italic></bold><italic>′</italic> = <bold><italic>XU</italic></bold>, <bold><italic>Y</italic></bold> <italic>′</italic> = <bold><italic>Y V</italic></bold>, <bold><italic>A</italic></bold><italic>′</italic> = <bold><italic>U</italic></bold> <sup><italic>T</italic></sup> <bold><italic>A, B</italic></bold><italic>′</italic> = <bold><italic>V</italic></bold> <sup><italic>T</italic></sup> <bold><italic>B</italic></bold>. It’s worth noting that if the optimization problem is solved in the PCA-transformed space, we need to transform <bold><italic>A</italic></bold><italic>′</italic> and <bold><italic>B</italic></bold><italic>′</italic> back <bold><italic>A</italic></bold> and <bold><italic>B</italic></bold> in order to retrieve the corresponding weight of each gene.</p></sec>
<sec id="s4b2">
<label>4.1.2</label>
<title>Connectivity and Gene Expressions of Neuronal Types are from Different Sources</title>
<p>In real scenarios, gene expression profiles and connectivity information are often derived from separate sources, such as single-cell sequencing [<xref ref-type="bibr" rid="c22">22</xref>, <xref ref-type="bibr" rid="c23">23</xref>] and connectome data [<xref ref-type="bibr" rid="c7">7</xref>, <xref ref-type="bibr" rid="c24">24</xref>, <xref ref-type="bibr" rid="c25">25</xref>]. Bridging these datasets requires classifying neurons into cell types based on their gene expression profiles and morphological characteristics. These cell types from different sources are subsequently aligned according to established biological knowledge (e.g., specific gene markers are known to be expressed in certain morphologically-defined cell types [<xref ref-type="bibr" rid="c26">26</xref>]).</p>
<p>The primary challenge in this scenario is that, while we can align cell types (denoted by indices <italic>i</italic> and <italic>j</italic> in <xref ref-type="disp-formula" rid="eqn4">equation 4</xref>), we are unable to associate individual cells (represented by indices <italic>k</italic> and <italic>l</italic> in <xref ref-type="disp-formula" rid="eqn4">equation 4</xref>). To tackle this issue, we adopt a simplifying assumption that the connectivity and latent gene expression features of individual cells can be approximated by the averages of their corresponding cell types. This premise hinges on the notion that the connectivity metrics and latent gene expression features of individual cells are close enough to the mean value of their corresponding cell types.</p>
<p>As a result, our optimization objective in <xref ref-type="disp-formula" rid="eqn4">equation 4</xref> becomes:
<disp-formula id="eqn9">
<alternatives><graphic xlink:href="551532v1_eqn9.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
In this equation, <italic>z</italic><sub>(<italic>i</italic>.)(<italic>j</italic>.)</sub> denotes the mean connectivity metric between presynaptic cell type <italic>i</italic> and postsynaptic cell type <italic>j</italic>. Meanwhile, <bold><italic>x</italic></bold><sub>(<italic>i</italic>.)</sub> and <bold><italic>y</italic></bold><sub>(<italic>j</italic>.)</sub> represent the average gene expressions of cell types <italic>i</italic> and <italic>j</italic> respectively.</p>
<p>While optimizing the transformation matrices <bold><italic>A</italic></bold> and <bold><italic>B</italic></bold>, we impose constraints on these matrices to ensure that the variance of latent gene expression features within each neuronal type is minimized. Specifically, we define <italic>ϵ</italic> as a small enough value and impose the following constraints on <bold><italic>A</italic></bold>:
<disp-formula id="eqn10">
<alternatives><graphic xlink:href="551532v1_eqn10.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
where
<disp-formula id="eqn11">
<alternatives><graphic xlink:href="551532v1_eqn11.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
and <bold><italic>B</italic></bold>:
<disp-formula id="eqn12">
<alternatives><graphic xlink:href="551532v1_eqn12.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
where
<disp-formula id="eqn13">
<alternatives><graphic xlink:href="551532v1_eqn13.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
These conditions assure that the latent gene expression features of individual cells are proximate enough to the average value within their respective cell types. With these constraints in mind, we formulate the optimization problem as follows:
<disp-formula id="eqn14">
<alternatives><graphic xlink:href="551532v1_eqn14.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
In this equation, <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline3.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> denotes the average gene expressions of the <italic>a</italic> presynaptic cell types, wherein each element <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline4.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> is indicative of the average gene expression feature <italic>m</italic> within cell type <italic>i</italic>. Likewise, <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline5.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> represents the average gene expressions of the <italic>b</italic> postsynaptic cell types, with each element <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline6.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> signifying the average gene expression feature <italic>m</italic> in cell type <italic>j</italic>.</p>
<p>In practical application, we approximate <bold>Σ</bold><sub><italic>x</italic></sub> and <bold>Σ</bold><sub><italic>y</italic></sub> with their diagonal estimates <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline7.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline7a.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> [<xref ref-type="bibr" rid="c27">27</xref>, <xref ref-type="bibr" rid="c28">28</xref>]. We then transform the initial optimization problem into the following:
<disp-formula id="eqn15">
<alternatives><graphic xlink:href="551532v1_eqn15.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
Here, elements in <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline8.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> are defined as <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline9.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> and elements in <italic>Ŷ</italic> ∈ R<sup><italic>b × q</italic></sup> are given by <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline10.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>. The optimization of this formulation tends to be compuqationally more tractable.</p>
<p>In summary, our methodology adapts when gene expression profiles and the connectivity matrix originate from distinct sources. Instead of aligning at the level of individual cells, we focus on the alignment of neuronal types. We achieve this by mapping gene expressions into a latent space via transformation matrices <bold><italic>Â</italic></bold> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline11.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>, with the optimization process aiming to minimize the discrepancies between these two sources of information while maintaining consistency of the gene expression features within individual neuronal types.</p>
</sec>
</sec>
<sec id="s4c">
<label>4.2</label>
<title>Optimization Algorithm</title>
<p>To solve the optimization problem as outlined in <xref ref-type="disp-formula" rid="eqn15">equation 15</xref>, we construct the following loss function:
<disp-formula id="eqn16">
<alternatives><graphic xlink:href="551532v1_eqn16.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
This function can be regarded as the Lagrangian of the initial problem. Here, <italic>λ</italic><sub><italic>A</italic></sub> and <italic>λ</italic><sub><italic>B</italic></sub> are treated as hyperparameters, and their optimal values are determined through a grid search.</p>
<p>Given this loss function, we propose an alternative gradient descent algorithm to find the solutions. This algorithm alternates between updating the transformation matrices <bold><italic>Â</italic></bold> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline12.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>, using the gradient descent optimization method.</p>
<p>The algorithm begins by initializing transformation matrices <bold><italic>Â</italic></bold> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline13.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> using random values drawn from a standard normal distribution. The central aspect of the algorithm is an iterative loop that alternates the updates of transformation matrices <bold><italic>Â</italic></bold> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline14.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>.</p>
<statement id="alg1">
<label>Algorithm 1</label>
<p>Alternative Gradient Descent (AGD) for Bilinear Mapping</p>
<p><fig id="alg1a" position="float" fig-type="figure">
<graphic xlink:href="551532v1_alg1.tif" mimetype="image" mime-subtype="tiff"/>
</fig></p>
</statement>
<p>During each iteration, the algorithm computes the predicted connectivity metric <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline15.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> using the current estimates of <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline16.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>, and <bold><italic>Ŷ</italic></bold>. Subsequently, the gradient of the loss function with respect to the transformation matrices is calculated, and the matrices are updated by moving in the negative gradient’s direction.</p>
<p>This iterative process is repeated until the transformation matrices <bold><italic>Â</italic></bold> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline18.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> converge to a steady solution. Upon completion, the algorithm yields the optimized transformation matrices.</p>
<p>This gradient descent-based algorithm provides a computationally efficient solution to the bilinear mapping problem between gene expression profiles and connectivity metrics while adhering to the constraints unique to our problem design. As a result, it produces associations between gene expression profiles of cell types and their connectivity.</p>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Datasets and Pre-processing</title>
<p>Our study hinges on two distinct data collections of mouse retina neurons: single-cell transcriptomic data from previous studies and connectomic data from the EyeWire project. Together, these datasets provide us with connectivity information and gene expression profiles, which constitute the key ingredients for our proposed bilinear model.</p>
<sec id="s5a">
<label>5.1</label>
<title>Single-cell Transcriptomic Data</title>
<p>The single-cell transcriptomic data used in our study include the gene expression profiles for two classes of mouse retina neurons - presynaptic BCs as reported by Shekhar et al. [<xref ref-type="bibr" rid="c22">22</xref>], and postsynaptic RGCs as reported by Tran et al. [<xref ref-type="bibr" rid="c23">23</xref>].</p>
<p>Preprocessing of this data adhered to previously documented procedures [<xref ref-type="bibr" rid="c22">22</xref>, <xref ref-type="bibr" rid="c23">23</xref>, <xref ref-type="bibr" rid="c29">29</xref>]. The transcript counts within each cell were first normalized to align with the median number of the transcripts per cell, followed by a log-transformation of the normalized counts. High variable genes (HVGs) were then selected using an approach based on establishing a relationship between mean expression level and the coefficient of variance [<xref ref-type="bibr" rid="c30">30</xref>, <xref ref-type="bibr" rid="c31">31</xref>, <xref ref-type="bibr" rid="c32">32</xref>]. We focused on those cells whose types correspond with the neuronal types outlined in the connectomic data, as delineated later in <xref rid="tbl1" ref-type="table">Table 1</xref>, <xref rid="tbl2" ref-type="table">Table 2</xref>, and <xref rid="tbl3" ref-type="table">Table 3</xref>. This yielded two matrices, <bold><italic>X</italic></bold> and <bold><italic>Y</italic></bold>, representing presynaptic BCs and postsynaptic RGCs, where each row pertains to a cell and each column represents an HVG. The dimensions of <bold><italic>X</italic></bold> and <bold><italic>Y</italic></bold> are 22453 × 17144 and 3779 × 12926, respectively.</p>
<table-wrap id="tbl1" orientation="portrait" position="float">
<label>Table 1:</label>
<caption><title>Correspondence of Mouse BC types [<xref ref-type="bibr" rid="c25">25</xref>, <xref ref-type="bibr" rid="c22">22</xref>]</title></caption>
<graphic xlink:href="551532v1_tbl1.tif" mimetype="image" mime-subtype="tiff"/>
</table-wrap>
<table-wrap id="tbl2" orientation="portrait" position="float">
<label>Table 2:</label>
<caption><title>Correspondence of Mouse RGC types [<xref ref-type="bibr" rid="c24">24</xref>, <xref ref-type="bibr" rid="c23">23</xref>, <xref ref-type="bibr" rid="c26">26</xref>]</title></caption>
<graphic xlink:href="551532v1_tbl2.tif" mimetype="image" mime-subtype="tiff"/>
</table-wrap>
<table-wrap id="tbl3" orientation="portrait" position="float">
<label>Table 3:</label>
<caption><title>Correspondence of Mouse RGC types [<xref ref-type="bibr" rid="c24">24</xref>, <xref ref-type="bibr" rid="c23">23</xref>, <xref ref-type="bibr" rid="c26">26</xref>]</title></caption>
<graphic xlink:href="551532v1_tbl3.tif" mimetype="image" mime-subtype="tiff"/>
</table-wrap>
<p>Next, we performed a principal component analysis (PCA) on these matrices to transform the gene expression data into the principal component (PC) space. We retained only the PCs that account for a cumulative 95% of explained variance. Consequently, the gene expression of the BCs in <bold><italic>X</italic></bold> and the RGCs in <bold><italic>Y</italic></bold> were featurized by their respective PCs, resulting in matrices of dimensions 22453 × 11323 and 3779 × 3142, respectively.</p>
<p>Based on each cell’s neuronal type, we computed the variance of gene expression features within these types. Mathematically, the variance of gene expression feature <italic>m</italic> within the BC types and the RGC types are expressed as:
<disp-formula id="eqn17">
<alternatives><graphic xlink:href="551532v1_eqn17.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
<disp-formula id="eqn18">
<alternatives><graphic xlink:href="551532v1_eqn18.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
Taking <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline19.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline20.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> to represent the average gene expression feature <italic>m</italic> of the BC type <italic>i</italic> and the RGC type <italic>j</italic>, we were able construct matrices, <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline21.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> and <bold><italic>Ŷ</italic></bold>, in which <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline22.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline23.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>. In these matrics, each row represents a cell type, with the dimensions of <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline24.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> being 25×11323 and <bold><italic>Ŷ</italic></bold> being 12×3142. These matrices serve to bridge the gene expression of BC types and RGC types with the connectivity matrix of these neuronal types derived from the connectomic data.</p>
</sec>
<sec id="s5b">
<label>5.2</label>
<title>Connectivity Data</title>
<p>The connectivity matrix of neuronal types is derived from connectomic data acquired through the process of serial electron microscopy (EM)-based reconstruction of brain tissues [<xref ref-type="bibr" rid="c6">6</xref>, <xref ref-type="bibr" rid="c7">7</xref>, <xref ref-type="bibr" rid="c8">8</xref>]. From these reconstructed tissues, connectivity measurements are usually expressed as either the contact area or the number of synapses between neurons [<xref ref-type="bibr" rid="c7">7</xref>, <xref ref-type="bibr" rid="c33">33</xref>]. When normalized to the total contact area or total number of synapses of each neuron, the resulting metric, ranging from 0 to 1, signifies the percentage of contact area or synapses formed between neurons. This normalized metric provides a quantitative connectivity measure, where 0 indicates no connectivity and 1 implies complete connectivity between two neurons.</p>
<p>Our analysis utilized the neural reconstruction data of mouse retinal neurons, courtesy of the EyeWire project, a crowd-sourced initiative that generates 3D reconstructions of neurons from serial section EM images [<xref ref-type="bibr" rid="c34">34</xref>]. This extensive dataset facilitated the derivation of a comprehensive connectivity matrix between two classes of mouse retina neurons - BCs [<xref ref-type="bibr" rid="c25">25</xref>] and RGCs [<xref ref-type="bibr" rid="c24">24</xref>]. The data were sourced from the EyeWire Museum (<ext-link ext-link-type="uri" xlink:href="https://museum.eyewire.org/">https://museum.eyewire.org/</ext-link>), which offers detailed information for each cell in a JSON file, including attributes like “cell id”, “cell type”, “cell class”, and “stratification”. The stratification profile describes the linear density of voxel volume as a function of the inner plexiform layer (IPL) depth [<xref ref-type="bibr" rid="c34">34</xref>, <xref ref-type="bibr" rid="c25">25</xref>, <xref ref-type="bibr" rid="c24">24</xref>].</p>
<p>We approximated the connectivity metric between a BC and a RGC using the cosine similarity of their stratification profiles. Let <bold><italic>v</italic></bold><sub><italic>ik</italic></sub> and <bold><italic>v</italic></bold><sub><italic>jl</italic></sub> denote the stratification profiles of the <italic>k</italic><sup><italic>th</italic></sup> cell in BC type <italic>i</italic> and the <italic>l</italic><sup><italic>th</italic></sup> cell in RGC type <italic>j</italic>, respectively. The connectivity metric <italic>z</italic><sub>(<italic>ik</italic>)(<italic>jl</italic>)</sub> between these two neurons can be expressed as:
<disp-formula id="eqn19">
<alternatives><graphic xlink:href="551532v1_eqn19.gif" mimetype="image" mime-subtype="gif"/></alternatives>
</disp-formula>
This equation represents the degree of overlap in their voxel volume profile within the IPL, resulting in the connectivity matrix <bold><italic>Z</italic></bold> between mouse BCs and RGCs. To allow for both positive and negative values within the matrix, we standardized <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline25.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> by subtracting the mean of <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline26.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> and then dividing by its standard deviation. Subsequently, the connectivity matrix <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline27.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> between mouse BC and RGC neuronal types was calculated, with each element <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline28.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> representing the average of the connectivity metrics between cells of BC type <italic>i</italic> and cells of RGC type <italic>j</italic>.</p>
</sec>
<sec id="s5c">
<label>5.3</label>
<title>Correspondence of Cell Types between Datasets</title>
<p>Aligning neuronal types as annotated in the single-cell transcriptomic data and those identified in the connectomic data was informed by findings from previous studies. Notably, a one-to-one correspondence exists between BC cell types classified by Shekhar et al. [<xref ref-type="bibr" rid="c22">22</xref>] and Greene et al. [<xref ref-type="bibr" rid="c25">25</xref>]. This correspondence is presented in <xref rid="tbl1" ref-type="table">Table 1</xref>.</p>
<p>Regarding RGC types, alignment between cell types annotated in Tran et al. [<xref ref-type="bibr" rid="c23">23</xref>] and Bae et al. [<xref ref-type="bibr" rid="c24">24</xref>] was established primarily based on the findings from Goetz et al. [<xref ref-type="bibr" rid="c26">26</xref>]. This study presents a unified classification of mouse RGC types, based on their functional, morphological, and gene expression features. The corresponding RGC types were mainly obtained from Supplementary Table S3 of Goetz et al. (<xref rid="tbl2" ref-type="table">Table 2</xref>): with the following additions derived from Supplementary Table S1 of Tran et al., based on the expressions of genetic markers of the following RGC types (<xref rid="tbl3" ref-type="table">Table 3</xref>):</p>
<p>We carried out subsequent analyses based on the alignment of these neuronal types.</p>
</sec>
</sec>
<sec id="s6">
<label>6</label>
<title>Model Training and Validation</title>
<p>With the bilinear model outlined in <xref ref-type="sec" rid="s4">Section 4</xref>, we iteratively optimized transformation matrics, denoted as <bold><italic>Â</italic></bold> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline29.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>, using the ADG optimization algorithm. The goal was to minimize the prescribed loss function. The initial states of these transformation matrices were randomly generated from a standard normal distribution, and updates were continued until either the change in loss fell below a predetermined threshold of 10<sup><italic>−</italic>6</sup> or a maximum iteration count of 10<sup>6</sup> was reached.</p>
<p>We examined two sets of hyperparameters during the optimization process: the regularization parameters, <italic>λ</italic><sub><italic>A</italic></sub> and <italic>λ</italic><sub><italic>B</italic></sub>, and the dimensionality of the latent feature space. Preliminary tests suggested that a lower loss was achieved when <italic>λ</italic><sub><italic>A</italic></sub> and <italic>λ</italic><sub><italic>B</italic></sub> were set to equivalent values. Consequently, both were unified under a single parameter, <italic>λ</italic>.</p>
<p>We utilized 5-fold cross-validation to identify the optimal hyperparameters. This procedure partitioned the connectivity matrix entries into five unique subsets or “folds”. The model was then trained using a combination of four folds, with the remaining one serving as a validation set. This procedure was repeated five times, with each iteration reserving a different fold for validation. Throughout this cross-validation process, we varied <italic>λ</italic> across the range [0.1, 1, 10, 100] and the latent feature space’s dimensionality across the range [<xref ref-type="bibr" rid="c1">1</xref>, <xref ref-type="bibr" rid="c2">2</xref>, <xref ref-type="bibr" rid="c3">3</xref>, <xref ref-type="bibr" rid="c4">4</xref>, <xref ref-type="bibr" rid="c8">8</xref>].</p>
<p><xref rid="fig2" ref-type="fig">Figure 2a</xref> presents a heatmap of the logarithm (base 10) of the validation loss, highlighting variations of <italic>λ</italic> and dimensionality. It can be observed that the lowest validation loss is achieved when <italic>λ</italic> equals 1 and the dimensionality equals 2, as shown in <xref rid="fig2" ref-type="fig">Figure 2b,c</xref>. These specific values were thus chosen as the optimal hyperparameters. With these optimal hyperparameters, we performed the final round of training on the entire dataset to yield the definitive transformation matrices, <bold><italic>Â</italic></bold> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline30.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>, using a learning rate of 5 × 10<sup><italic>−</italic>7</sup>.</p>
<fig id="fig2" position="float" fig-type="figure">
<label>Figure 2:</label>
<caption><p>Hyperparameter selection through cross-validation. (a) Heatmap plot of the logarithm (base 10) of the validation loss, showing variations with respect to <italic>λ</italic> across [0.1, 1, 10, 100] and dimensionality across [<xref ref-type="bibr" rid="c1">1</xref>, <xref ref-type="bibr" rid="c2">2</xref>, <xref ref-type="bibr" rid="c3">3</xref>, <xref ref-type="bibr" rid="c4">4</xref>, <xref ref-type="bibr" rid="c8">8</xref>]. (b) Plot showing the logarithm (base 10) of the validation loss against <italic>λ</italic> over the range [0.1, 1, 10, 100]. (c) Plot displaying the logarithm (base 10) of the validation loss against dimensionality over the range [<xref ref-type="bibr" rid="c1">1</xref>, <xref ref-type="bibr" rid="c2">2</xref>, <xref ref-type="bibr" rid="c3">3</xref>, <xref ref-type="bibr" rid="c4">4</xref>, <xref ref-type="bibr" rid="c8">8</xref>].</p></caption>
<graphic xlink:href="551532v1_fig2.tif" mimetype="image" mime-subtype="tiff"/>
</fig>
</sec>
<sec id="s7">
<label>7</label>
<title>Results</title>
<sec id="s7a">
<label>7.1</label>
<title>Bilinear Model Reconstructs Neuronal Type-Specific Connectivity Map from Gene Expression Profiles</title>
<p>Upon completion of the final training process, our optimized bilinear model produced transformation matrices, <bold><italic>Â</italic></bold> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline30a.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>. We used these matrices to project the normalized single-cell transcriptomic data,</p>
<p><inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline31.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> and <bold><italic>Ŷ</italic></bold>, into a shared latent feature space. Consequently, we obtained projected representations for BC and RGC types, <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline32.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> and <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline33.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula>, respectively. With these latent representations, we were able to reconstruct the cell-type-specific connectivity matrix: <inline-formula><alternatives><inline-graphic xlink:href="551532v1_inline34.gif" mimetype="image" mime-subtype="gif"/></alternatives></inline-formula> (<xref rid="fig3" ref-type="fig">Figure 3a</xref>).</p>
<fig id="fig3" position="float" fig-type="figure">
<label>Figure 3:</label>
<caption><p>Reconstruction of connectivity map from gene expression profiles. (a) The reconstructed connectivity matrix, derived from the shared latent feature space projections. (b) The connectivity matrix obtained from connectomic data. Differences in color intensity represent the strength of connections, with dark red indicating strong connections and dark blue indicating weak or no connections.</p></caption>
<graphic xlink:href="551532v1_fig3.tif" mimetype="image" mime-subtype="tiff"/>
</fig>
<p>To evaluate our model, we compared the reconstructed connectivity matrix with the one derived from connectomic data (<xref rid="fig3" ref-type="fig">Figure 3b</xref>). We calculated the Pearson correlation coefficient between entries of the two matrices to assess their agreement. The resulting correlation of 0.83 (<italic>p &lt;</italic> 0.001) demonstrated a robust association between the transformed gene expression features and the connectomic data. This result attests to our model’s capability in capturing the relationship between these two distinct types of biological information.</p>
</sec>
<sec id="s7b">
<label>7.2</label>
<title>Bilinear Model Recapitulates Recognized Connectivity Motifs</title>
<p>Our cross-validation procedure indicated that the optimal number of latent dimensions was two. This finding suggested that these two dimensions capture the essential connectivity motifs between BC and RGC types. This led us to further investigate what are these motifs and how they are different from each other.</p>
<p>We first reconstructed connectivity using only the first latent dimension. The first dimension appeared to emphasize connectivity patterns between BCs and RGCs that laminate within the IPL’s central region, as well as those that laminate within the marginal region (<xref rid="fig4" ref-type="fig">Figure 4a,d,g</xref>). We then reconstructed connectivity using only the second latent dimension. Notably, the spotlight shifted to connections between BCs and RGCs that laminate within the outer and inner regions of the IPL, respectively (<xref rid="fig4" ref-type="fig">Figure 4b,e,i</xref>).</p>
<fig id="fig4" position="float" fig-type="figure">
<label>Figure 4:</label>
<caption><p>Distinct connectivity motifs revealed by the two latent dimensions. (a, b) The reconstructed connectivity using only latent dimension 1 or 2, respectively. Differences in color intensity represent the strength of connections. (c) BC types plotted in the latent feature space, with each point representing a specific BC type. Dashed lines indicate zero values for latent dimensions 1 and 2. (d, e) Stratification profiles of BC types in IPL, color-coded based on their positions along the first (d) or second (e) latent dimension. Red indicates BC types on the positive half, while blue indicates BC types on the negative half. (f) RGC types plotted in the latent feature space, with each point representing a specific RGC type. (g, h) Stratification profiles of RGC types in IPL, color-coded based on their positions along the first (g) or second (h) latent dimension. Dashed lines in (d) and (g) mark the positions of ON and OFF SACs [<xref ref-type="bibr" rid="c24">24</xref>]. BCs and RGCs stratifying between them tend to exhibit more transient responses, and those stratifying outside them exhibit more sustained responses. Dashed lines in (e) and (h) denote the boundary of the outer and inner IPL [<xref ref-type="bibr" rid="c24">24</xref>]. Synapses between BCs and RGCs in the outer retina mediate OFF responses, while those in the inner retina mediate ON responses.</p></caption>
<graphic xlink:href="551532v1_fig4.tif" mimetype="image" mime-subtype="tiff"/>
</fig>
<p>To confirm these observations, we further visualized BC and RGC types within the two-dimensional latent feature space (<xref rid="fig4" ref-type="fig">Figure 4c,f</xref>). Grouping BC and RGC types based on whether they fell within the positive or negative halves of the latent dimensions, we color-coded their stratification profiles within the IPL by group. BCs and RGCs that fell within the positive half of latent dimension 1 tend to stratify within the IPL’s central region, delineated by the boundaries formed by the ON and OFF starburst amacrine cells (SACs) (<xref rid="fig4" ref-type="fig">Figure 4d,g</xref>). Conversely, those falling within the negative half of this dimension tend to stratify in the marginal region of the IPL. As for the second latent dimension, BCs and RGCs that fell within the positive half predominantly stratify in the inner region of the IPL (<xref rid="fig4" ref-type="fig">Figure 4e,i</xref>), while those within the negative half primarily stratify in the IPL’s outer region.</p>
<p>Interestingly, these distinct connectivity motifs align with two widely recognized properties of retinal neurons: kinetic attributes that reflect the temporal dynamics (transient versus sustained responses) of a neuron responding to visual stimuli, and polarity (ON versus OFF responses) reflecting whether a neuron responds to the initiation or cessation of a stimulus [<xref ref-type="bibr" rid="c35">35</xref>, <xref ref-type="bibr" rid="c10">10</xref>, <xref ref-type="bibr" rid="c11">11</xref>, <xref ref-type="bibr" rid="c36">36</xref>]. This correlation implies that our bilinear model has successfully captured key aspects of retinal circuitry from gene expression data.</p>
</sec>
<sec id="s7c">
<label>7.3</label>
<title>Bilinear Model Reveals Interpretable Insights into Gene Signatures Associated with Different Connectivity Motifs</title>
<p>The inherent linearity of our bilinear model affords a significant advantage: it enables the direct interpretation of gene expressions by examining their associated weights in the model. These weights signify the importance of each gene in determining the connectivity motifs between the BC and RGC types. We identified the top 50 genes with the largest positive or negative weights for BCs and RGCs across both latent dimensions. We plotted their weights alongside their expression profiles in the respective cell types (<xref rid="fig5" ref-type="fig">Figure 5a-d</xref>).</p>
<fig id="fig5" position="float" fig-type="figure">
<label>Figure 5:</label>
<caption><p>Gene signatures associated with the two latent dimensions. (a, b) Weight vectors of the top 50 genes for latent dimension 1, along with their expression patterns in BC types (a) and RGC types (b). The weight value is indicated in the color bar, with the sign represented by color (red: positive and blue: negative), and the magnitude by saturation. The expression pattern is represented by the size of each dot (indicating the percentage of cells expressing the gene) and the color saturation (representing the gene expression level). BC and RGC types are sorted by their positions along latent dimension 1, as shown in Figure 4c,f, with the dashed line separating the positive category from the negative category. (c, d) Weight vectors of the top 50 genes for latent dimension 2, and their expression patterns in BC types (c) and RGC types (d), depicted in the same manner as in (a) and (b). BC and RGC types are sorted by their positions along latent dimension 2. (e, f) The top 10 significant Gene Ontology (GO) terms extracted from the top 50 genes for BC types (e) and RGC types (f) in latent dimension 1. (g, h) The top 10 significant Gene Ontology (GO) terms extracted from the top 50 genes for BC types (g) and RGC types (h) in latent dimension 2. The significance of each GO term is indicated by its adjusted p-value.</p></caption>
<graphic xlink:href="551532v1_fig5.tif" mimetype="image" mime-subtype="tiff"/>
</fig>
<p>Our analysis unveiled distinct gene signatures associated with the connectivity motifs revealed by the two latent dimensions. In the first latent dimension, genes like CDH11 and EPHA3, involved in cell adhesion and axon guidance, carried high weights for BCs forming synapses in the IPL’s central region. In contrast, for BCs synapsing in the marginal region, we observed high weights in the cell adhesion molecule PCDH9 and the axon guidance cue UNC5D (<xref rid="fig5" ref-type="fig">Figure 5a</xref>). This pattern was echoed in RGCs but involved a slightly different set of molecules. For example, in RGCs forming synapses in the IPL’s central region, the cell adhesion molecule PCDH7 carried high weights, whereas for RGCs synapsing in the marginal region, cell adhesion molecules PCDH11X and CDH12 were associated with high weights (<xref rid="fig5" ref-type="fig">Figure 5b</xref>).</p>
<p>The second latent dimension revealed a comparable pattern, albeit with different gene signatures. For BCs laminating in the IPL’s outer region, high weights were assigned with guidance cues such as SLIT2, NLGN1, EPHA3 and PLXNA4, as well as the adhesion molecule DSCAM. For BCs in the inner region, the adhesion molecule CNTN5 was associated with a high weight (<xref rid="fig5" ref-type="fig">Figure 5c</xref>). In RGCs, we noticed that guidance molecules such as PLXNA2, SLITRK6 and PLXNA4 along with adhesion modules CDH8 and LRRC4C were associated with high weights for cells forming synapses in the IPL’s outer region. In contrast, the adhesion molecule SDK2 was among the top genes for RGCs laminating and forming synapses in the IPL’s inner region (<xref rid="fig5" ref-type="fig">Figure 5d</xref>). Some of these genes or gene families, such as Plexins (PLXNA2, PLXNA4), Contactin5 (CNTN5), Sidekick2 (SDK2), and Cadherins (CDH8,11,12), are known to play crucial roles in establishing specific synaptic connections [<xref ref-type="bibr" rid="c37">37</xref>, <xref ref-type="bibr" rid="c38">38</xref>, <xref ref-type="bibr" rid="c39">39</xref>, <xref ref-type="bibr" rid="c40">40</xref>, <xref ref-type="bibr" rid="c20">20</xref>, <xref ref-type="bibr" rid="c21">21</xref>, <xref ref-type="bibr" rid="c41">41</xref>]. Others, particularly delta-protocadherins (PCDH7,9,11x), emerged as new candidates potentially mediating specific synaptic connections [<xref ref-type="bibr" rid="c3">3</xref>].</p>
<p>To elucidate the biological implications of these identified gene sets, we further conducted Gene Ontology (GO) enrichment analysis on the top genes through g:Profiler, a public web server for GO enrichment analysis [<xref ref-type="bibr" rid="c42">42</xref>, <xref ref-type="bibr" rid="c43">43</xref>]. This tool allowed us to delve into the molecular functions, cellular pathways, and biological processes associated with these genes. Intriguingly, when we plotted the top 10 significant GO terms for BCs and RGCs in latent dimension 1 or 2 (<xref rid="fig5" ref-type="fig">Figure 5e-h</xref>), we found two common themes associated with the top genes: neuronal development and synaptic organization. This observation underlines the potential role of the top genes in forming and shaping the specific connections between BC and RGC types.</p>
</sec>
<sec id="s7d">
<label>7.4</label>
<title>Bilinear Model Predicts Connectivity Partners of Transcriptomically-Defined RGC Types</title>
<p>The success of recommendation systems in accurately predicting the preferences of new users inspired us to leverage the bilinear model for predicting the connectivity partners of RGC types whose interconnections with BC types remain uncharted. There are some RGC types defined from single-cell transcriptomic data [<xref ref-type="bibr" rid="c23">23</xref>], which lack clear correspondence with those identified through connectomics studies [<xref ref-type="bibr" rid="c24">24</xref>]. This discrepancy leaves the connectivity patterns of these transcriptionally-defined RGC types unknown, providing an opportunity for our model to predict their BC partners.</p>
<p>To accomplish this, we first projected these RGC types into the same latent space as those used to train the model (<xref rid="fig6" ref-type="fig">Figure 6a</xref>). We then employed this projection to construct a connectivity matrix between these RGC types and BC types (<xref rid="fig6" ref-type="fig">Figure 6b</xref>), facilitating educated estimates about their connectivity partners. For each transcriptionally-defined RGC type, we identified the top three BC types as potential partners, determined by the highest values present in the connectivity matrix. These three BC types could provide insight into the potential synaptic input to each RGC type. Detailed predictions are presented in <xref rid="tbl4" ref-type="table">Table 4</xref>.</p>
<table-wrap id="tbl4" orientation="portrait" position="float">
<label>Table 4:</label>
<caption><title>Predicted BC Partners of Transciptionally-defined RGC Types</title></caption>
<graphic xlink:href="551532v1_tbl4.tif" mimetype="image" mime-subtype="tiff"/>
</table-wrap>
<fig id="fig6" position="float" fig-type="figure">
<label>Figure 6:</label>
<caption><p>BC partner prediction of transcriptionally-defined RGC types. (a) Projection of transcriptionally-defined RGC types with unknown connectivity into the same latent space as those with known connectivity. (b) The resulting predicted connectivity matrix between these RGC types and BC types. Transcriptionally-defined RGC types are named according to Tran et al. [<xref ref-type="bibr" rid="c23">23</xref>]</p></caption>
<graphic xlink:href="551532v1_fig6.tif" mimetype="image" mime-subtype="tiff"/>
</fig>
<p>Although the ground truth connectivity of these RGC types remains unknown due to the absence of matching types in connectomic data, Goetz et al. [<xref ref-type="bibr" rid="c26">26</xref>], via Patch-seq, attempted to match some transcriptomic types with functionally defined RGC types. These functional descriptions may hint at the BC partners of these RGC types. For instance, an RGC exhibiting OFF sustained responses is likely to be synaptically linked with BC types bc1-2, known to mediate OFF sustained pathways. Conversely, an RGC that displays ON sustained responses likely receives synaptic inputs from BC types bc6-9, which oversee ON sustained pathways. We summarized these functional descriptions in <xref rid="tbl4" ref-type="table">Table 4</xref>, referencing <xref rid="fig5" ref-type="fig">Figure 5A</xref> from Goetz et al. [<xref ref-type="bibr" rid="c26">26</xref>], and highlighted whether our predictions were consistent with these functional annotations. Among the ten predictions made, eight aligned with these functional descriptions, lending support to the predictive power of our model.</p>
</sec>
</sec>
<sec id="s8">
<label>8</label>
<title>Discussion</title>
<p>This study showcased an innovative computational strategy that integrates transcriptomic and connectomic data to elucidate genetic principles dictating neural circuit wiring. Leveraging a bilinear model, we reconstructed a neuronal type-specific connectivity map from gene expression profiles. We have demonstrated the model’s capability to recapitulate recognized connectivity motifs of the retinal circuit and provide interpretable insights into the gene signatures associated with these motifs. Moreover, our model can predict potential connectivity partners for transcriptomically-defined neuronal types whose connectomes are yet to be fully characterized. These make our bilinear model a valuable tool for investigating gene-regulatory mechanisms involved in neural circuit wiring.</p>
<p>The inspiration for our bilinear model stemmed from recommendation systems – a machine learning domain focused on capturing intricate interactions between users and items and predicting user preferences in commercial settings. This analogy served as a useful framework in our study, where the roles of users and items in the recommendation systems were mirrored by presynaptic and postsynaptic neurons, respectively. Likewise, the user-item preference matrix corresponds to the synaptic connection matrix in neural circuits. The recommendation systems are based on the assumption that user preferences and item attributes can be represented by latent factors; similarly, our model assumes that synaptic connectivity between various neuron types is determined by a shared latent feature space derived from gene expression profiles.</p>
<p>Our bililinear model successfully recapitulated two core connectivity motifs of the retinal circuit, representing synapses formed in central or marginal parts of the IPL, and synapses formed in outer or inner regions. These motifs align well with recognized properties of retinal neurons: kinetic attributes (transient versus sustained responses) and polarity (ON versus OFF responses). Significantly, these motifs aren’t predefined or explicitly encoded into the model; instead, they emerge naturally from the model, further attesting to the model’s power to capture key aspects of retinal circuitry.</p>
<p>The bilinear model also revealed unique insights into the gene signatures associated with the connectivity motifs. The weight vectors in the transformation matrices provide a means to assess the relative importance of individual genes. This direct interpretability is a significant advantage of the linear model, allowing for a more intuitive understanding of the gene-to-connectivity transformation process. Our analysis discovered distinct gene signatures associated with different connectivity motifs. Among these genes, some have been previously implicated in mediating specific synaptic connections, thererby validating our approach. For instance, Plexins A4 and A2 (PLXNA4, PLXNA2), predicted to be crucial for RGCs’ synapsing in the outer IPL, have been shown to be necessary for forming specific lamina of the IPL in the mouse retina, interacting with the guidance molecule Semaphorin 6A (Sem6A) [<xref ref-type="bibr" rid="c37">37</xref>, <xref ref-type="bibr" rid="c38">38</xref>]. Contactin5 (CNTN5), which our model predicted as vital for BCs forming synapses in the inner IPL, has been shown to be essential for synapses between ON BCs and the ON lamina of ON-OFF direction-selective ganglion cells (ooDSGCs) [<xref ref-type="bibr" rid="c39">39</xref>]. Sidekick2 (SDK2), predicted to be critical for RGCs’ synapses in the inner IPL, has been shown to guide the formation of a retinal circuit that detects differential motion [<xref ref-type="bibr" rid="c40">40</xref>]. Similarly, Cadherins (CDH8,11,12), whose combinations have been implicated in synaptic specificity within retinal circuits [<xref ref-type="bibr" rid="c20">20</xref>, <xref ref-type="bibr" rid="c21">21</xref>], were highlighted for multiple connectivity motifs. In particular, Cadherin8 (CDH8), which our model predicted to be crucial for RGC’s synaptic connections in the outer IPL, has been shown to be guided by the transciptional factor Tbr1 for laminar patterning of J-RGCs, a type of OFF direction-selective RGCs [<xref ref-type="bibr" rid="c41">41</xref>].</p>
<p>In additional to these validated gene signatures, our analysis identified promising candidate genes that may mediate specific synaptic connections. Particularly, delta-protocadherins (PCDH7,9,11x) appeared as potential new candidates. While their roles in synaptic connectivity aren’t fully understood [<xref ref-type="bibr" rid="c3">3</xref>], mutations in delta-protocadherins in mice and humans have been linked with various neurological phenotypes, including axon growth and guidance impairments and changes in synaptic plasticity and stability [<xref ref-type="bibr" rid="c44">44</xref>, <xref ref-type="bibr" rid="c45">45</xref>, <xref ref-type="bibr" rid="c46">46</xref>]. Future experimental studies are needed to validate these findings and further unravel the roles of these genes in neural circuit formation and function.</p>
<p>While our approach was illustrated using the mouse retina, it holds vast potential to decipher the genetic programming of neuronal connectivity in other areas of the nervous system, given sufficient data on both gene expression profiles and synaptic connections. For instance, single-cell sequencing across multiple cortex regions and the hippocampus in mice has defined transcriptomic cell types [<xref ref-type="bibr" rid="c47">47</xref>, <xref ref-type="bibr" rid="c48">48</xref>, <xref ref-type="bibr" rid="c49">49</xref>]. Concurrently, considerable progress has been made in connectomic studies of mouse cortices, particularly the visual cortex [<xref ref-type="bibr" rid="c50">50</xref>, <xref ref-type="bibr" rid="c51">51</xref>, <xref ref-type="bibr" rid="c33">33</xref>, <xref ref-type="bibr" rid="c52">52</xref>]. Integrating these data sources enables a comprehensive examination of the genetic underpinnings of connectivities in these brain regions. Our approach serves as a pioneering step in investigating how gene expression patterns contribute to the diversity of neuronal circuits across different brain regions, thereby setting the stage for a holistic understanding of the genetic blueprint governing neuronal circuit wiring throughout the entire brain.</p>
</sec>
<sec id="s9">
<label>9</label>
<title>Future Directions</title>
<p>This study introduces a novel and powerful methodology for inferring connectivity maps from gene expression data, bridging the gap between gene expression and synaptic connectivity in neural circuits. It offers a data-driven, systematic approach to decipher the genetic regulation of neural circuit formation. Nevertheless, the study is not without its limitations, which, in turn, highlight potential avenues for future research.</p>
<p>Firstly, the inherent linearity of the model facilitates the identification of key genes influencing neuronal connectivity but reduces the intricate, multi-stage process of synapse formation to a linear transformation. This simplification, while useful for gaining insights, does not fully capture the complexity of synapse formation. However, the bilinear model provides a solid theoretical foundation for developing more sophisticated, non-linear models. For instance, we could extend the bilinear model by incorporating kernels, possibly in the form of neural networks, to capture non-linear interactions between gene expressions and connectivity patterns (<xref rid="fig7" ref-type="fig">Figure 7</xref>). A manifestation of this non-linear approach is the “two-tower model”, in which each “tower” represents a deep neural network that learns a non-linear transformation of the input features [<xref ref-type="bibr" rid="c53">53</xref>, <xref ref-type="bibr" rid="c54">54</xref>]. This model, widely used in contemporary recommendation systems, has demonstrated potency in capturing complicated user-item interactions.</p>
<fig id="fig7" position="float" fig-type="figure">
<label>Figure 7:</label>
<caption><p>Future direction: A two-tower model. (a) Gene expression profiles of pre- and postsynaptic neurons are transformed into latent embedding representations via deep neural networks. The connectivity metric between the pre- and post-synaptic neurons is predicted by taking the inner product of their respective latent embeddings.</p></caption>
<graphic xlink:href="551532v1_fig7.tif" mimetype="image" mime-subtype="tiff"/>
</fig>
<p>Secondly, our model operates on the assumption that the transcriptomic profile of a neuron type largely determines its connectivity profile. However, other factors, such as neuronal activity, also influence synaptic connectivity, and our model does not account for these elements. Future models could incorporate activity-dependent variables and additional omics data, to capture a broader range of factors shaping neural circuits.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="c1"><label>[1]</label><mixed-citation publication-type="other"><string-name><given-names>Sebastian</given-names> <surname>Seung</surname></string-name>. <source>Connectome: How the Brain’s Wiring Makes Us Who We Are. Houghton Mifflin Harcourt</source>, <year>2011</year>.</mixed-citation></ref>
<ref id="c2"><label>[2]</label><mixed-citation publication-type="journal"><string-name><given-names>Franck</given-names> <surname>Polleux</surname></string-name> and <string-name><given-names>William</given-names> <surname>Snider</surname></string-name>. <article-title>Establishment of axon-dendrite polarity in developing neurons</article-title>. <source>Annual review of neuroscience</source>, <volume>36</volume>:<fpage>467</fpage>–<lpage>488</lpage>, <year>2013</year>.</mixed-citation></ref>
<ref id="c3"><label>[3]</label><mixed-citation publication-type="journal"><string-name><given-names>Joshua R.</given-names> <surname>Sanes</surname></string-name> and <string-name><given-names>S. Lawrence</given-names> <surname>Zipursky</surname></string-name>. <article-title>Synaptic Specificity, Recognition Molecules, and Assembly of Neural Circuits</article-title>. <source>Cell</source>, <volume>181</volume>(<issue>3</issue>):<fpage>536</fpage>–<lpage>556</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="c4"><label>[4]</label><mixed-citation publication-type="journal"><string-name><given-names>Hongkui</given-names> <surname>Zeng</surname></string-name> and <string-name><given-names>Joshua R</given-names> <surname>Sanes</surname></string-name>. <article-title>Neuronal cell-type classification: challenges, opportunities and the path forward</article-title>. <source>Nature Reviews Neuroscience</source>, <volume>18</volume>(<issue>9</issue>):<fpage>530</fpage>–<lpage>546</lpage>, <year>2017</year>.</mixed-citation></ref>
<ref id="c5"><label>[5]</label><mixed-citation publication-type="journal"><string-name><given-names>Antonio</given-names> <surname>Scialdone</surname></string-name>, <string-name><given-names>Valentine</given-names> <surname>Svensson</surname></string-name>, <string-name><given-names>Anja</given-names> <surname>Wilbrey-Clark</surname></string-name>, <string-name><given-names>Valentina</given-names> <surname>Proserpio</surname></string-name>, and <string-name><given-names>Sarah A</given-names> <surname>Teichmann</surname></string-name>. <article-title>Computational and analytical challenges in single-cell transcriptomics</article-title>. <source>Nature Reviews Genetics</source>, <volume>19</volume>(<issue>3</issue>):<fpage>133</fpage>–<lpage>145</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="c6"><label>[6]</label><mixed-citation publication-type="journal"><string-name><given-names>Winfried</given-names> <surname>Denk</surname></string-name> and <string-name><given-names>Heinz</given-names> <surname>Horstmann</surname></string-name>. <article-title>Serial block-face scanning electron microscopy to reconstruct three-dimensional tissue nanostructure</article-title>. <source>PLOS biology</source>, <volume>2</volume>(<issue>11</issue>), <year>2004</year>.</mixed-citation></ref>
<ref id="c7"><label>[7]</label><mixed-citation publication-type="journal"><string-name><given-names>Moritz</given-names> <surname>Helmstaedter</surname></string-name>, <string-name><given-names>Kevin L.</given-names> <surname>Briggman</surname></string-name>, <string-name><given-names>Srinivas C.</given-names> <surname>Turaga</surname></string-name>, <string-name><given-names>Viren</given-names> <surname>Jain</surname></string-name>, <string-name><given-names>H. Sebastian</given-names> <surname>Seung</surname></string-name>, and <string-name><given-names>Winfried</given-names> <surname>Denk</surname></string-name>. <article-title>Connectomic reconstruction of the inner plexiform layer in the mouse retina</article-title>. <source>Nature</source>, <volume>500</volume>(<issue>7461</issue>):<fpage>168</fpage>–<lpage>174</lpage>, <year>2013</year>.</mixed-citation></ref>
<ref id="c8"><label>[8]</label><mixed-citation publication-type="journal"><string-name><given-names>Juan Carlos</given-names> <surname>Tapia</surname></string-name>, <string-name><given-names>Narayanan</given-names> <surname>Kasthuri</surname></string-name>, <string-name><given-names>Kenneth J</given-names> <surname>Hayworth</surname></string-name>, <string-name><given-names>Richard</given-names> <surname>Schalek</surname></string-name>, <string-name><given-names>Jeff W</given-names> <surname>Lichtman</surname></string-name>, <string-name><given-names>Stephen J</given-names> <surname>Smith</surname></string-name>, and <string-name><given-names>JoAnn</given-names> <surname>Buchanan</surname></string-name>. <article-title>High-contrast en bloc staining of neuronal tissue for field emission scanning electron microscopy</article-title>. <source>Nature Protocols</source>, <volume>7</volume>(<issue>2</issue>):<fpage>193</fpage>–<lpage>206</lpage>, <year>2012</year>.</mixed-citation></ref>
<ref id="c9"><label>[9]</label><mixed-citation publication-type="journal"><string-name><given-names>Yehuda</given-names> <surname>Koren</surname></string-name>, <string-name><given-names>Robert</given-names> <surname>Bell</surname></string-name>, and <string-name><given-names>Chris</given-names> <surname>Volinsky</surname></string-name>. <article-title>Matrix factorization techniques for recom-mender systems</article-title>. In <source>Computer</source>, volume <volume>42</volume>, pages <fpage>30</fpage>–<lpage>37</lpage>, <year>2009</year>.</mixed-citation></ref>
<ref id="c10"><label>[10]</label><mixed-citation publication-type="journal"><string-name><given-names>Thomas</given-names> <surname>Euler</surname></string-name>, <string-name><given-names>Silke</given-names> <surname>Haverkamp</surname></string-name>, <string-name><given-names>Timm</given-names> <surname>Schubert</surname></string-name>, and <string-name><given-names>Tom</given-names> <surname>Baden</surname></string-name>. <article-title>Retinal bipolar cells: elementary building blocks of vision</article-title>. <source>Nat Rev Neurosci</source>, <volume>15</volume>(<issue>8</issue>):<fpage>507</fpage>–<lpage>519</lpage>, <year>2014</year>.</mixed-citation></ref>
<ref id="c11"><label>[11]</label><mixed-citation publication-type="journal"><string-name><given-names>Joshua R</given-names> <surname>Sanes</surname></string-name> and <string-name><given-names>Richard H</given-names> <surname>Masland</surname></string-name>. <article-title>The types of retinal ganglion cells: Current status and implications for neuronal classification</article-title>. <source>Annual Review of Neuroscience</source>, <volume>38</volume>:<fpage>221</fpage>–<lpage>246</lpage>, <year>2015</year>.</mixed-citation></ref>
<ref id="c12"><label>[12]</label><mixed-citation publication-type="journal"><string-name><given-names>Tim</given-names> <surname>Gollisch</surname></string-name> and <string-name><given-names>Markus</given-names> <surname>Meister</surname></string-name>. <article-title>Eye smarter than scientists believed: Neural computations in circuits of the retina</article-title>. <source>Neuron</source>, <volume>65</volume>(<issue>2</issue>):<fpage>150</fpage>–<lpage>164</lpage>, <year>2010</year>.</mixed-citation></ref>
<ref id="c13"><label>[13]</label><mixed-citation publication-type="journal"><string-name><given-names>Rava Azeredo</given-names> <surname>da Silveira</surname></string-name> and <string-name><given-names>Botond</given-names> <surname>Roska</surname></string-name>. <article-title>Cell types, circuits, computation</article-title>. <source>Current Opinion in Neurobiology</source>, <volume>21</volume>(<issue>5</issue>):<fpage>664</fpage>–<lpage>671</lpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="c14"><label>[14]</label><mixed-citation publication-type="journal"><string-name><given-names>Joris</given-names> <surname>de Wit</surname></string-name> and <string-name><given-names>Anirvan</given-names> <surname>Ghosh</surname></string-name>. <article-title>Specification of synaptic connectivity by cell surface interactions</article-title>. <source>Nature Reviews Neuroscience</source>, <volume>17</volume>:<fpage>4</fpage>, <year>2016</year>.</mixed-citation></ref>
<ref id="c15"><label>[15]</label><mixed-citation publication-type="journal"><string-name><given-names>Adam J</given-names> <surname>Isabella</surname></string-name>, <string-name><surname>Eduardo</surname> <given-names>Leyva-Díaz</given-names></string-name>, <string-name><given-names>Takuya</given-names> <surname>Kaneko</surname></string-name>, <string-name><given-names>Scott J</given-names> <surname>Gratz</surname></string-name>, <string-name><given-names>Cecilia B</given-names> <surname>Moens</surname></string-name>, <string-name><given-names>Oliver</given-names> <surname>Hobert</surname></string-name>, <string-name><surname>Kate</surname> <given-names>O’Connor-Giles</given-names></string-name>, <string-name><given-names>Rajan</given-names> <surname>Thakur</surname></string-name>, and <string-name><given-names>HaoSheng</given-names> <surname>Sun</surname></string-name>. <article-title>The field of neurogenetics: where it stands and where it is going</article-title>. <source>Genetics</source>, <volume>218</volume>(<issue>4</issue>):<fpage>iyab085</fpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="c16"><label>[16]</label><mixed-citation publication-type="journal"><string-name><given-names>Alex</given-names> <surname>Fornito</surname></string-name>, <string-name><surname>Aurina</surname> <given-names>Arnatkevičiūtė</given-names></string-name>, and <string-name><given-names>Ben D.</given-names> <surname>Fulcher</surname></string-name>. <article-title>Bridging the Gap between Connectome and Transcriptome</article-title>. <source>Trends in Cognitive Sciences</source>, <volume>23</volume>(<issue>1</issue>):<fpage>34</fpage>–<lpage>50</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="c17"><label>[17]</label><mixed-citation publication-type="journal"><string-name><given-names>Francesco</given-names> <surname>Ricci</surname></string-name>, <string-name><given-names>Lior</given-names> <surname>Rokach</surname></string-name>, <string-name><given-names>Bracha</given-names> <surname>Shapira</surname></string-name>, and <string-name><given-names>Paul B</given-names> <surname>Kantor</surname></string-name>. <article-title>Introduction to recommender systems handbook</article-title>. <source>Recommender systems handbook</source>, <volume>1</volume>:<fpage>1</fpage>–<lpage>35</lpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="c18"><label>[18]</label><mixed-citation publication-type="other"><string-name><given-names>Xiaoyuan</given-names> <surname>Su</surname></string-name> and <string-name><given-names>Taghi M</given-names> <surname>Khoshgoftaar</surname></string-name>. <article-title>A survey of collaborative filtering techniques</article-title>. <source>Advances in artificial intelligence, 2009</source>, <year>2009</year>.</mixed-citation></ref>
<ref id="c19"><label>[19]</label><mixed-citation publication-type="other"><string-name><given-names>Steffen</given-names> <surname>Rendle</surname></string-name>, <string-name><given-names>Christoph</given-names> <surname>Freudenthaler</surname></string-name>, <string-name><given-names>Zeno</given-names> <surname>Gantner</surname></string-name>, and <string-name><given-names>Lars</given-names> <surname>Schmidt-Thieme</surname></string-name>. <article-title>Bpr: Bayesian personalized ranking from implicit feedback</article-title>. In <source>Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence</source>, pages <fpage>452</fpage>–<lpage>461</lpage>, <year>2009</year>.</mixed-citation></ref>
<ref id="c20"><label>[20]</label><mixed-citation publication-type="journal"><string-name><given-names>Xin</given-names> <surname>Duan</surname></string-name>, <string-name><given-names>Arjun</given-names> <surname>Krishnaswamy</surname></string-name>, <string-name><given-names>Irina De la</given-names> <surname>Huerta</surname></string-name>, and <string-name><given-names>Joshua R.</given-names> <surname>Sanes</surname></string-name>. <article-title>Type ii cadherins guide assembly of a direction-selective retinal circuit</article-title>. <source>Cell</source>, <volume>158</volume>(<issue>4</issue>):<fpage>793</fpage>–<lpage>807</lpage>, <year>2014. Free article</year>.</mixed-citation></ref>
<ref id="c21"><label>[21]</label><mixed-citation publication-type="journal"><string-name><given-names>Xin</given-names> <surname>Duan</surname></string-name>, <string-name><given-names>Arjun</given-names> <surname>Krishnaswamy</surname></string-name>, <string-name><given-names>Irina De la</given-names> <surname>Huerta</surname></string-name>, and <string-name><given-names>Joshua R</given-names> <surname>Sanes</surname></string-name>. <article-title>Cadherin combinations recruit dendrites of distinct retinal neurons to a shared interneuronal scaffold</article-title>. <source>Neuron</source>, <volume>99</volume>(<issue>6</issue>):<fpage>1145</fpage>–<lpage>1154</lpage>.e6, <year>2018</year>.</mixed-citation></ref>
<ref id="c22"><label>[22]</label><mixed-citation publication-type="journal"><string-name><given-names>Karthik</given-names> <surname>Shekhar</surname></string-name>, <string-name><given-names>Sylvain W.</given-names> <surname>Lapan</surname></string-name>, <string-name><given-names>Irene E.</given-names> <surname>Whitney</surname></string-name>, <string-name><given-names>Nicholas M.</given-names> <surname>Tran</surname></string-name>, <string-name><given-names>Evan Z.</given-names> <surname>Macosko</surname></string-name>, <string-name><given-names>Monika</given-names> <surname>Kowalczyk</surname></string-name>, <string-name><given-names>Xian</given-names> <surname>Adiconis</surname></string-name>, <string-name><given-names>Joshua Z.</given-names> <surname>Levin</surname></string-name>, <string-name><given-names>James</given-names> <surname>Nemesh</surname></string-name>, <string-name><given-names>Melissa</given-names> <surname>Goldman</surname></string-name>, <string-name><given-names>Steven A.</given-names> <surname>McCarroll</surname></string-name>, <string-name><given-names>Constance L.</given-names> <surname>Cepko</surname></string-name>, <string-name><given-names>Aviv</given-names> <surname>Regev</surname></string-name>, and <string-name><given-names>Joshua R.</given-names> <surname>Sanes</surname></string-name>. <article-title>Comprehensive classification of retinal bipolar neurons by single-cell transcriptomics</article-title>. <source>Cell</source>, <volume>166</volume>(<issue>5</issue>):<fpage>1308</fpage>– 1323.e30, <month>August</month> <year>2016</year>.</mixed-citation></ref>
<ref id="c23"><label>[23]</label><mixed-citation publication-type="other"><string-name><given-names>Nicholas M.</given-names> <surname>Tran</surname></string-name>, <string-name><given-names>Karthik</given-names> <surname>Shekhar</surname></string-name>, <string-name><given-names>Irene E.</given-names> <surname>Whitney</surname></string-name>, <string-name><given-names>Anne</given-names> <surname>Jacobi</surname></string-name>, <string-name><given-names>Inbal</given-names> <surname>Benhar</surname></string-name>, <string-name><given-names>Guosong</given-names> <surname>Hong</surname></string-name>, <string-name><given-names>Wenjun</given-names> <surname>Yan</surname></string-name>, <string-name><given-names>Xian</given-names> <surname>Adiconis</surname></string-name>, <string-name><given-names>McKinzie E.</given-names> <surname>Arnold</surname></string-name>, <string-name><given-names>Jung Min</given-names> <surname>Lee</surname></string-name>, <string-name><given-names>Joshua Z.</given-names> <surname>Levin</surname></string-name>, <string-name><given-names>Dingchang</given-names> <surname>Lin</surname></string-name>, <string-name><given-names>Chen</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>Charles M.</given-names> <surname>Lieber</surname></string-name>, <string-name><given-names>Aviv</given-names> <surname>Regev</surname></string-name>, <string-name><given-names>Zhigang</given-names> <surname>He</surname></string-name>, and <string-name><given-names>Joshua R.</given-names> <surname>Sanes</surname></string-name>. <article-title>Single-cell profiles of retinal neurons differing in resilience to injury reveal neuroprotective genes</article-title>. <source>bioRxiv</source>, page <fpage>711762</fpage>, <month>July</month> <year>2019</year>.</mixed-citation></ref>
<ref id="c24"><label>[24]</label><mixed-citation publication-type="journal"><string-name><given-names>J. Alexander</given-names> <surname>Bae</surname></string-name>, <string-name><given-names>Shang</given-names> <surname>Mu</surname></string-name>, <string-name><given-names>Jinseop S.</given-names> <surname>Kim</surname></string-name>, <string-name><given-names>Nicholas L.</given-names> <surname>Turner</surname></string-name>, <string-name><given-names>Ignacio</given-names> <surname>Tartavull</surname></string-name>, <string-name><given-names>Nico</given-names> <surname>Kemnitz</surname></string-name>, <string-name><given-names>Chris S.</given-names> <surname>Jordan</surname></string-name>, <string-name><given-names>Alex D.</given-names> <surname>Norton</surname></string-name>, <string-name><given-names>William M.</given-names> <surname>Silversmith</surname></string-name>, <string-name><given-names>Rachel</given-names> <surname>Prentki</surname></string-name>, <string-name><given-names>Marissa</given-names> <surname>Sorek</surname></string-name>, <string-name><given-names>Celia</given-names> <surname>David</surname></string-name>, <string-name><given-names>Devon L.</given-names> <surname>Jones</surname></string-name>, <string-name><given-names>Doug</given-names> <surname>Bland</surname></string-name>, <string-name><given-names>Amy L. R.</given-names> <surname>Sterling</surname></string-name>, <string-name><given-names>Jungman</given-names> <surname>Park</surname></string-name>, <string-name><given-names>Kevin L.</given-names> <surname>Briggman</surname></string-name>, <string-name><given-names>H. Sebastian</given-names> <surname>Seung</surname></string-name>, and <article-title>the EyeWirers. Digital museum of retinal ganglion cells with dense anatomy and physiology</article-title>. <source>Cell</source>, <volume>173</volume>(<issue>5</issue>):<fpage>1293</fpage>–<lpage>1306</lpage>.e19, <year>2018</year>.</mixed-citation></ref>
<ref id="c25"><label>[25]</label><mixed-citation publication-type="journal"><string-name><given-names>M. J.</given-names> <surname>Greene</surname></string-name>, <string-name><given-names>J. S.</given-names> <surname>Kim</surname></string-name>, <string-name><given-names>H. S.</given-names> <surname>Seung</surname></string-name>, and <article-title>the EyeWirers. Analogous convergence of sustained and transient inputs in parallel on and off pathways for retinal motion computation</article-title>. <source>Cell Reports</source>, <volume>14</volume>:<fpage>1</fpage>–<lpage>9</lpage>, <year>2016</year>.</mixed-citation></ref>
<ref id="c26"><label>[26]</label><mixed-citation publication-type="journal"><string-name><given-names>Jillian</given-names> <surname>Goetz</surname></string-name>, <string-name><given-names>Zachary F</given-names> <surname>Jessen</surname></string-name>, <string-name><given-names>Anne</given-names> <surname>Jacobi</surname></string-name>, <string-name><given-names>Adam</given-names> <surname>Mani</surname></string-name>, <string-name><given-names>Sam</given-names> <surname>Cooler</surname></string-name>, <string-name><given-names>Devon</given-names> <surname>Greer</surname></string-name>, <string-name><given-names>Sabah</given-names> <surname>Kadri</surname></string-name>, <string-name><given-names>Jeremy</given-names> <surname>Segal</surname></string-name>, <string-name><given-names>Karthik</given-names> <surname>Shekhar</surname></string-name>, <string-name><given-names>Joshua R</given-names> <surname>Sanes</surname></string-name>, and <string-name><given-names>Gregory W</given-names> <surname>Schwartz</surname></string-name>. <article-title>Unified classification of mouse retinal ganglion cells using function, morphology, and gene expression</article-title>. <source>Cell Reports</source>, <volume>40</volume>(<issue>2</issue>):<fpage>111040</fpage>, <year>2022</year>.</mixed-citation></ref>
<ref id="c27"><label>[27]</label><mixed-citation publication-type="journal"><string-name><given-names>Andrew</given-names> <surname>Butler</surname></string-name>, <string-name><given-names>Paul</given-names> <surname>Hoffman</surname></string-name>, <string-name><given-names>Peter</given-names> <surname>Smibert</surname></string-name>, <string-name><given-names>Efthymia</given-names> <surname>Papalexi</surname></string-name>, and <string-name><given-names>Rahul</given-names> <surname>Satija</surname></string-name>. <article-title>Integrating single-cell transcriptomic data across different conditions, technologies, and species</article-title>. <source>Nature Biotechnology</source>, <volume>36</volume>(<issue>5</issue>):<fpage>411</fpage>–<lpage>420</lpage>, <month>May</month> <year>2018</year>.</mixed-citation></ref>
<ref id="c28"><label>[28]</label><mixed-citation publication-type="journal"><string-name><given-names>Tim</given-names> <surname>Stuart</surname></string-name>, <string-name><given-names>Andrew</given-names> <surname>Butler</surname></string-name>, <string-name><given-names>Paul</given-names> <surname>Hoffman</surname></string-name>, <string-name><given-names>Christoph</given-names> <surname>Hafemeister</surname></string-name>, <string-name><given-names>Efthymia</given-names> <surname>Papalexi</surname></string-name>, <string-name><given-names>William M.</given-names> <surname>Mauck</surname></string-name>, <string-name><given-names>Yuhan</given-names> <surname>Hao</surname></string-name>, <string-name><given-names>Marlon</given-names> <surname>Stoeckius</surname></string-name>, <string-name><given-names>Peter</given-names> <surname>Smibert</surname></string-name>, and <string-name><given-names>Rahul</given-names> <surname>Satija</surname></string-name>. <article-title>Comprehensive Integration of Single-Cell Data</article-title>. <source>Cell</source>, <volume>177</volume>(<issue>7</issue>):<fpage>1888</fpage>–<lpage>1902</lpage>.e21, <month>June</month> <year>2019</year>.</mixed-citation></ref>
<ref id="c29"><label>[29]</label><mixed-citation publication-type="other"><string-name><given-names>Mu</given-names> <surname>Qiao</surname></string-name>. <article-title>Factorized linear discriminant analysis and its application in computational biology</article-title>. <source>arXiv preprint</source> <pub-id pub-id-type="arxiv">2010.02171</pub-id>, <year>2020</year>.</mixed-citation></ref>
<ref id="c30"><label>[30]</label><mixed-citation publication-type="journal"><string-name><given-names>Hung-I. Harry</given-names> <surname>Chen</surname></string-name>, <string-name><given-names>Yufang</given-names> <surname>Jin</surname></string-name>, <string-name><given-names>Yufei</given-names> <surname>Huang</surname></string-name>, and <string-name><given-names>Yidong</given-names> <surname>Chen</surname></string-name>. <article-title>Detection of high variability in gene expression from single-cell RNA-seq profiling</article-title>. <source>BMC Genomics</source>, <volume>17</volume> <issue>Suppl 7</issue>:<fpage>508</fpage>, <month>August</month> <year>2016</year>.</mixed-citation></ref>
<ref id="c31"><label>[31]</label><mixed-citation publication-type="journal"><string-name><given-names>Shristi</given-names> <surname>Pandey</surname></string-name>, <string-name><given-names>Karthik</given-names> <surname>Shekhar</surname></string-name>, <string-name><given-names>Aviv</given-names> <surname>Regev</surname></string-name>, and <string-name><given-names>Alexander F.</given-names> <surname>Schier</surname></string-name>. <article-title>Comprehensive Identification and Spatial Mapping of Habenular Neuronal Types Using Single-Cell RNA-Seq</article-title>. <source>Curr. Biol</source>., <volume>28</volume>(<issue>7</issue>):<fpage>1052</fpage>–<lpage>1065</lpage>.e7, <month>April</month> <year>2018</year>.</mixed-citation></ref>
<ref id="c32"><label>[32]</label><mixed-citation publication-type="journal"><string-name><given-names>Yerbol Z</given-names> <surname>Kurmangaliyev</surname></string-name>, <string-name><given-names>Juyoun</given-names> <surname>Yoo</surname></string-name>, <string-name><given-names>Samuel A</given-names> <surname>LoCascio</surname></string-name>, and <string-name><given-names>S</given-names> <surname>Lawrence Zipursky</surname></string-name>. <article-title>Modular transcriptional programs separately define axon and dendrite connectivity</article-title>. <source>eLife</source>, <volume>8</volume>:<fpage>e50822</fpage>, <month>November</month> <year>2019</year>.</mixed-citation></ref>
<ref id="c33"><label>[33]</label><mixed-citation publication-type="journal"><string-name><given-names>Nicholas L</given-names> <surname>Turner</surname></string-name>, <string-name><given-names>Thomas</given-names> <surname>Macrina</surname></string-name>, <string-name><given-names>J</given-names> <surname>Alexander Bae</surname></string-name>, <string-name><given-names>Runzhe</given-names> <surname>Yang</surname></string-name>, <string-name><given-names>Alyssa M</given-names> <surname>Wilson</surname></string-name>, <string-name><given-names>Casey</given-names> <surname>Schneider-Mizell</surname></string-name>, <string-name><given-names>Kisuk</given-names> <surname>Lee</surname></string-name>, <string-name><given-names>Ran</given-names> <surname>Lu</surname></string-name>, <string-name><given-names>Jingpeng</given-names> <surname>Wu</surname></string-name>, <string-name><given-names>Agnes L</given-names> <surname>Bodor</surname></string-name>, <string-name><given-names>Adam A</given-names> <surname>Bleckert</surname></string-name>, <string-name><given-names>Derrick</given-names> <surname>Brittain</surname></string-name>, <string-name><given-names>Emmanouil</given-names> <surname>Froudarakis</surname></string-name>, <string-name><given-names>Sven</given-names> <surname>Dorkenwald</surname></string-name>, <string-name><given-names>Forrest</given-names> <surname>Collman</surname></string-name>, <string-name><given-names>Nico</given-names> <surname>Kemnitz</surname></string-name>, <string-name><given-names>Dodam</given-names> <surname>Ih</surname></string-name>, <string-name><given-names>William M</given-names> <surname>Silversmith</surname></string-name>, <string-name><given-names>Jonathan</given-names> <surname>Zung</surname></string-name>, <string-name><given-names>Aleksandar</given-names> <surname>Zlateski</surname></string-name>, <string-name><given-names>Ignacio</given-names> <surname>Tartavull</surname></string-name>, <string-name><given-names>Szi-Chieh</given-names> <surname>Yu</surname></string-name>, <string-name><given-names>Sergiy</given-names> <surname>Popovych</surname></string-name>, <string-name><given-names>Shang</given-names> <surname>Mu</surname></string-name>, <string-name><given-names>William</given-names> <surname>Wong</surname></string-name>, <string-name><given-names>Chris S</given-names> <surname>Jordan</surname></string-name>, <string-name><given-names>Manuel</given-names> <surname>Castro</surname></string-name>, <string-name><given-names>JoAnn</given-names> <surname>Buchanan</surname></string-name>, <string-name><given-names>Daniel J</given-names> <surname>Bumbarger</surname></string-name>, <string-name><given-names>Marc</given-names> <surname>Takeno</surname></string-name>, <string-name><given-names>Russel</given-names> <surname>Torres</surname></string-name>, <string-name><given-names>Gayathri</given-names> <surname>Mahalingam</surname></string-name>, <string-name><given-names>Leila</given-names> <surname>Elabbady</surname></string-name>, <string-name><given-names>Yang</given-names> <surname>Li</surname></string-name>, <string-name><given-names>Erick</given-names> <surname>Cobos</surname></string-name>, <string-name><given-names>Pengcheng</given-names> <surname>Zhou</surname></string-name>, <string-name><given-names>Shelby</given-names> <surname>Suckow</surname></string-name>, <string-name><given-names>Lynne</given-names> <surname>Becker</surname></string-name>, <string-name><given-names>Liam</given-names> <surname>Paninski</surname></string-name>, <string-name><given-names>Franck</given-names> <surname>Polleux</surname></string-name>, <string-name><given-names>Jacob</given-names> <surname>Reimer</surname></string-name>, <string-name><given-names>Andreas S</given-names> <surname>Tolias</surname></string-name>, <string-name><given-names>R</given-names> <surname>Clay Reid</surname></string-name>, <string-name><surname>Nuno Maçarico da</surname> <given-names>Costa</given-names></string-name>, and <string-name><given-names>H</given-names> <surname>Sebastian Seung</surname></string-name>. <article-title>Reconstruction of neocortex: Organelles, compartments, cells, circuits, and activity</article-title>. <source>Cell</source>, <volume>185</volume>(<issue>6</issue>):<fpage>1082</fpage>–<lpage>1100</lpage>.e24, <year>2022</year>.</mixed-citation></ref>
<ref id="c34"><label>[34]</label><mixed-citation publication-type="journal"><string-name><given-names>Jinseop S</given-names> <surname>Kim</surname></string-name>, <string-name><given-names>Matthew J</given-names> <surname>Greene</surname></string-name>, <string-name><given-names>Aleksandar</given-names> <surname>Zlateski</surname></string-name>, <string-name><given-names>Kisuk</given-names> <surname>Lee</surname></string-name>, <string-name><given-names>Mark</given-names> <surname>Richardson</surname></string-name>, <string-name><given-names>Srinivas C</given-names> <surname>Turaga</surname></string-name>, <string-name><given-names>Michael</given-names> <surname>Purcaro</surname></string-name>, <string-name><given-names>Matthew</given-names> <surname>Balkam</surname></string-name>, <string-name><given-names>Amy</given-names> <surname>Robinson</surname></string-name>, <string-name><given-names>Bardia F</given-names> <surname>Behabadi</surname></string-name>, <etal>et al.</etal> <article-title>Space–time wiring specificity supports direction selectivity in the retina</article-title>. <source>Nature</source>, <volume>509</volume>(<issue>7500</issue>):<fpage>331</fpage>–<lpage>336</lpage>, <year>2014</year>.</mixed-citation></ref>
<ref id="c35"><label>[35]</label><mixed-citation publication-type="journal"><string-name><given-names>Richard H.</given-names> <surname>Masland</surname></string-name>. <article-title>The neuronal organization of the retina</article-title>. <source>Neuron</source>, <volume>76</volume>(<issue>2</issue>):<fpage>266</fpage>–<lpage>280</lpage>, <year>2012</year>.</mixed-citation></ref>
<ref id="c36"><label>[36]</label><mixed-citation publication-type="journal"><string-name><given-names>Tom</given-names> <surname>Baden</surname></string-name>, <string-name><given-names>Philipp</given-names> <surname>Berens</surname></string-name>, <string-name><given-names>Katrin</given-names> <surname>Franke</surname></string-name>, <string-name><surname>Miroslav Román</surname> <given-names>Rosón</given-names></string-name>, <string-name><given-names>Matthias</given-names> <surname>Bethge</surname></string-name>, and <string-name><given-names>Thomas</given-names> <surname>Euler</surname></string-name>. <article-title>The functional diversity of retinal ganglion cells in the mouse</article-title>. <source>Nature</source>, <volume>529</volume>(<issue>7586</issue>):<fpage>345</fpage>–<lpage>350</lpage>, <year>2016</year>.</mixed-citation></ref>
<ref id="c37"><label>[37]</label><mixed-citation publication-type="journal"><string-name><given-names>Ryota L.</given-names> <surname>Matsuoka</surname></string-name>, <string-name><given-names>Kim T.</given-names> <surname>Nguyen-Ba-Charvet</surname></string-name>, <string-name><given-names>Aijaz</given-names> <surname>Parray</surname></string-name>, <string-name><given-names>Tudor C.</given-names> <surname>Badea</surname></string-name>, <string-name><surname>Alain</surname> <given-names>Chédotal</given-names></string-name>, and <string-name><given-names>Alex L.</given-names> <surname>Kolodkin</surname></string-name>. <article-title>Transmembrane semaphorin signaling controls laminar stratification in the mammalian retina</article-title>. <source>Nature</source>, <volume>470</volume>(<issue>7333</issue>):<fpage>259</fpage>–<lpage>263</lpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="c38"><label>[38]</label><mixed-citation publication-type="journal"><string-name><given-names>Liqun O.</given-names> <surname>Sun</surname></string-name>, <string-name><given-names>Zhuoyi</given-names> <surname>Jiang</surname></string-name>, <string-name><given-names>Merav</given-names> <surname>Rivlin-Etzion</surname></string-name>, <string-name><given-names>Rachel</given-names> <surname>Hand</surname></string-name>, <string-name><given-names>Claire M.</given-names> <surname>Brady</surname></string-name>, <string-name><given-names>Reina L.</given-names> <surname>Matsuoka</surname></string-name>, <string-name><given-names>King-Wai</given-names> <surname>Yau</surname></string-name>, <string-name><given-names>Marla B.</given-names> <surname>Feller</surname></string-name>, and <string-name><given-names>Alex L.</given-names> <surname>Kolodkin</surname></string-name>. <article-title>On and off retinal circuit assembly by divergent molecular mechanisms</article-title>. <source>Science</source>, <volume>342</volume>(<issue>6158</issue>):<fpage>1241974</fpage>, <year>2013</year>.</mixed-citation></ref>
<ref id="c39"><label>[39]</label><mixed-citation publication-type="journal"><string-name><given-names>Yi-Rong</given-names> <surname>Peng</surname></string-name>, <string-name><given-names>Nicholas M</given-names> <surname>Tran</surname></string-name>, <string-name><given-names>Arjun</given-names> <surname>Krishnaswamy</surname></string-name>, <string-name><given-names>Dimitar</given-names> <surname>Kostadinov</surname></string-name>, <string-name><given-names>Emily M</given-names> <surname>Martersteck</surname></string-name>, and <string-name><given-names>Joshua R</given-names> <surname>Sanes</surname></string-name>. <article-title>Satb1 regulates contactin 5 to pattern dendrites of a mammalian retinal ganglion cell</article-title>. <source>Neuron</source>, <volume>95</volume>(<issue>4</issue>):<fpage>869</fpage>–<lpage>883</lpage>.e6, <year>2017</year>.</mixed-citation></ref>
<ref id="c40"><label>[40]</label><mixed-citation publication-type="journal"><string-name><given-names>Arjun</given-names> <surname>Krishnaswamy</surname></string-name>, <string-name><given-names>Masahito</given-names> <surname>Yamagata</surname></string-name>, <string-name><given-names>Xin</given-names> <surname>Duan</surname></string-name>, <string-name><given-names>Y. Kate</given-names> <surname>Hong</surname></string-name>, and <string-name><given-names>Joshua R.</given-names> <surname>Sanes</surname></string-name>. <article-title>Sidekick 2 directs formation of a retinal circuit that detects differential motion</article-title>. <source>Nature</source>, <volume>524</volume>(<issue>7566</issue>):<fpage>466</fpage>–<lpage>470</lpage>, <year>2015</year>.</mixed-citation></ref>
<ref id="c41"><label>[41]</label><mixed-citation publication-type="journal"><string-name><given-names>Jinyue</given-names> <surname>Liu</surname></string-name>, <string-name><given-names>Jasmine D. S.</given-names> <surname>Reggiani</surname></string-name>, <string-name><given-names>Mallory A.</given-names> <surname>Laboulaye</surname></string-name>, <string-name><given-names>Shristi</given-names> <surname>Pandey</surname></string-name>, <string-name><surname>Bin</surname> <given-names>Chen</given-names></string-name>, <string-name><given-names>John L. R.</given-names> <surname>Rubenstein</surname></string-name>, <string-name><given-names>Arjun</given-names> <surname>Krishnaswamy</surname></string-name>, and <string-name><given-names>Joshua R.</given-names> <surname>Sanes</surname></string-name>. <article-title>Tbr1 instructs laminar patterning of retinal ganglion cell dendrites</article-title>. <source>Nature Neuroscience</source>, <volume>21</volume>(<issue>5</issue>):<fpage>659</fpage>–<lpage>670</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="c42"><label>[42]</label><mixed-citation publication-type="other"><string-name><given-names>Jüri</given-names> <surname>Reimand</surname></string-name>, <string-name><given-names>Meelis</given-names> <surname>Kull</surname></string-name>, <string-name><given-names>Hedi</given-names> <surname>Peterson</surname></string-name>, <string-name><given-names>Jaanus</given-names> <surname>Hansen</surname></string-name>, and <string-name><given-names>Jaak</given-names> <surname>Vilo</surname></string-name>. <article-title>g:Profiler — a web-based toolset for functional profiling of gene lists from large-scale experiments</article-title>. <source>Nucleic Acids Research</source>, <year>2007. [PDF</year>].</mixed-citation></ref>
<ref id="c43"><label>[43]</label><mixed-citation publication-type="other"><string-name><given-names>Uku</given-names> <surname>Raudvere</surname></string-name>, <string-name><given-names>Liis</given-names> <surname>Kolberg</surname></string-name>, <string-name><given-names>Ivan</given-names> <surname>Kuzmin</surname></string-name>, <string-name><given-names>Tambet</given-names> <surname>Arak</surname></string-name>, <string-name><given-names>Priit</given-names> <surname>Adler</surname></string-name>, <string-name><given-names>Hedi</given-names> <surname>Peterson</surname></string-name>, and <string-name><given-names>Jaak</given-names> <surname>Vilo</surname></string-name>. <article-title>g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update)</article-title>. <source>Nucleic Acids Research</source>, <year>2019. [PDF</year>].</mixed-citation></ref>
<ref id="c44"><label>[44]</label><mixed-citation publication-type="journal"><string-name><given-names>Irene</given-names> <surname>Kahr</surname></string-name>, <string-name><given-names>Karl</given-names> <surname>Vandepoele</surname></string-name>, and <string-name><given-names>Frans</given-names> <surname>van Roy</surname></string-name>. <article-title>Delta-protocadherins in health and disease</article-title>. <source>Progress in Molecular Biology and Translational Science</source>, <volume>116</volume>:<fpage>169</fpage>–<lpage>192</lpage>, <year>2013</year>.</mixed-citation></ref>
<ref id="c45"><label>[45]</label><mixed-citation publication-type="journal"><string-name><given-names>Sarah E.W.</given-names> <surname>Light</surname></string-name> and <string-name><given-names>James D.</given-names> <surname>Jontes</surname></string-name>. <article-title>Delta-protocadherins: organizers of neural circuit assembly</article-title>. <source>Seminars in Cell &amp; Developmental Biology</source>, <volume>69</volume>:<fpage>83</fpage>–<lpage>90</lpage>, <year>2017</year>. PMCID: <pub-id pub-id-type="pmcid">PMC5582989</pub-id>.</mixed-citation></ref>
<ref id="c46"><label>[46]</label><mixed-citation publication-type="journal"><string-name><given-names>Stacey</given-names> <surname>Peek</surname></string-name>, <string-name><given-names>Kar Men</given-names> <surname>Mah</surname></string-name>, and <string-name><given-names>Joshua A.</given-names> <surname>Weiner</surname></string-name>. <article-title>Regulation of neural circuit formation by protocadherins</article-title>. <source>Cellular and Molecular Life Sciences</source>, <volume>74</volume>(<issue>22</issue>):<fpage>4133</fpage>–<lpage>4157</lpage>, <year>2017</year>. PMCID: <pub-id pub-id-type="pmcid">PMC5643215</pub-id>.</mixed-citation></ref>
<ref id="c47"><label>[47]</label><mixed-citation publication-type="journal"><string-name><given-names>Bosiljka</given-names> <surname>Tasic</surname></string-name>, <string-name><given-names>Vilas</given-names> <surname>Menon</surname></string-name>, <string-name><given-names>Thuc Nghi</given-names> <surname>Nguyen</surname></string-name>, <string-name><given-names>Tae Kyung</given-names> <surname>Kim</surname></string-name>, <string-name><given-names>Tomasz</given-names> <surname>Jarsky</surname></string-name>, <string-name><given-names>Zizhen</given-names> <surname>Yao</surname></string-name>, <string-name><given-names>Boaz</given-names> <surname>Levi</surname></string-name>, <string-name><given-names>Louie T</given-names> <surname>Gray</surname></string-name>, <string-name><given-names>Stacey A</given-names> <surname>Sorensen</surname></string-name>, <string-name><given-names>Tim</given-names> <surname>Dolbeare</surname></string-name>, <etal>et al.</etal> <article-title>Adult mouse cortical cell taxonomy revealed by single cell transcriptomics</article-title>. <source>Nature Neuroscience</source>, <volume>19</volume>(<issue>2</issue>):<fpage>335</fpage>–<lpage>346</lpage>, <year>2016</year>.</mixed-citation></ref>
<ref id="c48"><label>[48]</label><mixed-citation publication-type="journal"><string-name><given-names>Bosiljka</given-names> <surname>Tasic</surname></string-name>, <string-name><given-names>Zizhen</given-names> <surname>Yao</surname></string-name>, <string-name><given-names>Lucas T</given-names> <surname>Graybuck</surname></string-name>, <string-name><given-names>Kimberly A</given-names> <surname>Smith</surname></string-name>, <string-name><given-names>Thuc Nghi</given-names> <surname>Nguyen</surname></string-name>, <string-name><given-names>Darren</given-names> <surname>Bertagnolli</surname></string-name>, <string-name><given-names>Jeff</given-names> <surname>Goldy</surname></string-name>, <string-name><given-names>Ellen</given-names> <surname>Garren</surname></string-name>, <string-name><given-names>Michael N</given-names> <surname>Economo</surname></string-name>, <string-name><given-names>Sarada</given-names> <surname>Viswanathan</surname></string-name>, <etal>et al.</etal> <article-title>Shared and distinct transcriptomic cell types across neocortical areas</article-title>. <source>Nature</source>, <volume>563</volume>(<issue>7729</issue>):<fpage>72</fpage>–<lpage>78</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="c49"><label>[49]</label><mixed-citation publication-type="journal"><string-name><given-names>Zizhen</given-names> <surname>Yao</surname></string-name>, <string-name><given-names>Cindy T.J.</given-names> <surname>van Velthoven</surname></string-name>, <string-name><given-names>Thuc Nghi</given-names> <surname>Nguyen</surname></string-name>, <string-name><given-names>Jeff</given-names> <surname>Goldy</surname></string-name>, <string-name><given-names>Adriana E.</given-names> <surname>Sedeno-Cortes</surname></string-name>, <string-name><given-names>Fahimeh</given-names> <surname>Baftizadeh</surname></string-name>, <string-name><given-names>Darren</given-names> <surname>Bertagnolli</surname></string-name>, <string-name><given-names>Tamara</given-names> <surname>Casper</surname></string-name>, <string-name><given-names>Megan</given-names> <surname>Chiang</surname></string-name>, <string-name><given-names>Kirsten</given-names> <surname>Crichton</surname></string-name>, <etal>et al.</etal> <article-title>A taxonomy of transcriptomic cell types across the isocortex and hippocampal formation</article-title>. <source>Cell</source>, <volume>184</volume>(<issue>12</issue>):<fpage>3222</fpage>–<lpage>3241</lpage>.e26, <year>2021</year>.</mixed-citation></ref>
<ref id="c50"><label>[50]</label><mixed-citation publication-type="journal"><string-name><given-names>David D.</given-names> <surname>Bock</surname></string-name>, <string-name><given-names>Wei-Chung Allen</given-names> <surname>Lee</surname></string-name>, <string-name><given-names>Alexander M.</given-names> <surname>Kerlin</surname></string-name>, <string-name><given-names>Marga L.</given-names> <surname>Andermann</surname></string-name>, <string-name><given-names>Gregory</given-names> <surname>Hood</surname></string-name>, <string-name><given-names>Ann W.</given-names> <surname>Wetzel</surname></string-name>, <string-name><given-names>Stanislav</given-names> <surname>Yurgenson</surname></string-name>, <string-name><given-names>Elisha R.</given-names> <surname>Soucy</surname></string-name>, <string-name><given-names>Hyun Seok</given-names> <surname>Kim</surname></string-name>, and <string-name><given-names>R. Clay</given-names> <surname>Reid</surname></string-name>. <article-title>Network anatomy and in vivo physiology of visual cortical neurons</article-title>. <source>Nature</source>, <volume>471</volume>:<fpage>177</fpage>–<lpage>182</lpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="c51"><label>[51]</label><mixed-citation publication-type="journal"><string-name><given-names>Wei-Chung Allen</given-names> <surname>Lee</surname></string-name>, <string-name><given-names>Vincent</given-names> <surname>Bonin</surname></string-name>, <string-name><given-names>Michael</given-names> <surname>Reed</surname></string-name>, <string-name><given-names>Brett J.</given-names> <surname>Graham</surname></string-name>, <string-name><given-names>Gregory</given-names> <surname>Hood</surname></string-name>, <string-name><given-names>Kenneth</given-names> <surname>Glattfelder</surname></string-name>, and <string-name><given-names>R. Clay</given-names> <surname>Reid</surname></string-name>. <article-title>Anatomy and function of an excitatory network in the visual cortex</article-title>. <source>Nature</source>, <volume>532</volume>:<fpage>370</fpage>–<lpage>374</lpage>, <year>2016</year>.</mixed-citation></ref>
<ref id="c52"><label>[52]</label><mixed-citation publication-type="journal"><string-name><given-names>Shenqin</given-names> <surname>Yao</surname></string-name>, <string-name><given-names>Quanxin</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>Karla E.</given-names> <surname>Hirokawa</surname></string-name>, <string-name><given-names>Benjamin</given-names> <surname>Ouellette</surname></string-name>, <string-name><given-names>Ruweida</given-names> <surname>Ahmed</surname></string-name>, <string-name><given-names>Jasmin</given-names> <surname>Bomben</surname></string-name>, <string-name><given-names>Krissy</given-names> <surname>Brouner</surname></string-name>, <string-name><given-names>Linzy</given-names> <surname>Casal</surname></string-name>, <string-name><given-names>Shiella</given-names> <surname>Caldejon</surname></string-name>, <string-name><given-names>Andy</given-names> <surname>Cho</surname></string-name>, <string-name><given-names>Nadezhda I.</given-names> <surname>Dotson</surname></string-name>, <string-name><given-names>Tanya L.</given-names> <surname>Daigle</surname></string-name>, <string-name><given-names>Tom</given-names> <surname>Egdorf</surname></string-name>, <string-name><given-names>Rachel</given-names> <surname>Enstrom</surname></string-name>, <string-name><given-names>Amanda</given-names> <surname>Gary</surname></string-name>, <string-name><given-names>Emily</given-names> <surname>Gelfand</surname></string-name>, <string-name><given-names>Melissa</given-names> <surname>Gorham</surname></string-name>, <string-name><given-names>Fiona</given-names> <surname>Griffin</surname></string-name>, <string-name><given-names>Hong</given-names> <surname>Gu</surname></string-name>, <string-name><given-names>Nicole</given-names> <surname>Hancock</surname></string-name>, <string-name><given-names>Robert</given-names> <surname>Howard</surname></string-name>, <string-name><given-names>Leonard</given-names> <surname>Kuan</surname></string-name>, <string-name><given-names>Sophie</given-names> <surname>Lambert</surname></string-name>, <string-name><given-names>Eric Kenji</given-names> <surname>Lee</surname></string-name>, and <string-name><given-names>Hongkui</given-names> <surname>Zeng</surname></string-name>. <article-title>A whole-brain monosynaptic input connectome to neuron classes in mouse visual cortex</article-title>. <source>Nature Neuroscience</source>, <volume>26</volume>:<fpage>350</fpage>–<lpage>364</lpage>, <year>2023. Article</year>.</mixed-citation></ref>
<ref id="c53"><label>[53]</label><mixed-citation publication-type="other"><string-name><given-names>Tian</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>Yuri M.</given-names> <surname>Brovman</surname></string-name>, and <string-name><given-names>Sriganesh</given-names> <surname>Madhvanath</surname></string-name>. <article-title>Personalized embedding-based e-commerce recommendations at ebay</article-title>. <source>arXiv preprint</source> <pub-id pub-id-type="arxiv">2102.06156</pub-id>, <year>2021</year>.</mixed-citation></ref>
<ref id="c54"><label>[54]</label><mixed-citation publication-type="other"><string-name><given-names>Yantao</given-names> <surname>Yu</surname></string-name>, <string-name><given-names>Weipeng</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>Zhoutian</given-names> <surname>Feng</surname></string-name>, <string-name><given-names>Daiyue Xue</given-names>, <surname>Meituan</surname></string-name>, and <article-title>Beijing. A dual augmented two-tower model for online large-scale recommendation</article-title>. <source>KDD</source>, <year>2021</year>.</mixed-citation></ref>
</ref-list>
</back>
<sub-article id="sa0" article-type="editor-report">
<front-stub>
<article-id pub-id-type="doi">10.7554/eLife.91532.1.sa2</article-id>
<title-group>
<article-title>eLife Assessment</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Nelson</surname>
<given-names>Sacha B</given-names>
</name>
<role specific-use="editor">Reviewing Editor</role>
<aff>
<institution-wrap>
<institution>Brandeis University</institution>
</institution-wrap>
<city>Waltham</city>
<country>United States of America</country>
</aff>
</contrib>
</contrib-group>
<kwd-group kwd-group-type="evidence-strength">
<kwd>Incomplete</kwd>
</kwd-group>
<kwd-group kwd-group-type="claim-importance">
<kwd>Valuable</kwd>
</kwd-group>
</front-stub>
<body>
<p>This is a <bold>valuable</bold> computational study that applies the machine learning method of bilinear modeling to the problem of relating gene expression to connectivity. Specifically, the author attempts to use transcriptomic data from mouse retinal neurons to predict their known connectivity. The results are promising, although the reviewers felt that demonstration of the general applicability of the approach required testing it against a second data set. Hence the present results were felt to provide borderline <bold>incomplete</bold> support for a key premise of the paper.</p>
</body>
</sub-article>
<sub-article id="sa1" article-type="referee-report">
<front-stub>
<article-id pub-id-type="doi">10.7554/eLife.91532.1.sa1</article-id>
<title-group>
<article-title>Reviewer #1 (Public Review):</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<anonymous/>
<role specific-use="referee">Reviewer</role>
</contrib>
</contrib-group>
</front-stub>
<body>
<p>Summary of what the author was trying to achieve:</p>
<p>
In this study, the author aimed to develop a method for estimating neuronal-type connectivity from transcriptomic gene expression data, specifically from mouse retinal neurons. They sought to develop an interpretable model that could be used to characterize the underlying genetic mechanisms of circuit assembly and connectivity.</p>
<p>Strengths:</p>
<p>
The proposed bilinear model draws inspiration from commonly implemented recommendation systems in the field of machine learning. The author presents the model clearly and addresses critical statistical limitations that may weaken the validity of the model such as multicollinearity and outliers. The author presents two formulations of the model for separate scenarios in which varying levels of data resolution are available. The author effectively references key work in the field when establishing assumptions that affect the underlying model and subsequent results. For example, correspondence between gene expression cell types and connectivity cell types from different references are clearly outlined in Tables 1-3. The model training and validation are sufficient and yield a relatively high correlation with the ground truth connectivity matrix. Seemingly valid biological assumptions are made throughout, however, some assumptions may reduce resolution (such as averaging over cell types), thus missing potentially important single-cell gene expression interactions.</p>
<p>Weaknesses:</p>
<p>
The main results of the study could benefit from replication in another dataset beyond mouse retinal neurons, to validate the proposed method. Dimensionality reduction significantly reduces the resolution of the model and the PCA methodology employed is largely non-deterministic. This may reduce the resolution and reproducibility of the model. It may be worth exploring how the PCA methodology of the model may affect results when replicating. Figure 5, 'Gene signatures associated with the two latent dimensions', lacks some readability and related results could be outlined more clearly in the results section. There should be more discussion on weaknesses of the results e.g. quantification of what connectivity motifs were not captured and what gene signatures might have been missed.</p>
<p>The main weakness is the lack of comparison against other similar methods, e.g. methods presented in</p>
<p>
Barabási, Dániel L., and Albert-László Barabási. &quot;A genetic model of the connectome.&quot; Neuron 105.3 (2020): 435-445.</p>
<p>
Kovács, István A., Dániel L. Barabási, and Albert-László Barabási. &quot;Uncovering the genetic blueprint of the C. elegans nervous system.&quot; Proceedings of the National Academy of Sciences 117.52 (2020): 33570-33577.</p>
<p>
Taylor, Seth R., et al. &quot;Molecular topography of an entire nervous system.&quot; Cell 184.16 (2021): 4329-4347.</p>
<p>Appraisal of whether the author achieved their aims, and whether results support their conclusions:</p>
<p>
The author achieved their aims by recapitulating key connectivity motifs from single-cell gene expression data in the mouse retina. Furthermore, the model setup allowed for insight into gene signatures and interactions, however could have benefited from a deeper evaluation of the accuracy of these signatures. The author claims the method sets a new benchmark for single-cell transcriptomic analysis of synaptic connections. This should be more rigorously proven. (I'm not sure I can speak on the novelty of the method)</p>
<p>Discussion of the likely impact of the work on the field, and the utility of methods and data to the community :</p>
<p>
This study provides an understandable bilinear model for decoding the genetic programming of neuronal type connectivity. The proposed model leaves the door open for further testing and comparison with alternative linear and/or non-linear models, such as neural network-based models. In addition to more complex models, this model can be built on to include higher resolution data such as more gene expression dimensions, different types of connectivity measures, and additional omics data.</p>
</body>
</sub-article>
<sub-article id="sa2" article-type="referee-report">
<front-stub>
<article-id pub-id-type="doi">10.7554/eLife.91532.1.sa0</article-id>
<title-group>
<article-title>Reviewer #2 (Public Review):</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<anonymous/>
<role specific-use="referee">Reviewer</role>
</contrib>
</contrib-group>
</front-stub>
<body>
<p>Summary:</p>
<p>
In this study, Mu Qiao employs a bilinear modeling approach, commonly utilized in recommendation systems, to explore the intricate neural connections between different pre- and post-synaptic neuronal types. This approach involves projecting single-cell transcriptomic datasets of pre- and post-synaptic neuronal types into a latent space through transformation matrices. Subsequently, the cross-correlation between these projected latent spaces is employed to estimate neuronal connectivity. To facilitate the model training, connectomic data is used to estimate the ground-truth connectivity map. This work introduces a promising model for the exploration of neuronal connectivity and its associated molecular determinants. However, it is important to note that the current model has only been tested with Bipolar Cell and Retinal Ganglion Cell data, and its applicability in more general neuronal connectivity scenarios remains to be demonstrated.</p>
<p>Strengths:</p>
<p>
This study introduces a succinct yet promising computational model for investigating connections between neuronal types. The model, while straightforward, effectively integrates single-cell transcriptomic and connectomic data to produce a reasonably accurate connectivity map, particularly within the context of retinal connectivity. Furthermore, it successfully recapitulates connectivity patterns and helps uncover the genetic factors that underlie these connections.</p>
<p>Weaknesses:</p>
<p>
1. The study lacks experimental validation of the model's prediction results.</p>
<p>
2. The model's applicability in other neuronal connectivity settings has not been thoroughly explored.</p>
<p>
3. The proposed method relies on the availability of neuronal connectomic data for model training, which may be limited or absent in certain brain connectivity settings.</p>
</body>
</sub-article>
<sub-article id="sa3" article-type="author-comment">
<front-stub>
<article-id pub-id-type="doi">10.7554/eLife.91532.1.sa3</article-id>
<title-group>
<article-title>Author Response</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Qiao</surname>
<given-names>Mu</given-names>
</name>
<role specific-use="author">Author</role>
</contrib>
</contrib-group>
</front-stub>
<body>
<disp-quote content-type="editor-comment">
<p><bold>Response to Reviewer 1:</bold></p>
<p>Summary of what the author was trying to achieve: In this study, the author aimed to develop a method for estimating neuronal-type connectivity from transcriptomic gene expression data, specifically from mouse retinal neurons. They sought to develop an interpretable model that could be used to characterize the underlying genetic mechanisms of circuit assembly and connectivity.</p>
<p>Strengths: The proposed bilinear model draws inspiration from commonly implemented recommendation systems in the field of machine learning. The author presents the model clearly and addresses critical statistical limitations that may weaken the validity of the model such as multicollinearity and outliers. The author presents two formulations of the model for separate scenarios in which varying levels of data resolution are available. The author effectively references key work in the field when establishing assumptions that affect the underlying model and subsequent results. For example, correspondence between gene expression cell types and connectivity cell types from different references are clearly outlined in Tables 1-3. The model training and validation are sufficient and yield a relatively high correlation with the ground truth connectivity matrix. Seemingly valid biological assumptions are made throughout, however, some assumptions may reduce resolution (such as averaging over cell types), thus missing potentially important single-cell gene expression interactions.</p>
</disp-quote>
<p>Thank you for acknowledging the strengths of this work. The assumption to average gene expression data across individual cells within a given cell type was made in response to the inherent limitations of, for example, the mouse retina dataset, where individual cell-level connectivity and gene expression data are not profiled jointly (the second scenario in our paper). This approach was a necessary compromise to facilitate the analysis at the cell type level. However, in datasets where individual cell-level connectivity and gene expression data are matched, such as the C.elegans dataset referenced below, our model can be applied to achieve single-cell resolution (the first scenario in our paper), offering a more detailed understanding of genetic underpinnings in neuronal connectivity.</p>
<disp-quote content-type="editor-comment">
<p>Weaknesses: The main results of the study could benefit from replication in another dataset beyond mouse retinal neurons, to validate the proposed method. Dimensionality reduction significantly reduces the resolution of the model and the PCA methodology employed is largely non-deterministic. This may reduce the resolution and reproducibility of the model. It may be worth exploring how the PCA methodology of the model may affect results when replicating. Figure 5, ’Gene signatures associated with the two latent dimensions’, lacks some readability and related results could be outlined more clearly in the results section. There should be more discussion on weaknesses of the results e.g. quantification of what connectivity motifs were not captured and what gene signatures might have been missed.</p>
</disp-quote>
<p>I value the suggestion of validating the propose method in another dataset. In response, I found the C.elegans dataset in the references the reviewer suggested below a good candidate for this purpose, and I plan to explore this dataset and incorporate findings in the revised manuscript. I understand the concerns regarding the PCA methodology and its potential impact on the model’s resolution and reproducibility. In response, alternative methods, such as regularization techniques, will be explored to address these issues. Additionally, I agree that enhancing the clarity and readability of Figure 5, as well as including a more comprehensive discussion of the model’s limitations, would significantly strengthen the manuscript.</p>
<disp-quote content-type="editor-comment">
<p>The main weakness is the lack of comparison against other similar methods, e.g. methods presented in Barabási, Dániel L., and Albert-László Barabási. &quot;A genetic model of the connectome.&quot; Neuron 105.3 (2020): 435-445. Kovács, István A., Dániel L. Barabási, and Albert-László Barabási. &quot;Uncovering the genetic blueprint of the C. elegans nervous system.&quot; Proceedings of the National Academy of Sciences 117.52 (2020): 33570-33577. Taylor, Seth R., et al. &quot;Molecular topography of an entire nervous system.&quot; Cell 184.16 (2021): 4329-4347.</p>
</disp-quote>
<p>Thank you for highlighting the importance of comparing our model with others, particularly those mentioned in your comments. After reviewing these papers, I find that our bilinear model aligns closely with the methods described, especially in [1, 2]. To see this, let’s start with Equation 1 in Kovács et al. [2]:</p>
<disp-formula id="sa3equ1">
<graphic mime-subtype="jpg" xlink:href="elife-91532-sa3-equ1.jpg" mimetype="image"/>
</disp-formula>
<p>In this equation, B represents the connectivity matrix, while X denotes the gene expression patterns of individual neurons in C.elegans. The operator O is the genetic rule operator governing synapse formation, linking connectivity with individual neuronal expression patterns. It’s noteworthy that the work of Barabási and Barabási [1] explores a specific application of this framework, focusing on O for B that represents biclique motifs in the C.elegans neural network.</p>
<p>To identify the the operator O, the authors sought to minimize the squared residual error:</p>
<disp-formula id="sa3equ2">
<graphic mime-subtype="jpg" xlink:href="elife-91532-sa3-equ2.jpg" mimetype="image"/>
</disp-formula>
<p>with regularization on O.</p>
<p>Adopting the notation from our bilinear model paper and using Z to represent the connectivity matrix, the above becomes</p>
<disp-formula id="sa3equ3">
<graphic mime-subtype="jpg" xlink:href="elife-91532-sa3-equ3.jpg" mimetype="image"/>
</disp-formula>
<p>Coming back to the bilinear model formulation, the optimization problem, as formulated for the C.elegans dataset where individual neuron connectivity and gene expression are accessible, takes the form:</p>
<disp-formula id="sa3equ4">
<graphic mime-subtype="jpg" xlink:href="elife-91532-sa3-equ4.jpg" mimetype="image"/>
</disp-formula>
<p>where we consider each neuron as a distinct neuronal type. In addition, we extend the dimensions of X and Y to encompass the entire set of neurons in C.elegans, with X = Y ∈ Rn×p, where n signifies the total number of neurons and p the number of genes.
Accordingly, our optimization challenge evolves into:</p>
<disp-formula id="sa3equ5">
<graphic mime-subtype="jpg" xlink:href="elife-91532-sa3-equ5.jpg" mimetype="image"/>
</disp-formula>
<p>Upon comparison with the earlier stated equation, it becomes clear that our approach aligns consistently with the notion of O = ABT. This effectively results in a decomposition of the genetic rule operator O. This decomposition extends beyond mere mathematical convenience, offering several substantial benefits reminiscent of those seen in the collaborative filtering of recommendation systems:</p>
<p>•   Computational Efficiency: The primary advantage of this approach is its improvement in computational efficiency. For instance, solving for O ∈ Rp×p necessitates determining p2 entries. In contrast, solving for A ∈ Rp×d and B ∈ Rp×d involves determining only 2pd entries, where p is the number of genes, and d is the number of latent dimensions. Assuming the existence of a lower-dimensional latent space (d &lt;&lt; p) that captures the essential variability in connectivity, resolving A and B becomes markedly more efficient than resolving O. Additionally, from a computational system design perspective, inferring the connectivity of a neuron allows for caching the latent embeddings of presynaptic neurons XA or postsynaptic neurons XB with a space complexity of O(nd). This is significantly more space-efficient than caching XO or OXT, which has a space complexity of O(np). This difference is particularly notable when dealing with large numbers of neurons, such as those in the entire mouse brain. The bilinear modeling approach thus enables effective handling of large datasets, simplifying the optimization problem and reducing computational load, thereby making the model more scalable and faster to execute.</p>
<p>•   Interpretability: The separation into A for presynaptic features and B for postsynaptic features provides a clearer understanding of the distinct roles of pre- and post- synaptic neurons in forming the connection. By projecting the pre- and post- synaptic neurons into a shared latent space through XA and YB, one can identify meaningful representations within each axis, as exemplified in different motifs from the mouse retina dataset. The linear characteristics of A and B facilitate direct evaluation of each gene’s contribution to a latent dimension. This interpretability, offering insights into the genetic factors influencing synaptic connections, is beyond what O could provide itself.</p>
<p>•   Flexibility and Adaptability: The bilinear model’s adaptability is another strength. Much like collaborative filtering, which can manage very different user and item features, our bilinear model can be tailored to synaptic partners with genetic data from varied sources. A potential application of this model is in deciphering the genetic correlates of long-range projectomic rules, where pre- and post-synaptic neurons are processed and sequenced separately, or even involving post-synaptic targets being brain regions with genetic information acquired through bulk sequencing. This level of flexibility also allows for model adjustments or extensions to incorporate other biological factors, such as proteomics, thereby broadening its utility across various research inquiries into the determinants of neuronal connectivity.</p>
<p>In the study by Taylor et al. [3], the authors introduced a generalization of differential gene expressions (DGE) analysis called network DGE (nDGE) to identify genetic determinants of synaptic connections. It focuses on genes co-expressed across pairs of neurons connected, compared with pairs without connection.</p>
<p>As the authors acknowledged in the method part of the paper, nDGE can only examine single genes co-expressed at synaptic terminals: &quot;While the nDGE technique introduced here is a generalization of standard DGE, interrogating the contribution of pairs of genes in the formation and maintenance of synapses between pairs of neurons, nDGE can only account for a single co-expressed gene in either of the two synaptic terminals (pre/post).&quot;</p>
<p>In contrast, the bilinear model offers a more comprehensive analysis by seeking a linear combination of gene expressions in both pre- and post-synaptic neurons. This model goes beyond the scope of examining individual co-expressed genes, as it incorporates different weights for the gene expressions of pre- and post-synaptic neurons. This feature of the bilinear model enables it to capture not only homogeneous but also complex and heterogeneous genetic interactions that are pivotal in synaptic connectivity. This highlights the bilinear model’s capability to delve into the intricate interactions of synaptic gene expression.</p>
<disp-quote content-type="editor-comment">
<p>Appraisal of whether the author achieved their aims, and whether results support their conclusions: The author achieved their aims by recapitulating key connectivity motifs from single-cell gene expression data in the mouse retina. Furthermore, the model setup allowed for insight into gene signatures and interactions, however could have benefited from a deeper evaluation of the accuracy of these signatures. The author claims the method sets a new benchmark for single-cell transcriptomic analysis of synaptic connections. This should be more rigorously proven. (I’m not sure I can speak on the novelty of the method)</p>
</disp-quote>
<p>I value your appraisal. In response, additional validation of the bilinear model on a second dataset will be undertaken.</p>
<disp-quote content-type="editor-comment">
<p>Discussion of the likely impact of the work on the field, and the utility of methods and data to the community : This study provides an understandable bilinear model for decoding the genetic programming of neuronal type connectivity. The proposed model leaves the door open for further testing and comparison with alternative linear and/or non-linear models, such as neural networkbased models. In addition to more complex models, this model can be built on to include higher resolution data such as more gene expression dimensions, different types of connectivity measures, and additional omics data.</p>
</disp-quote>
<p>Thank you for your positive assessment of the potential impact of the study.</p>
<disp-quote content-type="editor-comment">
<p><bold>Response to Reviewer 2:</bold></p>
<p>Summary: In this study, Mu Qiao employs a bilinear modeling approach, commonly utilized in recommendation systems, to explore the intricate neural connections between different pre- and post-synaptic neuronal types. This approach involves projecting single-cell transcriptomic datasets of pre- and post-synaptic neuronal types into a latent space through transformation matrices. Subsequently, the cross-correlation between these projected latent spaces is employed to estimate neuronal connectivity. To facilitate the model training, connectomic data is used to estimate the ground-truth connectivity map. This work introduces a promising model for the exploration of neuronal connectivity and its associated molecular determinants. However, it is important to note that the current model has only been tested with Bipolar Cell and Retinal Ganglion Cell data, and its applicability in more general neuronal connectivity scenarios remains to be demonstrated.</p>
<p>Strengths: This study introduces a succinct yet promising computational model for investigating connections between neuronal types. The model, while straightforward, effectively integrates singlecell transcriptomic and connectomic data to produce a reasonably accurate connectivity map, particularly within the context of retinal connectivity. Furthermore, it successfully recapitulates connectivity patterns and helps uncover the genetic factors that underlie these connections.</p>
</disp-quote>
<p>Thank you for your positive assessment of the paper.</p>
<disp-quote content-type="editor-comment">
<p>Weaknesses:</p>
<p>1. The study lacks experimental validation of the model’s prediction results.</p>
</disp-quote>
<p>Thank you for pointing out the importance of experimental validation. I acknowledge that the current version of the study is focused on the development and validation of the computational model, using the datasets presently available to us. Moving forward, I plan to collaborate with experimental neurobiologists. These collaborations are aimed at validating our model’s predictions, including the delta-protocadherins mentioned in the paper. However, considering the extensive time and resources required for conducting and interpreting experimental results, I believe it is more pragmatic to present a comprehensive experimental study, including the design and execution of experiments informed by the model’s predictions, in a separate follow-up paper. I intend to include a paragraph in the discussion of this paper outlining the future direction for experimental validation.</p>
<disp-quote content-type="editor-comment">
<p>1. The model’s applicability in other neuronal connectivity settings has not been thoroughly explored.</p>
</disp-quote>
<p>I recognize the importance of assessing the model across different neuronal systems. In response to similar feedback from Reviewer 1, I am keen to extend the study to include the C.elegans dataset mentioned earlier. The results from applying our bilinear model to the second dataset will be incorporated into the revised manuscript.</p>
<disp-quote content-type="editor-comment">
<p>1. The proposed method relies on the availability of neuronal connectomic data for model training, which may be limited or absent in certain brain connectivity settings.</p>
</disp-quote>
<p>The concern regarding the dependency of our model on the availability of connectomic data is valid. While complete connectomes are available for organisms like C.elegans and Drosophila, and efforts are underway to map the connectome of the entire mouse brain, such data may not always be accessible for all research contexts. Recognizing this limitation, part of the ongoing research is to explore ways to adapt our model to the available data, such as projectomic data. Furthermore, our bilinear model is compatible with trans-synaptic virus-based sequencing techniques [4, 5], allowing us to leverage data from these experimental approaches to uncover the genetic underpinnings of neuronal connectivity. These initiatives are crucial steps towards broadening the applicability of our model, ensuring its relevance and usefulness in diverse brain connectivity studies where detailed connectomic data may not be readily available.</p>
<p>References</p>
<p>[1] Dániel L. Barabási and Albert-László Barabási. A genetic model of the connectome. Neuron, 105(3):435–445, 2020.</p>
<p>[2] István A. Kovács, Dániel L. Barabási, and Albert-László Barabási. Uncovering the genetic blueprint of the c. elegans nervous system. Proceedings of the National Academy of Sciences, 117(52):33570–33577, 2020.</p>
<p>[3] Seth R. Taylor, Gabriel Santpere, Alexis Weinreb, Alec Barrett, Molly B. Reilly, Chuan Xu, Erdem Varol, Panos Oikonomou, Lori Glenwinkel, Rebecca McWhirter, Abigail Poff, Manasa Basavaraju, Ibnul Rafi, Eviatar Yemini, Steven J. Cook, Alexander Abrams, Berta Vidal, Cyril Cros, Saeed Tavazoie, Nenad Sestan, Marc Hammarlund, Oliver Hobert, and David M. 3rd Miller. Molecular topography of an entire nervous system. Cell, 184(16):4329–4347, 2021.</p>
<p>[4] Nicole Y. Tsai, Fei Wang, Kenichi Toma, Chen Yin, Jun Takatoh, Emily L. Pai, Kongyan Wu, Angela C. Matcham, Luping Yin, Eric J. Dang, Denise K. Marciano, John L. Rubenstein, Fan
Wang, Erik M. Ullian, and Xin Duan. Trans-seq maps a selective mammalian retinotectal synapse instructed by nephronectin. Nat Neurosci, 25(5):659–674, May 2022.</p>
<p>[5] Aixin Zhang, Lei Jin, Shenqin Yao, Makoto Matsuyama, Cindy van Velthoven, Heather Sullivan, Na Sun, Manolis Kellis, Bosiljka Tasic, Ian R. Wickersham, and Xiaoyin Chen. Rabies virusbased barcoded neuroanatomy resolved by single-cell rna and in situ sequencing. bioRxiv, 2023.</p>
</body>
</sub-article>
</article>