The RTRBM extends the RBM by additionally accounting for temporal interactions of the neural assemblies. (A) A schematic depiction of an RBM with visible units (neurons) on the left, and hidden units (neural assemblies) on the right. The visible and hidden units are connected through a set of weights W. (B) An example W matrix where a subset of visible units is connected to one hidden unit. Details of the equations in panel B and E are given in Methods and Materials. (C) Hidden and visible activity traces generated by sampling from the RBM. Due to its static nature, the RBM samples do not exhibit any sequential activation pattern, but merely show a stochastic exploration of the population activity patterns. (D) Schematic depiction of an RTRBM. The RTRBM formulation matches the static connectivity of the RBM, but extends it with the weight matrix U to model temporal dependencies between the hidden units. (E) In the present example, assembly 1 excites assembly 2, assembly 2 excites assembly 3, and assembly 3 excites assembly 1, while the remaining connections were set to 0. (F) Hidden and visible activity traces generated by sampling from the RTRBM. In contrast to the RBM samples, the RTRBM generates samples featuring a sequential firing pattern. It is able to do so due to the temporal weight matrix U which enables modeling temporal dependencies.

The RTRBM outperforms the RBM on sequential statistics on simulated data. (A) Simulated data generation: Hidden Units interact over time to generate firing rate traces which are used to sample a Poisson train. For example, assembly 1 drives assembly 2 and inhibits assembly 10, both at a single time step delay. (B) Schematic depiction of the RBM and RTRBM trained on the simulated data. (C) For the RBM, the aligned weight matrix Ŵ contains spurious off-diagonal weights, while the RTRBM identifies the correct diagonal structure (top). For the assembly weights U (left), the RTRBM also converges to similar aligned temporal weights Û (right). (D) The RTRBM attributes only a single strong weight to each visible unit ((wi,j > 0.5σ, where σ is the standard deviation of W)), consistent with the specification in W, while in the RBM multiple significant weights get assigned per visible units. (E) The RBM and RTRBM perform similarly for concurrent (⟨vi⟩, ⟨vi vj⟩) statistics, but the RTRBM provides more accurate estimates for sequential statistics. In all panels, the abscissa refers to the data statistics in the test set, while the ordinate shows data sampled from the two models respectively. (F) The trained RTRBM and the RBM yield similar concurrent moments, but the RTRBM significantly outperformed the RBM on time-shifted moments (see text for details on statistics). (G) The RTRBM achieved significantly higher accuracy when predicting ahead in time from the current state in comparison to RBMs for up to 4 time steps.

Figure 2—figure supplement 1. Alignment of weight matrices after learning.

RTRBM often outperforms the cRBM on zebrafish data (A) Whole-brain neural activity of larval zebrafish was imaged via Calcium-indicators using light sheet microscopy at single neuron resolution (left). Calcium activity (middle, blue) is deconvolved by blind, sparse deconvolution to obtain a binarized spike train (middle, black). The binarized neural activity of 1000 randomly chosen neurons (right). (B) Left: Distribution of all visible-to-hidden weights. Here, a strong weight is determined by proportional thresholding, wi,j > wthr. Here wthr is set such that 5000 neurons have a strong connection towards the hidden layer. Right: log-weight distribution of the visible to hidden connectivity. (C) The RTRBM extracts sample assemblies (color indicates assembly) by selecting neurons based on the previously mentioned threshold. Visible units with stronger connections than this threshold for a given hidden unit are included. Temporal connections (inhibitory: blue, excitatory: red) between assemblies are depicted across timesteps. (D) Temporal connections between the assemblies are sorted by agglomerative clustering (dashed lines separate clusters, colormap is clamped on the range [1, 1] for visibility). Details on the clustering method can be found in Methods and Materials. (E) Corresponding receptive fields of the clusters identified in (D), where the visible units with strong weights are selected similarly to (B). (F) Comparative analysis between the cRBM (bottom row) and RTRBM (top row) on inferred model statistics and data statistics (test dataset).Compared in terms of Spearman correlations and sum square difference. From left to right: the RTRBM significantly outperformed the cRBM on the mean activations ⟨< vi> (p < ϵ), pairwise neuron-neuron interactions, ⟨vivj ⟩ (p < ϵ) time-shifted pairwise neuron-neuron interactions and time-shifted pairwise hidden-hidden interactions for example fish 4. (G) The methodology in panel F is extended to analyze datasets from eight individual fish. Spearman correlation and the assessment of significant differences between both models are determined using a bootstrap method (see Methods and Materials for details).

Neural interaction timescale can be identified via RTRBM estimates over multiple timescales. (A) Training paradigm. Simulated data is generated as in Figure 2, but with temporal interactions between populations at a delay of ΔtA = 4 time steps. This data is downsampled according to a downsampling rate ΔtD by taking every ΔtD-th time step (shown here is ΔtD = 4), and used for training different RTRBMs. (B) Performance of the RTRBM for various down-sampling rates measured as the accuracy in predicting the visible units one time step ahead (N = 10 models per ΔtD). Dotted line shows a the mean estimate of the upper bound on the accuracy ± SEM (N = 10, 000) due to inherent variance in the way the data is generated (see Methods and Materials). Dashed gray line indicates the theoretical performance of an uninformed, unbiased estimator (C) Cosine similarity between the interaction matrix U and the aligned learned matrices Û, both z-scored. Bars and errorbars show mean and standard deviation respectively across the N = 10 models per ΔtD. Dark lines show absolute values of the mean cosine similarity. Shown above are the Û matrices with the largest absolute cosine similarity per down-sampling rate.

An alignment procedure is used to be able to enable comparison between the learned temporal weight matrix Û and the original temporal matrix used to generate the data. (A) The link between hidden units and assemblies of visible units can often be clearly identified within the weight matrix, making it possible to define an ordering of the hidden units such that it is in line with the ordering of the assemblies. This leads to a shuffling of the rows in the Ŵ matrix. The sign of the mean weight between an assembly and its matched hidden unit is used to identify inverse relations. (B) The reordering of the hidden units leads to a reshuffling of both the rows and the columns of the Û matrix. The sign of an entry is switched when the two hidden units have an opposing sign to their matched assembly of visible units.