Encoding sensory and motor patterns as time-invariant trajectories in recurrent neural networks

  1. Vishwa Goudar  Is a corresponding author
  2. Dean V Buonomano  Is a corresponding author
  1. University of California, Los Angeles, United States
8 figures, 1 video and 1 additional file

Figures

Figure 1 with 1 supplement
Trained RNNs perform a sensorimotor spoken-to-handwritten digit transcription task

(A) Transcription task. The spectrogram of a spoken digit, e.g. ‘two’, is transformed to a 12-channel cochleogram that serves as the continuous-time input to a RNN during the sensory epoch of each …

https://doi.org/10.7554/eLife.31134.002
Figure 1—figure supplement 1
Transcription performance of an RNN trained only during the motor epoch.

Overlaid outputs of a motor-trained RNN (N = 4000) for five sample utterances of each of the 10 digits used in the spoken-to-handwritten digit transcription task. Transcription performance of the …

https://doi.org/10.7554/eLife.31134.003
Digit transcription is robust to perturbations during the sensory and motor epochs.

(A) Schematic of a perturbation experiment. The motor trajectory of a trained RNN (N = 2100; 100 sample units shown) for the spoken digit ‘three’, is perturbed with a 25 ms pulse (amplitude = 2). …

https://doi.org/10.7554/eLife.31134.005
Trained RNNs generate convergent continuous neural trajectories in response to different instances of the same spatiotemporal object.

(A–B) Neural activity patterns of three sample units of a reservoir (A) and trained (B) network, in response to a trained and a novel utterance each of the digits ‘six’ (red traces) and ‘eight’ …

https://doi.org/10.7554/eLife.31134.006
Trained RNNs encode both sensory and motor objects as well separated neural trajectories.

(A) Euclidean distance between trajectories of the same digit (within-digit) versus those of different digits (between-digit). At each time step, the trajectory distances represent the mean and SD …

https://doi.org/10.7554/eLife.31134.007
Trajectory separation in reservoir and trained RNNs as a function of input amplitude.

(A) Comparison of mean within- and between-digit distances of the sensory epoch trajectories in reservoir and trained networks (N = 2100) at different input amplitudes. Bars represent mean of the …

https://doi.org/10.7554/eLife.31134.008
Figure 6 with 1 supplement
Robustness to spectral noise depends on the spatiotemporal structure of the inputs that the network is exposed to during training.

(A) Spectral noise in the inputs to an RNN (N = 2100) during presentations of digit zero. Sample noise is the difference between the external input (each row reflects net external input to a unit in …

https://doi.org/10.7554/eLife.31134.009
Figure 6—figure supplement 1
Schematic description of the decomposition of trajectories and trajectory separation into recurrent and input components.

(A) Left panel. Schematic of the evolution of a trajectory x(t). Right panel. The evolution of the trajectory in phase space between time steps t and t + 1 (dx(t)dt), can be decomposed into three component …

https://doi.org/10.7554/eLife.31134.010
Invariance of encoding trajectories to temporally warped spoken digits.

(A) Temporally warped input cochleograms for an utterance of the digit ‘nine’ (left panel), warped by a factor of 2x (upper row) and 0.5x (lower row). Distance matrices between the trajectories …

https://doi.org/10.7554/eLife.31134.011
Figure 8 with 2 supplements
Mechanism of temporal scaling invariance.

(A) Time-averaged linear speed (v) during the sensory epoch trajectories in the reservoir and trained networks (N = 2100) compared to the ideal linear speed, over a range of warp factors. The speeds …

https://doi.org/10.7554/eLife.31134.012
Figure 8—figure supplement 1
Phase space relationships in the input and recurrent subspaces as a function of the warp factor.

(A) Projections in PCA space of the input (left) and recurrent (right) subspace trajectories that compose the sensory epoch trajectories of a trained network (N = 2100), in response to warped …

https://doi.org/10.7554/eLife.31134.013
Figure 8—figure supplement 2
Temporal invariance performance and mechanism in a network trained with Backpropagation through time (BPTT).

(A) Mean time-averaged Euclidean distance between sensory trajectories encoding warped and reference utterances of the 10 digits, in a network trained with BPTT, over a range of warp factors. Dashed …

https://doi.org/10.7554/eLife.31134.014

Videos

Video 1
A trained RNN performs the sensorimotor spoken-to-handwritten digit transcription task on novel utterances.

A trained RNN (N = 4000) and its output units perform the transcription task on five novel utterances (five different speakers). The last two utterances illustrate RNN performance on same-digit …

https://doi.org/10.7554/eLife.31134.004

Additional files

Download links