(A) Transcription task. The spectrogram of a spoken digit, e.g. ‘two’, is transformed to a 12-channel cochleogram that serves as the continuous-time input to a RNN during the sensory epoch of each …
Overlaid outputs of a motor-trained RNN (N = 4000) for five sample utterances of each of the 10 digits used in the spoken-to-handwritten digit transcription task. Transcription performance of the …
(A) Schematic of a perturbation experiment. The motor trajectory of a trained RNN (N = 2100; 100 sample units shown) for the spoken digit ‘three’, is perturbed with a 25 ms pulse (amplitude = 2). …
(A–B) Neural activity patterns of three sample units of a reservoir (A) and trained (B) network, in response to a trained and a novel utterance each of the digits ‘six’ (red traces) and ‘eight’ …
(A) Euclidean distance between trajectories of the same digit (within-digit) versus those of different digits (between-digit). At each time step, the trajectory distances represent the mean and SD …
(A) Comparison of mean within- and between-digit distances of the sensory epoch trajectories in reservoir and trained networks (N = 2100) at different input amplitudes. Bars represent mean of the …
(A) Spectral noise in the inputs to an RNN (N = 2100) during presentations of digit zero. Sample noise is the difference between the external input (each row reflects net external input to a unit in …
(A) Left panel. Schematic of the evolution of a trajectory x(t). Right panel. The evolution of the trajectory in phase space between time steps t and t + 1 , can be decomposed into three component …
(A) Temporally warped input cochleograms for an utterance of the digit ‘nine’ (left panel), warped by a factor of 2x (upper row) and 0.5x (lower row). Distance matrices between the trajectories …
(A) Time-averaged linear speed (v) during the sensory epoch trajectories in the reservoir and trained networks (N = 2100) compared to the ideal linear speed, over a range of warp factors. The speeds …
(A) Projections in PCA space of the input (left) and recurrent (right) subspace trajectories that compose the sensory epoch trajectories of a trained network (N = 2100), in response to warped …
(A) Mean time-averaged Euclidean distance between sensory trajectories encoding warped and reference utterances of the 10 digits, in a network trained with BPTT, over a range of warp factors. Dashed …
A trained RNN (N = 4000) and its output units perform the transcription task on five novel utterances (five different speakers). The last two utterances illustrate RNN performance on same-digit …