Error patterns of orientation stimuli in delayed-estimation tasks and low-dimensional attractor models. (A-C) Characteristic patterns of natural statistics of orientation stimuli θ (A), bias (B), and standard deviation (SD; C) during the delay period observed experimentally. Cardinal orientations are predominant in natural images (A). Bias and SD increase during the delay period, keeping patterns of repulsive bias (B) and minimum variance (C) around cardinal orientations. These characteristic patterns are visualized using trigonometric functions, and the range is normalized by their maximum values. Red vertical lines correspond to representative cardinal and oblique orientations, and with a periodicity of the error patterns, we only show the grey-shaded range in the remaining panels. (D-L) Comparison of different attractor models. (D-F) Continuous attractors with constant noise. Energy potential is flat (D), resulting in no bias (E) and uniform SD with uniform noise (F). (G-L) Discrete attractors with constant (G-I) and nonuniform noise (J-L). The discrete attractor models have potential hills and wells at cardinal and oblique orientations, respectively (G,J). While the bias patterns depend only on the energy landscape (H,K), SD representing variability also depends on noise (I,L). For the correct SD pattern (L), uneven noise with its maxima at the obliques (J) is required. Bias and SD patterns in the attractor models were obtained by running one-dimensional drift-diffusion models (see Methods).

Extension of Bayesian sensory models. (A) Schematics of extension to memory processing. We adapted the previous Bayesian models (Wei and Stocker 2015) for sensory encoding where θ and are the input and output of sensory modules. We added a memory module where it maintains with the addition of memory noise ξ. The output of the memory module, , is fed back to the sensory module as the input for the next iteration. (B) Illustration of the first iteration of sensory-memory interaction. Prior distribution follows the natural statistics (top), resulting in a sharper likelihood function near cardinal orientations (middle). Combining prior and likelihood functions leads to the posterior distribution of decoded (light colors at the bottom), which is broadened with the addition of memory noise (dark colors at the bottom). Different curves correspond to different initial θ. (C) Bias (top) and SD (bottom) patterns obtained from decoded for the 1st, 2nd, and 3rd iterations.

Network models of sensory and memory circuits in isolation. (A) Schematics of columnar architecture for orientation selectivity. Neurons in the same column have similar preferred orientations, and recurrent connections are a combination of local excitation and global inhibition, represented as triangles and circles, respectively. (B-F) Connectivity and tuning properties of the sensory network (left column) and memory network (right column). (B) Example connectivity strengths. We indexed neurons by ψ ranging uniformly from 0° to 180°. The connectivity strengths depend only on ψ’s of the presynaptic and postsynaptic neurons. Each curve shows the connectivity strengths from presynaptic neuron ψ to an example postsynaptic neuron. Unlike the homogeneous connectivity in the memory network (right), the sensory connectivity is heterogeneous, and its degree is denoted by α. (C) Heterogeneous tuning curves for different stimulus θ in the sensory network in the stimulus period (left) and homogeneous ones in the memory network in the delay period (right). The memory network can sustain persistent activity in isolation, while the sensory network cannot. (D) Histograms of the preferred orientations. We measured the maximum of the tuning curve of each neuron, denoted as (Methods). The heterogeneous sensory network has more cardinally tuned neurons. (E) Widths of tuning curves measured at the half maximum of the tuning curves (Methods). The sensory tuning curves sharpen around cardinal orientations. Each neuron is labeled with its index ψ as in (B). (F) Neural manifolds projected onto the first two principal components of activities during the stimulus period (left) and during the delay period (right). The neural manifold of the sensory network resembles a curved ellipsoid, while the manifold corresponding to the homogeneous memory network is a perfect ring.

Network model with interacting sensory and memory modules. (A) Schematic of two-module architecture. The sensory and memory modules are connected via feedforward and feedback connectivity to form a closed loop. The sensory module receives external input with orientation θ while internal representation is decoded from the memory module, denoted as . (B) Tuning curves of sensory (upper panels) and memory (lower panels) modules at the end of the stimulus epoch (i.e., the beginning of the delay epoch; left panels) and during the delay period (right panels). Note that while both modules can sustain persistent activity in the delay period, the firing rates of the sensory module are significantly lower than those in the stimulus period (upper right). (C-E) Bias (C), standard deviation (SD; D), and Fisher information (FI; E) patterns evaluated at 1, 2.5, and 4 seconds into the delay, consistent with the characteristic patterns observed experimentally (Figure 1A-C). While FI decays due to noise accumulation, it is largest around cardinal orientations, corresponding to a smaller discrimination threshold (E). In (C) and (D), shaded areas mark the ±s.e.m. of 1000 realizations.

Low-dimensional dynamics along memory manifold and its parameter dependence. (A) Low-dimensional projection along the memory states. Left panel: The memory manifold projected to the first two PCs associated with the vector fields. Right panel: Example drift-diffusion trajectories along the memory manifold starting at θ = 112.5°. (B,C) Velocity (B) and noise coefficients (C) corresponding to drift and diffusion processes. Different grey scales represent different heterogeneity degrees in the sensory module, α in Figure 3B. The velocity with which the remembered orientation drifts to the obliques in a noise-free network (B). A larger noise coefficient around the obliques overcomes the underlying drift dynamics and causes the standard deviation pattern to reach its maxima at the obliques (C). (D) Equivalent one-dimensional energy potential derived from the velocity in (B). (E,F) Example bias (E) and standard deviation (F) patterns at 4s into the delay. The shaded areas mark the ±s.e.m. of 1000 realizations.

Stronger inhibitory synaptic modulation is required for correct error patterns. (A) Segregation of excitatory (blue) and inhibitory (red) synaptic pathways. (B) Example excitatory (left) and inhibitory (right) connectivity strengths of the sensory module. The heterogeneity degrees of excitatory and inhibitory connections are denoted by α and β, respectively. Unlike combined excitation and inhibition in Figure 3B, the connectivity strengths are maximal around cardinal orientations. (C,D) Bias with stimulus at 22.5° (C) and standard deviation (SD) index (D) estimated at 1s into the delay for different values of α and β. SD index compares the SD at the cardinal and oblique orientations (Methods). (E,F) Example bias (left) and SD (right) patterns when excitatory modulation overwhelms inhibitory modulation (α = 0.068,β = 0.04; E) and when inhibitory modulation is stronger (α = 0.03, β = 0.08; F). In (C) and (D), green (yellow) pentagrams mark the parameters used in (E) and (F). Stronger inhibitory modulation is required for correct bias and variance patterns (F and green regions in C and D). In (E) and (F), shaded areas mark the ±s.e.m. of 1000 realizations.

Network model with memory module only. (A) Schematics of one-module network with heterogeneous and strong recurrent connections that enable both efficient coding and memory maintenance. (B) Example tuning curves at the end of the stimulus epoch (left) and at 4s into the delay epoch (right). (C,D) Bias with stimulus at 22.5° (C) and standard deviation (SD) index (D) estimated at 1s into the delay for different heterogeneity degrees of excitatory and inhibitory connections, denoted by α and β. For the parameters that generate reasonable bias patterns, the SD index is always negative, which indicates that the SD pattern is inconsistent with experimental findings. (E) Bias (left), and SD (right) patterns in the delay. While the bias pattern is correct, the SD reaches maxima around cardinal orientations, unlike the experiments. In (C) and (D), the yellow pentagram marks the parameters used in (E).

Effect of perturbations in sensory-memory interaction on error patterns. (A,B) Example bias (A) and standard deviation (B) patterns when we assumed that TMS is applied to interrupt the feedforward signal from 2.5s into the delay. Shaded areas mark the ±s.e.m. of 1000 realizations. (C,D) Evolution of bias with example cue orientation at θ = 18° (C) and the tuning width indices in the memory network (WI; C) representing the asymmetry of tuning widths at cardinal and oblique orientations (Methods). Two vertical dashed lines mark the end of the stimulus epoch and the beginning of TMS disruption, respectively. Solid and dashed curves correspond to with and without perturbations, respectively. Both bias (C) and WI (D) stop increasing when TMS is on (C,D).

Dynamics of bias and tuning properties of sensory-memory interacting network models. (A) Maximum firing rate of ψ = 22.2° for all stimulus orientations. The vertical grey line represents the end of the stimulus presentation. Both sensory and memory modules show lower but sustained activities during the delay period. (B) Bias evolution to input orientation θ = 18°. The bias increases both in the stimulus and delay periods, while its increasing speed is reduced during the delay period. (C) Tuning width indices (WI) measuring the asymmetry of tuning widths at cardinal and oblique orientations (Methods). WI also increases in the whole process, indicating the tuning curves of the neural population become more heterogeneous. All parameters are the same as in Figure 4.

Comparison between bias and standard deviation (SD) patterns of the full network model (orangish) and low-dimensional projection (bluish curves). From top to bottom, each row corresponds to sensory-memory interacting networks in Figure 5 with α = 0.03 (A,D), 0.04 (B,E), and 0.05 (C,F), respectively. We projected the dynamics onto the left (A-C) and right (D-F) eigenvectors of the Jacobian matrix obtained from local dynamics along the memory states (Methods). The manifold was parameterized at 1s into the delay after a 0.5s-long stimulus to determine the drift speed and diffusivity of the low-dimensional model. The initial orientations of the low-dimensional model were set to be the orientations decoded from the full model at 1s into the delay. We compared the increase of bias from then on, i.e., at 1.2s, 1.5s, and 2s into the delay for the full model, but 0.2s, 0.5s, and 1s for the low-dimensional model. Low-dimensional projection captures characteristic patterns well despite relatively larger deviation in the SD compared to bias, and we found that projecting to the right eigenvector (D-F) generally yields better predictions than projecting to the left eigenvector (A-C). All parameters are the same as in Figure 5 except for α. Shaded areas (too narrow to be seen) mark the ±s.e.m. of 3000 realizations.

Error patterns and low-dimensional dynamics for different feedforward (A-E) and feedback (F-J) connection strengths at 4s into the delay epoch. Increasing both feedforward and feedback connection strengths enlarges the bias (A,F) and flattens the SD pattern (B,G). That can be understood through the low-dimensional projection—the drift velocity increases (D,I), but the noise coefficient corresponding to diffusion is less affected (E,J). Shaded areas mark the ±s.e.m. of 1000 realizations.

Relationship between drift speed and memory loss in two-module (A-C) and one-module (D-F) networks. (A,D) Drift speed for different heterogeneity degrees, α and β. (B,E) Minimum Fisher information (FI). The upper panels were obtained with a coarse parameter grid, and lower panels were obtained with a fine grid but only along parameters for the smallest bias increase (red circle) and the orthogonal direction (red triangle). When bias speed is large with unbalanced excitation and inhibition strengths (triangle), the minimum FI decreases quickly in both two-module and one-module networks, suggesting memory loss. On the other hand, along the direction with the smallest bias increase (circle), the minimum FI is relatively high. The FI was estimated at 4s into the delay epoch using 1000 realizations. (C,F) Negative correlation between minimum FI and drift speed.

Comparison of low-dimensional dynamics between two-module and one-module network models. (A,B) Bias and standard deviation (SD) patterns of two-module (A) and one-module (B) networks adapted from Figure 6F and Figure 7E, respectively. The averages of bias and SD over different θ at 4s into the delay are similar in the two networks. (C,D) Energy potential and noise coefficients in two-module (black) and one-module (red) networks. Despite the similar bias levels at 4s, the two-module network has a shallower potential (C) but larger heterogeneity in the noise coefficient profile (D). Such differences make it possible for the SD to become smaller around cardinal orientations in the two-module network (right in A), while drift dynamics overwhelm and the SD pattern is opposite to that of the noise coefficient in the one-module network (right in B).