Figures and data

Illustration of the hierarchical working memory model.
(a) Network architecture. Stimulus clusters and chunking clusters both have recurrent self-excitations (thick sharp arrows) and reciprocal connections to the global inhibitory pool (not shown). Chunking clusters have dense but weak connections to the stimulus clusters (thin blunt arrows in the background). (b) Effective network architecture after presentation. Activities in the network selectively augment connections between stimuli within chunks and the corresponding chunking clusters, effectively forming a hierarchical structure. (c) Dynamics of the recurrent self-connections. Upon the arrival of pre-synaptic inputs (top panel), the release probability u increases, and the fraction of available neurotransmitters x decreases (left axis of the middle panel). The amplitude of the recurrent strength A gradually increases with each reactivation of the cluster (right axis of the middle panel). As a result, the total synaptic efficacy of the recurrent self-connection J Self = uxA oscillates (bottom panel). Activity traces are taken from the first stimulus cluster from the top panel of (d) below. (d) Network simulation. The first three memories are colored in blue, and the other three memories are colored in green. Shades represent external input to the cluster. Top: Memories are loaded at a uniform speed; chunking clusters are not activated. Only four out of six memories remain active in the WM. Bottom: Slight pauses after chunks activate the chunking clusters, which inhibit the stimulus clusters presented before the pause. All memories are retrieved chunk-by-chunk in the retrieval stage. The full activity trace of the synaptic variables is presented in Fig. S1.

Memory retrieval from a hierarchical structure.
(a) Top: Schematic of an emergent hierarchy of three levels. The top node (black) denotes the global inhibitory neural pool. The first two levels represent chunking clusters, and the lowest level represents stimulus clusters. Grey stripes denote the clusters that need to be suppressed to retrieve the 1st chunk. Blue dashed circles represent clusters that are active during the retrieval of the 1st chunk during the retrieval stage. Bottom: Architecture of the underlying recurrent neural network. (b) Simulation of the network in (a). R(k): activity trace of firing rates, color-coded to match the corresponding clusters in (a). The time-course of the traces is labeled as chunks (stimulus clusters), pauses (chunking clusters), and long pauses (meta-chunking clusters). Ib(k): activity traces of background input currents. Decreasing the background input to a cluster at level k suppresses its reactivation and removes the inhibition on its children clusters at level k − 1.

Cognitive boundary neurons in the medial temporal lobe.
(a) Average firing rate from single-neuron recording data in [41]. The mean z-score firing rates are plotted in solid lines, with one standard deviation included as the shades. Firing rates are averaged over all subjects and trials, and the relative time zero is chosen to be the location of the movie cut. Two qualitative features in the firing rates of the non-boundary neurons: a dip followed by a ramp, are predicted by the hierarchical working memory model. Top: Boundary neurons. Bottom: Non-boundary neurons. (b) Average firing rates of non-boundary neurons over all trials for individual subjects. Subjects are sorted based on the location of the dip. A trend similar to panel (a) is observed for each subject. For individual 2D plots, see Fig. S3. (c) Average firing rates of neurons aligned to the onset of the movie (relative time zero). After the peak in onset-specific neurons, the non-onset-specific neurons do not exhibit the dip-then-ramp pattern seen in panel (a). Top: onset-specific neurons. Bottom: Non-onset specific neurons.

The new magic number bounds perfect-recall performance on verbal memory.
(a) Fraction of recalled words as a function of the length of the presented text. Different shades of blue correspond to different n-gram approximations. Black color represents natural text. Inset: Original data as presented in [42]. Main: Different n-gram approximation curves become straight lines in a semi-log plot and can be collapsed into a single universal curve (red dashed line) by adjusting the offsets on the individual intercepts. (b Critical length of perfect recall as a function of n-gram approximations. The location of the critical length Lc is determined by extrapolating the individual n-gram approximation curves to where f (Lc) = 1 using the universal slope. Different colored lines represent experiments in different languages. The grey dashed line corresponds to M ∗ = 2C−1 for C = 4.

Full activity trace of the bottom panel in Fig. 1(d).
(a) Activity traces of all variables. From top to bottom: firing rates Rµ, background input currents


Additional RNN simulations with delayed Hebbian plasticity.
(a) Approximating the chunking dynamics in Fig. 1(d) using Eq. (15) instead of Eq. (8). Top: activity traces of the firing rates. Bottom: activity traces of the inhibitory connections from chunking clusters to stimulus clusters JSC. (b) Snapshot of the synaptic matrix after chunking, resulting from the dynamics described in Eq. (15). (c) Approximating the chunking dynamics in Fig. 2(b) using Eq. (15) instead of Eq. (8). Synaptic matrix components that correspond to the inhibition from level k to l are collectively denoted as J (k)⊣(l). First three panels: firing rate activity traces of the clusters in Fig. 2(a). Fourth and fifth panels: inhibitory connections between adjacent levels J (1)⊣(2) and J (2)⊣(3), inhibitory connections between skip levels J (1)⊣(3), resulting from the dynamics described in Eq. (15).

Individual 2D plots of Fig. 3(b).
Individual subjects’ z-score firing rates of the non-boundary neurons are shown in blue, with one standard deviation included as shades. Black dashed lines denote t = 0 s where the movie cut occurs. Red dashed lines denote the location of the maximum firing rate of the boundary neurons. Results are pooled from the raw firing rates of all non-boundary neurons from that subject. Subject IDs are presented according to the data in [41]. While some subjects do not exhibit the qualitative trend as predicted (e.g., the firing rate of subject P64CS does not have a ramp, and TWH120 does not have a dip), most of the subjects’ firing rates follow the same qualitative trend as observed in the average plot in Fig. 3(a).
