Schematic representation of our model.

A, An example of context-dependent cognition. Humans can understand the meaning of “mouse” (an animal or a computer input device) depending on the context. B, Our model involves two modules: Context selector (X) and Sequence composer (H). X chooses a context depending on the external stimuli and the input from H, and activates a sequence in H. This sequence is used for reward prediction. In addition, H sends predictive feedback about external stimuli to X. C, The schematic figure of two kinds of remapping. Grey boxes indicate external stimuli, orange boxes indicate hippocampal segment (a part of hippocampal sequence), blue circles indicate contextual state, and green cross marks indicate the prediction error about external stimuli (left) and about reward (right). Solid lines indicate the actual state transition and dotted lines indicate virtual state transition that is created in the past transition. Green arrows indicate the synaptic potentiation related to remapping. D, E, Attractor dynamics of Amari-Hopfield network related to SPE-driven remapping (D) and RPE-facilitated remapping (E). Blue dotted lines indicate an energy landscape, and green solid lines indicate the chosen attractor as a result of remapping. F, Hippocampal segments in H are combined depending on rewards (purple arrows) and formed into task-dependent sequences. Each sequence supports action planning and enables predictions of future external stimuli and rewards. G, An example state transition related to hippocampal sequence formation. In early phase, hippocampal neurons are activated through the input from X, while in the late phase, hippocampal neurons are activated through the recurrent input within H.

Our model replicates the emergence of splitter cells.

A, Simplified alternation task diagram. B, A successful contextual state transition of our model. Preparing 2 different contextual states X2α and X2β at S2 is necessary to solve this task. C, An example environmental state transition (left) and contextual state transition (right). Check marks indicate the rewarded states, and cross marks indicate non-rewarded states. Red shades indicate the right-turn trials and blue shades indicate the left-turn trials. (Right) The intensity of blue indicates the order of created contextual state, following history-driven remapping indicated in green triangles. Red outlines indicate X2α and blue outlines indicate X2β. D, The corresponding neural activity of X to each contextual state. The neurons in the stim. domain are sorted according to external stimuli. E, The corresponding hippocampal activity at each contextual state. Red square indicates the transition-coding neuron of S2 to S4, and blue square indicates the transition-coding neuron of S2 to S5. Purple line indicates the hippocampal sequence, which is gradually lengthened in reward-dependent manner. F, The correct rate of our model. The error bar indicates the standard error of the mean (N = 40). G, The maximum number of environmental states ahead that the agents planned (planning length) gradually increases over learning. Black lines indicate the planning length of each agent, and the red line is their average. H, Emergence of splitter cells in the hippocampus in the modified T-maze modification task (Wood et al., 2000). I, The transition-coding neurons in our model replicate the emergence of splitter cells in S2.

Our model replicates the emergence of lap cells.

A, Simplified 2-laps task diagram. Agents are rewarded for the shortest path (S1→S2→S4) for the initial 20 trials, for the 1-lap path (S1→S2→S3→S2→S4) for the next 20 trials, and for the 2 or more laps (S1→S2→S3→S2→S3→S2→S4, etc.) for the next 40 trials. B, A successful contextual state transition map of our model. The environmental states S2 and S4 are split into three contextual states (X2α, X2β, X2γ), S3 is split into two contextual states (X3α, X3β), and S4 is split into three contextual states (X4α, X4β, X4γ). C, The correct rate of our model. The error bar indicates the standard error of the mean (N = 40). D, The planning length gradually increases during learning, depending on the task demand. The black lines indicate the planning length of each agent, and the red line is their average. E, The comparison of (Left) lap cells in the hippocampus in the 4-laps task (Sun et al., 2020) and (Right) our results of active neurons in H module. The transition-coding neurons at S2 in 2-laps task are indicated in orange and green and purple squares corresponding to B. F, The inhibition experiment of medial entorhinal cortex axons at CA1. ESR cells show a weak lap-specific correlation (ESR correlation) between light-on trials and light-off trials, while they show a strong spatial correlation between light-on trials and light-off trials (Left). Our model replicates the result qualitatively with the inhibition on and off (Right). G, The correct rate of 1-lap and 2-or-more-laps alternation task. The error bar indicates the standard error of the mean (N = 40). H, The planning length adapts flexibly to the task demand.

Our model replicates key features of human neural activity in dynamic environments.

A, Simplified probabilistic cueing task diagram. In environment I, agents start at S0 and move to S2 or S3 randomly (S2 for p = 0.8 and S3 for p = 0.2) and receive a reward in S4 when they come from S2 and in S5 otherwise. In environment II, agents start at S1 and move to S2 or S3 randomly (S2 for p = 0.2 and S3 for p = 0.8) and receive a reward in S5 when they come from S2 and in S4 otherwise. The environment switches between the two every 30 trials. B, A successful context map of this task. S2 and S3 are split into two contextual states, and S4 and S5 are split into four contextual states. The hippocampal connections are built for rewarded conditions only. C, The probability of choosing S4. The red/blue line shows its mean when S2/S3 is presented. The error bar indicates the standard error of the mean (N = 40). D, The planning length gradually increases over learning and converges to 3. The black lines indicate each agent’s planning length, and the red line is their average. E, The probability of generating a specific planning sequence at S0 or S1. The expected states (S2 or S3) are modulated according to the environment. F, Our model behavior is similar to the human fMRI result of the cue-probability-dependent hippocampal replay (Ekman et al., 2022). Paired sample t-test. **P<0.01. G, Simplified task diagram (Julian and Doeller, 2021). The training phase is the same as A, but the contextual stimuli of Square (Sq) or Circle (Ci) are initially presented and the probability of S2 and S3 is equal. In the test phase, either one of Sq, Ci or the mixture stimuli of Sq and Ci (Squircle: SC) are presented, and the agent transfers following their faith. Reward feedback is not given in the test phase. H, The transition probability under Sq context (Left) and Ci context (Right). I, The transition probability under SC context of the human patients in Julian and Doeller, 2021 (Left) and our model (Right). J, Comparison of behavioral decoding accuracy from hippocampal fMRI activity of Julian and Doeller, 2021 (Left) and hippocampal neural activity of our model (Right). Our model replicates the worse decoding accuracy in SC context (Bottom) than Sq or Ci context (Top).

Model prediction about the relationship between sensory processing and flexible behavior.

A, Task diagram. The structure is the same as Figure 4, but the probability of S2 and S3 is equal. B, (Top) We tested three stimulus neuron ratios: 2.5% for SZ, 16.7% for control and 50% for ASD. (Bottom) Schematics of how Context selector changes by the manipulation of neuron ratios in this task. Blue dotted lines indicate the energy landscape and blue circles indicate the attractor dynamics. Red arrows indicate the wrong stimulus prediction (hallucination-like effects) which triggers SPE-driven remapping (green cross marks and arrows), and orange lines indicate the input from the hippocampus to X (H0 and H1 indicate hippocampal segments in S0 and S1, respectively). C, (Left) The probability of choosing S4 at S2 and S3 is plotted in red and blue, respectively. SZ model fails to show one-shot switch for the second experience of the environment I and II, while ASD model shows an impaired task performance mainly to the environment II. (Right) The result of context selection (see Figure S1). The probability of wrong stimulus reconstruction (hallucination-like) is plotted in red, and the probability of default context usage due to failures in context reconstruction (see Materials and methods) is plotted in blue.

The algorithmic flow chart of the model.

Square boxes show the manipulation explained in Materials and methods, while the gray circles show if bifurcation with yes for ochre arrows and no for blue arrows. Synaptic weight updates are indicated in the pink boxes. Context selection in X is indicated in the blue dotted box, and sequence composition in H is indicated in the orange dotted box. The black dotted box indicates the sequence selection through the interaction between X and H, and the yellow dotted box indicates the action loop after the sequence selection.

2-laps task with model-free learning with temporal contextual states.

The contextual states are defined by the composition of the current state and n back sensory histories. It requires at least 3 back histories to complete this task, but the correct rate of 3 back histories is worse than our model.

Reward-dependent plasticity when sensory and contextual encoding neurons coexist in hippocampus.

A, Schematic figure of how sensory and contextual encoding neurons can coexist in the hippocampus. Hippocampal neurons that receive synaptic input mainly from the stimulus-encoding region have sensory encoding, while those from the context-encoding region have contextual encoding. B, How the hippocampal network evolves when sensory and contextual encoding neurons coexist in the 1-lap task. This task requires contextual encoding, otherwise agents cannot distinguish between the first and second visit of S2. After 100 trials of random exploration in this area, the network between sensory encoding hippocampal neurons (indicated by the orange square) does not increase synaptic weights, while that between relevant context-encoding hippocampal neurons increases synaptic weights. C, How the hippocampal network evolves when sensory and contextual encoding neurons coexist in the ignore task. In this task, contextual encoding is not necessary because agents receive a reward at S4 independent of past states or latent variables. In contrast to the 1-lap task, the network between sensory encoding hippocampal neurons (indicated by the orange square) increases the synaptic weights as well as that between context encoding hippocampal neurons.