Schematic representation of our model.

A, An example of context-dependent cognition. Humans can understand the meaning of “mouse” (an animal or a computer input device) depending on the context. B, Our model involves two modules: the context selection module (X) and the hippocampal sequential module (H). chooses a context depending on the external stimuli and the input from H, and activates a sequence in H. This sequence is used for reward prediction. In addition, H sends predictive feedback about external stimuli to X. C, X compares the predictive input from H to the external stimuli. In case of a prediction error (a green cross mark), remapping occurs: the context representation in X is either switched or newly created and a different sequence in H is activated. D, Episodic segments represented in H are combined depending on rewards (purple arrows) and concatenated into task-dependent sequences. The sequences support action planning and enable predictions of future external stimuli and rewards. E, Mechanism of remapping in our model. Blue squares represent contextual states in the context selection module, and orange squares represent those in the sequential module and gray circles represent visible environmental states. Reward-independent synaptic connections are indicated by black arrows and reward-dependent synaptic connections are indicated by purple arrows. Here we consider a situation where an agent has experienced the environmental states S1 and S2, and the corresponding contextual states are already established in X and H. If the agent assumes it is in a contextual state associated with S1 that predicts external stimuli other than S2 but experiences S2, a prediction error arises (a green cross mark) and triggers remapping: a new contextual state associated with S2 indexed by β (green squares) is created, and the synaptic connections are potentiated between X and H (green arrows).

Our model replicates the emergence of splitter cells.

A, Simplified alteration task diagram. B, A successful context map of our model. The state S2 is split into 2 different contextual states, C2α and C2β. C, The correct rate of our model. The error bar indicates the standard error of the mean (N = 40). D, The maximum number of states ahead that the agents planned (planning length) gradually increases over learning. Black lines indicate the planning length of each agent, and the red line is their average. e, Emergence of splitter cells in the hippocampus in the modified T-maze modification task (Wood et al., 2000)(Wood et al., 2000). F, Our model replicates the emergence of splitter cells in S2.

Our model replicates the emergence of lap cells.

A, Simplified 2-lap task diagram. Agents are rewarded for the shortest path (S1→S2→S4) for the initial 15 trials, for the 1-lap path (S1→S2→S3→S2→S4) for the next 15 trials, and for the 2 or more laps (S1→S2→S3→S2→S3→S2→S4, etc.) for the next 30 trials. B, A successful context map of our model. The state S2 and S4 are split into three contextual states, while S3 is split into two contextual states. C, The correct rate of our model. The error bar indicates the standard error of the mean (N = 40). D, The planning length gradually increases during learning, depending on the task demand. The black lines indicate the planning length of each agent, and the red line is their average. E, The comparison of lap cells in the hippocampus in the 4-lap task (Sun et al., 2020)(Sun et al., 2020) and our replicated results. F, The inhibition experiment of medial entorhinal cortex axons at CA1. ESR cells show a weak lap-specific correlation (ESR correlation) between light-on trials and light-off trials, while they show a strong spatial correlation between light-on trials and light-off trials (Left). Our model replicates the result qualitatively with the inhibition on and off (Right).

Our model replicates key features of human neural activity in dynamic environments.

A, Simplified task diagram of Ekman et al. 2022. In environment I, agents start at S0 and move to S2 or S3 randomly (S2 for p = 0.8 and S3 for p = 0.2) and receive a reward in S4 when they come from S2 and in S5 otherwise. In environment II, agents start at S1 and move to S2 or S3 randomly (S2 for p = 0.2 and S3 for p = 0.8) and receive a reward in S5 when they come from S2 and in S4 otherwise. The environment switches between the two every 30 trials. B, A successful context map of this task. S2 and S3 are split into two contextual states, and S4 and S5 are split into four contextual states. The hippocampal connections are built for rewarded conditions only. C, The probability of choosing S4. The red/blue line shows its mean when S2/S3 is presented. The error bar indicates the standard error of the mean (N = 40). D, The planning length gradually increases over learning and converges to 3. The black lines indicate each agent’s planning length, and the red line is their average. E, The probability of generating a specific planning sequence at S0 or S1. The expected states (S2 or S3) are modulated according to the environment. F, Our model behavior is similar to the human fMRI result of Ekman et al. (2022)(Ekman et al., 2022). G, Simplified task diagram of Julian & Doeller (2021)(Julian and Doeller, 2021). The training phase is the same as A, but the contextual stimuli of Square (Sq) or Circle (Ci) are initially presented and the probability of S2 and S3 is equal. In the test phase, either one of Sq, Ci or the mixture stimuli of Sq and Ci (Squircle: SC) are presented, and the agent transfers following their faith. Reward feedback is not given in the test phase. H, The transition probability under Sq context (Left) and Ci context (Right). I, The transition probability under SC context of the human patients in Julian & Doeller (2021)(Julian and Doeller, 2021) (Left) and our model (Right). J, Comparison of behavioral decoding accuracy from hippocampal fMRI activity of Julian & Doeller (2021)(Julian and Doeller, 2021) (Left) and hippocampal neural activity of our model (Right). Our model replicates the worse decoding accuracy in SC context (Bottom) than Sq or Ci context (Top).

Model prediction about the relationship between sensory processing and flexible behavior.

A, Task diagram. The structure is the same as Figure 4, but the probability of S2 and S3 is equal. B, The result of perturbation about the neuron ratio of stimulus domain in the cortex. (Left) We tested three stimulus neuron ratios; 2.5% for SZ, 16.7% for control and 50% for ASD. (Middle) The probability of choosing S4 is plotted for the task performance. SZ model fails to show one-shot switch for the second experience of the environment I and II, while ASD model shows an impaired task performance mainly to the environment II. (Right) The result of context calculation is plotted. The total number of context calculations is plotted in black, the number of wrong stimulus context reconstruction (hallucination-like) is plotted in green, the number of reconstruction fail (default network usage) is plotted in red, and the number of new context preparation is plotted in yellow.

The algorithmic flow chart of the model.

Square boxes show the manipulation explained in Method, while the gray circles show if bifurcation with yes for ochre arrows and no for blue arrows. Synaptic weight update is indicated in the pink boxes.

Simplified 2-lap task with model-free learning with temporal contextual states.

The contextual states are defined by the composition of the current state and n back states. It requires at least 3 back states to complete this task, but the correct rate of 3 back states is worse than our model.

Reward-dependent plasticity when sensory and contextual encoding neurons coexist in hippocampus.

A, Schematic figure of how sensory and contextual encoding neurons can coexist in the hippocampus. Hippocampal neurons that receive synaptic input mainly from the stimulus-encoding region have sensory encoding, while those from the context-encoding region have contextual encoding. B, How the hippocampal network evolves when sensory and contextual encoding neurons coexist in the 1- round task. This task requires contextual encoding, otherwise agents cannot distinguish between the first and second visit of S2. After 100 trials of random exploration in this area, the network between sensory encoding hippocampal neurons (indicated by the orange square) does not increase synaptic weights, while that between relevant context-encoding hippocampal neurons increases synaptic weights. c, How the hippocampal network evolves when sensory and contextual encoding neurons coexist in the ignore task. In this task, contextual encoding is not necessary because agents receive a reward at S4 independent of past states or latent variables. In contrast to the 1-round task, the network between sensory encoding hippocampal neurons (indicated by the orange square) increases the synaptic weights as well as that between context encoding hippocampal neurons.