Abstract
Predicting how our actions will affect future events is essential for effective behavior. However, learning predictive relationships is not trivial in a multidimensional world where numerous causes bring any one event about. Here we examine (1) how these multidimensional dynamics may distort predictive learning, and (2) how inductive biases may mitigate these harmful effects. We developed a theoretical framework for studying this problem using a computational successor features model. Model simulations demonstrate how spurious observations arise in such contexts to compound noise in memory and limit the generalizability of learning. We then provide behavioral evidence in human participants for a semantic inductive bias that constrains these predictive learning dynamics based on the semantic relatedness of causes and outcomes. Together, these results show that prior knowledge can shape multidimensional predictive learning, potentially minimizing severe memory distortions that may arise from complex everyday observations.
Introduction
To make good decisions, humans must predict how actions will affect future experiences. However, making such predictions is not straightforward in a multidimensional world where numerous causal processes drive how one event leads to the next. The co-occurrence of these causes creates ambiguity about which causes produce which outcomes. Understanding how these causal mappings are disentangled in memory is essential to understanding how humans effectively predict complex futures during decision making.
The ability to make predictive inferences has been long thought to rely on memory representations that encode a structured model of the environment1–4. An early formulation of this idea was Edward Tolman’s concept of cognitive maps, that mentally represent the spatial layouts of environments to aid flexible navigation5. The neural basis for these representations was later demonstrated through the discovery of place and grid cells in the hippocampal-entorhinal system, which are thought to encode, respectively, specific locations in space and a coordinate system for space6. More recent work has extended the apparent role of these representations beyond physical space, showing that similar neural codes organize abstract conceptual information, such as social dimensions7, conceptually relevant feature dimensions8,9, and a “bird” space that reflects differences in neck and leg length10. These codes have also been theorized to implement not just Euclidean maps, but also graph-like representations that capture the relational structure between discrete concepts, states, or events4,11,12. Importantly, these relations can be statistical regularities, such as the frequency with which events transition between one another13,14. These representations can therefore be used to predict how events unfold into the future.
The reinforcement learning framework formalizes how predictive representations are used in service of goal-directed behavior15. In this framework, an agent makes actions to transitions between states in an environment, with the aim of occupying rewarding states with desirable features. Predictive representations enable the agent to make inferences about distal states that may be several steps away, facilitating behaviors that will be the most rewarding in the long run. One influential model, successor features16–19 (SF; a generalization of the successor representation) represents each state in terms of the features expected to be encountered over time after entering that state. Thus, if previously a state has not directly comprised desirable features but proximal states have, the agent would still choose to approach. This model has been found to capture key patterns of predictive learning in both behavior20,21 and the brain13,18,22–26. However, work to date has, to our knowledge, only studied how SF representations are learned for each state rather than for each of the multiple independent features that compose states. Consequently, it has not been characterized how feature representations may be distorted when causal processes defined over features co-occur.
To illustrate how feature-level causal dynamics may distort representations, consider the following example (Fig. 1A). Suppose two causal processes are defined by feature A1 producing A2 (A1→A2), and feature B1 producing B2 (B1→B2). In a one-dimensional environment, an event might comprise {A1}, which reliably produces event {A2}, allowing A1→A2 to be learned without interference. However, in a multidimensional environment, A1 and B1 may co-occur within a single event {A1, B1}, which reliably produces event {A2, B2}. Since A1→A2 and B1→B2 unfold together, spurious transitions A1⇢B2 and B1⇢A2 will also be observed. These spurious and causal observations cannot be disambiguated, leading to faulty or uncertain inference27,28. The first aim of the present research is to simulate a fully feature-based SF model to demonstrate how these spurious observations may distort predictive learning. We argue that because such spurious observations are inherent to our multidimensional world, predictive representations of real-world experience are inherently prone to distortion.

Feature-based model simulations.
Simulation environments varied in (A) dimensionality and (B) causal depth. (A) Environments had either one or two features per state. Since A1→A2 and B1→B2 unfold together in the two-dimensional environment, spurious transitions A1⇢B2 and B1⇢A2 are also observed. We additionally consider a causal learner model that fully suppresses spurious information during updating. (B) Environments consisted of either (light) 1-, 2-, 3-, or 4-step (dark) causal chains, beginning with fSTARTand ending with fTERMINAL. (C) Mean learning curves in the one-dimensional environments by causal depth. (D) Mean learning curve in the two-dimensional environments by causal depth for causal learners (grey lines) and “regular” learners (red lines). For C and D, P(Reward) is the proportion of the maximum reward (C = 1, D = 2) agents could earn on each trial. (E) Proportion of spurious information in successor matrix M by environment depth and the number of steps a feature is from fTERMINAL. (F) Schematic demonstrating how spurious information may compound in deeper feature representations (as shown in E). Causal observations (solid lines) are constrained by the deterministic causal processes that generate them, producing simple chains of observations over multiple steps. Conversely, since spurious observations (dashed lines) are not constrained by a mechanism in the environment, they can occur between a more diffuse set of features. This diffusivity expands over multiple steps, producing increasingly less constrained chains of spurious observations. Due to this, features that are deeper in causal chains (more steps away from fTERMINAL) will compound much more spurious information through these diffuse webs (see E). In turn, predictive inferences (red arrows) about deeper features will be noisier and more diffusive, leading to poorer behavioral performance (see D). (G) The representational similarity of start feature pairs (Pearson’s correlation of 



Given this susceptibility to distortion, how do humans learn accurate predictive representations? We suggest that a class of inductive bias may attenuate spurious observations during learning. Inductive biases shape learning towards certain interpretations of experience over others29–35. Critically, inductive biases tend to orient learning towards stable properties of the environment29,35–37. For example, beliefs formed over the lifespan are encoded in semantic memory, which extracts the statistically regular features of our experiences38,39. Semantic inductive biases that shape learning based on semantic knowledge thus direct learning towards stable properties of past experience29,32,33,37. Causal processes are one such stable property. As the generative components of experience, causal processes are stable across contexts40. Consequently, inductive biases may incidentally shape representations around the predictions of stable causal processes while suppressing unstable spurious observations. The second aim of the present research is to test whether human participants express such a semantic inductive bias on multidimensional predictive learning.
To summarize, the aims of the present research are two-fold. Firstly, we simulate feature-based predictive learning models to demonstrate how spurious observations in multidimensional environments make memory inherently prone to distortion. Secondly, we investigate whether human participants express a semantic inductive bias on predictive learning that theoretically attenuates the distortive effects of spurious observations. Results illustrate the crucial role inductive biases play in shaping memory for effective behavior in our complex world.
Results
Simulations
To characterize how predictive learning may be inherently prone to distortion in multidimensional environments, we implemented a fully feature-based SF learner model (Methods – Computational Models, Feature-Based Model). This model learned a predictive representation for each feature rather than each state in the environment, and could update representations in parallel as features co-occurred.
The feature-based model was simulated on a task in which agents observed simple causal chains defined over features (Fig. 1A, 1B&2B) and made compositional predictive inferences to earn reward (Fig. 2). The causal chains could span a single or multiple steps (Fig. 1B), but were always observed to start with feature fSTART and end with feature fTERMINAL. Crucially, these causal chains could co-occur in multidimensional environments, generating spurious observations that may interfere with learning (Fig. 1A & 2B). On each trial (Fig. 2A), the agent was tasked with producing a target terminal state sTARGET (or w). The agent was then presented a set of start feature options, grouped into a number of categories corresponding to the dimensionality of the environment. The agent selected one feature from each category to compose a start state sSTART that may lead to sTARGET. Then the causal process initiated by sSTART and ending with sTERMINALwas observed. Finally, a reward was paid corresponding to the number of features shared between sTERMINALand sTARGET. The more accurately the agent learned a representation (M) that reflected the outcomes of these causal processes, the more reward it earned.

Robot task design.
(A) Trial procedure (1-step, 2-dimensional environment). Participants are presented with a blueprint for the robot they are to ultimately build. They then compose a builder robot they think will produce the target robot. During training, they then first see the builder (start) robot they composed followed by the final built (terminal) robot. Finally, they receive reward commensurate to the number of features that overlap between the terminal and target. The example here is from the semantic congruent condition, where start and terminal features are drawn from the same semantic categories. The top-bottom placement of a robot’s features was randomized within and across trials. For example, here, the start robot was displayed with the antenna on top, while the terminal robot was displayed with the antenna on the bottom. (B) Example transition structures in the semantic incongruent and congruent conditions. Causal transitions (solid arrows) are defined between start and terminal features, and the co-occurrence of start features generates spurious observations (dashed arrows). However, whereas all transitions are defined out-of-category in the semantic incongruent condition, the causal transitions are defined within category in the semantic congruent condition (e.g., head→head). Thus, a semantic bias would be expressed as upweighted learning of the causal transitions in the congruent relative to incongruent condition. (C) Feature co-occurrences at training and test, by target and options definition and semantic congruency condition. During training, a subset of feature combinations (AB or CD – old) were presented in each training block. However, at test, both these old combinations and new feature combinations comprising features from across the two training blocks were observed. Moreover, at test, all possible combinations of old and novel targets and options sets were presented (bidirectional arrows), producing 144 unique trials.
Agents completed the task in both one- and two-dimensional environments, wherein states respectively comprised one or two features (Fig. 1A). Since causal processes could not co-occur in the one-dimensional environments but did co-occur in the two-dimensional environments, we could compare learning in the absence versus presence of spurious transitions between these environments. To probe how spurious transitions affected the formation of compressed multistep representations, we additionally varied the depth of the causal chains. That is, each agent was simulated in an environment with either one-, two-, three-, or four-step causal chains (Fig. 1B).
We first compare agents’ behavioral performance. A strength of the SF learner is that, irrespective of an environment’s depth, the model should converge on an optimal policy if the environment is deterministic. The feature-based model demonstrated this principle in the one-dimensional environments (Fig. 1C). Whereas early in training there were some differences across depths in reward earnings, these converged at the same level by the end of training. However, in the two-dimensional environments, performance converged at lower levels in deeper environments (Fig. 1D). To verify that this decrement in performance was driven by learning about spurious transitions, we simulated a version of the model that was biased towards learning only the causal transitions. Specifically, when incorporating information from the successor state during updating, this causal learner model fully discounted information that would be indicative of a spurious transition (that is, b=1; see Methods – Computational Models: Feature-Based Model). Reinforcing that spurious learning drove the performance deficit in two-dimensional environments, the causal-only agents exhibited comparable performance to that of agents in the one-dimensional environments (Fig. 1D – grey lines).
Two causal and two spurious transitions were observed on each step in all two-dimensional environments – irrespective of causal depth. Given that this rate of spurious to causal transitions did not vary with causal depth, why did the spurious transitions have a more deleterious impact on performance in the deeper environments? To answer this, we inspected how spurious information accrued in the model’s learned representations. For each feature representation an agent learned, we computed the proportion of spurious information as the sum of the values corresponding to spurious associations divided by the sum of the values corresponding to spurious or causal associations. This showed that spurious information had an outsized representation in deeper features, that were more steps away from fTERMINAL(Fig. 1E). Thus, even if spurious and causal transitions are observed at the same rate, spurious information can compound in deep feature representations, harming inference (red lines in Fig. 1D).
This compounding occurs due to the diffuse nature of spurious associations, that enables spurious noise to propagate widely through memory via SF updating (Fig. 1F). Spurious associations are more diffuse than causal associations because they are not constrained by any specific causal mechanism in the environment. That is, an environment with deterministic causal processes will always produce the same causal transition from fSTART, while spurious observations will vary simply by what other features fSTART co-occurs with across states. Any given feature’s diffuse spurious associations will likely in turn have their own downstream diffuse spurious associations. This broad and deep diffusivity enables spurious information to propagate widely through memory via SF updating (Equation 15), particularly compounding in deeper feature representations with more long-run associations. Thus, the process that produces efficient multistep representations enables spurious noise to compound through memory. Given that real-world experience is driven by deep causal processes, this effect has important implications for how predictive representations may be formed in naturalistic multidimensional environments.
The distortive effect of spurious transitions on learning reflects not just an injection of noise, but a tuning of representations to specific learning contexts. Spurious transitions in effect act as links between disparate causal processes. If a set of causal processes frequently co-occur, these links are reinforced. Representations of co-occurring features thus converge on common tunings that reflect conjunctive predictions of future features. In short, instead of tuning to the underlying causal processes (e.g., A1→A2, B1→B2), feature representations tune to the learning context (e.g., {A1, B1} → {A2, B2}). To demonstrate this effect, we analyzed how representations in the two-dimensional environments tuned based on the frequency with which features co-occurred during learning. To induce variation in feature co-occurrence frequency, we manipulated the rates at which different target states appeared. Specifically, half the targets occur twice as frequently as the others, while terminal features occurred at the same rate across these targets. This encouraged regular inferences on frequent trials, such that certain start features co-occurred more frequently. Then for each agent, we characterized how similarly each pair of features were represented by computing the Pearson’s correlation of their corresponding SF vectors. Reflecting a tuning to specific feature conjunctions, representational similarity increased as features co-occurred more frequently (red lines in Fig. 1G). The causal learners’ representations did not mirror this effect (grey lines in Fig. 1G), demonstrating that this specificity tuning was driven by the spurious observations.
High specificity tuning had ensuing effects on agents’ inferences. To characterize these inferences after training, agents completed a test phase in which outcomes were hidden to prevent continued learning. Importantly, an extensive set of targets and options were displayed to probe the generalizability of learning (Fig. 2C – akin to semantic congruent). Whereas each training half included only “old” targets/options (AB, CD), test also included “novel” targets/options comprising unseen combinations of features (AC, AD, BC, BD). Additionally, all combinations of target and options were shown (e.g., AB-AB, AB-AC, AB-AD, AB-BC, etc.), producing 144 unique trials. Reflecting specificity tuning, more reward was earned on: (1) old trials that occurred frequently versus infrequently during training (Fig. 1H); and (2) trials with old versus novel options (Fig. 1I; specifically, this analysis was constrained to trials where only one causal and one spurious inference was possible – see Methods, Preregistration and Deviations). The causal learner performed equivalently within both sets of comparisons (grey lines in Fig. 1H&I), again demonstrating that the spurious observations drove both effects.
In all, these simulations demonstrate that spurious observations make predictive learning inherently prone to distortion in multidimensional environments. Specifically, this spurious information makes representations both noisier and more biased towards idiosyncratic contexts rather than underlying causal processes. Mechanisms that limit the deleterious effects of spurious transitions may thus be essential for effective predictive learning in the real world.
Human Experiment
What mechanism may limit the impact of spurious observations on predictive learning? We posit that since semantic memory extracts stable properties across experiences38,39, it may incidentally reflect stable causal associations. Therefore, inductive biases derived from semantic memory may help direct learning towards causal observations.
To test whether a semantic bias shapes human predictive learning, we ran a preregistered experiment in which 100 participants completed the one-step, two-dimensional task used in the simulations (Fig. 2; 72 training trials, 144 test trials). Crucially, we used visual stimuli (robots) where features had an implied semantic structure (robot parts were drawn from four categories: heads, antennas, arms, bodies). We then varied across two between-subjects conditions whether this semantic structure was congruent or incongruent with the causal task structure (Fig. 2B). If participants used a semantic bias to direct predictive learning, we hypothesized that this would be expressed as bolstered learning of the causal structure in the semantic congruent condition.
Semantic bias directed predictive learning
The experiment’s key marker of a semantic bias would be upweighted learning of causal relative to spurious transitions in the semantic congruent condition. To assess this, we analyzed the extent to which observations of causal versus spurious transitions influenced choice. This involved fitting a Bayesian multinomial logistic regression to each participant’s training data that predicted the composition made on each trial based on recent causal and spurious transitions. On a trial, a participant could have produced one of four possible compositions, which we arbitrarily label 0, 1, 2, 3. Treating composition 0 as the reference class, we modelled the choice of each of the other three compositions s relative to composition 0 with a distinct logit model:

where: (1) Δcausal+,1 is the relative probability that composition s versus 0 will produce the target based on recently observed causal transitions; and (2) Δspurious+,1 is the relative probability that composition s versus 0 will produce the target based on recently observed spurious transitions (see Methods – Causal and Spurious Transition Evidence for more information). Greater βs,causal amd βs,spurious fits thus respectively reflect greater uses of causal and spurious learning during inference. Since six coefficients were fit across the three logit models, the three βs,causal and three βs,spurious are metrics that together reflect how much causal versus spurious information drove participants’ behavior in the task.
In line with a semantic bias, we found that causal relative to spurious transitions had a weaker influence on choice in the congruent condition (M = 0.875, SD = 0.352, 95% HDI = [0.173, 1.563]; Fig. 3A; Table S1; Fig. S11). This suggests a semantic bias was leveraged in the congruent condition to better learn the causal predictions.

Training Performance.
(A) Spurious and causal transition influence coefficients by semantic congruency condition. (B). Training reward earnings by training target frequency and semantic congruency condition. (C) Training reward earnings by spurious transition influence coefficient fit. All error bars are 95% HDIs. Density plots show posteriors from the Bayesian regression models, with red lines indicating 0.
Semantic bias reduced learning specificity
The simulations demonstrated that spurious learning binds features into context-specific predictive representations (Fig. 1G&H). Since we expected that the lack of a semantic bias in the semantic incongruent condition would produce more spurious learning, learning should also be more specific in the incongruent condition. In line with this, we found that reward earnings in the congruent compared to incongruent condition differed less by trial frequency (M = −0.139, SD = 0.034, 95% HDI = [−0.204, −0.072]; Fig. 3B; Table S2). To validate these results, we repeated the analysis on a subset of “old” test trial, which had the same targets and feature options from training, only choice outcomes were not displayed (Fig. S3). Reinforcing the evidence for greater learning specificity in the semantic incongruent condition, there was a negative interaction between condition and frequency (M = −0.191, SD = 0.096, 95% HDI = [−0.366, −0.001], Table S3).
Central to the specificity hypothesis is the rationale that spurious learning tunes representations towards frequent robot transitions. It should therefore be expected that – irrespective of condition – participants whose choices were more strongly influenced by the spurious transitions should also display more sensitivity to the frequency manipulation. An exploratory analysis supported this prediction, showing that participants earned more reward on frequent than infrequent trials when their training choices were overall more strongly influenced by spurious transitions (M = 0.068, SD = 0.009, 95% HDI = [0.050, 0.087]; Fig. 3C, Table S4). Conversely, there was no relationship between causal transition influence and reward earnings on frequent versus infrequent trials (M =-0.001, SD = 0.004, 95% HDI = −0.009, 0.007]; Fig. S4). Together, these results demonstrate that higher learning specificity was driven by greater spurious learning.
We can again validate these results by repeating the analysis on old test trials. Reinforcing the evidence for greater learning specificity with more spurious learning, test reward earnings differed more by training trial frequency with greater training spurious transition influence (M = 0.093, SD = 0.024, 95% HDI = [0.043, 0.139]; Fig. S5, Table S5) and did not differ by training trial frequency with greater training causal transition influence (M = −0.006, SD = 0.011, 95% HDI = −0.029, 0.015]; Fig. S6).
No evidence the semantic bias improved generalization
The simulations demonstrated that by tuning learning to frequent state transitions, spurious learning should impair inference not only about infrequent states, but also about entirely novel states (Fig. 1I). A semantic bias that reduces spurious learning should thus improve generalization to novel states during the test phase.
We evaluated participants’ abilities to make inferences about novel states on trials with novel options and make inferences for novel tasks or goals on trials with novel targets. We predicted generalization about both would be impaired in the incongruent condition, where spurious learning was unattenuated by the semantic bias. However, we found no differences in reward earnings by condition and options or target novelty (options: M = 0.025, SD = 0.022, 95% HDI = −0.019, 0.069], Fig. 4A; target: M = 0.032, SD = 0.023, 95% HDI = [−0.011, 0.078], Fig. 4B; Table S6). This may in part be because the environment was not sufficiently complex to make differences in old versus novel performance identifiable. We explore this possibility below.

Test Performance.
(A) Test reward earnings by semantic congruency condition and options novelty. (B). Test reward earnings by semantic congruency condition and target novelty. For A and B, analyses were restricted to test trials where only one possible causal and spurious inference was possible based on the options and target. (C) Composition spurious predictiveness by semantic congruency condition on test trials where only spurious inferences were possible based on the options and target. All error bars are 95% HDIs. Density plots show posteriors from the Bayesian regression models, with red lines indicating 0.
We expected generalization to be less accurate in the incongruent condition because greater spurious learning would interfere with causal transition inferences. However, we also predicted that spurious learning could be associated with qualitative differences in the use of spurious information during inference on certain novel trials. Namely, there was a subset of test trials on which only spurious inferences could be made. For example, on a trial in the congruent condition with options AC and target BD, no causal transition and two spurious transitions (A⇢B, C⇢D) would have been observed during training. We hypothesized that greater spurious learning in the semantic incongruent condition would encourage participants to actively use spurious information during inference on these trials.
To test this, for each composition, we computed composition spurious predictiveness defined as the number of transitions between the composition and target features that would have been spuriously observed during training. This measure can also be thought of as the reward that would have been received if the causal task structure was defined in terms of these spurious transitions. However, while there was a trend towards less spurious inference in the semantic congruent condition, we did not find statistical evidence for this effect (M = −0.079, SD = 0.061, 95% HDI = [−0.201, 0.33]; Fig. 4C; Table S7).
Idiosyncratic biases captured by the successor features model
Each person’s semantic knowledge is idiosyncratic. We should therefore expect idiosyncratic differences in semantic bias between participants. However, our analyses thus far have assumed that a semantic bias in the robot task arises solely based on discrete differences between contrived categories (i.e., heads, arms, bodies, antennas) and should be common across participants. We therefore sought to capture individual differences in bias using our computational modelling framework.
The feature-based model was fit separately to each participant’s data. Crucially, the model included a causal-bias parameter (b=(0, 1)) that governed the extent to which information indicative of a spurious transition was downweighted during updating (Equation 15). Since bis agnostic to the underlying memory structure that helps the learner attenuate spurious observations, it can capture individual differences in bias that go beyond the task’s categorical semantic structure.
While we have assumed so far that participants learned at the level of features (feature transitions), it was entirely possible for them to learn at the level of feature conjunctions (robot transitions). Since the inductive bias acts on feature-based transition dynamics, it would be invalid to interpret differences in bfor conjunctive learners. We therefore sought to identify which participants performed feature-based versus conjunctive learning, and verify that the former generally captured behavior better in the task. To do this, we compared the fit of the feature-based model to two conjunctive learner models. The basic conjunctive model represented each robot as a unique state, acquiring distinct predictive representations for different robots that may share a common feature. The conjunctive sampler model learned about robots in an identical way, only it also possessed an inference mechanism that enabled it to retroactively integrate prior learning at choice. The advantage of the latter model is it can generalize learning to novel states, better reflecting participants’ abilities to make inferences about novel options and targets during test.
We characterized whether participants learned at the level of features or feature conjunctions by comparing the fit of the three models (Table S13). For each participant, the best fitting model was identified as that which minimized the Akaike information criterion (AIC). A null model was also considered by computing an AIC based on the random choice probability for each training and test trial (p = 0.25). Statistically verifying that feature-based learning best explained participants’ choices, a multinomial logistic regression found that a larger proportion of participants (p = 0.46) were better fit by the feature-based model than each of the alternative models (null: p = 0.24, M = −0.640, SD = 0.254, 95% HDI = [−1.174, −0.176]; conjunctive: p = 0.09, M = −1.629, SD = 0.362, 95% HDI = [−2.314, −0.916]; conjunctive sampler: p = 0.21, M = −0.778, SD = 0.262, 95% HDI = [−1.130, −0.268]). To assess the relative strength of the four model fits within each participant, we converted AIC values to relative likelihoods, which reflect the plausibility of a given model relative to the alternatives. This analysis showed that those best fit by the feature-based model were strongly best fit by it, as revealed by lower relative likelihoods for alternative models among those best fit by the feature-based model (Fig. 5A, Table S13).

Model-based analyses.
(A) Relative likelihoods of computational model fits by participant. (B). Spurious and causal transition influence coefficient fit by bfit. (C) Training reward earnings on trials with frequent versus infrequent targets by bfit. (D) Test reward earnings on trials with old versus novel options by bfit. (E) Composition spurious predictiveness by bfit. All error bars are 95% HDIs. Density plots show posteriors from the Bayesian regression models, with red lines indicating 0.
Having identified the participants best fit by the feature-based model, we could investigate the fit of the causal-bias parameter bamongst these participants. The transition influence analysis demonstrated that participants were biased towards causal information in the semantic congruent condition (Fig. 3A). We predicted that this differential bias would be captured in the fit of b. However, we found no difference in bby condition (M = −0.106, SD = 0.402, 95% HDI = [−0.880, 0.676]; Fig. S7).
The lack of difference in bfit by condition was surprising given that we observed differences in behavior by condition (Fig. 3A) and the computational models and bwere sufficiently recoverable (Fig. S1, S2). Despite this, bmay still have been sensitive to idiosyncratic biases that emerge irrespective of the between-subjects conditions. We therefore ran a set of exploratory analyses to test whether the fit of bcaptured patterns of behavior in-line with a causal bias.
Firstly, we analyzed whether bcaptured real differences in the attenuation of spurious information. Indeed, a higher bfit was associated with greater causal relative to spurious transition influence (M = 2.638, SD = 0.752, 95% HDI = [1.190, 4.125]; Fig. 5B; Table S8).
Next, we analyzed whether a higher bfit was associated with lower learning specificity. In line with this, training reward earnings differed less by trial frequency with a greater bfit (M = −0.265, SD = 0.058, 95% HDI = [−0.379, −0.149]; Fig. 5C; Table S9). Somewhat mirroring this result, when repeating the analysis on old test trials, there was a trend towards test reward earnings differing less by training trial frequency with a greater bfit, however there was not statistical evidence for this effect (M = −0.254, SD = 0.160, 95% HDI = [−0.564, 0.058]; Fig. S8; Table S10).
Our analyses of generalization performance by semantic congruency condition did not demonstrate differences in choice accuracy on test trials with old versus novel targets or options (Fig. 4A&B). Given that bmay reflect idiosyncratic biases that differ from the semantic congruency manipulation, the fit of bmay be able to capture difference in generalization at test. However, we did not observe differences in test reward earnings by bfit and options or target novelty (options: M = −0.012, SD = 0.037, 95% HDI = [−0.082, 0.062], Fig. 5D; Targets: M = −0.036, SD = 0.037, 95% HDI = [−0.107, 0.039], Fig. S9).
Finally, we found numerical differences in the spurious predictiveness of test compositions by semantic congruency condition (Fig. 4C). We therefore explored whether the fit of bcaptures individual differences in composition spurious predictiveness. Supporting this, bfit was associated with lower composition spurious predictiveness at test (M = −0.427, SD = 0.126, 95% HDI = [−0.672, −0.181]; Fig. 5E; Table S12).
In all, these model fits capture individual differences in causal bias that may better reflect the idiosyncrasy of semantic memory that shapes learning.
Causal bias has an inflated benefit in more complex environments
The robot task captured components of environment complexity often neglected in predictive learning tasks. Namely, it allowed multiple causal processes to unfold in parallel, and imbued states with a semantic structure that could be used to direct learning. Nevertheless, the task environment was still simple relative to everyday human experiences, which are higher dimensional and involve deeper causal processes that unfold over more extended periods of time. We therefore wanted to characterize how such a bias might affect behavioral outcomes in more naturalistic settings. To do this, we simulated the feature-based models that provided best fits to participants’ data (Fig. 5A, n = 46) across environments varying in dimensionality and depth.
The environments had two levels of dimensionality (2- and 4-dimensional) and four levels of depth (1-, 2-, 3-, or 4-step). There were eight targets in each environment, with half of the targets occurring at twice the rate as the other half of targets. Agents were tested in each environment twice: once after 72 training trials; once after 1080 training trials. The 72 trial simulations enabled us to characterize test performance given the extent of training participants received in the robot task. The 1080 trial simulations enabled us to characterize test performance after agents’ learning converged. The 1080 trial simulations could therefore quantify differences in test performance that were driven not by differences in the stage of learning, but by qualitative differences in representation.
To characterize how a causal bias affected behavioral outcomes, we compared test performance between agents with a low causal-bias (b< 0.5) versus high causal-bias (b> 0.5). With 72 trials of training (Fig. 6A), the difference in participants’ performance by bfit was re-created in the one-step, two-dimensional robot task environment. That is, high-bias agents earned more reward than low-bias agents, but neither group exhibited clear differences in reward earnings on trials with old versus novel options. This pattern persisted across environments, even as performance amongst both groups generally declined with increasing dimensionality and causal depth. However, with 1080 trials of training (Fig. 6B), additional differences in performance by bbecame apparent. While reward earnings on old versus novel trials were indistinguishable amongst high-bias agents, low-bias agents exhibited clearly poorer generalization. Namely, in the two-dimensional environments, low-bias agents earned as much, if not more, reward than high-bias agents on old trials, while earning notably less reward on novel trials. Moreover, in the four-dimensional environments, low-bias agents’ learning was so severely impaired that their performance was poor even on old trials. Therefore, while low-bias agents’ performance is equivalent on old versus novel trials, this is not because they learned fundamentally generalizable representations, it is because learning was noisy overall.

Simulated test performance of best-fitting feature-based models.
Feature-based models that provided a best fit to participants’ data were simulated in environments varying in dimensionality and causal depth for either (A) 72 training trials or (B) 1080 training trials. P(Reward) is the proportion of the maximum reward (1) agents could earn on each trial. Here we compare P(Reward) on trials with old versus novel options for agents with a low bias (b< 0.5) versus high bias (b> 0.5). The cell outlined in black is the robot task environment.
These results may partially explain why we failed to observe differences in participants’ generalization performance by bfit and semantic congruency condition. Firstly, the robot task may not have been complex enough to reveal differences in generalization performance given the sample size and number of training trials. Secondly, spurious information can so severely distort representations that performance even on old trials is impaired, making it more difficult to distinguish from performance on novel trials. More generally, these simulations imply that the bias we captured in the robot task should have more dramatic behavioral benefits in more naturalistic environments. This reinforces that such inductive biases on multidimensional predictive learning could be crucial to limit spurious distortions in the real world.
Discussion
The present work investigated how predictive memory may be distorted in multidimensional environments, and how inductive biases may mitigate these effects. Our simulations illustrate that the spurious observations that arise from multidimensional transition dynamics may both overly tune representations to specific contexts and generate noise that compounds throughout memory. This motivates characterizing inductive biases the brain may leverage to suppress such spurious information. In line with this idea, we found that participants relied on a semantic bias that shaped learning based on the semantic relatedness of causes and outcomes. We propose that, since semantic memory reflects the stable properties of our experiences38,39, this bias tunes learning towards stable causal information in our everyday lives, overcoming harmful distortions in memory.
Some may debate whether semantic biases do indeed extract causal information outside of our contrived task setting. This raises the question: can the bias still be beneficial even if it does not direct learning towards causal information? We speculate that it can. Our simulations demonstrated that the high diffusivity of spurious associations enables spurious noise to propagate widely through memory, particularly compounding in deep feature representations (Fig. 1E&F). An inaccurate inductive bias may still help mitigate these effects. By constraining learning to a subset of transitions, the feature associations become less diffusive, and the widespread propagation of spurious noise through memory may be curtailed. Therefore, even if learning is not tuned towards causal information, the memory system as a whole may benefit via the reduced compounding of spurious noise.
Successor features (SF; or the successor representation, originally) was introduced to enable efficient predictive inference in large, multi-step environments without relying on slower model-based planning15,17. Since the computational costs of model-based planning scale steeply with the size and depth of the environment, SF has been viewed as especially well-suited for modelling naturalistic inference in real-world environments3,18. However, our simulations illustrate that spurious associations may more severely distort SF in these complex environments where its computational advantages are considered the greatest (Fig. 1D-F & 6B). This firstly reinforces the importance of building additional inductive biases into SF algorithms to mitigate these effects. Moreover, these results generate new behavioral predictions to evaluate the validity of SF as a model of human predictive learning. Since the robot task used one-step terminating transitions, we were unable to test these predictions here. Future work should therefore investigate multidimensional predictive learning and the behavioral outcomes of such inductive biases across environments that span levels of dimensionality and causal depth.
Prior work has shown that SF captures key characteristics of putative predictive maps in the hippocampus23,26. This suggests that the hippocampus may also encode the feature-based predictive representations modelled with SF in the present work. On the one hand, this proposal could be at odds with influential theories about the division of labor between the cortex and hippocampus, whereby the neocortex slowly learns the general features that recur across events, while the hippocampus rapidly encodes conjunctive representations of specific events39,41,42. Relatedly, while the medial temporal lobe may support compositional generalization, the entorhinal cortex, not hippocampus, has been suggested to learn the compositional representations that underlie such inferences1,11,43. On the other hand, the hippocampus itself may have complementary learning systems, with the trisynaptic pathway rapidly encoding conjunctive representations, while the monosynaptic pathway learns the regularities that recur across events44,45. Moreover, it has been shown that a biologically plausible neural network that relies on inhibitory competition (thought to support pattern separation in the hippocampus44,46) and a bias acquired through Hebbian learning can form generalizable representations that preserve feature information31. To resolve this tension in the literature, continuing work is investigating whether the hippocampus encodes the biased feature-based representations characterized by our findings.
The current work has important implications for research on artificial intelligence, which has assumed an ever more central role in society with recent developments in large language models47–50. The reliance of these large models on slow associative learning from vast quantities of data have raised concerns about their immense energy usage, and thus severe environmental costs51–54. These concerns are particularly pressing given recent proposals that performance can be improved by simply further scaling existing model architectures55,56. The present work supports an alternate approach. The human brain relies on a slew of inductive biases that have evolved to produce accurate learning with limited energy expenditure29,35. By building such biases into new model architectures, the environmental costs of AI can be curtailed.
To summarize, present work introduced a framework for studying the dynamics of predictive learning in multidimensional environments that better reflect the complexity of real-world human experience. We reinforce broad perspectives that effective behavior cannot arise through slow, inefficient associative learning alone, and therefore requires the influence of inductive biases that constrain and accelerate learning29,32,33,35,37,57. Particularly, we demonstrated the role existing semantic knowledge may play in disentangling webs of causal information into accurate feature-based predictive representations.
Methods
Computational Models
We developed our formal theoretical framework using computational modelling.
An agent’s interaction with its environment was modelled as a Markov Decision Process (MDP), whereby the agent makes actions (a ∈ A) to step between environment states (s ∈ S) based on transition function T(s′|s, a); where the agent transitions from state s to successor state s′ following action a. To render this a multidimensional framework, each state comprises d features Φ(s), and T(s′|s, a) emerges from a transition function defined at the feature level:

This definition of T assumes that the relationship between observed events is generative, arising from a fundamental set of causal processes that may unfold in parallel. The agent’s goal is instantiated in task vector w, which represents preferences over Φ(s). The inner product of w and Φ(s) defines a reward function R(w, s), dictating the immediate reward received for entering s:

The agent should make actions to navigate towards states with features that will be the most rewarding given the current task w. To infer which actions will give cause to rewarding states, the agent must therefore learn a predictive representation that approximates the transition dynamics defined by T. We model such predictive learning as a successor features (SF) learner16,17,19.
The SF model learns a representation of each state s in the environment that reflects the quantity of features likely to be encountered upon entering s and in subsequent states following s. These expectations are represented in successor matrix M(s). We implemented an incremental SF learner that uses a temporal difference rule to learn M(s). After state transition s → s′ is observed, the feature vector for the current state Φ(s) is added to vector M(s). Since this update incorporates the discounted feature predictions of the successor state s′, the agent gradually learns to predict distant feature visitations. Formally:


where free parameter α =(0, 1) controls the learning rate and free parameter γ =[0, 1] controls the discount rate. Since the present task uses one-step terminating transitions, we set γ =1.
To estimate the value of entering s given some task w, M(s) is used to define a value function V(w, s) akin to R(w, s):

On each trial of the robot task, an agent should compose the robot s which has a representation M(s) that maximizes V(w, s) for the trial’s target w. Lastly, a probabilistic choice policy is computed for composing state s* from the set of candidate compositions Ct on trial t using a softmax with inverse temperature β =(0, ∞):

We implemented three forms of this basic SF learner: (1) conjunctive (representing environment states as discrete conjunctions of features); (2) conjunctive sampler (conjunctive representation with a retroactive integration mechanism to generalize to novel states); and (3) feature-based (representing environment states as independent features).
Conjunctive Model
In contrast to the basic SF learner model described thus far, the conjunctive model represents environment features Φ(s) in terms of the conjunctive successor state itself. That is, Φ(s) is a one-hot vector with a length equal to the number of task states (unique robots), and a 1 in the position corresponding to state s. Successor matrix M, is therefore square, where the number of rows and columns correspond to the number of unique states. This implementation of SF is akin to the traditional successor representation model.3,17,18
On each trial, the agent constructs an action set that comprises all possible candidate compositions Ct that could be built from the trial’s start feature options. The agent computes V(w, s*) based on each candidate composition s* in Ct.
Conjunctive Sampler Model
The conjunctive model suffers from two limitations as a model of human behavior. Firstly, it must compute all possible combinations of features to construct a candidate composition set during inference. This approach may be computationally infeasible in high-dimensional real-world environments where the number of possible feature combinations would likely be intractably large to compute. Secondly, the conjunctive model is unable to generalize learning to make inferences about novel compositions at test. The test phase of the robot task included trials with novel options sets, comprising combinations of feature categories that were not seen during training (AC, AD, BC, BD). Since the conjunctive model would not have learned about the candidate compositions that could be constructed from these options, it could not make inferences about them, and would thus make random choices on novel trials. This does not reflect the behavior of participants, who performed above chance on test trials with novel options (Fig. 4). Both limitations can be addressed by considering an alternative form of conjunctive model, which represents and learns about states as described, but uses a more flexible inference mechanism to estimate state values at choice.
Previous work has suggested that humans may use a retroactive integration mechanism to sample and combine memories during choice inference58–66. This is thought to allow humans to adaptively re-combine prior knowledge into novel configurations to generalize learning to novel problems. We thus implemented a form of retroactive integration that computes weights g(s*) over M based on each represented state’s recency, frequency, and similarity to s* — properties that all drive memory retrieval67,68. The estimated value V*(w, s*) is then computed as a weighted sum over V:

To compute g(s*), the agent represents the recency and frequency of states in memory as integer vectors (gREC, gFREQ) with lengths corresponding to the size of S. After composition s is made on a training trial, a count is added to gREC in all positions except that corresponding to s, which is set to zero. A count is also added to the position of s in gFREQ. To allow the similarity computation, the agent stores each observed Φ(s) in memory. When evaluating s*, the similarity gSIM(s*, s) of s* to each s in memory is calculated as the number of shared features between Φ(s*) and Φ(s):

At inference, gREC, gFREQ, and gSIM (s*) are normalized by dividing by their respective maximum values. We then compute g(s*) as a weighted sum over these:

where ωREC, ωFREQ, and ωSIM control the degree to which each vector directs memory sampling. To limit the number of free parameters, we treated ωSIM (ωSIM =[0, 1]) as the sole free parameter that in turn determined the values of ωREC and ωFREQ:

Finally, to capture the extent to which few versus many memories are sampled and integrated over, sampling specificity parameter Ψ (Ψ =(1, inf)) was applied to a normalized g(s*):

A larger value of Ψ produced more sampling specificity, directing sampling to the most probable states in memory.
The described retroactive integration mechanism addresses the conjunctive model’s limited ability to generalize. However, it does not address the model’s computationally expensive action set generation mechanism. To account for this, we simply have the model treat the feature options themselves as the action set on each trial, requiring no additional computation. The basic conjunctive model cannot make inferences about these features, since it represents them distinctly to the feature conjunctions it learns about, and does not possess a generalization mechanism to make inferences about them. Since the conjunctive sampler model possesses such a generalization mechanism, it is not limited in this regard. Thus, the conjunctive sampler model can utilize a simpler feature-based action set.
Feature-Based Model
The final model reflects our primary hypothesis for this study, which is that participants will represent the environment in terms the independent features that define states. That is, M is again square, but the rows and columns correspond to independent feature instances rather than conjunctive states. Since a given state transition comprises multiple feature transitions, multiple rows of M are updated on each trial. This parallelization is implemented by updating the full matrix M, weighted by Φ(s):

where Φ(s) is defined as a binarized vector representation of the state’s features, as in the basic SF learner.
The present study investigates an inductive bias towards causal transitions during predictive learning. We implemented this bias via matrix B, where each element 



We normalize B′ so that rows sum to 1, and then apply the bias on the temporal discounting term

Since true transitions were defined within semantic categories in the congruent condition, we hypothesized a semantic bias would direct participants learning towards the true transitions. However, since true transitions were defined between semantic categories in the incongruent condition, we hypothesized a semantic bias could not direct learning towards the true transitions. This pattern of results should be expressed as a higher fit b in the congruent condition compared to the incongruent condition.
Simulations
In the first set of simulations (Fig. 1), the feature-based model was simulated in each task environment 250 times, with random α and β values. Learning rate α was sampled from a uniform distribution bounded α =(0, 1). Inverse temperature β was sampled as 
For the second set of simulations (Fig. 6), we identified the feature-based models that provided best fits to participants’ data (Fig. 5A; n = 46; α: M = 0.7631, SD = 0.3486; β (sigmoid transformed): M = 0.6153, SD = 0.1520; b: M = 0.6652, SD = 0.3927). These models were simulated on the task in environments varying in dimensionality (d =2, d =4), causal depth (1-, 2-, 3-, 4-step), and number of training trials (72, 1080).
Model Fitting
The three computational models were separately fit to each participant’s choice data using maximum likelihood estimation. Given a set of parameter values, the softmax function estimates the probabilities of a participant’s sequence of choices. To fit each model, we computed the negative log likelihood from these probabilities 
To account for differences in model complexity during model comparison, we converted L to Akaike information criterion (AIC =2(k + L), where k is the number of free parameters), penalizing models with more free parameters. For each participant, the best fitting model was identified as the one that minimized the AIC.
Human Experiment
Participants
One-hundred and ten Amazon Mechanical Turk participants were recruited for the learning task via CloudResearch’s toolkit. Ten participants were excluded due to the preregistered criterion of composing a robot on less than 85% of trials. This left a final sample of 100 participants (mean age = 38.92; standard deviation age = 9.95; 31 female; 69 male). Participants received $13 plus a fixed bonus of $3 for completing the 60-minute task. The value of the bonus was not revealed until the end of the study. Consent was obtained in accordance with a protocol approved by the University of Chicago’s Institutional Review Board (protocol #: 20-1324).
Stimuli
The task stimuli were line drawings of robots. Each consisted of two features from four categories (antenna, head, body, arms – randomly mapped to task-relevant labels ABCD). During training, only AB targets appeared in the first block, and only CD targets appeared in the second block. Likewise, only AB/CD options appears in the first block of the congruent/incongruent conditions, and only CD/AB options appeared in the second block of congruent/incongruent conditions. During test, participants in both condition were shown robots with all combinations of feature categories (AB, AC, AD, BC, BD, CD). We label robots seen during training as “old” (AB, CD) and new robots only seen at test as “novel” (AC, AD, BC, BD; Fig. 2C). There were four feature instances per category (e.g., four head variations, four antenna variations, etc.). Half of these instances were “start” features composing start items (builder robots); the other half were “terminal” instances composing terminal items (output robots). Predictive associations were deterministic mappings from start to terminal instances/items (start→terminal; Fig. 2B). Each start item/instance was associated with a single terminal item/instance. In all, there were 16 feature instances (8 start, 8 terminal) composing 48 items (24 start, 24 terminal; 16 old; 32 novel).
Every robot had a “flipped” and “upright” version, which only differed by which feature instance was in the top versus bottom position within the stimulus. We randomized which of these two versions was presented for each robot display within and across trials to control for learning and inferences effects based on features’ spatial locations rather than their identities.
Training
Participants were instructed that they will play a game in which they will act as the owner of a robot factory. Their task is to build robots according to specifications. There are two types of robots: “output” robots (terminal items) – the end products the participant is trying to produce; and “builder” robots (start items) – robots that participants themselves compose from sets of builder parts, that in turn produce the output robots. On each trial, a “target” output robot is shown. Participants’ goal is to compose the builder that will produce this target. Thus, performance in this task centrally depended on accurate start→terminal predictive inferences.
Trials unfolded as follows (Fig. 2A). At the start of each trial, a fixation cross was shown before a target terminal item was presented in the center of the screen. Next, an options set of four builder parts (start instances) from two feature categories was shown along with the target. The participant used the cursor to select a feature from each category to compose a start item. As features were selected, the composed item was shown at the bottom of the screen. The two categories were presented on either side of the screen, however which category appeared on which side randomly varied from trial-to-trial. Additionally, the position of each instance within a category randomly varied from trial-to-trial. This spatial randomization ensured that participants’ choices reflected inferences based on features’ identities rather than their spatial locations. After composition, a start→terminal sequence was shown, starting with the composed start item, and terminating with the associated terminal item produced. Again, we randomized whether an “upright” or “flipped” version of each stimulus was shown to ensure learning was based on the features’ identities rather than their spatial locations. Finally, a point reward was presented. Reward was computed as the number of feature instances that matched between the produced and target terminal items. For these two-feature items, reward values could thus be 0, 1, or 2. To earn a mean reward greater than chance (chance reward = 1), participants needed to learn the start→terminal sequences to infer which start robot they should compose to produce the target. To encourage a response on each trial, a −5 reward was given if the participant did not select two features. There were 36 trials per block, separated by a 60 second break, producing 72 training trials in total.
We probed the specificity of inference by manipulating the rates as which specific target items were presented. Half the targets were presented at twice the rate of the other half of targets. However, features were presented at a uniform rate across these targets. We hypothesized that if participants made consistent inferences about each target, the rates at which they composed start items and observed specific start→terminal transitions would converge on the target rates. Our simulations demonstrate that this manipulation should manifest in differential performance for frequent versus infrequent targets only if the spurious transitions are learned (Fig. 1GH). We thus use differential performance by target frequency as a metric of learning specificity, which we expected to vary with the degree of causal bias.
Test
Test was the same as training, only both old and novel targets and options sets were presented, and participants did not receive feedback for their responses. Specifically, every possible combination of old and novel target and options were shown, producing 144 unique trials. By comparing choice performance on trials with old versus novel options or targets, we could probe the generalizability of participants’ learning. Target frequency was not manipulated at test, and the trial order was randomized. Test was split into two blocks of 72 trials, separated by a 60 second break.
Semantic Congruency Condition
A semantic bias on predictive learning would be expressed as upweighted learning of transitions between semantically-related features, and downweighted learning of transitions between semantically-unrelated features. To test whether a semantic bias shapes multidimensional predictive learning in humans, we implemented two between-subjects conditions in the robot task (Fig. 2B, n per condition = 50). In the semantic-congruent condition, start→terminal transitions were defined between feature instances from the same semantic category (e.g., start antennas always lead to terminal antennas). In the semantic-incongruent condition, start→terminal transitions were defined between feature instances from different semantic categories (e.g., start antennas may always lead to terminal arms). We hypothesized that a semantic bias would upweight learning of the semantically-congruent transitions. Thus, there should be better learning of the causal relative to spurious transitions in the congruent compared to the incongruent condition.
Causal and Spurious Transition Evidence
For the transition influence analysis, causal and spurious transition evidence variables were computed on each trial for each candidate composition as follows. A candidate composition s has two features. For each of these features, we found the most recent trial on which it was observed. We then record whether on that past trial the feature transitioned to either (or both) of the target features of the present trial. If a causal transition was observed, a count of 1 was added to the causal transition evidence for s. If a spurious transition was observed, a count of 1 was added to the spurious transition evidence for s.
For example, s may have features A1 and B1, and the target may have features A2 and B4. On the most recent trial A1 was seen, it may have transitions to a terminal item with features A2 and B4. Thus, the transitions including A1 are: (causal) A1→A2; (spurious) A1⇢B4. Both transitions would also be observed for s to the present target, and so a count of 1 is added to the causal and spurious evidence for s. On the most recent trial B1 was seen, it may have transitioned to a terminal item with features A2 and B2. Thus, the transitions including B1 are: (causal) B1→B2; (spurious) B1⇢A2. Only the spurious transition would also be observed for s to the present target, and so a count of 1 is added only to the spurious evidence for s. In all, the counts for s would be: causal = 1; spurious = 2. We finally divide these counts by the number of features in the composition (2) to create normalized evidence values: causals =0.5; spuriouss =1.
Lastly, we convert the transition evidence variables so that they are all relative to reference composition 0:

Bayesian Regression Models
The Bayesian regression models were fit using Markov chain Mente Carlo sampling via the Bambi python package (version 0.14.0)69. Four MCMC chains were each run for 2000 to 8000 iterations, with the first 1000 to 4000 iterations of each chain discarded as warm-up. This yielded a total of 4000 to 16000 posterior across the four chains. For all models, we assume Bambi’s default weakly informative priors70. All model formula, model fits, and MCMC sampling settings are reported in the supplementary materials.
Preregistration and Deviations
All experimental procedures and analyses were preregistered on July 7 2024 aside from those explicitly stated to be exploratory (link to preregistration: https://doi.org/10.17605/OSF.IO/2CY6R). However, the following deviations were made from that preregistration.
We preregistered a sample aged 18 to 35 years old. This age range was initially specified by the IRB at the University of Chicago, but was subsequently removed. To ease recruitment and attain a more diverse and representative sample, we therefore loosened the inclusion criterion to participants aged 18 years old and above.
We preregistered creating a single spurious and single causal transition influence measure per participant by averaging the three spurious and three causal transition influence coefficients fit in the multinomial logistic regression (Table S15). However, averaging these coefficients unnecessarily reduced the statistical power of subsequent regression models that included transition influence as a variable. To preserve this information for the subsequent regression models, we did not average over the coefficients. Instead, we included all coefficients as individual observations, with random effects terms specifying the logit comparison they came from (i.e., whether the coefficient was from the logit comparing 

We preregistered comparing generalization performance at test on all trials with old targets/options versus all trials with novel targets/options (Table S16). However, computational model simulations subsequently demonstrated that old and novel trials generally varied in the number of causal and spurious inferences they afforded, leading to differences in (1) possible reward earnings and (2) the degree to which alternative inferences can compete at choice. For example, on a trial in the congruent condition with options BC and target AB, only one causal transition (B→B) and one spurious (B→A) transition would have been seen during training. This means that, on this trial, a maximum of one reward could be received for inferring the single causal transition. Moreover, if a participant has learned the spurious transitions, an inference about B→A would interfere with the (correct) causal inference about B→B. On other trials, the ratio of spurious to causal transitions might be lower. For example, on a trial in the congruent condition with options AC and target AC, two causal transitions (A→A, C→C) can be inferred, but no spurious transitions can be inferred since A and C never co-occurred during training. This means the participant can get a maximum reward of two, and no spurious inference can interfere with the causal inferences. Thus, choice accuracy should be higher on this trial. Since novel options/targets have features from across the two training blocks, they generally limit the ability to make within-block spurious inferences. This may make performance appear superior on novel compared to old trials. To control for this, we therefore focus our generalization analysis on trials where only one causal and one spurious transition was afforded based on training observations.
We preregistered evaluating whether the feature-based model provided a better fit than the alternative models (conjunctive, conjunctive sampler, null) by fitting a logistic regression model that included as a depended variable whether each participant was best fit by the feature-based model (1) over any of the alternatives (0), and an intercept as the independent variable (Table S17). However, this analysis was limited in that it could tell us whether most participants were best fit by the feature-based model, but not whether the feature-based model fit best irrespective of whether most best fits were feature-based. This is problematic given the high rate of non-learners – the second largest proportion of participants were best fit by the null model. We therefore instead ran a multinomial logistic regression, including whether each participant was best fit by each of the alternative models (1) over the feature-based model (0) as the dependent variable, and an intercept as the independent variable.
The preregistered feature-based model differed from the model reported above. In the original model, bias B was also applied as a matrix of learning weights on the error term
δ. That is:

This meant that the bias affected both the discounting of successor features and overall learning rate, reducing the interpretability of the causal-bias parameter b. As a result, bno longer cleanly reflected the intended role of the inductive bias in shaping how, specifically, predictive associations are learned. To address this fundamental flaw to the interpretability of the model, we removed B as a learning weight matrix.
Stimulus and Task Availability
Task code and stimuli can be accessed at https://github.com/BakkourLabRepo/feat-predict_robot-builder-task
Computational Model Code Availability
Computational modelling code can be accessed at https://github.com/BakkourLabRepo/feat-predict_models
Analysis Code Availability
Simulation and human experiment analysis code can be accessed at https://github.com/BakkourLabRepo/feat-predict_analysis
Data availability
Simulation and human participants’ data can be accessed at https://osf.io/nqjy2/?view_only=72ad1546d2df460f9d8b574416c10644
Acknowledgements
We kindly thank Rain Liu for assistance with preliminary analyses during the conceptualization of this work. This work was funded by award #2342775 from the U.S. National Science Foundation.
Additional information
Contributions
Conceptualization, E.P. and A.B.; Experiment Design, E.P. and A.B.; Computational Modeling, E.P.; Data collection, E.P.; Data Analysis, E.P.; Writing, E.P. and A.B.; Funding, A.B.
Funding
National Science Foundation (2342775)
Additional files
References
- 1.What Is a Cognitive Map? Organizing Knowledge for Flexible BehaviorNeuron 100:490–509Google Scholar
- 2.Goals and Habits in the BrainNeuron 80:312–325Google Scholar
- 3.Learning Structures: Predictive Representations, Replay, and GeneralizationCurr. Opin. Behav. Sci 32:155–166Google Scholar
- 4.Structuring Knowledge with Cognitive Maps and Cognitive GraphsTrends Cogn. Sci 25:37–54Google Scholar
- 5.Cognitive maps in rats and menPsychol. Rev 55:189–208Google Scholar
- 6.Place Cells, Grid Cells, and MemoryCold Spring Harb. Perspect. Biol 7:a021808Google Scholar
- 7.Inferences on a multidimensional social hierarchy use a grid-like codeNat. Neurosci 24:1292–1301Google Scholar
- 8.The Hippocampus Encodes Distances in Multidimensional Feature SpaceCurr. Biol 29:1226–1231Google Scholar
- 9.The Hippocampus Maps Concept Space, Not Feature SpaceJ. Neurosci 40:7318–7325Google Scholar
- 10.Organizing conceptual knowledge in humans with a gridlike codeScience 352:1464–1468Google Scholar
- 11.The Tolman-Eichenbaum Machine: Unifying Space and Relational Memory through Generalization in the Hippocampal FormationCell 183:1249–1263Google Scholar
- 12.How to build a cognitive map: insights from models of the hippocampal formationarXiv Google Scholar
- 13.Representations of Temporal Community Structure in Hippocampus and Precuneus Predict Inductive Reasoning DecisionsbioRxiv https://doi.org/10.1101/2021.10.12.462707Google Scholar
- 14.Neural representations of events arise from temporal community structureNat. Neurosci 16:486–492Google Scholar
- 15.Reinforcement Learning: An IntroductionCambridge, Massachusetts: The MIT Press Google Scholar
- 16.Predictive representations: building blocks of intelligencehttp://arxiv.org/abs/2402.06590
- 17.Improving Generalisation for Temporal Difference Learning: The Successor RepresentationNeural Comput 5:613–624Google Scholar
- 18.The Successor Representation: Its Computational Logic and Neural SubstratesJ. Neurosci 38:7193–7200Google Scholar
- 19.Successor Features for Transfer in Reinforcement LearningIn: 31st Conference on Neural Information Processing Systems Google Scholar
- 20.Predictive maps in rats and humans for spatial navigationCurr. Biol 32:3676–3689Google Scholar
- 21.The successor representation in human reinforcement learningNat. Hum. Behav 1:680–692Google Scholar
- 22.Predictive Representations in Hippocampal and Prefrontal HierarchiesJ. Neurosci 42:299–312Google Scholar
- 23.Neurobiological successor features for spatial navigationHippocampus 30:1347–1355Google Scholar
- 24.Predictive representations can link model-based reinforcement learning to model-free mechanismsPLOS Comput. Biol 13:e1005768Google Scholar
- 25.Neural Evidence for the Successor Representation in Choice EvaluationbioRxiv https://doi.org/10.1101/2021.08.29.458114Google Scholar
- 26.The hippocampus as a predictive mapNat. Neurosci 20:1643–1653Google Scholar
- 27.How multiple causes combine: independence constraints on causal inferenceFront. Psychol 6Google Scholar
- 28.Neural Correlates of Causal ConfoundingJ. Cogn. Neurosci 32:301–314Google Scholar
- 29.Inductive biases for deep learning of higher-level cognitionProc. R. Soc. Math. Phys. Eng. Sci 478:20210068Google Scholar
- 30.Structured statistical models of inductive reasoningPsychol. Rev 116:20–58Google Scholar
- 31.Generalization in Interactive Networks: The Benefits of Inhibitory Competition and Hebbian LearningNeural Comput 13:1199–1241Google Scholar
- 32.Inductive biases in theory-based reinforcement learningCognit. Psychol 138:101509Google Scholar
- 33.How to Grow a Mind: Statistics, Structure, and AbstractionScience 331:1279–1285Google Scholar
- 34.The relational bottleneck as an inductive bias for efficient abstractionTrends Cogn. Sci 28:829–843Google Scholar
- 35.A critique of pure learning and what artificial neural networks can learn from animal brainsNat. Commun 10:3770Google Scholar
- 36.An inductive bias for slowly changing features in human reinforcement learningPLOS Comput. Biol 20:e1012568Google Scholar
- 37.The Need for Biases in Learning GeneralizationsRutgers University Google Scholar
- 38.The neurobiology of semantic memoryTrends Cogn. Sci 15:527–536Google Scholar
- 39.Why There Are Complementary Learning Systems in the Hippocampus and Neocortex: Insights From the Successes and Failures of Connectionist Models of Learning and MemoryPsychol Rev 102:419–457Google Scholar
- 40.Causality: Models, Reasoning, and InferenceCambridge University Press https://doi.org/10.1017/CBO9780511803161Google Scholar
- 41.What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory UpdatedTrends Cogn. Sci 20:512–534Google Scholar
- 42.Complementary Learning SystemsCogn. Sci 38:1229–1248Google Scholar
- 43.How to build a cognitive mapNat. Neurosci 25:1257–1272Google Scholar
- 44.Complementary learning systems within the hippocampus: a neural network modelling approach to reconciling episodic memory with statistical learningPhilos. Trans. R. Soc. B Biol. Sci 372:20160049Google Scholar
- 45.A neural network model of hippocampal contributions to category learningeLife 12:e77185https://doi.org/10.7554/eLife.77185Google Scholar
- 46.A role for hilar cells in pattern separation in the dentate gyrus: A computational approachHippocampus 19:321–337Google Scholar
- 47.GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language ModelsarXiv https://doi.org/10.48550/arXiv.2303.10130Google Scholar
- 48.Executive Order No. 88,210: Safe, Secure, and Trustworthy Development and Use of Artificial IntelligenceExecutive Office of the President :75191Google Scholar
- 49.Artificial Intelligence Index Report 2025Artif. Intell Google Scholar
- 50.Regulation 2024/1689: Artificial Intelligence ActEur. Parliam. Counc. Eur. Union Google Scholar
- 51.On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency ACM, Virtual Event Canada pp. 610–623https://doi.org/10.1145/3442188.3445922Google Scholar
- 52.Quantifying the Carbon Emissions of Machine LearningarXiv https://doi.org/10.48550/arXiv.1910.09700Google Scholar
- 53.Green AICommun. ACM 63:54–63Google Scholar
- 54.Energy and Policy Considerations for Deep Learning in NLPIn: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics Association for Computational Linguistics pp. 3645–3650https://doi.org/10.18653/v1/P19-1355Google Scholar
- 55.Scaling Laws for Autoregressive Generative ModelingarXiv https://doi.org/10.48550/arXiv.2010.14701Google Scholar
- 56.Scaling Laws for Neural Language ModelsarXiv https://doi.org/10.48550/arXiv.2001.08361Google Scholar
- 57.Reinforcement Learning, Fast and SlowTrends Cogn. Sci 23:408–422Google Scholar
- 58.The hippocampus supports deliberation during value-based decisionseLife 8:e46080https://doi.org/10.7554/eLife.46080Google Scholar
- 59.What Are Memories For? The Hippocampus Bridges Past Experience with Future DecisionsTrends Cogn. Sci 24:542–556Google Scholar
- 60.Reminders of past choices bias decisions for reward in humansNat. Commun 8:15958Google Scholar
- 61.Reinstated episodic context guides sampling-based decisions for rewardNat. Neurosci 20:997–1003Google Scholar
- 62.Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative FrameworkAnnu. Rev. Psychol 68:101–128Google Scholar
- 63.Hippocampal Contributions to Control: The Third WayIn: Advances in Neural Information Processing Systems Google Scholar
- 64.Decision Making and Sequential Sampling from MemoryNeuron 90:927–939Google Scholar
- 65.Integrating memories to guide decisionsCurr. Opin. Behav. Sci 5:85–90Google Scholar
- 66.Mixing memory and desire: How memory reactivation supports deliberative decision-makingWIREs Cogn. Sci 13Google Scholar
- 67.Associative Retrieval Processes in Episodic MemoryIn: Learning and Memory: A Comprehensive Reference Elsevier pp. 1–24https://doi.org/10.1016/B978-012370509-9.00185-6Google Scholar
- 68.A theory of memory retrievalPsychol. Rev 85:59–108Google Scholar
- 69.Bambi : A Simple Interface for Fitting Bayesian Linear ModelsPython. J. Stat. Softw 103Google Scholar
- 70.Statistical details of the default priors in the Bambi libraryarXiv https://doi.org/10.48550/ARXIV.1702.01201Google Scholar
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.108664. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2025, Euan Prentis & Akram Bakkour
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 0
- downloads
- 0
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.