Example and summary 3-state MoA-HMMs fit to rats in the two-step task using a population prior (see methods). The states are manually ordered based on three properties: the first state is chosen as the one with the highest initial state probability (blue diamond), the remaining two states are then ordered second and third (orange and green diamonds, respectively) in order of the weight they give to MBr (orange higher). A Example 3-state MoA-HMM on a single rat. (i) Agent weights split by hidden state. (ii) Initial state probability. (iii) State transition probability matrix. (iv) The expected hidden state calculated from the forward-backward algorithm averaged (mean) across sessions, with error bars as 95% confidence intervals around the mean. B Summary of all 3-state MoA-HMM fits across the population of 20 rats. States were sorted in the same way as (A). (i) Agent weights split by hidden state, where individual dots represent a single rat, light grey lines connect weights within a rat, bars represent the median weight over rats, and error bars are bootstrapped 95% confidence intervals around the median. Distribution of initial state probabilities, with each dot as an individual rat. (iii) Distribution of state transition probabilities, with each panel representing a single (t − 1) → (t) transition. (iv) Session-averaged expected state computed as in (Aiv), where light lines are average probabilities for individual rats and the dark solid lines are the population mean with 95% confidence intervals around the mean. *:p<0.05, ***:p<E-4
Figure 5—figure supplement 1. (A) Learning rates fit for each agent (i) corresponding to the example rat shown in Figure 5A and (ii) summarizing each learning rate over the population of rats. Each dot is an individual rat, bars represent the median, and errorbars are bootstrapped 95% confidence intervals around the median. (B) Three example sessions showing the inferred state likelihood on each trial from the example rat shown in Figure 5A. (C) Cross correlation between left choices and reward probabilities for the common outcome port given that choice (gray). Left choices are highly correlated to left-outcome reward blocks, with the peak correlation at a slight lag (vertical dashed line) indicating the trial at which the rat detects the reward probability flip. To test whether the latent states track reward flips, the cross correlation is also shown between left-outcome reward probability and the likelihood of each state: initial state (blue), the remaining state with a more rightward choice bias (orange), and the remaining state with a more leftward bias (green). These correspond directly to states 1-3 in the example rat (i) whose model is shown in Figure 5A. while other rats had states 2 and 3 assigned according to their individual choice biases.
Figure 5—figure supplement 2. A second 3-state MoA-HMM example and comparison to a GLM-HMM. States identified by a MoA-HMM and GLM-HMM are highly similar. A Example 3-state MoA-HMM model parameters with (i) agent weights split by state, (ii) each agent’s learning rate, the initial state probability, and (iv) the state transition matrix. B 3-state GLM-HMM fit to the same rat as (A). (i) GLM-HMM regression weights for each state. Each state is described by four types of regressors indicating four possible trial types – common-reward, common-omission, rare-reward and rare-omission – and choice direction for up to 5 previous trials, giving 20 parameters per state. Each state additionally had a bias term (ii), leading to a total of 63 model weights for a 3-state model. (iii) GLM-HMM initial state probabilities also identify a prominent initial state. (iv) GLM-HMM transition matrix closely matches MoA-HMM. C Expected state probabilities for MoA-HMM (i) averaged across all sessions and (ii) highlighted example sessions. D Expected state probabilities for GLM-HMM (i) averaged across all sessions and (ii) highlighted example sessions. Temporal structure is highly similar to MoA-HMM. E Cross-correlation between the expected state probabilities inferred from the MoA-HMM and GLM-HMM (i.e. panels Cii and Dii) across all sessions. Each dot is an individual rat, black circles are medians and error bars are 95% confidence intervals around the median.
Figure 5—figure supplement 3. Two example 4-state MoA-HMM fits corresponding to 3 state fits from (A) Figure 5A and (B) Figure 5-figure supplement 2. States are ordered according to the initial state probability (Aii and Bii) and the transition probabilities to most-likely states that follow (Aiii and Biii). Initial states are generally consistent with the 3-state fits, and the way the remaining two states split into three states is more idiosyncratic. For example, (A) suggests state 3 from the smaller model is split into two states (iv) that differ by bias (i), while (B) suggest the additional state 4 draws from both the smaller model’s states 2 and 3 (iv), and the state with largest MBr state no longer directly follows the initial state (i).