All components that are used to model choice at trial 21 are marked in orange. The sequence of choices for this participant was [1 1 1 2 1 1 1 2 2 1 2 2 1 1 1 2 2 2 2 1], and the payout for these choices was [1 1 0 0 1 1 0 1 0 0 0 0 1 1 0 0 1 1 0 1]. According to the participant’s individually fitted model parameters (ω = 0.72; λ = 0.28), and following this sequence of choices and outcomes, the beta distributions defining the subjective value of the bandits were and (see Equations 9–11, Materials and methods) at choice of trial 21. The expected value for each bandit was defined as the mean of the beta distribution (Q1 = 0.65, Q2 = 0.42; see Equation 7, Materials and methods). The variance of the unchosen option was equal to the variance of bandit 2, which was not chosen on trial 20 (Vuc = 0.05, see Equation 8, Materials and methods). Variance is schematically represented as a dotted line (note that this is an approximation because the beta distributions are not symmetrical). The 2-d plot shows the joint distribution P(,) where values of are along the x-axis and along the y-axis. Confidence was calculated based on the values of the distributions at choice on the previous trial. C1 was defined as the probability that a random sample drawn from at the time of choice at trial 20 was greater than a sample drawn from (shaded area below the diagonal, as > there. C1 = 0.56, Materials and methods Equation 15). C2 could be defined as 1-C1 (shaded area above the diagonal, C2 = 0.44, Equation 16, Materials and methods). Crel was equivalent to Cchosen – Cunchosen, in this case C1-C2 (Crel = 0.12, Equation 17). This relative confidence was scaled by and then added to the action that was not chosen on the previous trial (in this case bandit 2).