Comparison of the behavior of trained RNNs and monkeys. (A) Schematic of RNN training setup. In a trial, network makes a choice in response to a cue. Then, a feedback input, determined by the choice and reward outcome, is injected to the network. This procedure is repeated across trials. (B) Example of a trained RNN’s choice outcomes. Vertical bars show RNN choices in each trial and the reward outcomes (magenta: choice A, blue: choice B, light: rewarded, dark: not rewarded). Horizontal bars on the top show reward schedules (magenta: choice A receiving reward is 70%, choice B receiving reward is 30%; blue: reward schedule is reversed). Black curve shows the RNN output. Green horizontal bars show the posterior of reversal probability at each trial inferred using Bayesian model. (C) Probability of choosing the initial best option. Relative trial indicates the trial number relative to the behavioral reversal trial inferred from the Bayesian model. Relative trial number 0 is the trial at which the choice was reversed. (D) Fraction of no-reward blocks as a function of relative trial. Dotted lines show 0.3 and 0.7. (E) Distribution of RNN’s and monkey’s reversal trial, relative to the experimentally scheduled reversal trial.