Human Exploration Strategically Balances Approaching and Avoiding Uncertainty

Yaniv Abir; Michael N. Shadlen; Daphna Shohamy

doi:10.7554/eLife.94231.1

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Reviewing Editor
Claire Gillan
Trinity College Dublin, Dublin, Ireland
Senior Editor
Christian Büchel
University Medical Center Hamburg-Eppendorf, Hamburg, Germany

Reviewer #1 (Public Review):

This manuscript reports on the behavior of participants playing a game to measure exploration. Specifically, participants completed a task with blocks of exploratory choices (choosing between two 'tables', and within each table, two 'card decks', each of which had a specific probability of showing cards with one color versus another) and test choices, where participants were asked to choose which of the two decks per table had a higher likelihood of one color. Blocks differed on how long (how many trials) the exploration phase lasted. Participants' choices were fit to increasingly complex models of next-trial exploration. Participants' choices were best fit by an intermediate model where the difference in uncertainty between tables influenced the choice. Next, the authors investigated factors affecting whether participants sought out or avoided uncertainty, their choice reaction times, and the relationship of these measures with performance during the test phase of each block. Participants were uncertainty-seeking (exploratory) under most levels of overall uncertainty but became less uncertainty-seeking at high levels of total uncertainty. Participants with a stronger tendency to approach uncertainty at lower levels of total uncertainty were more accurate in the test phase, while the tendency to avoid uncertainty when total uncertainty was high was also weakly positively related to test accuracy. In terms of reaction times, participants whose reaction times were more related to the level of uncertainty, and who deliberated longer, performed better. The individual tendency to repeat choices was related to avoidance of uncertainty under high total uncertainty and better test performance. Lastly, choices made after a longer lag were less affected by these measures.

The authors note that their paradigm, which does not provide immediate rewarding feedback, is novel. However, the resulting behavior appears similar to other exploratory learning tasks, so it's unclear what this task design adds - besides perhaps showing that exploratory behavior is similar across types of reward environments. Several papers have shown that cognitive constraints modulate exploration (PMIDs: 30667262, 24664860, 35917612, 35260717); although this paper provides novel insights, it does not situate its findings in the context of this prior literature. As a result, what it adds to the literature is difficult to discern.

Other methodological questions include whether the same model provides the best fit for all participants and whether possible individual differences in models used relate to individual differences in exploration and performance; how some analyses were carried out that currently lack sufficient detail in the manuscript; and how the two stages of choice behavior (tables versus card decks) were accounted for in the analyses.

https://doi.org/10.7554/eLife.94231.1.sa1

Reviewer #2 (Public Review):

Summary:
This paper focuses on an interesting question that has puzzled psychologists for decades, that is, why do people demonstrate a mix of uncertainty approach and avoidance behavior, given the fact that reducing uncertainty could always gain information and seems beneficial? This paper designed a novel task to demonstrate behavioral signatures of uncertainty approaching and avoidance during the exploration phase within the same task at both a within-subject and between-subject level. On the algorithmic level, this paper compared four different implementations of uncertainty-guided exploration and found that the model sensitive to relative uncertainty provides the best fit for human behavior compared to its counterparts using expected information gain or past exposure. This paper then links people's uncertainty attitude with accuracy and finds that uncertainty avoidance during exploration does not impair task performance, implying that uncertainty avoidance may be the output of a resource-rational decision-making process. To examine this account, this paper uses reaction time as an independent proxy of costly deliberation and shows that people deliberate shorter when engaging in repetitive choice, which presumably saves cognitive resources. Finally, the paper shows that people's tendency to engage in repetitive choice correlates with their tendency to avoid uncertainty, which supports the argument that avoiding uncertainty could be a strategy developed under the constraint of limited cognitive resources.

Strengths:
One of the highlights of this paper, as mentioned in the previous paragraph, is that the authors can establish the existence of the uncertainty approach and avoidance behavior within the same task whereas previous work usually focuses on one of them. This dissociation allows the authors to examine what situational factor is related to the emergence of the act of avoiding uncertainty, and extract parameters describing participants' attitude towards uncertainty during baseline as well as during situations where uncertainty avoidance is more common. Besides documenting the existence of uncertainty avoidance behavior, this paper also tried to explain this behavior by proposing under the resource rational framework and has carefully quantified different aspects (e.g., accuracy; choice speed) of participants' behavior as well as examined their relationships. Though more experiments are needed to fully understand human uncertainty avoidance behavior, this paper has provided both empirical and theoretical contributions toward a mechanistic understanding of how people balance approaching and avoiding uncertainty.

Weaknesses:
I have a couple of concerns related to this paper. First, there seems to exist an anti-correlation between total uncertainty and absolute relative uncertainty (Figure 5 panel C, \delta uncertainty is restricted to a small range when total uncertainty is high). It seems to be a natural product of the exploration process since the high total uncertainty phase is usually the period where the participant knows little about either option, leading to a less distinguishable relative uncertainty. However, it remains unknown whether the documented uncertainty avoidance still applies when extrapolating to larger absolute relative uncertainty. It would be great if the experiment allows for a manipulation of uncertainty in the middle of the experiment (e.g., introducing a new deck/informing that one deck has been updated). Relatedly, the current 'threshold' of uncertainty avoidance behavior, if I understand correctly, is found by empirically fitting participants' data. This brings the question: can we predict when people will demonstrate uncertainty avoidance behavior before collecting any data? Or, is it possible that by measuring some metrics related to cognitive cost sensitivity, we could predict the proportion of choices that participants will show uncertainty-avoidant behavior? Finally, regarding the analysis of different behavior patterns in the game, it seems that the authors try to link repetitive behavior, uncertainty attitude, and accuracy together by testing the correlation between the two of them. I wonder whether other multivariate statistical methods e.g., mediation analysis, will be better suited for this purpose.

https://doi.org/10.7554/eLife.94231.1.sa0

Human Exploration Strategically Balances Approaching and Avoiding Uncertainty

Peer review process

Editors

Be the first to read new articles from eLife