The value of initiating a pursuit in temporal decision-making

  1. Elissa Sutlief
  2. Charlie Walters
  3. Tanya Marton  Is a corresponding author
  4. Marshall G Hussain Shuler  Is a corresponding author
  1. Department of Neuroscience, Johns Hopkins University School of Medicine, United States
  2. Kavli Neuroscience Discovery Institute, Johns Hopkins University, United States
  3. Microsoft, United States
  4. The Department of Neuroscience, Johns Hopkins University School of Medicine, United States
21 figures, 2 tables and 1 additional file

Figures

Fundamental classes of temporal decision-making regarding initiating a pursuit: ‘Forgo’ and ‘Choice’.

(A, B) (Top row) Topologies. The temporal structure of worlds exemplifying Forgo (A) and Choice (B) decisions mapped as their topologies. Forgo: A Forgo decision to accept or reject the purple …

Global reward rate with respect to parceling the world into ‘in’ and ‘outside’ the considered pursuit.

(A-C) as in Figure 1 ‘Forgo’. Conventions as in Figure 1. (D) The world divided into ‘Inside’ and ‘Outside’ the purple pursuit type, as the agent decides whether to forgo or accept. The axes are …

Figure 3 with 2 supplements
Forgo decision-making.

(A) When the reward rate of the considered pursuit (slope of the purple line) exceeds that of its outside rate (slope of gold), the global reward rate (slope of magenta) will be greater than the …

Figure 3—figure supplement 1
Forgo world with n-pursuits occurring at the same frequency.

An agent experiencing a Forgo decision making world (topology as in (A), left) where n-pursuits of varying times and rewards (purple, aqua, orange) occurring at the same frequency from the default …

Figure 3—figure supplement 2
Forgo world with n-pursuits occurring at different frequencies.

An agent experiencing a Forgo decision making world (topology as in (A), left) where n-pursuits of varying times and rewards (purple, aqua, orange) occurring at different frequency from the default …

The subjective value (sv) of a pursuit is the global reward-rate-equivalent immediate reward magnitude.

The subjective value (green bar) of a pursuit is that amount of reward requiring no investment of time that the agent would take as equivalent to a policy of accepting and acquiring the considered …

Equivalent expressions for subjective value reveal time’s cost comprises an apportionment as well as opportunity cost.

(A) The subjective value of a pursuit can be expressed in terms of the global reward rate obtained under a policy of accepting the pursuit. rin = 4, tin = 4, rout = .7, tout = 3. (B) The cost of …

The impact of outside reward on the subjective value of a pursuit.

(A) The subjective value (green dot) of a considered pursuit type (purple) in the context of its ‘outside’ (gold) is the resulting global reward rate vector’s (magenta) intersection of the y-axis in …

The impact of the apportionment cost of time on the subjective value of a pursuit.

(A) The apportionment cost of time can best be illustrated dissociated from the contribution of the opportunity cost of time by considering the special instance in which the outside has no net …

The effect of changing the outside time and thus the outside reward rate, on the subjective value of a pursuit.

(A) The subjective value (green dot) of the considered pursuit, when (B) changing the outside time, and thus, outside reward rate (green dots). (C) As outside time increases under these conditions …

Policy options considered during the initiation of pursuits in worlds with a ‘Choice’ topology.

(A-C) Choice topology, and policies of choosing the smaller-sooner or larger-later pursuit, as in Figure 1 ‘Choice’. (D) The world divided into ‘Inside’ and ‘Outside’ the selected pursuit type, as …

The effect of increasing outside reward on subjective value in choice decision-making.

The effect of increasing the outside reward while holding the outside time constant is to linearly increase the cost of time, thus decreasing the subjective value of pursuits considered in choice …

Effect of apportionment cost on subjective value in Choice decision-making.

The effect of apportionment cost can be isolated from the effect of opportunity cost by increasing the outside time while holding outside rate constant. Doing so results in decreasing the …

Effect of varying outside time and outside reward rate.

The effect of increasing the outside time while maintaining outside reward is to decrease the apportionment as well as the opportunity cost of time, thus increasing pursuit’s subjective value. …

The temporal discounting function of a global reward-rate-optimal agent is a hyperbolic function relating the apportionment and opportunity cost of time.

(A-C) The effect, as exemplified in three different worlds, of varying the outside time and reward on the subjective value of a pursuit as its reward is displaced into the future. The subjective …

Reward-rate-maximizing agents would exhibit the ‘Delay Effect’.

A ‘switch’ in preference from a SS―when the delay to the pursuits is relatively short (upper left)―to a LL pursuit, when the delay to the pursuits is relatively long (upper right), would occur as a …

Reward-rate-maximizing agents would exhibit the ‘Magnitude effect’.

(A, B) The global reward rate (the slope of magenta vectors) that would be obtained when acquiring a considered pursuit’s reward of a given size (either relatively large as in A or small as in B) …

Reward-rate-maximizing agents would exhibit the ‘Sign effect’.

(A, B) The global reward rate (the slope of magenta lines) that would be obtained when acquiring a considered pursuit’s outcome of a given magnitude but differing in sign (either rewarding as in A, …

Relationship between outside time and reward with optimal temporal decision-making behavioral transitions.

An agent may be presented with three decisions: the decision to take or forgo a smaller, sooner reward of 2.5 units after 2.5 s (SS pursuit), the decision to take or forgo a larger, later reward of …

Patterns of suboptimal temporal decision-making behavior resulting from time and/or reward misestimation.

Patterns of temporal decision-making in Choice and Forgo situations deviate from optimal (top row) under various parameter misestimations (subsequent rows). Characterization of the nature of …

The cost of time of a pursuit comprises both an opportunity as well as an apportionment cost.

The global reward rate under a policy of accepting the considered pursuit type (slope of magenta time), times the time that pursuit takes (tin), is the pursuit’s time’s cost (height of maroon bar). …

Comparison of typical hyperbolic discounting versus apparent discounting of a reward-rate-optimal agent.

Whereas (A) the curvature of hyperbolic discounting models is typically controlled by the free fit parameter k, (B) the curvature and steepness of the apparent discounting function of a …

The Malapportionment Hypothesis.

(A-E) Solid lines indicate true reward-rate maximizing values. Dashed lines indicate those of an agent described by the Malapportionment Hypothesis that underweights the apportionment of time …

Tables

Table 1
Definitions for misestimating global reward rate-enabling parameters.

Each misestimated variable (column 1) is multiplied by an error term, ω, to give ρ^g, the misestimated global reward rate (column 2). When ω=(0,1) the variable is underestimated, when ω=(1,2) the variable is …

Misestimated VariableMisestimated Global Reward Rate
True (No Misestimation)ρg=rin+ρouttouttin+tout
Outside Timeρ^g=rin+ρouttouttin+ωtout
Outside Rewardρ^g=rin+ωρouttouttin+tout
Outside Time and Reward
(maintaining ρout)
ρ^g=rin+ωρouttouttin+ωtout
Inside Timeρ^g=rin+ρouttoutωtin+tout
Inside Rewardρ^g=ωrin+ρouttouttin+tout
Inside Reward and Time
(maintaining ρin)
ρ^g=ωrin+ρouttoutωtin+tout
Table 2
Opportunity cost, apportionment cost, time cost, and subjective value functions by change in outside and inside reward and time.

Functions assume positive inside and outside rewards and times.

RewardTime
OutsideInsideOutsideInside
Opportunity Cost*Linear
Positive slope
No EffectHyperbolic
Negative slope
Linear
Positive slope
Apportionment CostLinear
Negative slope
Linear
Positive slope
Hyperbolic - Hyperbolic
Negative slope
Hyperbolic - Linear
Negative slope
Time’s CostLinear
Positive slope
Linear
Positive slope
Hyperbolic
Negative slope
Hyperbolic
Positive slope
Subjective ValueLinear
Negative slope
Linear
Positive slope
Hyperbolic
Positive Slope
Hyperbolic
Negative slope
  1. *

    If outside reward rate is zero, opportunity cost becomes a constant at zero.

  2. If outside reward rate is zero, as outside or inside time is varied, apportionment cost becomes purely hyperbolic.

Additional files

Download links