1. Neuroscience
Download icon

A flexible framework for simulating and fitting generalized drift-diffusion models

  1. Maxwell Shinn
  2. Norman H Lam
  3. John D Murray  Is a corresponding author
  1. Department of Psychiatry, Yale University, United States
  2. Interdepartmental Neuroscience Program, Yale University, United States
  3. Department of Physics, Yale University, United States
Tools and Resources
  • Cited 0
  • Views 1,419
  • Annotations
Cite this article as: eLife 2020;9:e56938 doi: 10.7554/eLife.56938

Abstract

The drift-diffusion model (DDM) is an important decision-making model in cognitive neuroscience. However, innovations in model form have been limited by methodological challenges. Here, we introduce the generalized drift-diffusion model (GDDM) framework for building and fitting DDM extensions, and provide a software package which implements the framework. The GDDM framework augments traditional DDM parameters through arbitrary user-defined functions. Models are solved numerically by directly solving the Fokker-Planck equation using efficient numerical methods, yielding a 100-fold or greater speedup over standard methodology. This speed allows GDDMs to be fit to data using maximum likelihood on the full response time (RT) distribution. We demonstrate fitting of GDDMs within our framework to both animal and human datasets from perceptual decision-making tasks, with better accuracy and fewer parameters than several DDMs implemented using the latest methodology, to test hypothesized decision-making mechanisms. Overall, our framework will allow for decision-making model innovation and novel experimental designs.

Introduction

The drift-diffusion model (DDM) is an important model in cognitive psychology and cognitive neuroscience, and is fundamental to our understanding of decision-making (Ratcliff, 1978; Bogacz et al., 2006). The DDM explains both choice and response time (RT) behavior across a wide range of tasks and species, and has proven to be an essential tool for studying the neural mechanisms of decision-making (Gold and Shadlen, 2007; Forstmann et al., 2016; Ratcliff et al., 2016). As a sequential sampling model for two-alternative forced-choice tasks, it assumes the subject obtains a continuous stream of evidence for each alternative throughout the course of an experimental trial. Evidence consists of some true underlying signal (drift) in addition to neural and environmental noise (diffusion). According to the DDM, once a decision variable tracking the difference in evidence between the two alternatives is sufficiently large, i.e. when the decision variable reaches some fixed upper or lower bound, a choice is made at the moment of bound crossing, thereby modeling for the decision both choice and RT. In its simplest form, the DDM includes four parameters: a drift rate, representing evidence; a diffusion constant or bound height representing noise or response caution; a starting position, representing side bias and often fixed at zero; and a non-decision time, representing afferent and efferent delays but external to the DDM process (Ratcliff, 1978). These models can be fit to choice and response time data.

In order to test new hypotheses about decision-making processes, there is great interest in developing extensions to the DDM. For example, the three-parameter DDM is known to be sub-optimal with respect to maximizing reward rate in standard block-design tasks when evidence strength varies unpredictably (Malhotra et al., 2018; Evans and Hawkins, 2019). Therefore, it has been hypothesized that decision-making is governed by an urgency signal which increases the probability of response over the course of a trial, implemented as either time-varying evidence gain (Cisek et al., 2009; Thura et al., 2012; Murphy et al., 2016; Ditterich, 2006b; Drugowitsch et al., 2014a) or collapsing integration bounds (Churchland et al., 2008; Hawkins et al., 2015a; Ditterich, 2006b; Drugowitsch et al., 2012). Additionally, the DDM may be extended in order to model more diverse experimental paradigms. For example, evidence which changes in magnitude throughout a single trial violates the DDM’s assumption that drift rate is fixed for a single trial. Modeling such tasks requires DDM extensions to support time-varying drift rate. We will show that these and most other extensions can be encapsulated within a single generalized DDM (GDDM). The GDDM provides a common mathematical language for discussing DDM extensions, and as we will show, supports an efficient framework for simulating models and fitting them to data.

There are three main methods for obtaining response time distributions from the GDDM. First, analytical solutions have a fast execution time, but only exist for special cases of the GDDM. While efficient, most GDDMs do not permit analytical solutions, and thus analytical solutions are only common for the DDM. Several packages exist for analytically solving DDMs and a limited number of extensions (Wiecki et al., 2013; Wagenmakers et al., 2007; Grasman et al., 2009; Vandekerckhove and Tuerlinckx, 2008; Voss and Voss, 2007; Drugowitsch, 2016; Tavares et al., 2017; Millner et al., 2018; Srivastava et al., 2017). A second method, trial-wise trajectory simulation, fully supports GDDMs for flexibility in model design (Chandrasekaran and Hawkins, 2019). However, this method is inefficient, and shows limited precision and computational efficiency. We will show that this in practice impacts the ability to fit data to RT distributions. To our knowledge, the only software package which currently exists for simulating GDDMs uses trial-wise trajectory simulation (Chandrasekaran and Hawkins, 2019).

A third and better method is to iteratively propagate the distribution forward in time, which is achieved by solving the Fokker-Planck equation (Voss and Voss, 2008) or using the method of Smith, 2000. These methods allow the RT distribution of GDDMs to be estimated with better performance and lower error than trial-wise trajectory simulation. However, they are difficult to implement, which has thus far impeded their widespread use in cognitive psychology and neuroscience. Currently, GDDMs without known analytical solutions are best implemented using home-grown code (e.g. Ditterich, 2006a; Zylberberg et al., 2016; Hawkins et al., 2015b; Brunton et al., 2013), which may not utilize the latest methodology or, as emphasized by Ratcliff and Tuerlinckx, 2002, could yield misleading results. Progress is limited by the ability to simulate and fit extensions to the DDM. A common open-source software package for efficiently solving GDDMs would help promote code sharing and reuse, and reduce the amount of time which must be spent implementing diffusion models. Researchers need a general, highly-efficient toolbox for building, solving, and fitting extended DDMs.

Here, we formally define and evaluate the GDDM framework for solving and fitting extensions to the DDM, and introduce the PyDDM (https://github.com/murraylab/PyDDM) software package as an implementation of this framework. We show that a GDDM with scientifically-interesting cognitive mechanisms can capture experimental data from a perceptual decision-making task in an animal dataset better than alternative models, while still providing an excellent fit to human data. Then, we show the execution time vs error tradeoff induced by different numerical methods, and demonstrate the GDDM RT distributions may be estimated with short runtime and low error. In particular, GDDMs may utilize the Crank-Nicolson and backward Euler methods to numerically solve the Fokker-Planck equation, which we will show to be highly efficient compared to alternative methods, and which permits fitting with full-distribution maximum likelihood. Solving the Fokker-Planck equation is analogous to simulating an infinite number of individual trials. The GDDM together with its associated fitting methodology comprise the GDDM framework. Finally, we compare our implementation of the GDDM framework to other software packages for diffusion modeling. In addition to being the only package to support the GDDM framework, we show PyDDM has a number of technical advantages, such as near-perfect parallel efficiency, a graphical user interface for exploring model forms, and a software verification system for ensuring reliable results. We hope that the GDDM framework will encourage innovative experimental designs and lower the barrier for experimenting with new model mechanisms.

Results

GDDM as a generalization of the DDM

The classic DDM is a four-parameter model which specifies fixed values for drift, bound height or noise level, starting position bias, and non-decision time (Figure 1). It assumes that evidence is constant throughout the trial, and that the integration process does not directly depend on time. This form is mathematically accessible, and has been used to model choice and RT data (Ratcliff, 1978; Ratcliff et al., 2016).

Generalized DDM.

The DDM has fixed, limited parameters, and requires a stream of constant evidence. The ‘full DDM’ expands on the DDM by allowing uniformly-distributed starting position or non-decision time and Gaussian-distributed drift rate. The GDDM is a generalization of the DDM framework which allows arbitrary distributions for starting position and non-decision time, as well as arbitrary functions (instead of distributions) for drift rate and collapsing bounds. It also permits experimental paradigms with dynamic evidence. Several examples of potential GDDM mechanisms are shown. In each case, total response time is the time to reach the diffusion bound plus the non-decision time.

In order to expand our knowledge of decision-making mechanisms, there is interest in using the DDM to model a variety of experimental paradigms. To accommodate different paradigms, the DDM itself must be extended. One key example is the case where a choice bias is induced experimentally via a change in prior probability or reward magnitude, and it is hypothesized that an offset starting position alone is insufficient to capture the behavioral effects. The DDM has been extended to accommodate an experimentally-induced bias by placing a constant offset on the drift rate (Mulder et al., 2012; Ratcliff, 1985; Ashby, 1983).

There is a set of influential yet insufficient extensions to the DDM called the ‘full DDM’. This set of extensions was developed to explain two specific discrepancies between the DDM and data (Anderson, 1960; Laming, 1968; Blurton et al., 2017). First, experimental data exhibited a difference in mean RT between correct and error trials which could not be captured by the classic DDM, so two parameters for across-trial variability were introduced to explain this difference: uniformly-distributed initial conditions to explain fast errors (Laming, 1968) and Gaussian-distributed drift rate to explain slow errors (Ratcliff, 1978; Ratcliff and Rouder, 1998). Second, the classic DDM also had a sharper rise time than experimental data, so uniformly-distributed across-trial variability in non-decision time was introduced to capture this effect. A previous study validated predictions of these across-trial variability parameters (Wagenmakers et al., 2008a). Compared to the classic DDM, the ‘full DDM’ improves the fit to data, but does not expand the range of experiments which may be performed. Likewise, it does not facilitate extenstions to test alternative cognitive mechansims.. A more versatile framework is therefore needed.

The GDDM framework provides a flexible means for developing extensions to the DDM (Figure 1) which can accommodate a wide range of experimental designs and hypotheses about decision-making. The GDDM allows for extensions in five key aspects. First, the drift rate and noise level may depend on time. This allows a range of new experimental paradigms to be modeled using the DDM, such as doubly-stochastic evidence (Zylberberg et al., 2016), pulsed evidence (Huk and Shadlen, 2005), discrete evidence (Cisek et al., 2009; Brunton et al., 2013; Pinto et al., 2018), or changing evidence (Shinn et al., 2020). Additionally, the time dependence of drift rate and noise allows a time-variant urgency signal which modulates the gain of evidence (Cisek et al., 2009; Thura et al., 2012; Murphy et al., 2016; Drugowitsch et al., 2014a). Second, drift rate and noise level may depend on position. This allows for model features such as unstable (Wong and Wang, 2006; Roxin and Ledberg, 2008; Erlich et al., 2015) and leaky (Wong and Wang, 2006; Ossmy et al., 2013; Farashahi et al., 2018) integration. Leaky integration with a short time constant is functionally similar to the urgency-gated model with a low pass filter (Hawkins et al., 2015a; Thura and Cisek, 2014). Likewise, position-dependent variance may allow for multiplicative noise (Freyer et al., 2012). Third, the bound may change over time, which allows collapsing bounds according to an arbitrary function (Hawkins et al., 2015a; Evans and Hawkins, 2019; Drugowitsch et al., 2012), such as for an urgency signal. Fourth, models are parameterizable by an arbitrary dependence on fittable parameters or task conditions. These parameters may be shared between two related experimental tasks, such as an experimental and control task. It also allows alternative implementations of side bias (Hanks et al., 2011) and condition-dependent nonlinearities (Shinn et al., 2020). Fifth, across-trial parameter variability is supported within the GDDM framework for both starting point and non-decision time, but not drift rate.

Mathematically, these concepts are expressed using five functions, corresponding to the drift rate, noise, bound height, starting position, and post-factum modifications to the distribution (Equation 1). Collectively, the GDDM framework allows complicated models to be expressed in a consistent manner, simplifying the way which we may think about complicated DDM extensions.

GDDM mechanisms characterize empirical data

The cognitive mechanisms by which evidence is used to reach a decision is still an open question. One way to distinguish these mechanisms from one another is by fitting models of these cognitive mechanisms to empirical RT distributions. To demonstrate the flexibility of GDDMs, we constructed a GDDM including three plausible mechanisms which are difficult or impossible to implement in a standard DDM: collapsing bounds, leaky integration, and a parameterized nonlinear dependence of drift rate on stimulus coherence. As described below in more detail, we then tested the fit of a GDDM which includes these three mechanisms, compared to the DDM and the ‘full DDM’, in two psychophysical datasets from perceptual decision-making tasks: one from monkeys (Roitman and Shadlen, 2002), and one from human subjects (Evans and Hawkins, 2019). We found that these mechanisms allow the GDDM to provide a better fit to RTs with a more parsimonious model than the DDM or the ‘full DDM’ in the monkey dataset, suggesting that these mechanisms may be important to explaining the monkeys’ behavior. However, consistent with previous results (Hawkins et al., 2015a), the GDDM fit no better than the DDM or ‘full DDM’ in the human dataset, suggesting that different cognitive strategies may be used in each case.

Roitman and Shadlen, 2002 trained two rhesus monkeys on a random dot motion discrimination task in which dots moved randomly but with varying strengths of coherent motion to the left or right (Figure 2—figure supplement 1). Motion coherence was randomly selected to be one of six conditions ranging from 0% to 51.2%. The monkeys were required to determine the direction of coherent motion and respond via saccade to flanking targets. Response time was recorded, which we fit with the standard DDM, ‘full DDM’, and GDDM.

We compared the GDDM to the ‘full DDM’ and to three different implementations of the DDM which differ in the methodology used to estimate the RT distribution. A detailed comparison of these three packages—PyDDM, HDDM, and EZ-Diffusion—as well other common packages is provided in the section ‘Software package for GDDM fitting’. The DDMs included terms for drift, non-decision time, and either bound height or noise magnitude. The DDM fit by PyDDM assumed drift rate was a linear combination of some baseline drift rate and coherence (Equation 14), whereas the DDM fit by HDDM used separate drift rates for each coherence condition (Equation 16), and EZ-Diffusion fit all parameters fully independently for each coherence condition (Equation 17). We also fit a ‘full DDM’ model, which included across-trial variability in the drift rate, non-decision time, and starting position (Equation 15). Finally, we fit a GDDM, which consisted of a typical DDM with no across-trial variability in the parameters, but which demonstrates three mechanisms that are difficult or impossible to model in a standard DDM framework: leaky integration, an exponentially collapsing bound, and a drift rate which is determined from coherence using a fittable nonlinearity (Equation 13). Each of the two monkeys was fit using a separate set of parameters for all models.

Each model was fit to the data, using full-distribution maximum likelihood for PyDDM and HDDM, and an analytic expression for EZ-Diffusion (Figure 2, Figure 2—figure supplement 2). First we examine the fit of the model quantitatively. Bayesian information criterion (BIC) is a common way of quantifying model fit while penalizing for free parameters. According to BIC, the 6-parameter GDDM fit better than all other models (Figure 2a). To determine whether this improved fit was due to the penalty BIC places on the number of parameters, we compared the models directly with likelihood. Again, we found that the GDDM fit this dataset better than the other models which have more parameters (Figure 2b). Finally, we tested whether this improved fit is due to the fact that PyDDM and HDDM use full-distribution maximum likelihood as an objective function, whereas EZ-Diffusion does not. We found that the mean squared error (MSE) of the GDDM was also lower than all other models (Figure 2c). These findings of a good fit by this GDDM, compared to the DDM and the ‘full DDM’, can be interpreted as evidence supporting the cognitive mechanisms included in the GDDM.

Figure 2 with 3 supplements see all
Fit of DDM, ‘full DDM’, and GDDM to Roitman and Shadlen, 2002.

Five models (see Materials and methods) were fit with three different software packages to data from monkey N. Model fits are compared according to BIC (a), negative log likelihood (b), and mean squared error (MSE) (c). The psychometric (d) and chronometric (e) functions represent data as dots and models as lines. The probability density function (PDF) (f) of each model is shown for 12.8% coherence, 6.4% coherence, 3.2% coherence, and 0% coherence. The data RT histogram is shown in gray, and black dots show the smoothed kernel density estimate probability distribution derived from the data (Scott, 2015). Columns of the figure and the color of the line denote different models.

Additionally, we assessed the qualitative fit of the models to the data to gain insight into what aspects of the RT distributions the GDDM was better able to capture. All five models provided a good fit to empirical psychometric (Figure 2d and Figure 2—figure supplement 2d) and chronometric (Figure 2e and Figure 2—figure supplement 2e) functions. However, models varied substantially in their ability to fit the qualitative properties of the RT distribution. The standard DDMs fit with HDDM and PyDDM showed a faster increase in response rate for the RT distribution at all coherence levels, and hence underestimated the mode of the distribution (Figure 2f and Figure 2—figure supplement 2f). The DDM fit with EZ-Diffusion overestimated the rate of increase in responses but did not begin to increase until the monkeys had made many responses, better matching the mode but creating an incorrect shape (Figure 2f and Figure 2—figure supplement 2f). The ‘full DDM’ fit better than the DDMs, and managed to capture the general shape of the RT distribution at high coherences (Figure 2f and Figure 2—figure supplement 2f), in addition to low coherences for one monkey (Figure 2—figure supplement 2). However, the GDDM fit better and with fewer parameters than the ‘full DDM’, matching the rate of increased responses and the overall distribution shape for all models under all conditions (Figure 2f, Figure 2—figure supplement 2f).

The RT distributions produced by monkeys may not be representative of those produced by humans (e.g., Hawkins et al., 2015a), which show a characteristic skewed distribution. This skew is characteristic of the DDM without collapsing bounds, and highlights that a different strategy may be used, potentially driven by overtraining or task structure (Hawkins et al., 2015a). We therefore fit the same GDDM to a more representative dataset from humans performing a similar perceptual decision-making task. Specifically, from the publicly available dataset of Evans and Hawkins, 2019, which used a random dot motion discrimination task (similar to Roitman and Shadlen, 2002), we fit the same set of models to the trials with no feedback delay, which the authors had found were better fit with no or little collapse of decision bounds. Consistent with the findings of Hawkins et al., 2015a and Evans and Hawkins, 2019, here we found that the 6-parameter GDDM provides no better fit than the 11-parameter ‘full DDM’ and 8-parameter DDM fit using HDDM (Figure 2—figure supplement 3a–c). The GDDM, ‘full DDM’, and DDMs fit with PyDDM and HDDM are all able to provide good fits to the psychometric function, chronometric function, and RT distributions for human subjects (Figure 2—figure supplement 3d–f). This lack of improved fit by the GDDM suggests that the human subjects employed a cognitive strategy closer to the standard DDM, and that the specific mechanisms of this GDDM (collapsing bounds, leaky integration, and input nonlinearity) are not important for explaining psychophysical behavior in this dataset.

For the example datasets considered above, the GDDM presented here improved the fit to the monkey data and provided a good fit to the human data. We caution that this result does not necessarily mean that this particular GDDM is appropriate for any particular future study. Models should not only provide a good fit to data, but also be parsimonious in form and represent a clear mechanistic interpretation (Vandekerckhove et al., 2015). The examples here demonstrate the utility of GDDMs for quantitatively evaluating cognitive mechanisms of interest through fitting models to empirical data.

Methods for estimating RT distributions in the DDM and GDDM

There are three broad classes of methods for estimating a GDDM’s RT distribution. First, analytical expressions can be derived using stochastic calculus and sequential analysis (Wald, 1945; Anderson, 1960). Analytical expressions were favored historically because they can be computed quickly by humans or computers. However, apart from the simplest form of the DDM and a few particular classes of extensions, analytical solutions often do not exist. While analytical expressions offer maximum performance and accuracy, deriving new analytical expressions is not a practical way to rapidly implement and test new GDDMs.

The second method, trial-wise trajectory simulation, fully supports GDDM simulation. This method simulates the decision variable trajectories of many individual trials to sample RTs without producing a probability distribution (Figure 3a). Previous work has used trial-wise trajectories to simulate DDMs with no known analytical solution (e.g. Hawkins et al., 2015a). However, this flexibility comes at the cost of performance. As we show in section ‘Execution time vs error tradeoff’, trial-wise trajectory simulation is very slow and provides a coarse-grained representation of the model. Since this method does not produce a probability distribution, it is difficult to use efficient and robust methods such as full-distribution maximum likelihood for fitting data, though progress is ongoing in this area using likelihood-free methods such as through approximate Bayesian computation (Holmes and Trueblood, 2018).

Methods for estimating GDDM RT distributions.

(a) A schematic of trial-wise trajectory simulation methods. In a single trial, a subject’s decision process can be summarized by a stochastic decision variable (DV); simulating this single-trial decision variable many times results in a histogram approximation to the RT distribution. (b) A schematic of solving the Fokker-Planck equation. The full probability distribution of decision variables is solved at each time point, providing a numerical solutions with higher accuracy and reduced execution time.

By contrast, the third method may use the Fokker-Planck equation to numerically compute the first passage time distribution. The Fokker-Planck equation is a mathematical expression which reformulates the trial-by-trial diffusion process: rather than simulating single-trial decision variable trajectories, this method ‘simulates’ all possible paths at once by propagating the distribution of single-trial decision variable trajectories over time (Figure 3b). Unlike analytical solutions, numerical solutions from the Fokker-Planck equation can be computed for all GDDMs. They also generate a probability density function for the RT distribution instead of a sample, allowing fitting with full-distribution likelihood methods. Several algorithms may be used to solve the DDM or GDDM using the Fokker-Planck equation, each with different advantages and disadvantages.

We consider three such algorithms: forward Euler, backward Euler, and Crank-Nicolson (see Appendix). These algorithms differ in how they propagate the probability distribution over time: forward Euler iteratively approximates the probability distribution of trajectory position at each timestep using the distribution at the previous timestep, while backward Euler and Crank-Nicolson iteratively solve systems of linear equations to approximate the probability distribution (Voss and Voss, 2008). The three algorithms also differ in their conceptual and technical difficulty to implement, with forward Euler being easier than backward Euler or Crank-Nicolson. In section ‘Execution time vs error tradeoff’, we will show that the backward Euler and Crank-Nicolson algorithms provide higher accuracy for a given execution time—and a faster execution time for a given accuracy level—than the forward Euler algorithm and trial-wise trajectory simulation.

Execution time vs error tradeoff

Next, we evaluated the effectiveness of different methodologies for estimating the RT distribution. Two general methodological considerations for all of these models are the precision with which the simulations are performed (accuracy), and the amount of time it takes the simulation to run on a computer (execution time). Accurate RT distributions are important to ensure the result faithfully represents the model, and fast execution times are important to increase the feasibility of fitting models to data. In general, for a given algorithm, there is a tradeoff between execution time and the accuracy of the solution obtained. One algorithm may be more efficient than another algorithm if it produces similar accuracy in a shorter amount of time, or better accuracy in the same amount of time.

We directly evaluate the execution time vs error tradeoff in trial-wise trajectory simulation and in solving the Fokker-Planck equation. To do so, we consider a standard DDM with constant drift, diffusion, and bounds (Equation 18). While this is not itself a GDDM, such a model has an analytical solution to the reaction-time distribution (as a closed-form infinite sum) (Anderson, 1960; Ratcliff, 1978) which can be used as ground truth for computing the error in numerical estimates (see Materials and methods).

First, we test trial-wise trajectory simulation. We perform a parameter sweep across timesteps Δt and sample sizes N. We simulate using the Euler-Maruyama method, which is standard in the DDM literature (e.g. Hawkins et al., 2015aHawkins et al., 2015bEvans et al., 2017; Harty et al., 2017; Zhou et al., 2009; Atiya et al., 2020; Simen et al., 2011; Nguyen et al., 2019; Lewis et al., 2014; Verdonck et al., 2016) and for simple models provides identical truncation error but better execution time than more sophisticated methods such as stochastic fourth-order Runge-Kutta (Gard, 1988; Brown et al., 2006). Because decreasing Δt for a fixed N increases the number of timebins and thus increases the variability across each timebin, we employed a smoothing procedure on the resulting simulated histogram to account for this. For each simulation, we smoothed the simulated RT histogram with all mean filters of size 1 to 500 points and chose the filter with the lowest error; since a filter of size one is equivalent to applying no smoothing, this procedure could only improve accuracy. In practice, selecting a filter this way would not be possible due to the lack of analytical ground truth for most GDDMs, and so this smoothing procedure artificially inflates the method’s performance compared to other methods, and effectively yields an upper bound on trial-wise trajectory performance. Execution time and error are shown separately for a range of numerical parameters (Figure 4a,b). As a proxy for the execution time vs error tradeoff, we also show the product of execution time and mean squared error for the parameter sweep (Figure 4—figure supplement 1).

Figure 4 with 2 supplements see all
Execution time vs error tradeoff.

A DDM with constant drift, noise, and bounds was estimated for different spatial (Δx) and temporal (Δt) resolutions. Execution time (a) and MSE (b) are shown across values of Δx and Δt. MSE was computed against the analytical DDM solution for the full distribution. Darker colors indicate a more favorable execution time vs error tradeoff. Gray color indicates numerical instability. (c) Relationships between execution time and error which approximately maximize this tradeoff are shown. Parametric expressions were empirically determined to approximately maximize accuracy vs error tradeoff (see Figure 4—figure supplement 1), and are given by Δt=0.43Δx2 for forward Euler and Δt=Δx for trial-wise trajectory simulation, backward Euler, and Crank-Nicolson. The lower left indicates low error and fast execution time whereas the upper right indicates slow execution time and high error.

In general, trial-wise trajectory simulation provides a suboptimal execution time vs error tradeoff. Using the product of MSE and execution time as a proxy measure of overall performance to illustrate this tradeoff, trial-wise trajectories perform most efficiently for highly imprecise models, i.e. those with a large timestep and a small number of trajectories simulated (Figure 4—figure supplement 1). To examine the execution times and accuracies which lead to this tradeoff, we fix Δt=0.01 and plot the execution time and accuracy across many different values of sample size N (Figure 4c). We see that as execution time increases, accuracy does not increase by more than one order of magnitude. Examples of these traces are provided in Figure 4—figure supplement 2.

Next, we examine methods for solving the Fokker-Planck equation. We consider three different algorithms: forward Euler, backward Euler, and Crank-Nicolson (Table 1). The execution time vs accuracy tradeoff of these three algorithms are governed by two parameters: the timestep Δt, and the granularity of decision variable discretization Δx. Decreasing either of these two parameters slows down the execution time but increases accuracy. These three algorithms operate in a distinct but related manner. Forward Euler is the easiest algorithm to implement, however it is only numerically stable for values of Δt which are very small relative to Δx. Forward Euler is an approximation in the limit as Δt0, and thus cannot be run for large Δt step sizes relative to Δx (see Materials and methods). Backward Euler solves this problem by providing a numerically stable solution across all values of Δt and Δx. While it provides a similar execution time and accuracy as forward Euler given Δt and Δx, the fact that it is numerically stable across all values of these parameters allows more efficient values of Δt and Δx to be chosen. Crank-Nicolson further improves the accuracy and execution time over backward Euler (Table 1). Like backward Euler, Crank-Nicolson does not have theoretical constraints on Δx or Δt, but unlike backward Euler, it may experience numerical instability for models with time-variant integration bounds. In practice, for each of these methods, the required maximum step size is constrained by how strongly varying the drift and noise are in space and time, where more varying drift would require a smaller step size. Convergence of the results can be evaluated by checking results with smaller step sizes.

Table 1
Methods for estimating RT distributions, in order of priority within PyDDM.
Time accuracyGrid accuracyRequirements
AnalyticalExactExactTime- and position-independent drift rate, diffusion constant, and bounds (or linearly collapsing bounds); centered point source initial condition
Crank-NicolsonO(Δt2)O(Δx2)Time independent bounds
Backward EulerO(Δt)O(Δx2)None (fallback for all models)
Forward EulerO(Δt)O(Δx2)Numerics Δt/Δx2<1/σ2 (not used by PyDDM)
Trial-wise trajectoriesN/AN/A(not used by PyDDM)

To evaluate the execution time vs error tradeoff, we perform a parameter sweep across values of Δt and Δx for these three algorithms and again compute the product of execution time and mean squared error as a proxy for execution time vs error tradeoff (Figure 4a,b, Figure 4—figure supplement 1). Overall, the execution time vs error tradeoff is one to six orders of magnitude better for these algorithms than for trial-wise trajectory simulation. We find that, for backward Euler and Crank-Nicolson, the execution time vs error tradeoff in the model shown in Figure 4—figure supplement 1 is maximized approximately when Δt=Δx. Forward Euler is unstable in this parameterization, with the best execution time vs error tradeoff when it is closest to its stability condition. We examine the execution time and error for parameterizations which satisfy these constraints, and compare them to trial-wise trajectory simulation (Figure 4c). We see that Crank-Nicolson is strictly better than other methods: for a given error, Crank-Nicolson offers a faster execution time, and for a given execution time, Crank-Nicolson offers lower error. However, since Crank-Nicolson may not be appropriate for models with time-variant boundaries, the second best method is also of interest. We see that, for a given execution time, backward Euler is better than forward Euler by two orders of magnitude, and better than trial-wise trajectory simulations by up to five orders of magnitude. Therefore, solving the Fokker-Planck equation using Crank-Nicolson and backward Euler represents a key advantage over the forward Euler and trial-wise trajectory simulation methods.

Software package for GDDM fitting

Overview of our software package

We developed a software package to encapsulate the methodological framework we have presented. Our package, PyDDM, allows the user to build models, simulate models, and fit them to data using the efficient GDDM framework methodology described previously. The structure and workflow of the package were designed to mirror the GDDM framework (Figure 5—figure supplement 1). PyDDM selects the most efficient approach to solve a GDDM by automatically determining whether the model can be solved analytically and whether it uses time-varying bounds (Table 1). PyDDM is also built to emphasize the modularity of the GDDM framework. Components of models may be readily reused in other models, and to promote this, we additionally provide an online database of models (the ‘PyDDM Cookbook’ https://pyddm.readthedocs.io/en/latest/cookbook/index.html) to promote code sharing and replication. Users are encouraged to submit their models to this database.

Furthermore, PyDDM is designed to accommodate the arbitrary flexibility of GDDMs. For reliable parameters estimation, it supports differential evolution (Storn and Price, 1997) as a global optimization method in addition to local search methods such as the Nelder-Mead simplex algorithm (Nelder and Mead, 1965). Fits are performed using full-distribution maximum likelihood on the full probability distribution in order to use all available data to estimate parameters. Likelihood can be overly sensitive to outliers or ‘contaminants’ (Ratcliff and Tuerlinckx, 2002), and this problem has been solved elsewhere by using a mixture model of the DDM process with a uniform distribution in a fixed ratio (Wiecki et al., 2013; Wagenmakers et al., 2008b; Voss and Voss, 2007). PyDDM builds on this methodology by allowing a parameterizable mixture model of any distribution and a fittable ratio. Because some models can be difficult to understand conceptually, PyDDM also includes a GUI for visualizing the impact of different model parameters (Figure 5—figure supplement 2), analogous to a GDDM-compatible version of the tool by Alexandrowicz, 2020.

Comparison to other software packages

Besides PyDDM, several software packages are available to simulate DDMs, and one package to simulate GDDMs. We compare these in detail in Table 2, to emphasize a few key points of comparison.

Table 2
Comparison of existing DDM packages.

PyDDM is compared to HDDM (Wiecki et al., 2013), EZ-Diffusion (Wagenmakers et al., 2007), CHaRTr (Chandrasekaran and Hawkins, 2019), Diffusion Model Analysis Toolbox (DMAT) (Grasman et al., 2009; Vandekerckhove and Tuerlinckx, 2008), and fast-dm (Voss and Voss, 2007; Voss and Voss, 2008; Voss et al., 2015). Red indicates limited flexibility, yellow indicates moderate flexibility, and green indicates maximal flexibility. For the solver, these colors indicate minimal, moderate, and maximal efficiency.

PyDDMHDDMEZ-DiffusionCHaRTrDMATfast-dm
LanguagePython3Python2/3Matlab, R, Javascript, or ExcelRequires both R and CMatlabCommand line
SolverFokker-Planck, analyticalAnalytical numerical hybridNoneNone (Monte Carlo)Analytical numerical hybridFokker-Planck
Task parameters
Time dependence of drift/noiseAny functionConstantConstantAny functionConstantConstant
Position dependence of drift/noiseAny functionConstantConstantAny functionConstantConstant
BoundsAny functionConstantConstantAny functionConstantConstant
Parameter dependence on task conditionsAny relationship for any parameterRegression modelCategoricalCategoricalLinearCategorical
Across-trial variability
Across-trial drift variabilitySlow discretization (via extension)Normal distributionNoneAny distributionNormal distributionNormal distribution
Across-trial starting point variabilityAny distributionUniform distributionNoneAny distributionUniform distributionUniform distribution
Across-trial non-decision variabilityAny distributionUniform distributionNoneAny distributionUniform distributionUniform distribution
Model simulation and fitting
Hierarchical fittingNoYesNoNoNoNo
Fitting methodsAny numerical (default: differential evolution)MCMCAnalyticalAny numericalNelder-MeadNelder-Mead
Objective functionAny function (default: likelihood)LikelihoodMean/stdev RT and P(correct)Any sampled (e.g. quantile maximum likelihood)Quantile maximum likelihood or chi-squaredLikelihood, chi-squared, Kolmogorov-Smirnov
Mixture modelAny distribution(s)UniformNone (extendable)NoneUniform and undecided guessesUniform

The choice of software package depends on the model to be fit. As demonstrated in Figure 2 and noted previously (Ratcliff, 2008), fitting a diffusion model to a small number of summary statistics, as in EZ-Diffusion, can lead to poor qualitative and quantitative model fit to the RT distribution. Better options for fitting the DDM and ‘full DDM’ are available, such as fast-dm and HDDM. However, these packages are limited in model form, and make it difficult to extend beyond the ‘full DDM’ to test additional cognitive mechanisms or harness time-varying stimulus paradigms. As we have shown previously (Figure 2), the GDDM is desirable for obtaining good model fits with few parameters.

The major highlight of PyDDM compared to other packages is that it allows GDDMs to be simulated, and is thus compatible with time- and position-dependent drift rate and noise, time-varying bounds, starting point variability, and non-decision time variability. Besides PyDDM, the only other package to our knowledge capable of simulating GDDMs is the new R package CHaRTr (Chandrasekaran and Hawkins, 2019). This package is a major innovation compared to previous approaches, as it allows flexible model forms to be tested. However, CHaRTr is based on trial-wise trajectory simulations. As shown previously (Figure 4), trial-wise trajectory simulation is especially inefficient, reducing the accuracy of the model simulations and increasing execution time by several orders of magnitude. In practice, this makes it infeasible to fit data to more flexible models. For the user to define new models, CHaRTr requires defining models in C and modifying the CHaRTr source code. By contrast, PyDDM facilitates the specification and implementation of models by defining them in user scripts written in pure Python.

Parallel performance

In addition to PyDDM’s efficient algorithms for solving GDDM models, PyDDM also includes the ability to further speed up simulations by solving models in parallel. Any model can be converted from single-threaded to multi-threaded with a single function call. PyDDM solves models for different conditions on multiple CPUs via the ‘pathos’ library (McKerns et al., 2012). It uses a so-called ‘embarrassingly parallel’ algorithm, which turns the most essential parts of the simulation into independent tasks and distributes them equally across CPUs. Such parallelization can be enabled in any PyDDM script with one line of code.

In a perfectly efficient parallelized environment, a simulation utilizing N CPUs should be N times faster than a simulation utilizing one CPU. In practice, this is rare due to communication latency and other overhead. We evaluate the speedup gained by solving an example DDM (Equation 19)) on more than one CPU (Figure 5). As expected, the execution time decreases when more CPUs are used to perform the simulations. However, this speedup is strikingly linear (Figure 5a); for up to 20 CPUs, PyDDM achieves close to 100% parallel efficiency (Figure 5b). This means that PyDDM is able to effectively utilize multiple processors to decrease execution time. In practice, the 6-parameter GDDM shown above fits on one CPU in under 15 minutes, and the authors routinely fit 15-parameter GDDMs on 8 CPUs with differential evolution for global parameter optimization in under 5 hours.

Figure 5 with 2 supplements see all
Parallel performance.

Simulations of a standard DDM were performed, varying numbers of CPUs each with ten replicates. (a) The speedup, defined as (execution time on 1 CPU)/(execution time on N CPUs), and (b) parallel efficiency, defined as speedup/(N CPUs), are shown for different numbers of CPUs. Because measured execution time varied run to run, the mean parallel efficiency could sometimes exceed 1. Error bars are bootstrapped 95% confidence intervals. Solid black lines indicate data, and dashed green lines indicate the theoretical maximum under noiseless conditions. Confidence intervals in (a) are hidden beneath the markers.

Discussion

The GDDM framework provides a consistent description of model extensions, and allows researchers to efficiently simulate and fit a wide variety of extensions to the DDM. Using two open datasets, we demonstrated that GDDMs are able to fit experimental data with high accuracy and few parameters. Model RT distributions are solved and fit numerically using multiple fast and accurate methods for solving the Fokker-Planck equation. We built the PyDDM software package to implement the GDDM framework. PyDDM makes it easy to utilize the efficiency of this framework, while providing additional technical advantages over other packages.

The major benefit of our GDDM framework is the flexibility and ease with which the model may be extended. The GDDM generalizes the DDM to support a wide variety of task paradigms and cognitive mechanisms. Any parameter in the standard DDM is allowed to be an arbitrary function of time and/or position when applicable. Mechanisms such as leaky or unstable integration, collapsing bounds, time-dependent drift rates, and complex non-decision time and starting point distributions all fall within the GDDM framework and can be readily simulated in PyDDM. GDDMs may be built modularly, eliminating the need to reimplement a separate model for each new mechanism. This modularity also helps promote code reuse through our online database of PyDDM model components.

GDDMs may be simulated by efficiently solving the Fokker-Planck equation. In our tests, trial-wise trajectory simulation performed poorly because large increases in the number of trials did not result in as large of increases in accuracy. While the smoothing procedure we applied improved accuracy, trial-wise trajectory simulation was unable to obtain low error for reasonable execution times. The Crank-Nicolson and backward Euler algorithms performed up to five orders of magnitude better than trial-wise trajectory simulation. This increase in performance is consistent with previous work which used Crank-Nicolson to solve the Fokker-Planck equation for the ‘full DDM’ (Voss and Voss, 2008). However, increasing performance is critical for the GDDM not only because it reduces simulation time, but because it leads to a qualitative difference in the types of models which may be fit to data. While inefficient methods are sufficient for simulating a single instance of the model, fitting models with many parameters to data requires thousands of simulations or more. By providing efficient algorithms for solving this broad class of models, our framework makes it possible to fit parameters to data for models which were previously prohibitively time consuming. In this way, it expands the range of potential cognitive mechanisms which can be tested and experimental paradigms which can be modeled.

PyDDM enables researchers to build very complex models of decision-making processes, but complex models are not always desirable. Highly complex models may be able to provide an excellent fit to the data, but unless they represent specific, plausible cognitive mechanisms, they may not be appropriate. PyDDM’s complexity can be used to examine a range of potential cognitive mechanisms which were previously difficult to test, and to probe a range of task paradigms. Additionally, sometimes complex models which do represent appropriate cognitive mechanisms may exhibit inferior performance due to limitations of the available data. For instance, simulation studies suggest that when data are sparse, Bayesian fitting or EZ-Diffusion may be better able to recover parameters (Lerche et al., 2017; Wiecki et al., 2013; van Ravenzwaaij et al., 2017).

When the hypothesis demands additional model complexity, though, one critical check is to make sure the parameters of the models are recoverable. That is, given a set of models, simulated data from each model should be best fit by the model and parameters that simulated it (Wiecki et al., 2013). This explicitly tests against model and parameter degeneracy, a known weakness of the ‘full DDM’ (van Ravenzwaaij and Oberauer, 2009; Boehm et al., 2018). Individual GDDMs must be tested for model and parameter recovery before claims can be made about model mechanisms. PyDDM makes it straightforward to test whether a given GDDM has parameters which are recoverable, and whether any two given models can be distinguished from one another. This is facilitated because synthetic RT data can be sampled from a the calculated probability distributions of the GDDM without the need to run trial-wise trajectory simulations.

The GDDM framework supports across-trial variability in non-decision time and starting position according to any arbitrary distribution, but it does not support across-trial drift rate variability. Across-trial drift rate variability is an inherent weakness of the Fokker-Planck approach. It is technically possible to model across-trial drift rate variability using our framework, by discretizing the drift rate distribution and solving the Fokker-Planck equation for each case separately (Voss and Voss, 2008). PyDDM is not set up to ergonomically reconcile across-trial drift rate variability with time-varying evidence, which is a core feature of PyDDM. As a result, the GDDM framework does not fully support the across-trial drift rate variability aspect of the ‘full DDM’. Across-trial drift rate variability is a potentially plausible cognitive mechanism used in prior literature to account for slow errors (Ratcliff and Rouder, 1998). Of note, studies suggest this mechanism may not be necessary to include to fit data (Lerche and Voss, 2016; Lerche et al., 2017). GDDMs offer researchers the capability to explore alternative cognitive mechanisms which might also account for slow errors.

PyDDM uses numerical solutions of the Fokker-Planck equation to estimate the RT distribution of the model. Other methods may be used for this estimation, such as solving the Volterra integral equation (Smith, 2000). Some prior work has used these alternative methods for solving GDDMs (Drugowitsch et al., 2012; Zhang et al., 2014; Drugowitsch et al., 2014c). Such methods already have fast solvers (Drugowitsch, 2016), which could be implemented as an alternative method within PyDDM in the future.

PyDDM also has the potential to support other applications of interest in the field. First, it could be used as a simulation engine for fitting hierarchical models of GDDM parameters, in which each subject’s individual parameters are drawn from a group distribution. The most critical component to hierarchical model fitting is an efficient method for simulating model likelihood. Since PyDDM provides such a method, hierarchical fitting of GDDMs could be readily implemented as an extension to PyDDM. Second, while PyDDM currently only implements bounds which are symmetric around zero, support for asymmetric bounds is a feasible future addition. Third, PyDDM may be extended to support multi-dimensional diffusion, enabling it to represent interacting race models (Purcell et al., 2010; Usher and McClelland, 2001; Bogacz et al., 2006; Tsetsos et al., 2012; Drugowitsch et al., 2014b). The Fokker-Planck equation can be solved in higher dimensions, in principle enabling similar highly efficient methods to be used for these models as well. The GDDM framework would require only minor adaptations to be compatible with multi-dimensional diffusion. Fourth, it may be extended to allow fitting models to neural recordings or other data which represents the value of the decision variable, as in Zoltowski et al., 2020. Currently, PyDDM only supports fitting to the RT distribution.

In summary, we show that the GDDM framework is a flexible and efficient way to explore new extensions on the DDM, allowing new possibilities in task design. Our package, PyDDM, offers a convenient and modular method for implementing this framework.

Materials and methods

GDDM description

Request a detailed protocol

The GDDM is described as the following differential equation for decision variable x:

(1) dx=μ(x,t,)dt+σ(x,t,)dW

where ‘’ represents task conditions and fittable parameters. Initial conditions are drawn from the distribution xX0(). The process terminates when |x(t,)|B(t,). This can be parameterized by five functions:

  • μ(x,t,) – The instantaneous drift rate, as a function of the decision variable position, time, and task parameters. In PyDDM, it is represented by the ‘Drift’ class and defaults to μ(x,t,)=0.

  • σ(x,t,) – The instantaneous noise, as a function of the decision variable position, time, and task parameters. In PyDDM, it is represented by the ‘Noise’ class and defaults to σ(x,t,)=1.

  • X0() – The probability density function describing the initial position of the decision variable, as a function of task parameters. It must hold that xX0(x,)=1. In PyDDM, it is represented by the ‘IC’ class and defaults to a Kronecker delta function centered at x=0.

  • B(t,) – The boundaries of integration, as a function of time and task parameters. In PyDDM, it is represented by the ‘Bound’ class and defaults to B(t,)=1.

  • pC*(t,) and pE*(t,) – Post-processing performed on the simulated first passage time distributions of correct and error RT distributions pC(t) and pE(t), respectively. In PyDDM, these functions are represented by the ‘Overlay’ class and default to pC*(t,)=pC(t) and pE*(t,)=pE(t). In practice, ‘Overlay’ is used to implement non-decision times and mixture models for fitting contaminant responses.

The organization of the code in PyDDM mirrors this structure (Figure 5—figure supplement 1).

The GDDM framework supports Markovian processes, in that the instantaneous drift rate and diffusion constant of a particle is determined by its position xt and the current time t, without dependence on the prior trajectory leading to that state. Drift rate is discretized, such that the timestep of the discretization is equal to the timestep of the Fokker-Planck solution. In practice, for typical task paradigms and dataset properties, the impact of this discretization is expected to be minimal for a reasonable timestep size.

Fokker-Planck formalism of the GDDM

Request a detailed protocol

PyDDM solves the GDDM using the Fokker-Planck equation, a partial differential equation describing the probability density ρ(x,t,) of the decision variable to be at position x at time t, with initial condition X0():

(2) tρ(x,t,)=-x(μ(x,t,)ρ(x,t,))+2x2(σ2(x,t,)2ρ(x,t,)).

By definition, the GDDM has absorbing boundary conditions (ρ(x,t,)=0 for all x-B(t,) and xB(t,)), and any mass of the probability density crossing the boundaries are treated as committed, irreversible decisions.

The Fokker-Planck equation permits numerical solutions of the GDDM. The numerical algorithm is described for the GDDM with fixed bounds in this section, and the algorithm for the GDDM with time-varying bounds is described in the next section. The system is first discretized with space and time steps Δx and Δt. Without loss of generality, we assume Δx divides 2B(t,) and Δt divides the simulated duration. The probability density solution of Equation 2, ρ(x,t,), can then be approximated as the probability distribution function at spatial grids {-B(t,)+Δx,-B(t,)+2Δx,,0,,B(t,)-2Δx,B(t,)-Δx} and temporal grids {nΔt} for any non-negative integer n. Due to absorbing bounds, the two spatial grids one step outward at ±B(t,) have 0 density, and are not considered explicitly. The choice of Δt is constrained by the stability criteria and accuracy vs execution time tradeoff (see Figure 4).

Define Pim as the probability distribution at the ith space-grid and the mth time-grid. For clarity, here and below Pim denotes the total probability inside a grid of lengths Δx and Δt, and its center represented by (i,m). As such, Pim is unitless, in contrast to ρ(x,t) which has a unit of the inverse of the spatial dimension. Probability distribution functions can also be viewed as a discrete proxy of the continuous probability density (i.e. approximating ρ(x,t) at the corresponding space and time grid), which will then have a unit of the inverse of the spatial dimension. Both views are mathematically equivalent (and just off by a Δx factor to the whole equation), but the former definition was used for simplicity and interpretability.

The Fokker-Planck equations can be discretized in several distinct ways, including the forward Euler method, the backward Euler method, and the Crank-Nicolson method. In the forward Euler method, Equation 2 is approximated as:

(3) PjnPjn1Δt=(μP)j+1n1(μP)j1n12Δx+(DP)j+1n1+(DP)j1n12(DP)jn1Δx2

where D=σ2/2 is the diffusion coefficient. Here and below, dependencies of space, time, and other variables are omitted for simplicity. Terms in parenthesis with subscripts and superscripts are evaluated at the subscripted space grid and superscripted time grid, in the same manner as Pjn. The spatial derivatives at the right side of Equation 2 are evaluated at the previous time step n-1, such that only one term of the equation is at the current time step n, generated from the temporal derivative at the left side of Equation 2. As such, Equation 3 for each j fully determines Pjn from the probability distributions at the previous time-step:

(4) Pjn=Pjn1Δt2Δx((μP)j+1n1(μP)j1n1)+ΔtΔx2((DP)j+1n1+(DP)j1n12(DP)jn1)

It is important to note that the forward Euler method, Equation 4, is an approximation method. Since the probability distribution is iteratively propagated from early to later times, any errors of the solution accumulate over the iterative process. Moreover, the forward Euler method is prone to instability. Assuming small Δx and Δt (i.e. much less than 1), ΔtΔx2 is much larger than Δt2Δx and thus the third term on the right side of Equation 4 should in general be much larger than the second term. If Δx and Δt are such that the third term is large for any n and j, then Pjn will not be properly computed in Equation 4, and such errors will propagate to the whole space over time. The forward Euler method thus has a stability criteria:

(5) DΔtΔx2<12

Equation 5 poses a strict constraint on the step-sizes: while Δx and Δt should both be small to minimize the error due to discretization, decreasing Δx demands a much stronger decrease in Δt, resulting in a much longer execution time of the method. This can be seen in our simulations shown in Figure 4.

The backward Euler and Crank-Nicolson methods largely circumvent the aforementioned problems. In the backward Euler method, the right side of Equation 2 is expressed at the current time step n:

(6) Pjn-Pjn-1Δt=-(μP)j+1n-(μP)j-1n2Δx+(DP)j+1n+(DP)j-1n-2(DP)jnΔx2,

In the Crank-Nicolson method, half of the right side of Equation 2) is expressed at the current time step n, and half at the previous time step n-1:

(7) PjnPjn1Δt=0.5((μP)j+1n1(μP)j1n12Δx+(DP)j+1n1+(DP)j1n12(DP)jn1Δx2)+ 0.5((μP)j+1n(μP)j1n2Δx+(DP)j+1n+(DP)j1n2(DP)jnΔx2),

With multiple terms at the current time-step, Equation 6 and Equation 7 cannot be directly solved. Instead, the equations across all spatial grids can be summarized in matrix form:

(8) (𝟙+(1α)M)Pn=(𝟙αM)Pn1for Pn=(Px=B+ΔxnPx=BΔxn),M=(2νν+χ/2000νχ/22νν+χ/2000νχ/2000ν+χ/2000νχ/22ν),

Here, ν=DΔt/Δx2 and χ=μΔt/Δx. In this formulation, α=1 for the forward Euler method, α=0 for the backward Euler method, and α=0.5 for the Crank-Nicolson method. Since 𝕄 is a tridiagonal matrix, it can be inverted from the left side to the right side of the equation in an efficient manner. Practically, this allows the backward Euler and Crank-Nicolson method to rapidly generate the discretized probability distribution function inside the boundaries at each time-step.

Finally, the probability of decision formation at each time step is computed as the outward flux from the outermost grids ±BΔx (to the hypothetical grids at ±B):

(9) Δpcorrectn=(νχ/2)|x=BΔxn px=BΔxnΔpincorrectn=(ν+χ/2)|x=B+Δxn px=B+Δxn

This provides the correct and incorrect reaction time distributions, as well as the total correct (nΔpcorrectn) and incorrect (nΔpincorrectn) probability.

Unlike the forward-Euler method which iteratively propagates the solution and the approximation error in a feedforward manner, the backward Euler and Crank-Nicolson methods solve for the matrix equation of the probability distribution evolution, and are less susceptible to error being propagated. In addition, the two methods do not have any stability criteria which constrain the step size. Contrasting the two methods, the backward Euler method has larger truncation error (O(Δx,Δt2)) than the Crank-Nicolson method (O(Δx2,Δt2)) (Table 1), while the Crank-Nicolson method generally has twice the execution time of the backward Euler method, assuming the matrix construction is the slowest step of the algorithms. Crucially, the solution from the Crank-Nicolson method tends to be susceptible to oscillations over iterations, especially with varying bounds (see below). PyDDM automatically chooses the best solver for any given model (Table 1).

Fokker-Planck formalism with time-varying bounds

Request a detailed protocol

The previous section considered the Fokker-Planck equation of the GDDM assuming fixed bounds. The formalism can be adapted to accommodate time-varying bounds. This is more generally studied as a free boundary problem, with a typical example including the Stefan problem (Meyer, 1978; Li, 1997).

In particular, consider upper and lower bounds Bn=B(nΔt,) and -Bn=-B(nΔt,) at an arbitrary time t=nΔt. If Bn is exactly on a grid point, the previous results apply. Alternatively, if Bn is in between two grid points (xj,xj+1), the solution is approximated by a linearly weighted summation of the solutions in two grid systems: an inner grid system {xk|k=-j,,j} and an outer grid system {xk|k=-j-1,,j+1}.

Define Pin/outn to be the probability distribution function at time t=nΔt over the inner and outer grid systems, respectively. In the backward Euler method (α=0 in Equation 8), Pin/outn can be computed by propagating from the actual probability distribution function at the previous step Pn-1:

(10) (𝟙+𝕄in/out)Pin/outn=Pn-1,

where 𝕄in/out are as in Equation 8, but defined over the domain of inner/outer grids respectively.

Furthermore, define

(11) wout=(Bnxj)/Δxwin=(xj+1Bn)/Δx

as the weights of the linear approximation of the solution to that of the outer and inner grid systems. The resulting probability distribution function and the probabilities of decision formation at step n are:

(12) Pn=woutPoutn+winPinnΔpcorrectn=wout(ν+χ/2)|x=xj+1n Pj+1,outn+win(ν+χ/2)|x=xjn Pj,innΔpincorrectn=wout(νχ/2)|x=xj1n Pj1,outn+win(νχ/2)|x=xjn Pj,inn

In the literature of the DDM in cognitive psychology and neuroscience, collapsing bounds are the most common form of time-varying bounds considered. As bounds collapse, it is possible that the outermost grids at the previous step (with non-zero probability densities) become out-of-bound at the current step. Based on the side they become out of bound, such probabilities are added to the correct and incorrect probabilities respectively (and multiplied by wout/in for each of the outer/inner grid system). Nevertheless, increasing bounds are also supported. The algorithm for the Crank-Nicolson scheme of time-varying bounds often generates oscillatory solutions, and is thus not described here or implemented in PyDDM.

Empirical datasets

Request a detailed protocol

As testbeds for fitting different models, we used two empirical datasets of choices and RTs during perceptual decision-making tasks: one from Roitman and Shadlen, 2002, one from Evans and Hawkins, 2019. These references contain full details of the tasks and datasets. For the dataset of Roitman and Shadlen, 2002, we used trials from the ‘reaction time’ task variant, for both monkeys. For the dataset of Evans and Hawkins, 2019, we used trials from the task variant with no feedback delay.

In brief, these two publicly available datasets were selected for the following reasons. Both use random dot motion discrimination tasks, which facilitate comparison of behavior and fitting with the same GDDM. These two datasets span species, with behavior from monkeys (Roitman and Shadlen, 2002) and human subjects (Evans and Hawkins, 2019). Furthermore, both datasets have been analyzed in prior work fitting DDMs with collapsing bounds. Specifically, Hawkins et al., 2015a generally found that the dataset of Roitman and Shadlen, 2002 was better fit with collapsing bounds than with constant bounds, in contrast with typical human datasets which are well fit with constant bounds. In line with those findings, Evans and Hawkins, 2019 showed that behavior in a task variant with no feedback delay yielded behavior which was more representative of human datasets and was well fit by a DDM with decision bounds with no or little collapse. These two datasets therefore provided useful testbeds for GDDM fitting with distinct a priori predictions about the utility of collapsing bounds in a GDDM.

Compared models

Request a detailed protocol

The 6-parameter GDDM fit by PyDDM is parameterized by

(13) μ(x,t,)=μ0(CCmax)αxσ(x,t,)=1X0(x,)={1,x=00,otherwiseB(t,)=B0et/τpi(t)=0.95pi(ttnd)+0.025

where fittable parameters are drift rate μ0, coherence nonlinearity α, leak constant ℓ, initial bound height B0, bound collapse rate τ, and non-decision time tnd. Additionally, Cmax is the maximum coherence in the experiment, which is fixed at 0.512 for the Roitman and Shadlen, 2002 dataset and 0.4 for the Evans and Hawkins, 2019 dataset.

The three-parameter DDM fit by PyDDM is a subset of the GDDM framework parameterized by

(14) μ(x,t,)=μ0Cσ(x,t,)=1X0(x,)={1,x=00,otherwiseB(t,)=B0pi(t)=0.95pi(ttnd)+0.025

where C is coherence and fittable parameters are μ0, B0, and tnd.

The 11-parameter ‘full DDM’ fit by HDDM does not fall into the GDDM framework because the across-trial variability in drift rate μ(x,t,) is a random variable. Thus, the ‘full DDM’ is parameterized by

(15) μ(x,t,)𝒩(μj,sv2)σ(x,t,)=1X0(x,)={12sz+1,szxsz0,otherwiseB(t,)=B0pi(t)=12sT+1k[sT,sT]pi(ttndk)

where 𝒩 is a normal distribution, and fittable parameters are six independent drift rates μj for each coherence level j, non-decision time tnd, and bounds B0, as well as variability in drift rate sv, starting position sz, and non-decision time sT. It was fit an outlier probability of p_outlier=0.05.

The 8-parameter DDM fit by HDDM is a subset of the GDDM framework parameterized by

(16) μ(x,t,)=μjσ(x,t,)=1X0(x,)={1,x=00,otherwiseB(t,)=B0pi(t)=pi(ttnd)

where fittable parameters are B0, tnd, and six independent drift rates μj, one for each coherence level j. It was fit an outlier probability of p_outlier=0.05.

The 18-parameter DDM fit by EZ-Diffusion is a subset of the GDDM framework parameterized by

(17) μ(x,t,)=μjσ(x,t,)=1X0(x,)={1,x=00,otherwiseB(t,)=Bjpi(t)=pi(ttndj)

where fittable parameters are independent drift rates μj, bounds Bj, and non-decision times tndj, for each of the six coherence level j.

All simulations were fit separately for each monkey. In the human fits, data were pooled across all subjects for the 0 s delay condition only.

Execution time and accuracy analysis

Request a detailed protocol

To analyze the execution time vs error tradeoff, we benchmarked a DDM which fits into the GDDM framework, defined by:

(18) μ(x,t,)=2σ(x,t,)=1.5X0(x,)={1,x=00,otherwiseB(t,)=1pi(t)=pi(t)

The DDM was chosen because it permits closed-form solutions, providing a comparison for benchmarking simulation accuracy. Performance was measured on a Lenovo T480 computer with an Intel M540 i7-8650U 4.2 GHz CPU with hyperthreading disabled. Verification and parallelization were disabled for these evaluations. The simulated time was 2 s.

In order to examine combinations of Δt and either Δx or N which simultaneously minimize both execution time and error, we plot the product of the MSE and execution time in seconds. This product should be interpreted as a heuristic which encompasses both execution time and accuracy rather than a fundamentally important quantity in and of itself.

Model for parallel performance

Request a detailed protocol

To analyze PyDDM’s parallel performance, we benchmarked a DDM which fits into the GDDM framework, defined by:

(19) μ(x,t,)=2Cσ(x,t,)=1X0(x,)={1,x=00,otherwiseB(t,)=1pi(t)=pi(t)

where the model was simulated using backward Euler for 64 coherence values C uniformly spaced from 0 to 0.63. Because PyDDM parallelizes functions by utilizing an embarrassingly parallel approach over experimental conditions, we included many such conditions to enable benchmarks which reflect the simulations instead of the specifics of the model. Simulations were performed on a high performance computing cluster using a single node with 28 Xeon E5-2660v4 cores. Verification was disabled for these simulations.

Validity of results

Request a detailed protocol

It is important that simulation results are valid and reliable. Results could be inaccurate either due to bugs in user-specified models or in PyDDM itself. Due to the ability to easily construct user-defined models, PyDDM takes this issue very seriously. In addition to utilizing standard software engineering practices such as unit and integration testing with continuous integration (Beck Andres, 2004), PyDDM is the first neuroscience software to our knowledge which incorporates software verification methods for ensuring program correctness (Ghezzi et al., 2002). It does so using a library which checks for invalid inputs and outputs within the code, and also checks for invalid inputs and outputs in user-defined models which do not explicitly incorporate the library (Shinn, 2020). Invalid inputs and outputs include anything which invalidates the assumptions of the GDDM, such as initial conditions outside of the bounds, noise levels less than zero, or probability distributions which do not sum to 1. Such checking helps ensure that results generated by PyDDM are not due to a configuration or programming mistake, which is especially important because user-specified models logic can quickly become complicated for intricate experimental designs. This, combined with the ease of reproducibility and code reuse afforded by open source software in general, means that high validity and reliability are key advantages of PyDDM.

Appendix 1

Implementing and fitting models in PyDDM

Briefly, we describe the process of building and fitting a model in PyDDM. For more detailed information, see the quickstart guide in the documentation at https://pyddm.readthedocs.io/en/latest/quickstart.html. A familiarity with Python and with Python classes is assumed.

PyDDM is based on a modular architecture, organized around the principles of the GDDM. There are five components, specified using Python classes, which correspond to the five functions listed in section ‘GDDM description’:

  • Drift: the drift rate. May depend on time, decision variable position, task conditions, and fittable parameters. Defaults to 0.

  • Noise: the standard deviation of diffusion process. May depend on time, decision variable position, fittable parameters, and task conditions. Defaults to 1.

  • Bound: The integration bounds (+B and -B). May depend on time, fittable parameters, and task conditions. Defaults to 1.

  • IC: The initial conditions at time t = 0, in terms of a probability density function over decision variable position. May depend on fittable parameters or task conditions. Defaults to a probability density function of a Kronecker delta function centered at x=0.

  • Overlay: Post-factum modifications to the histogram, for instance to create a mixture model or add a non-decision time. May depend on fittable parameters or task conditions. Defaults to an identity function.

These model components do not specify a single value for these parameters, but rather specify a function through which these values can be computed. Each may take any number of parameters and utilize any number of task variables. Most models will not implement all of these components from scratch, but will use the default component or a built-in component for most of them. For example, the GDDM in Figure 2 (Equation 13) used the default ‘Noise’ and ‘IC’ components, built-in Bounds and Overlay components, and a customized ‘Drift’ component. A full list of built-in components, as well as a library of example components, can be found in the online documentation.

Each of these model components may depend on parameters; for example, the ‘Bound’ component requires the parameters B0 and τ. These parameters must be specified when the class for the model component is instantiated, for example, ‘BoundCollapsingExponential’ is the class for exponential collapsing bounds used in the GDDM in Figure 2 (Equation 13). It can be instantiated (for B0=2 and τ=3) as

BoundCollapsingExponential(B0 = 2, tau = 3)

Of course, we would often like to fit these parameters to data, and thus, we cannot specify their exact values when instantiating a component. For these situations, we use an instance of a ‘Fittable’ class as a placeholder. Minimum and maximum values must be specified for bounded fitting algorithms, such as differential evolution. For example, we could fit both of these parameters using

BoundCollapsingExponential(B0 = Fittable(minval = 0.5, maxval = 3), tau = Fittable(minval = 0.1, maxval = 10))

Components which are not built-in to PyDDM may be specified by defining a class which gives four pieces of information: the component’s name, the parameters it depends on, the task conditions it depends on, and a function to compute its value. For example, here is ‘DriftCoherenceLeak’, the ‘Drift’ class used in our GDDM in Figure 2 (Equation 13):

class DriftCoherenceLeak (ddm.models.Drift): 
    name = "Leaky drift depends nonlinearly on coherence" 
    # Parameters we want to include in the model 
    required_parameters = ["driftcoh", "leak", "power", "maxcoh"] 
    # Task parameters/conditions. Should be the same name as in the sample. 
    required_conditions = ["coh"] 
    # The get_drift function is used to compute the instantaneous value of drift. 
    def get_drift(self, x, conditions, ∗∗kwargs): 
        return self.driftcoh ∗ (conditions["coh"]/self.maxcoh)∗∗self.power + self.leak ∗ x

Once we have determined all of the components of our model, we may tie them together using an instance of the ‘Model’ class. When creating an instance of the ‘Model’ class, we must also define our timestep ‘dt’, grid spacing ‘dx’, and simulation duration ‘T_dur’. Shown below is the GDDM in Figure 2 (Equation 13):

m=Model(name="Roitman-Shadlen GDDM", 
         drift = DriftCoherenceLeak(driftcoh = Fittable(minval = 0, maxval = 20),
                                                    leak = Fittable(minval=-10, maxval = 10)), 
         # Do not specify noise=… since we want the default 
         # Do not specify IC=… since we want the default
          bound = BoundCollapsingExponential(B = Fittable(minval = 0.5, maxval = 3),
                                                                                        tau = Fittable(minval = 0.0001, maxval = 5)), 
         # OverlayChain strings together multiple overlays. 
         Overlay = OverlayChain(overlays=[
                                 OverlayNonDecision(nondectime = Fittable(minval = 0, maxval = 0.4)),
                                  OverlayUniformMixture(umixturecoef = 0.05)]), 
          dx = 0.001, dt = 0.001, T_dur = 2)

If the model ‘m’ includes at least one component with a ‘Fittable’ instance, we must first fit the model to data by calling the ‘fit_adjust_model’ function, passing our model and dataset as arguments. By default, it fits using differential evolution (Storn and Price, 1997), however Nelder-Mead (Nelder and Mead, 1965), BFGS (Nocedal and Wright, 2006), basin hopping (Wales and Doye, 1997), and hill climbing (Luke, 2013) are also available. Fitting a model in parallel using n cores of a CPU is as simple as calling the function ‘set_N_cpus(n)’ before fitting the model. Once we load our sample ‘samp’ using one of a number of convenience functions for loading data, we may fit the model:

set_N_cpus(4) 
fit_adjust_model(m, sample = samp)

After fitting the model to data, the ‘Fittable’ instances are automatically replaced by their fitted values, so the model can be directly simulated using ‘m.solve()’. If any of the model components depend on task conditions, these must be passed as a dictionary to the solver; for example, since the ‘DriftCoherenceLeak’ component depends on ‘coh’ (the trial’s coherence), we must instead call

m.solve(conditions={‘coh’: 0.064}) to simulate with a coherence of 0.064. Alternatively, the function ‘solve_partial_conditions’ can simulate across many task conditions simultaneously. Both the functions ‘m.solve’ and ‘solve_partial_conditions’ will return an instance of the ‘Solution’ class, which has methods for obtaining the correct and error RT distributions as well as useful summary statistics.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
    Extreme Programming Explained
    1. Beck Andres
    (2004)
    Addison Wesley.
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
    Optimal decision-making with time-varying evidence reliability
    1. J Drugowitsch
    2. R Moreno-Bote
    3. A Pouget
    (2014b)
    In: Z Ghahramani, M Welling, C Cortes, K. Q Weinberger, N. D Lawrence, editors. Advances in Neural Information Processing Systems, 27. Curran Associates, Inc. pp. 748–756.
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
    Introduction to Stochastic Differential Equations. Monographs and Text-Books in Pure and Applied Mathematics
    1. TC Gard
    (1988)
    Dekker, Inc.
  28. 28
    Fundamentals of Software Engineering
    1. C Ghezzi
    2. M Jazayeri
    3. D Mandrioli
    (2002)
    Pearson.
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
    Modelling reaction times in non-linear classification tasks
    1. M Lewis
    2. A Fedor
    3. M Öllinger
    4. E Szathmáry
    5. C Fernando
    (2014)
    In: A. P del Pobil, E Chinellato, E Martinez-Martin, J Hallam, E Cervera, A Morales, editors. From Animals to Animats 13. Springer International Publishing. pp. 53–64.
    https://doi.org/10.1007/978-3-319-08864-8_6
  41. 41
  42. 42
    Essentials of Metaheuristics
    1. S Luke
    (2013)
    Lulu.
  43. 43
  44. 44
    Building a framework for predictive science
    1. MM McKerns
    2. L Strand
    3. T Sullivan
    4. A Fang
    5. MAG Aivazis
    (2012)
    Proceedings of the 10th Python in Science Conference.
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
  65. 65
    Refinement type contracts for verification of scientific investigative software
    1. M Shinn
    (2020)
    In: S Chakraborty, J. A Navas, editors. Verified Software. Theories, Tools, and Experiments. Cham: Springer International Publishing. pp. 143–160.
    https://doi.org/10.1007/978-3-030-03592-1
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87
  88. 88
  89. 89
  90. 90
  91. 91
  92. 92
  93. 93

Decision letter

  1. Thorsten Kahnt
    Reviewing Editor; Northwestern University, United States
  2. Joshua I Gold
    Senior Editor; University of Pennsylvania, United States
  3. Long Ding
    Reviewer; University of Pennsylvania, United States
  4. Eric-Jan Wagenmakers
    Reviewer; University of Amsterdam, Netherlands

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

Drift diffusion models are widely used in psychology and neuroscience research. However, diffusion model applications are often constraint by the available software rather than the scientific question that ought to be asked. Thus, having a model and associated software package that provides more flexible model fits will be of great value for the scientific community.

Decision letter after peer review:

Thank you for submitting your article "A flexible framework for simulating and fitting generalized drift-diffusion models" for consideration by eLife. Your article has been reviewed by Joshua Gold as the Senior Editor, a Reviewing Editor, and three reviewers. The following individuals involved in review of your submission have agreed to reveal their identity: Long Ding (Reviewer #1); Eric-Jan Wagenmakers (Reviewer #2).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

We would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). Specifically, we are asking editors to accept without delay manuscripts, like yours, that they judge can stand as eLife papers without additional data, even if they feel that they would make the manuscript stronger. Thus, the revisions requested below only address clarity and presentation.

Summary:

Reviewers were generally impressed with this work as it presents a substantial step forward, providing a much-needed, flexible, and efficient way to fit choice and reaction time data common to many decision behaviors. Reviewers mostly made a number of suggestions for how to improve the manuscript. However, they also identified two essential issues that would need to be adequately addressed in a revised version. Specifically, it would be important to apply the models to a different data set and to consider additional criteria for model evaluation.

Essential revisions:

1) As explained in reviewer #2's major comment 2, the RT distribution in the Roitman and Shadlen data set is not very representative. Reviewers agree that the authors should include an addition, more representative data set for the current paper. In particular, reviewers suggested the following data set: Wagenmakers et al., (2008). These data are available at http://www.ejwagenmakers.com/Code/2008/LexDecData.zip

2) As detailed in reviewer #2's major comment 3, and reviewer #3 major comment 3, given the flexibility of the model, it is unclear whether goodness-of-fit is the only or even the most appropriate criteria to evaluate different models.

Reviewer #1:

This work provides a much-needed, flexible, and efficient way to fit choice and reaction time data common to many decision behaviors.

The detailed methods in the section "Fokker-Plank formalism of the GDDM" are outside my expertise. But the superior performance of the GDDM compared to other packages, both in accuracy and execution time, is very convincing and rigorously demonstrated. The writing is also very clear. I expect that the software package will be widely used in the field. This is a rare instance when I support publication as is.

Reviewer #2:

In general I am truly impressed with this work. It present a substantial step forward, and I strongly believe a revision should be published. At the same time, I also believe the manuscript can be greatly strengthened along the lines provided below.

Essential revisions:

1) "Fifth, single trial parameter variability is supported within the GDDM framework for both starting point and non-decision time."

Initially, I was not sure what the authors mean with "single trial parameter variability". Do they mean "across-trial variability"? And if so, does their methodology not allow for across-trial variability in drift rate, as the full DDM does? If the GDDM indeed does not allow such variability, this warrants explicit mention and an extensive discussion.

Then when I read the Discussion section the cat comes out of the bag: "While the GDDM framework supports individual trial variability in non-decision time and starting position according to any arbitrary distribution, it does not support drift rate variability. As a result, the GDDM framework does not support the entire "full DDM"."

But this is a severe limitation, one that needs to be acknowledged and emphasized from the start. Variability in drift rate is a highly plausible mechanism that has been an integral part of the DDM since 1978. Without this mechanism to account for (fairly ubiquitous) slow errors, the model will have to turn to other mechanism to accomplish this, such as moving-in bounds. The authors mention "However, the GDDM framework is broad enough to support an infinite number of extensions which create the same RT features which originally motivated inclusion of drift rate variability." But the goal of the modeling is not to fit the data well; the actual goal is to decompose the observed data into meaningful psychological processes, and a process of moving-in bounds is fundamentally different from drift-rate variability. I assume that the drift rate variability frustrates the mathematical work? The reader needs to know the reason why the model cannot handle drift rate variability.

Staying on the same point, the authors then point out that "Indeed, in Figure 2, our 5-parameter GDDM provided a better fit to the asymmetry between correct and error RT distributions than the 11-parameter "full DDM"." True, but since these data are highly unrepresentative (see the next point below), this can just as well be used as an argument *against* the GDDM: apparently it is able to fit just about any pattern of data, even patterns that are highly unrepresentative.

2) The demonstration where the models are fit to the data is (deeply) problematic, because the monkey data are simply not representative. As can be seen from the grey histograms in Figure 2, the "regular" DDMs predict a long-tailed RT distribution, which is however not apparent from the data. This then creates ambiguity: yes, one can show that the existing models do not fit well, but they are fit to unrepresentative data. If these data patterns were regularly observed, the DDM would never have been proposed in the first place. It would be much more compelling to fit the models to any of the large benchmark data sets that are publicly available – these data sets all show a pronounced right skew.

3) As the progenitor of the much-maligned "EZ" model, I guess I am obliged to protest a little. Yes, of course EZ is not flexible, and does not have all the bells and whistles. But that is kind of the point of a simple model. There are some demonstrations that (contrary to my own expectations even) EZ actually does better than the full diffusion model when applied to sparse data. I am not talking about goodness-of-fit, but about recovering the ground truth in a simulation study (e.g., figuring out which parameters are the same and which differ). For sparse data, a simple model may work better. One reference:

van Ravenzwaaij, Donkin and Vandekerckhove, (2017).

Reviewer #3:

The manuscript introduces a generalized drift-diffusion model, together with the modular software package PyDDM to fit it. The generalized drift-diffusion model is a set of accumulation-to-bound models that feature highly flexible components, such as time- and location-varying drifts, variable starting points, time-varying decision boundaries, etc. The manuscript describes how to compute first-passage time distributions for this set of models by numerically and efficiently solving the Fokker-Plank Equation (FPE), of which PyDDM provides an implementation, together with a set of optimizers that can be used to adjust the model's parameters to behavioral data.

Having a model and associated software package that provides more flexible model fits is very welcome and timely, as diffusion model applications are frequently constraint by the available software rather than the scientific question that ought to be asked. PyDDM promises to provide such a package, and appears to be sufficiently well document to ease uptake by the community. Thus, it would be a welcome addition to the currently available diffusion model fitting packages.

Some concerns should be addressed before the manuscript can be published:

1) The manuscript frequently describes the numerical solution of the FPE a "simulation", starting with the title, the introduction, the main text, etc. Commonly, solving the FPE is not considered a simulation, such that describing it as such is potentially confusing misnomer – in particular as there exist simulation-based "likelihood-free" methods to fit models solely based on simulating them. First, I urge the authors to not call solving FPEs a "simulation", but instead describe their approach as numerically propagating the probability mass, which can intuitively be thought of as performing an infinite number of simulations. Second, the authors should acknowledge likelihood-free fitting methods, like ABC (see, for example, Holmes and Trueblood, 2018 and related) to fit diffusion models – subsection “Methods for simulating the DDM and GDDM” would be a good starting point.

2) The precision and the clarity of the manuscript can be improved in many instances:

- It should be made clear that GDDMs only support Markovian processes + some additional post-processing of the computed first-passage time distributions. This is, for example, the reason why they don't support continuously varying drift rates, but this isn't at all clear from the manuscript. An additional (current) limitation is that the implementation currently only supports boundaries that vary symmetrically around zero, even though this appears to be a limitation of the provided implementation rather than a fundamental limit to the chosen approach.

- The manuscript frequently claims that the solution is "continuous" (e.g., subsection “Methods for simulating the DDM and GDDM”, subsection “Software package for GDDM fitting”) in time, even though they only provide time-discretized numerical results.

- It isn't sufficiently clear from the manuscript that all provided FPE solutions are approximate, as they discretize time and space. Currently, the manuscript only highlights the approximate nature of the forward Euler scheme (subsection “Execution time vs error tradeoff”). Indeed, it has worse numerical properties, but that doesn't make the implicit schemes non-approximate.

3) The manuscript highlights a better fit (i.e., higher likelihood, better BIC, etc.) of a DDM with time-varying boundaries and a leak, and uses this fit (of a single dataset) to re-iterate throughout the manuscript that a better model fit with fewer parameters is always more desirable (e.g., subsection “Software package for GDDM fitting”). This mis-characterizes the modeling exercise, in which the structure and presence of different components of DDMs might be as important as the quality of the fit.

4) The authors highlight three approaches to "simulate" DDMs and GDDMs, but seem to be missing a fourth based on solving Volterra integral equation, as described by Smith, (2000) and implemented by the 'dm' package by Drugowitsch et al., – which is fairly flexible, but unfortunately only provides computing the first-passage times, but no model fitting code.

https://doi.org/10.7554/eLife.56938.sa1

Author response

Essential revisions:

1) As explained in reviewer #2's major comment 2, the RT distribution in the Roitman and Shadlen data set is not very representative. Reviewers agree that the authors should include an addition, more representative data set for the current paper. In particular, reviewers suggested the following data set: Wagenmakers et al., (2008). These data are available at http://www.ejwagenmakers.com/Code/2008/LexDecData.zip

We thank the reviewers for this helpful suggestion, which we think will better contextualize the role of GDDMs in working with empirical datasets. We had utilized the Roitman-Shadlen dataset as an example testbed to demonstrate how GDDMs could be fit to empirical dataset to test hypothesized mechanisms (there, collapsing bounds and leaky integration, as motivated by prior studies). We agree that RT distributions in this dataset are not representative of many experiments from human subjects, and therefore that also fitting models to a representative human dataset is an important addition to the paper.

We thank reviewer 2 for the suggestion of the publicly available dataset from Wagenmakers et al., (2008a). However, after surveying other publicly available datasets, we decided that it is not most appropriate for this analysis, especially to serve as a comparison to Roitman and Shadlen, (2002). The mechanisms we focused on in this manuscript are those related to integration of evidence extended over time. In the dataset of Wagenmakers et al., (2008a), the task was to determine whether a string of characters formed a word or not, and collapsing bounds and leaky integration do not have clear interpretations as cognitive strategies for this task paradigm (see our response to Essential revision major point 2).

To address the reviewer’s concern, we sought to include a more representative dataset from human subjects which would also be more comparable in task design to the monkey experiment of Roitman and Shadlen, (2002). Here we were primarily inspired by the recent publication of Hawkins et al., (2015a) (“Revisiting the evidence for collapsing boundaries and urgency signals in perceptual decision-making”), which analyzed multiple studies from monkeys and human subjects performing perceptual decision-making tasks and specifically investigated evidence for collapsing bounds. Overall, they found that most monkey datasets better fit by collapsing bounds than fixed bounds, but that most human datasets were better fit by fixed bounds. Hawkins et al., (2015a) suggested these differences may arise from task design and training differences, as well as species differences.

In our revised manuscript, we now use publicly available data from the study of Evans and Hawkins, (2019) (“When humans behave like monkeys: Feedback delays and extensive practice increase the efficiency of speeded decisions”), which tested the Roitman-Shadlen task paradigm in human subjects, and manipulated feedback timing and amount of practice. The dominant effect they found was in feedback timing: having zero delay between response and feedback yielded RTs that were best fit by no or little collapse of bounds (similar to the human studies analyzed by Hawkins et al., (2015a)); whereas a 1-s delay in feedback (as in Roitman and Shadlen, (2002)) yielded RTs that were best fit by collapsing bounds (similar to the monkey studies analyzed by Hawkins et al. (2015a)). We therefore selected for our GDDM analysis the zero-delay trials from Evans and Hawkins, (2019), which the authors found are more representative of human datasets.

Using the two publicly available datasets of (i) Roitman and Shadlen, (2002) and (ii) Evans and Hawkins, (2019) as testbeds for our GDDM framework has several advantages. They span monkey and human studies. They use very similar random dot motion perceptual decision-making task paradigms. They allow us to test the same GDDM (i.e., with collapsing bounds, leaky integration, and input nonlinearity) across datasets. Finally, they provide benchmarks for comparison to PyDDM through different model-fitting methodologies: Hawkins et al., (2015a) found that the Roitman and Shadlen, (2002) dataset was better fit with collapsing bounds, whereas Evans and Hawkins, (2019) found that their dataset was best fit with no or little collapse.

We found that in the Evans-Hawkins dataset, the pre-specified GDDM did not perform better than the other models, but was more or less comparable. This contrast between the Roitman-Shadlen and EvansHawkins datasets nicely illustrates how our GDDM framework can be applied to test hypothesized mechanisms with empirical datasets, with example results that are consistent with prior literature. We now include a new Figure 2—figure supplement 1 on the Evans-Hawkins dataset, in parallel format to Figure 2.

We note that EZ-Diffusion performed especially poorly on the Evans-Hawkins dataset; in particular, the non-decision time was fit to be less than zero. We confirmed this was the case using two separate implementations of EZ-Diffusion: the javascript implementation, and the implementation built in to the HDDM Python package. The reason for this poor performance appears to be largely due to the inclusion of long RTs. We mirrored Evans and Hawkins, (2019) in using a cutoff of 7 seconds for the RT distribution. The high cutoff leads to a more skewed distribution with larger variance and a relatively smaller mean.

We introduced Figure 2—figure supplement 3 which describes the Evans-Hawkins dataset in a similar form as the Roitman-Shadlen dataset:

We added the following to describe this figure in the Results section:

"The RT distributions produced by monkeys may not be representative of those produced by humans (e.g., Hawkins et al., 2015a), which show a characteristic skewed distribution. […] This lack of improved fit by the GDDM suggests that the human subjects employed a cognitive strategy closer to the standard DDM, and that the specific mechanisms of this GDDM (collapsing bounds, leaky integration, and input nonlinearity) are not import for explaining psychophysical behavior in this dataset."

Additionally, we updated the Materials and methods section to include information about how we fit these datasets, as well as a new subsection briefly describing the two datasets and our rationales for their selection:

"As testbeds for fitting different models, we used two empirical datasets of choices and RTs during perceptual decision-making tasks: one from Roitman and Shadlen, (2002), one from Evans and Hawkins, (2019). These references contain full details of the tasks and datasets. For the dataset of Roitman and Shadlen, (2002), we used trials from the ‘‘reaction time’’ task variant, for both monkeys. For the dataset of Evans and Hawkins, (2019), we used trials from the task variant with no feedback delay."

"In brief, these two publicly available datasets were selected for the following reasons. […] These two datasets therefore provided useful testbeds for GDDM fitting with distinct a priori predictions about the utility of collapsing bounds in a GDDM."

2) As detailed in reviewer #2's Essential revision 3, and reviewer #3 Essential revision 3, given the flexibility of the model, it is unclear whether goodness-of-fit is the only or even the most appropriate criteria to evaluate different models.

We thank the reviewers for pointing out to us that our intention of the manuscript and the role of the GDDM were not sufficiently clear. Our intention appears to be in line with that of the reviewers. Our manuscript previously gave the impression that the best feature of the GDDM was its ability to fit datasets more accurately and with fewer parameters than other models. However, we stress that our goal is not to suggest that a GDDM (let alone the specific one demonstrated here) is generically the most appropriate model. Furthermore, we agree that the goal of modeling is not simply to fit data with the least error and/or fewest parameters. Our goal with this study is to provide a flexible, powerful, and accessible framework which (i) makes it easy to test new hypothesized cognitive mechanisms by fitting them to data, and (ii) to model task paradigms with time-varying evidence, which are difficult or impossible within the classic DDM framework. We used the Roitman-Shadlen dataset as an example testbed to demonstrate how hypothesized mechanisms can be fit to empirical data and how evidence for a mechanism can be informed by improvement in model fit. This is now further complemented by inclusion of the Evans-Hawkins dataset.

We have done the following to highlight these changes and clarify how GDDMs can be used in the scientific process. First, we have slightly changed the specific GDDM which is used in the context of the empirical datasets in section “GDDM mechanisms characterize empirical data”. Since our goal is to highlight the possibilities of using GDDMs in general, rather than one specific GDDM with collapsing bounds and leaky integration, we added another feature/parameter to the model which does not substantially improve fit but highlights another attractive feature of the GDDM: a fittable coherence nonlinearity. While this adds an extra parameter (the exponent of the nonlinearity), it allows us to demonstrate another practical and interesting feature of GDDMs.

We focus on this new description of the model within the Results section. The first paragraph of the subsection “GDDM mechanisms characterize empirical data” was changed to the following:

"The cognitive mechanisms by which evidence is used to reach a decision is still an open question. […] However, consistent with previous results (Hawkins et al., 2015a), the GDDM fit no better than the DDM or ‘‘full DDM’’ in the human dataset, suggesting that different cognitive strategies may be used in each case."

As a result of these changes, we change the subsection title to the following: "GDDM mechanisms characterize empirical data"

We have added the following paragraph to the Results section:

"For the example datasets considered above, the GDDM presented here improved the fit to the monkey data and provided a good fit to the human data. We caution that this result does not necessarily mean that this particular GDDM is appropriate for any particular future study. Models should not only provide a good fit to data, but also be parsimonious in form and represent a clear mechanistic interpretation (Vandekerckhove et al., 2015). The examples here demonstrate the utility of GDDMs for quantitatively evaluating cognitive mechanisms of interest through fitting models to empirical data."

We added the following paragraph to the Discussion section:

"PyDDM enables researchers to build very complex models of decision-making processes, but complex models are not always desirable. Highly complex models may be able to provide an excellent fit to the data, but unless they represent specific, plausible cognitive mechanisms, they may not be appropriate. PyDDM’s complexity can be used to examine a range of potential cognitive mechanisms which were previously difficult to test, and to probe a range of task paradigms."

Furthermore, we have removed the passages of the Roitman-Shadlen part of the Results section which imply that goodness of fit is more critical than the interpretability of the model over the goodness of fit.

Reviewer #2:

Essential revisions:

1) "Fifth, single trial parameter variability is supported within the GDDM framework for both starting point and non-decision time."

Initially, I was not sure what the authors mean with "single trial parameter variability". Do they mean "across-trial variability"? And if so, does their methodology not allow for across-trial variability in drift rate, as the full DDM does? If the GDDM indeed does not allow such variability, this warrants explicit mention and an extensive discussion.

Then when I read the Discussion section the cat comes out of the bag: "While the GDDM framework supports individual trial variability in non-decision time and starting position according to any arbitrary distribution, it does not support drift rate variability. As a result, the GDDM framework does not support the entire "full DDM"."

But this is a severe limitation, one that needs to be acknowledged and emphasized from the start. Variability in drift rate is a highly plausible mechanism that has been an integral part of the DDM since 1978. Without this mechanism to account for (fairly ubiquitous) slow errors, the model will have to turn to other mechanism to accomplish this, such as moving-in bounds. The authors mention "However, the GDDM framework is broad enough to support an infinite number of extensions which create the same RT features which originally motivated inclusion of drift rate variability." But the goal of the modeling is not to fit the data well; the actual goal is to decompose the observed data into meaningful psychological processes, and a process of moving-in bounds is fundamentally different from drift-rate variability. I assume that the drift rate variability frustrates the mathematical work? The reader needs to know the reason why the model cannot handle drift rate variability.

Staying on the same point, the authors then point out that "Indeed, in Figure 2, our 5-parameter GDDM provided a better fit to the asymmetry between correct and error RT distributions than the 11-parameter "full DDM"." True, but since these data are highly unrepresentative (see the next point below), this can just as well be used as an argument *against* the GDDM: apparently it is able to fit just about any pattern of data, even patterns that are highly unrepresentative.

We thank the reviewer for drawing our attention to the importance of supporting the entire “full DDM”. Fokker-Planck methods, in general, do not readily support true across-trial variability in drift rate, and therefore this constitutes a limitation of the Fokker-Planck approach. Each numerical solution of the Fokker-Planck equation must be performed for a specified drift rate which can be a function of x and t but cannot account for across-trial variability of drift rate. As a result, PyDDM does not offer native support for across-trial drift rate variability.

However, it is possible to approximate across-trial drift rate variability within the Fokker-Planck approach by discretizing the drift rate’s probability distribution, running a Fokker-Planck numerical solution for each drift rate separately, and then combining the results to yield an estimation of RT distributions. As the reviewer pointed out, this feature would be of interest to users.

In response to the reviewer’s concern, and to address this use case, we have now added an example to the PyDDM online documentation demonstrating how to implement across-trial drift rate variability:

https://pyddm.readthedocs.io/en/latest/cookbook/driftnoise.html#across-trial-variability-indrift-rate

We note that this support for across-trial drift rate variability is not without cost, due to limitations of the Fokker-Plank approach. In particular, this will cause simulations to be slow depending on the discretization of the drift rate distribution (due to repeated numerical solves of Fokker-Planck), and will interfere with the ability to automatically compute model likelihood. Thus, as the reviewer points out, it must be made clear to the reader that PyDDM does not have “first class” support for this feature, although it is implementable.

In summary, the original manuscript contained the following elements to highlight the fact that PyDDM does not support drift rate variability:

Mentioned it explicitly in the subsection “GDDM as a generalization of the DDM”.

– Highlighted this as a major negative for PyDDM (a “red cell”) in the table comparing software packages.

– Showed this in the diagram in Figure 1 by not drawing a distribution around the arrow (as we did for the “full DDM”).

– Included a paragraph in the Discussion section about it.

– Discussed technical aspects of not fitting the “full DDM” in the Materials and methods section.

Following the reviewer’s suggestion, we have changed our phrasing of “single-trial variability” to “across-trial variability” throughout the manuscript.

In addition to the elements listed above which were already in the manuscript, we made the following changes to further emphasize the fact that PyDDM does not natively support across-trial drift rate variability:

- Within the figure caption of Figure 1, we explicitly pointed out the meaning of the lack of a distribution around the arrow representing drift. The relevant portion of the caption now reads:

"The GDDM is a generalization of the DDM framework which allows arbitrary distributions for starting position and non-decision time, as well as arbitrary functions (instead of distributions) for drift rate and collapsing bounds."

- We change the text within the table to read "slow discretization (via extension)" under the “Across-trial drift variability” section

- We rewrote the paragraph in the Discussion section to be the following, to better explain our GDDM framework vis-`a-vis across-trial drift rate variability:"

The GDDM is a generalization of the DDM framework which allows arbitrary distributions for starting position and non-decision time, as well as arbitrary functions (instead of distributions) for drift rate and collapsing bounds."

- We change the text within the table to read "slow discretization (via extension)" under the “Across-trial drift variability” section

- We rewrote the paragraph in the Discussion section to be the following, to better explain our GDDM framework vis-`a-vis across-trial drift rate variability:

"The GDDM framework supports across-trial variability in non-decision time and starting position according to any arbitrary distribution, but it does not support across-trial drift rate variability. […] GDDMs offer researchers the capability to explore alternative cognitive mechanisms which might also account for slow errors."

2) The demonstration where the models are fit to the data is (deeply) problematic, because the monkey data are simply not representative. As can be seen from the grey histograms in Figure 2, the "regular" DDMs predict a long-tailed RT distribution, which is however not apparent from the data. This then creates ambiguity: yes, one can show that the existing models do not fit well, but they are fit to unrepresentative data. If these data patterns were regularly observed, the DDM would never have been proposed in the first place. It would be much more compelling to fit the models to any of the large benchmark data sets that are publicly available – these data sets all show a pronounced right skew.

We thank the reviewer for this important comment. Please see our response above to Essential revision point 2, which covers this point in depth.

3) As the progenitor of the much-maligned "EZ" model, I guess I am obliged to protest a little. Yes, of course EZ is not flexible, and does not have all the bells and whistles. But that is kind of the point of a simple model. There are some demonstrations that (contrary to my own expectations even) EZ actually does better than the full diffusion model when applied to sparse data. I am not talking about goodness-of-fit, but about recovering the ground truth in a simulation study (e.g., figuring out which parameters are the same and which differ). For sparse data, a simple model may work better. One reference:

van Ravenzwaaij, Donkinand Vandekerckhove, (2017).

We thank the reviewer for pointing out the inadequacy of our discussion of the merits of EZ-Diffusion, especially its utility in the case of sparse data.

We have added the following passage to the Discussion section:

"Additionally, sometimes complex models which do represent appropriate cognitive mechanisms may exhibit inferior performance due to limitations of the available data. For instance, simulation studies suggest that when data are sparse, Bayesian fitting or EZ-Diffusion may be better able to recover parameters (Lerche et al., 2017; Wiecki et al., 2013; van Ravenzwaaij et al., 2017)."

Reviewer #3:

Some concerns should be addressed before the manuscript can be published:

1) The manuscript frequently describes the numerical solution of the FPE a "simulation", starting with the title, the introduction, the main text, etc. Commonly, solving the FPE is not considered a simulation, such that describing it as such is potentially confusing misnomer – in particular as there exist simulation-based "likelihood-free" methods to fit models solely based on simulating them. First, I urge the authors to not call solving FPEs a "simulation", but instead describe their approach as numerically propagating the probability mass, which can intuitively be thought of as performing an infinite number of simulations.

We agree with the reviewer that the term “simulation”, which usually refers to trial-wise trajectory simulations or sampling, is not the ideal terminology to use for Fokker-Planck. We agree that it is better to be technically precise with our language, to be clearer that we are not performing Monte Carlo simulations. As a result, throughout the main text we now use more precise language to describe the numerical Fokker-Planck solutions, describing propagation of the probability, rather than “simulation” which we use for Monte Carlo trajectories.

However, after careful consideration we have opted to keep the word “simulating” in the title, to accommodate the brevity and clarity of common usage required there. After considering different options, we believe using “simulating” in the title best communicates to a typical reader, in succinct and accessible language appropriate for a title, the generation of RT distributions for a specified GDDM. To the typical reader who would like to fit parameters of a GDDM to empirical data or estimate an RT distribution, it is not important whether it is through Monte Carlo trajectories or another method. We also note that usage of “simulation” of an RT distribution is arguably not technically incorrect, because the result of Fokker-Planck is the distribution of simulations as N → ∞. Likewise, by discretely sampling from the distribution produced by Fokker-Planck, it is indeed possible to use our framework for fast and efficient simulation of sets of RTs (albeit without originating trajectories of the decision variable).

Within the body text, we have replaced every instance of “simulate” with more descriptive and/or precise terminology when it refers to either Fokker-Planck, or collectively to both Fokker-Planck and trial-wise trajectory simulations. In most cases, this involved replacing “simulate” with “solve”. Overall, depending on the context, we made one of three substitutions: “solve”, when referring to Fokker-Planck or analytical solutions; “estimate the RT distribution”, when referring to the more general case which encompasses both Fokker-Planck and trial-wise trajectory simulation; and “post-factum modifications”, to replace “post-simulation modifications” (in reference to the “Overlay” object). As a result of these changes, the term “simulate” refers exclusively to trial-wise trajectory simulations in all but three places: the title, its first occurrence in the Abstract, and in one instance in the Introduction section where we specify the usage.

In order to ensure this does not cause confusion, we furthermore added the following to the Introduction section:

"Solving the Fokker-Planck equation is analogous to simulating an infinite number of individual trials."

Second, the authors should acknowledge likelihood-free fitting methods, like ABC (see, for example, Holmes and Trueblood, 2018 and related) to fit diffusion models – subsection “Methods for simulating the DDM and GDDM” would be a good starting point.

We have changed the suggested line as follows:

"Since this method does not produce a probability distribution, it is difficult to use efficient and robust methods such as full-distribution maximum likelihood for fitting data, though progress is ongoing in this area using likelihood-free methods such as through approximate Bayesian computation (Holmes and Trueblood, 2018)."

2) The precision and the clarity of the manuscript can be improved in many:

We thank the reviewer for these comments. We agree that because language in the manuscript has heavily emphasized readability for typical target users, mathematical precision may be under-emphasized in the main text at times.

- It should be made clear that GDDMs only support Markovian processes + some additional post-processing of the computed first-passage time distributions. This is, for example, the reason why they don't support continuously varying drift rates, but this isn't at all clear from the manuscript.

The reviewer is correct in that we do not support non-Markovian processes, for instance mechanisms related to momentum and inertia.

We have added the following text to clarify this point:

"The GDDM framework supports Markovian processes, in that the instantaneous drift rate and diffusion constant of a particle is determined by its position xt and the current time t, without dependence on the prior trajectory leading to that state. Drift rate is discretized, such that the timestep of the discretization is equal to the timestep of the Fokker-Planck solution. In practice, for typical task paradigms and dataset properties, the impact of this discretization is expected to be minimal for a reasonable timestep size."

An additional (current) limitation is that the implementation currently only supports boundaries that vary symmetrically around zero, even though this appears to be a limitation of the provided implementation rather than a fundamental limit to the chosen approach.

The reviewer is correct in that PyDDM currently only supports bounds which are symmetric around zero, that this is due to the implementation rather than the methodology, and that adding asymmetric bounds is a feasible future addition.

We have added the following text to clarify this point:

"Second, while PyDDM currently only implements bounds which are symmetric around zero, support for asymmetric bounds is a feasible future addition."

- The manuscript frequently claims that the solution is "continuous" (e.g., subsection “Methods for simulating the DDM and GDDM”, subsection “Software package for GDDM fitting”) in time, even though they only provide time-discretized numerical results.

Our use of the word “continuous” in the manuscript overwhelmingly refers to “continuous maximum likelihood”, a term borrowed from Heathcote, Brown and Mewhort, (2002). It is not sufficient to describe our fitting procedure as “maximum likelihood”, because this can refer to a number of fitting procedures, such as quantile maximum likelihood (an approximation to likelihood based on data quantiles; Heathcote, Brown and Mewhort, 2002), or to the maximum likelihood derivations which utilize only summary statistics of the RT distribution rather than all of the data points (e.g. Palmer et al., 2005).

However, we also see the reviewer’s point that our use of the term “continuous” could be confusing in this context.

We have replaced all references to “continuous maximum likelihood” with “full-distribution maximum likelihood”. We have also confirmed that no other uses of the word “continuous” refer to a discrete approximation.

- It isn't sufficiently clear from the manuscript that all provided FPE solutions are approximate, as they discretize time and space. Currently, the manuscript only highlights the approximate nature of the forward Euler scheme (subsection “Execution time vs error tradeoff”). Indeed, it has worse numerical properties, but that doesn't make the implicit schemes non-approximate.

We thank the reviewer for pointing out our asymmetric use of the term “approximation”.

We updated the referenced lines to say the following:

"…forward Euler iteratively approximates the probability distribution of trajectory position at each timestep using the distribution at the previous timestep, while backward Euler and Crank-Nicolson iteratively solve systems of linear equations to approximate the probability distribution (Voss and Voss, 2008)."

3) The manuscript highlights a better fit (i.e., higher likelihood, better BIC, etc.) of a DDM with time-varying boundaries and a leak, and uses this fit (of a single dataset) to re-iterate throughout the manuscript that a better model fit with fewer parameters is always more desirable (e.g., subsection “Software package for GDDM fitting”). This mis-characterizes the modeling exercise, in which the structure and presence of different components of DDMs might be as important as the quality of the fit.

We thank the reviewer for pointing out that our description of model fitting could be misleading. For a full description of how this point was addressed, see our response above to Essential revisions point 2.

4) The authors highlight three approaches to "simulate" DDMs and GDDMs, but seem to be missing a fourth based on solving Volterra integral equation, as described by Smith, (2000) and implemented by the 'dm' package by Drugowitsch et al., – which is fairly flexible, but unfortunately only provides computing the first-passage times, but no model fitting code.

We thank the reviewer for bringing to our attention that we have not properly discussed the method of Smith, (2000).

We have modified the passage describing the “three methods” for solving the GDDM such that the method of Smith, (2000) is incorporated into the third method. The text is now as follows:

"A third and better method is to iteratively propagate the distribution forward in time, which is achieved by solving the Fokker-Planck equation (Voss and Voss, 2008) or using the method of Smith, (2000). These methods allow the RT distribution of GDDMs to be estimated with better performance and lower error than trial-wise trajectory simulation. However, they are difficult to implement, which has thus far impeded their widespread use in cognitive psychology and neuroscience."

We also added the following passage to the Discussion section to bring attention to this fourth method and its relationship to PyDDM:

"PyDDM uses numerical solutions of the Fokker-Planck equation to estimate the RT distribution of the model. Other methods may be used for this estimation, such as solving the Volterra integral equation (Smith, 2000). Some prior work has used these alternative methods for solving GDDMs (Drugowitsch et al., 2012; Zhang et al., 2014; Drugowitsch et al., 2014). Such methods already have fast solvers (Drugowitsch, 2016), which could be implemented as an alternative method within PyDDM in the future."

https://doi.org/10.7554/eLife.56938.sa2

Article and author information

Author details

  1. Maxwell Shinn

    1. Department of Psychiatry, Yale University, New Haven, United States
    2. Interdepartmental Neuroscience Program, Yale University, New Haven, United States
    Contribution
    Conceptualization, Software, Formal analysis, Visualization, Methodology, Writing - original draft, Writing - review and editing
    Contributed equally with
    Norman H Lam
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7424-4230
  2. Norman H Lam

    Department of Physics, Yale University, New Haven, United States
    Present address
    Department of Brain and Cognitive Sciences, Massaschusetts Institute of Technology, Cambridge, United States
    Contribution
    Conceptualization, Software, Formal analysis, Visualization, Methodology, Writing - original draft, Writing - review and editing
    Contributed equally with
    Maxwell Shinn
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5817-6680
  3. John D Murray

    1. Department of Psychiatry, Yale University, New Haven, United States
    2. Interdepartmental Neuroscience Program, Yale University, New Haven, United States
    3. Department of Physics, Yale University, New Haven, United States
    Contribution
    Conceptualization, Supervision, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    john.murray@yale.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4115-8181

Funding

National Institute of Mental Health (R01MH112746)

  • John D Murray

Gruber Foundation

  • Maxwell Shinn

NSERC (PGSD2-502866-2017)

  • Norman H Lam

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This research was supported by NIH grant R01MH112746 (JDM), the Gruber Foundation (MS), and NSERC (NHL). We thank Robert Yang for his contribution to the analytic solver, and Daeyeol Lee and Hyojung Seo for helpful discussions. We also acknowledge the authors of Roitman and Shadlen, 2002 and Evans and Hawkins, 2019 for making their datasets publicly available.

Senior Editor

  1. Joshua I Gold, University of Pennsylvania, United States

Reviewing Editor

  1. Thorsten Kahnt, Northwestern University, United States

Reviewers

  1. Long Ding, University of Pennsylvania, United States
  2. Eric-Jan Wagenmakers, University of Amsterdam, Netherlands

Publication history

  1. Received: March 16, 2020
  2. Accepted: August 3, 2020
  3. Accepted Manuscript published: August 4, 2020 (version 1)
  4. Version of Record published: September 1, 2020 (version 2)

Copyright

© 2020, Shinn et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,419
    Page views
  • 218
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Neuroscience
    Amanda M Pocratsky et al.
    Research Article Updated

    Within the cervical and lumbar spinal enlargements, central pattern generator (CPG) circuitry produces the rhythmic output necessary for limb coordination during locomotion. Long propriospinal neurons that inter-connect these CPGs are thought to secure hindlimb-forelimb coordination, ensuring that diagonal limb pairs move synchronously while the ipsilateral limb pairs move out-of-phase during stepping. Here, we show that silencing long ascending propriospinal neurons (LAPNs) that inter-connect the lumbar and cervical CPGs disrupts left-right limb coupling of each limb pair in the adult rat during overground locomotion on a high-friction surface. These perturbations occurred independent of the locomotor rhythm, intralimb coordination, and speed-dependent (or any other) principal features of locomotion. Strikingly, the functional consequences of silencing LAPNs are highly context-dependent; the phenotype was not expressed during swimming, treadmill stepping, exploratory locomotion, or walking on an uncoated, slick surface. These data reveal surprising flexibility and context-dependence in the control of interlimb coordination during locomotion.

    1. Neuroscience
    Sven Dannhäuser et al.
    Research Advance

    Adhesion-type GPCRs (aGPCRs) participate in a vast range of physiological processes. Their frequent association with mechanosensitive functions suggests that processing of mechanical stimuli may be a common feature of this receptor family. Previously, we reported that the Drosophila aGPCR CIRL sensitizes sensory responses to gentle touch and sound by amplifying signal transduction in low-threshold mechanoreceptors (Scholz et al., 2017). Here, we show that Cirl is also expressed in high-threshold mechanical nociceptors where it adjusts nocifensive behaviour under physiological and pathological conditions. Optogenetic in vivo experiments indicate that CIRL lowers cAMP levels in both mechanosensory submodalities. However, contrasting its role in touch-sensitive neurons, CIRL dampens the response of nociceptors to mechanical stimulation. Consistent with this finding, rat nociceptors display decreased Cirl1 expression during allodynia. Thus, cAMP-downregulation by CIRL exerts opposing effects on low-threshold mechanosensors and high-threshold nociceptors. This intriguing bipolar action facilitates the separation of mechanosensory signals carrying different physiological information.