Comparison between the exact Gillespie algorithm and the DGA for simulating chemical kinetics. (a) Example of kinetics with N = 3 reactions with rates ri(i = 1, 2, 3). (b) Illustration of the DGA’s approximations: replacing the non-differentiable Heaviside and Kronecker delta functions with smooth sigmoid and Gaussian functions, respectively. (c) Flow chart comparing exact and differentiable Gillespie simulations.

Flowchart of the parameter optimization process using the DGA. The process begins by initializing the parameters . Simulations are then run using the DGA to obtain statistics like moments. These statistics are used to compute the loss L({Si}), and the gradient of the loss ∇L is obtained. Finally, parameters are updated using the ADAM optimizer, and the process iterates to minimize the loss.

Two-state gene regulation architecture. (a) Schematic of gene regulatory circuit for transcriptional repression. RNA polymerase (RNAP) binds to the promoter region to initiate transcription at a rate r, leading to the synthesis of mRNA molecules (red curvy lines). mRNA is degraded at a rate γ. A repressor protein can bind to the operator site, with association and dissociation rates and , respectively. (b) Experimental data from Ref. [37], showing the relationship between the mean mRNA level and the Fano factor for two different promoter constructs: lacUD5 (green squares) and 5DL1 (red squares).

Accuracy of the DGA in simulating the two-state promoter architecture in Fig. 3(a). Comparison between the DGA and exact simulations for (a) steady-state mRNA distribution, (b) moments of the steady-state mRNA distribution, and (d) the probability for the promoter to be in the “ON” or “OFF” state. (c) Ratio of the Jensen-Shannon divergence JSD between the differentiable Gillespie PDF pDGA and the exact steady-state PDF , and the Shannon entropy H of the exact steady-state PDF. In all of the plots, 2000 trajectories are used. The simulation time used in panels (a),(b), and (d) is marked by blue ‘x’. Parameter values: .

Gradient-based learning via DGA is applied to the synthetic data for the gene expression model in Fig. 3(a). Parameters are fixed at 1, with 1/a = 200 and 1/b = 20 for a simulation time of 10. (a) Scatter plot of true vs. inferred parameters ( and) with γ constant. Error bars are 95% CIs. Panel (b) plots the logarithm of the loss function near a learned parameter set (shown in red circles in (a)), showing insensitivity regions. Panel (c) compares true and predicted mRNA mean and standard deviation with 95% CIs.

Gradient-based learning via DGA is applied to the synthetic data for the gene expression model in Fig. 3(a). Parameters are fixed at 1, with 1/a = 200 and 1/b = 20 for a simulation time of 10. (a) Scatter plot of true vs. inferred parameters (, and γ). Error bars are 95% CIs. Panel (b) plots the logarithm of the loss function near a learned parameter set (shown in red circles in (a)), showing insensitivity regions. Panel (c) compares true and predicted mRNA mean and standard deviation with 95% CIs.

Fitting of experimental data from Ref. [37] using the DGA. (a) Comparison between theoretical predictions from the DGA (solid curves) and experimental values of mean and the Fano factor for the steady-state mRNA levels are represented by square markers, along with the error bars, for two different promoters, lacUD5 and 5DL1. Solid curves are generated by using DGA to estimate , and and using this as input to exact analytical formulas. (b) Comparison between the inferred values of using DGA with experimentally measured values of this parameter from Ref. [37]. (c) Inferred values as a function of the mean mRNA level.

Design of the four-state promoter architecture using the DGA. (a) Schematic of four-state promoter model. (b) Target input-output relationships (solid curves) and learned input-output relationships (blue dots) between activator concentration [c] and average mRNA production rate. (c) Parameters learned by DGA for the two responses in (b). (d) The sharpness of the response [c], and the energy dissipated per unit time for two responses in (b). (e) Logarithm of the loss function for the learned parameter set for Response-2, revealing directions (or curves) of insensitivity in the model’s parameter space. The red circles are the learned parameter values.

In panels (a) and (b), we plot the ratio of the Jensen-Shannon divergence JSD between the differentiable Gillespie PDF pDGA and the exact steady-state PDF , and the Shannon entropy of the exact steady-state PDF, as a function of the two sharpness parameters 1/a and 1/b. In panel (a), 1/b = 20; in panel (b), 1/a = 200. The simulation time is set to 10. In panels (c) and (d), for these same values, we show the gradient ∇Lr of the loss function L with respect to the parameter r near the true parameter values. In all the plots, the values of the rates are: , and γ = 1. 5000 trajectories are used to obtain the PDFs, while 200 trajectories are used to obtain the gradients.

Error bars estimation for asymmetric loss function.