Abstract
Discovering biochemical circuits that exhibit a desired behavior is an outstanding problem in biological engineering. The traditional approach of enumerating every possible circuit topology becomes intractable for circuits with more than four components due to combinatorial scaling of the search space. Here, we use Monte Carlo Tree Search (MCTS), a reinforcement learning (RL) algorithm, to optimize circuit topology for a target phenotype by approaching circuit design as a sequence of assembly decisions. Our RL-based design framework, which we call CircuiTree, efficiently and comprehensively finds robust designs for three-component oscillators by prioritizing sparsity. CircuiTree can also infer candidate network motifs from its search results, producing similar results to enumeration. Using parallel MCTS, we scale this workflow up to five components and find that highly fault-tolerant designs use a novel strategy, which we call motif multiplexing. Multiplexed circuits contain many overlapping network motifs that each activate in different mutational scenarios. The evolutionary robustness of multiplexing may explain the ubiquity of multiple sub-oscillators in circadian clock circuits. Overall, CircuiTree provides the first scalable computational platform for designing biochemical circuits.
Introduction
Understanding how biochemical networks produce biological behavior is one of the central goals of systems and synthetic biology. Studying the design principles of these networks, or circuits, provides fundamental insights into their computational capabilities and enables the application of synthetic biology to problems such as cell-based therapeutics (Lim et al. (2013); Williams et al. (2020)) and engineered microbial communities (Jones et al. (2024); Purnick and Weiss (2009)). The topology, or qualitative architecture, of a synthetic gene circuit is a critical determinant of its behavior, and choosing a topology is the first step in engineering circuits with novel functions such as switching in response to an input (Gardner et al. (2000)), spontaneous oscillations (Elowitz and Leibler (2000)), or multistability (Zhu et al. (2022)). Circuit topologies with a particular desired phenotype have traditionally been discovered de novo using a method we call computational enumeration. Enumeration involves listing every topology in a predefined space of possibilities and simulating its behavior with an exhaustive set of random perturbations of the modeling parameters (Ma et al. (2009); Cotterell and Sharpe (2010); Chau et al. (2012); Schaerli et al. (2014); Gerardin et al. (2019)). This method identifies the topologies that are the most robust to parameter choice, as well as the network motifs of the phenotype, which are topological patterns overrepresented among highly robust circuits (Alon (2007, 2019)).
Currently, computational complexity has limited the role of automation in circuit design. Assuming 3 possible types of interactions (activation, inhibition, or no interaction), the number of distinct topologies that can be made with N circuit components scales as
Our inspiration for a circuit design approach with these features comes from algorithms developed in modern artificial intelligence to solve decision problems, which share core computational similarities with genetic circuit design. In a decision problem, or game, one starts from an initial position (the root state) and makes a sequence of decisions ending in success or failure. Algorithms for optimal game playing sample the branching tree of possible decisions in search of the best decision sequences, or paths. Large game trees, characteristic of long, complex games like chess (∼ 1043 states) and Go (10170 states), can overwhelm classical exhaustive search methods (e.g., flat search, A∗, and minimax) and branch-and-bound methods (e.g., α-β pruning) (Shannon (1950); Tromp and Farnebäck (2007)). In contrast, most modern probabilistic search algorithms use reinforcement learning (RL) to sample decision paths; for example, Monte Carlo tree search (MCTS) uses an RL decision rule that optimistically estimates the success of each possible decision (Kocsis and Szepesvári (2006)). MCTS has proven to be instrumental in solving large games (Silver et al. (2016, 2018)) due to its low memory footprint, ability to be parallelized (Enzenberger and Müller (2010)), balance of exploitation and exploration (Auer et al. (2002)), and myriad variations (Świechowski. (2022); Browne et al. (2012)).
In this work, we present a circuit design platform based on tree search. This platform, which we call CircuiTree, approaches circuit design as a sequence of assembly decisions and optimizes the circuit topology for a given phenotype by using MCTS to search the tree of possible circuit assemblies. We first used CircuiTree to search a trial space – all 3,325 connected circuits with three transcription factors – for topologies that exhibit spontaneous, sustained oscillations. As it dynamically explores the search space, CircuiTree discovers sparse oscillators before complex ones and reliably identifies the most robust topologies. Next, we demonstrate that CircuiTree accurately identifies assembly motifs, which are overrepresented search states analogous to network motifs found by enumeration. The best assembly motifs for 3-component oscillators also contain the activator-inhibitor (AI) and/or repressor (Rep) motifs, a result consistent with previous studies of 2- and 3-component oscillators. Finally, we use a parallelized implementation of CircuiTree to study mutational tolerance in large circuits. We discover that highly fault-tolerant five-component oscillator circuits are “multiplexed” in that they contain multiple AI and Rep motifs that are syner-gistically interleaved such that the oscillations are robust to complete and partial knockdowns of single genes. Our results demonstrate that CircuiTree’s RL-based framework enables the efficient, generalizable, and scalable design of biochemical circuits, which could be used to engineer and understand complex intracellular and multicellular biological systems.
Results
Circuit design as an assembly game
We define a circuit as a system of k dynamically interacting biochemical species x = [x0, …, xk] characterized by its qualitative topology s, a member of the set of possible topologies 𝒮, and its parameters θ ∈ ℝn(s), which describe the rates of biochemical reactions. We also assume the existence of a phenotype function f (x(t)) = q ∈ {0, 1} that returns a yes-or-no verdict as to whether a given simulation x(t) exhibits the desired phenotype. This function, which we conceptualize quite generally, may represent the presence of a dynamical phenotype such as adaptation or oscillation or a more general property such as multistability. Like many prior computational studies of design principles (Ma et al. (2009); Chau et al. (2012); Cotterell and Sharpe (2010); Schaerli et al. (2014)), each simulation of the system is performed with a set of randomly chosen parameters, and we define robustness Q(s) as the mean phenotype score from these samples Q(s) = 𝔼θ[q(s)] = 𝔼θ [f (x(t; s, θ))]. The most robust topology s∗ is therefore the solution to a combinatorial optimization problem. (Note that Q in this context should not be confused with the value function used in RL.)
In this work, for convenience, we consider models where all topologies are governed by the same reaction parameters (i.e. n(si) = n(sj)∀si, sj ∈ 𝒮), but CircuiTree does not require this assumption.
To solve Equation 1, we approach circuit topology design as a multi-step decision problem, or game (Figure 1A). Beginning with an “empty” circuit s0 with a given set of components and without interactions, each step of the game can add an activating or inhibitory (auto-)regulatory interaction. The game ends when the builder chooses to “terminate” the assembly at some topology si, at which point the outcome is decided by a simulation with a win probability Q(si). In the language of Markov decision processes (MDPs), each topology s represents a state of the game, and the addition of an interaction or the termination of assembly represents an action a that converts a state si to a different state sj. Each playout of the game traces a path on the tree of possible decisions T rooted at s0. The nodes in the tree are circuit topologies 𝒮, and each directed edge (si, sj) ∈ ε represents the assembly of sj from si by an action. Note that because multiple decision paths can converge on the same sj, T is technically a directed acyclic graph (DAG), but we will call it a tree for simplicity.

A stepwise assembly framework enables circuit topology optimization with tree search.
(A) Circuit topologies are built step-by-step by adding interactions until the game is ended by taking the “terminate” action (the STOP sign). (B) Each MCTS iteration undergoes four phases: (1) Selection: The UCT criterion is used to recursively select the most promising action
For large design spaces, it is computationally infeasible to estimate Q(s) for every s ∈ T. Instead, we iteratively sample from T using Monte Carlo tree search (MCTS), an RL algorithm commonly used in MDPs and game-playing artificial intelligences (Silver et al. (2016, 2017, 2018)) to bound the complexity of searching large decision trees. In these applications, tree search is used during game play (and sometimes also during training) to find the best move by simulating many hypothetical scenarios forward in time.
Each MCTS iteration starts at the root state s0 and undergoes four phases, shown in Figure 1B and described in Algorithm 1. The key innovation of MCTS occurs during the first phase (selection), when it uses the UCT criterion from RL to choose moves while balancing (i) exploitation of moves that have yielded high rewards in the past and (ii) exploration of new moves that may yield higher rewards. At each branch (Figure 1C, left), from a state si, multiple actions aj are available that have yielded cumulative rewards rij over υij previous visits. MCTS decides the optimal action using the decision policy
The first term of the UCT criterion estimates the mean reward of the move si → sj, while the second term, derived from Hoeffding’s inequality (Kocsis and Szepesvári (2006)), is an estimated upper bound on the sampling error. The second term ensures that a move with a historically low reward but relatively few samples (υj << ∑j υij) will still be tried occasionally to account for insufficient sample size. Thus, UCT is an optimistic predictor of a move’s true mean reward. Note that the two terms encourage exploitation and exploration, respectively. (By definition, UCT(i, j) = +∞ if si → sj has not been visited before).
Using the UCT criterion, actions are selected recursively, breaking any ties randomly, until the new state sj is terminal (i.e. the assembly is finished) or has not been visited before during sampling. An unvisited sj is added to T. Then, during the simulation phase, the outcome q of the trial is determined by taking random actions until the “terminate” action is chosen and simulating the resulting topology with randomly chosen reaction parameters and initial conditions. Finally, the history of rewards and visits is updated for each selected edge in T (backpropagation).
In addition to finding individual circuit topologies, an important goal of circuit design is to identify specific structural features, or network motifs, that lead to successful designs (Milo et al. (2002); Alon (2007); Ma et al. (2009); Cotterell and Sharpe (2010); Shah and Sarkar (2011); Chau et al. (2012); Lim et al. (2013); Schaerli et al. (2014)). Similarly, in our game-playing paradigm, a motif of the assembly game is a series of moves (an assembly strategy) that leads to topologies with a high rate of phenotypic success, on average. Thus, when a motif is present in the space of possible designs, it creates an enriched region in T, which can be exploited during the tree search to rapidly discover successful designs (Figure 1D).
Establishing a ground-truth for three-node stochastic oscillators with enumeration
Oscillations appear in diverse cellular contexts such as metabolism (Chandra et al. (2011)), DNA damage (Geva-Zatorsky et al. (2006, 2010)), the cell cycle (Ferrell et al. (2011)), and circadian rhythms (Tyson et al. (1999)), and consequently, many of their design principles have been elucidated (Novák and Tyson (2008)). We first benchmark CircuiTree on the well-studied problem of designing an oscillator circuit with three nodes (Figure 2A, part i). Specifically, we consider a system of symmetric transcription factors (TFs) modeled as a stochastic birth-and-death process of individual mRNAs, TFs, and TF-response element (TF-RE) complexes, with elementary reactions of transcription, translation, degradation, binding, and unbinding (Figure S1A; reactions and their rate parameters are summarized in Table 1). TFs regulate transcription by binding to one of two REs in the regulatory region of each promoter, and cooperativity is introduced by assuming that TF-RE binding is stronger when both sites are occupied by the same TF. This assumption reduces the computational cost associated with modeling every homo- and heterodimer and their dimer-RE complexes. For each stochastic simulation, the system was randomly initialized and stochastically simulated using the Gillespie method for tmax = 4 × 104 sec = 11.1 hrs (see Methods for details of initialization). Oscillations were quantified by computing the normalized autocorrelation function (ACF) and identifying its lowest minimum value across all TFs ACFmin, excluding the bounds, and a Boolean reward value was assigned based on a cutoff value (Figure 2A, part ii).

Model rate parameters

CircuiTree efficiently identifies simple and robust 3-component oscillators.
(A) All three-component transcription factor (TF) circuits (3,325 up to symmetry) were enumerated with 104 random parameter sets (i) and evaluated for oscillations (ii) using an autocorrelation-based reward function. (B) A representative MCTS run. With more iterations (N), the search graph T (represented by a spanning tree for simplicity) expands to encounter more oscillators (orange circles) and improve its best predicted oscillator topology (shown in black). (C) A heatmap showing the average rate of discovery, or recall, for each oscillator (proportion of n = 50 replicates. Rows (oscillators) are sorted in order of complexity, or the number of interactions, and oscillators with the same complexity are sorted by descending robustness Q. Sparse oscillators are found before more complex ones, with a preference for the most robust candidates. (D) Precision (blue) and recall (orange) of oscillator classification (mean ± 95% CI, n = 50). CircuiTree’s recall is particularly high for the 10% most robust oscillators (red), reaching 94.7% after 105 iterations. See also Figures S1, S2, S3, and S4.
To compare with enumeration, all 3,325 unique, fully connected topologies were simulated with 104 randomly sampled parameter sets and initial conditions. The 10 rate parameters of the model were reduced to 8 dimensionless variables, which were sampled uniformly from a range of values determined based on experimentally measured rates (Milo et al. (2010)) and known requirements for oscillations (Elowitz and Leibler (2000); Novák and Tyson (2008)). (See Table 2 for variable definitions and sampling ranges and Methods for details of parameter sampling). The robustness to parameter perturbation Q was calculated as the fraction of parameter sets that produced oscillation, and topologies with Q > 0.01 were considered oscillators. This procedure uncovered 221 oscillator topologies (6.65% of the search space) (see Figures S1B and S1C for a summary of the best topologies). These results generally agree with prior studies of two- and three-node oscillators (Qiao et al. (2022); Elowitz and Leibler (2000); Novák and Tyson (2008); Woods et al. (2016); Stricker et al. (2008)). For instance, almost all oscillators contain a repressilator (Rep) loop, activator-inhibitor (A-I) loop, or a combination thereof. Positive autoregulation (PAR), which has been shown to buffer extrinsic and intrinsic noise (Qiao et al. (2022)), is also ubiquitous among these oscillators and is required for robust A-I loop oscillations (Novák and Tyson (2008); Stricker et al. (2008)). Further discussion of these motifs can be found in the section below. Interestingly, stochastic switching between stable states was occasionally mistaken for low-frequency oscillations (Figure S2), causing some toggle switch circuits to be classified as oscillators (highest Q = 0.024, ranked #139).

Dimensionless variables and limits imposed on random parameter sampling
Algorithm 1 Circuit topology search using MCTS Algorithm 1 Circuit topology search using MCTS

Across all oscillators, oscillation was favored by tight TF-RE binding (KD,1 < 103), low basal transcription (km,unbound < 0.3 sec−1), strong repression (km,rep/km,unbound < 10−1), and a high activated transcription rate (km,act, monotonic effect) (Figure S3A). Protein and mRNA degradation rates had a non-monotonic effect with a peak at γm ≈ γp ≈ 0.2 molec−1 sec−1 (Figure S3A), and oscillation period depended strongly on these rates and their ratio (Figures S3B and S3C). However, the best oscillators were exceptionally robust to the choice of parameters. For example, the most robust oscillator (the repressilator with PAR) has a Q of 0.767, meaning that for each of the 8 sampled variables, on average, 96.7% of the possible values would permit oscillations (0.9678 ≈ 0.767).
CircuiTree efficiently and systematically discovers three-node oscillators
MCTS masters a game with a limited number of trials by balancing deep sampling of promising regions with broad sampling of under-explored regions. To understand how this strategy performs for circuit topology design, we use CircuiTree to search for 3-node oscillators given N = 105 iterations of MCTS (n = 50 replicates; see Methods for MCTS implementation specifics). In the average run, the first putative result is discovered after just 0.69 samples/topology (2,280 iterations), and by 19.6 samples/topology (65,360 iterations), a top-5 oscillator has been sampled >100 times. A topology is discovered as a “successful” oscillator if its estimated robustness
CircuiTree balances efficiency and comprehensiveness by first finding sparse solutions before exploring deeper areas of the tree. In Figure 2C, each row of the heatmap is one of the 221 oscillators, and the rows are sorted first by the circuit’s complexity, or the number of interactions, then by decreasing robustness (Q). The color scale indicates the likelihood of discovery, or recall, of each oscillator measured as the proportion of replicates that discovered it. Because sparse topologies require fewer assembly steps, MCTS encounters the sparsest oscillators (such as the repressilator) first, and it discovers increasingly complex solutions over time until oscillators with 9 interactions (the maximum) are found at N ≈ 5 × 104. As shown by the line plot in Figure 2D, CircuiTree has a very high recall of 94.7% for the top 10% of oscillators (95% CI: 86.4% − 100.0%) and has a recall of 35.9% (95% CI: 30.8% − 41.5%) for oscillators in general. This is fairly high, considering that it gets an average of
The competing goals of breadth and depth manifest during tree search as distinct temporal phases in which MCTS first explores a broad set of topologies before focusing on a narrow, enriched subset. Among topologies containing a combination of AI and Rep feedback loops, 27.2% (47/173) are oscillators, a very high proportion compared to 6.7% (221/3,325) for the entire design space and 9.3% (146/1,575) and 8.0% (6/75) for the AI or Rep loops alone (Figure S4A). Figure S4B shows how MCTS allocates samples to each of these categories over a total sampling time of 105 and 106 iterations (mean of n = 50 and n = 12 replicates, respectively). During an initial exploratory phase, samples are taken broadly across categories; however, at N ≈ 6 × 104, the AI-Rep combination becomes heavily favored, followed by Rep alone, for the rest of sampling time. This transition can be observed directly from the search history by measuring regret. Defined as
regret is the difference between the expected reward from sampling the best topology (with robustness Q∗) and the actual reward over N iterations. As MCTS moves from exploration into exploitation, the reward rate becomes higher and regret accumulates more slowly (Figures S4C and S4D). Thus, the discovery of an enriched region during search triggers a shift to more focused sampling that can be identified in real-time without a priori knowledge about design principles or motifs.
CircuiTree infers oscillator motifs from search results
In addition to finding individual circuit topologies, an important goal of circuit design is to identify specific structural features, or network motifs, that lead to successful designs (Milo et al. (2002); Alon (2007); Ma et al. (2009); Cotterell and Sharpe (2010); Shah and Sarkar (2011); Chau et al. (2012); Lim et al. (2013); Schaerli et al. (2014)). Similarly, in our game-playing paradigm, a motif of the assembly game is a series of moves (an assembly strategy) that leads to a subset of T highly enriched with successful design results. Specifically, we define an assembly motif as any successful topology (Q > Qthresh) that is also overrepresented as a sub-pattern within other successful topologies, compared to a set of randomly assembled designs. Using a statistical test for overrepresentation (described in detail in the Methods section and shown schematically in Figure S5A), CircuiTree can mine the results of a tree search for putative assembly motifs.
Looking for 3-node oscillator motifs with ≤ 3 interactions, CircuiTree identifies the same four motifs as enumeration, shown at the top of Figure 3: the repressilator motif (Rep) and the activator-inhibitor (AI) loop with either PAR of the activator (AIPAR), constitutive inhibition of the inhibitor (AICI), or constitutive activation of the activator (AICA). CircuiTree finds these minimal motifs in 100% (Rep), 86% (AIPAR), 94% (AICI), and 62% (AICA) of replicates. To see how these motifs are situated in the overall design space, we plot all 221 oscillators in Figure 3 as a “complexity atlas” (Cotterell and Sharpe (2010); Schaerli et al. (2014)), a graph-of-circuits where every oscillator is a node and nodes are organized in layers according to their complexity. Edges connect nodes in adjacent layers if they differ by the addition of a single circuit interaction, analogous to a move in the assembly game. Notably, 216/221 oscillators constitute a large connected component of the graph originating from the four minimal motifs, indicating that almost all 3-node oscillators live in a subset of design space defined by Rep and/or AI motifs. The size of each node in Figure 3 denotes its motif robustness Qmotif, the overall oscillation probability for all circuits containing this motif (in other words, the mean reward once reaching this state in the game). The color of each node reflects its motif discovery rate, measured as the percentage of n = 50 replicates that labeled it as a motif.

Motifs identified from search results form a cluster of optimal 3-node oscillators.
A complexity atlas of oscillators with ≤3 components. Circles are oscillator topologies identified by enumeration, and edges link oscillators that differ by the addition/removal of one interaction. 97.7% of oscillators (216/221) are topologically related to one of the four motifs for 3-node oscillation, shown above the atlas in red boxes. Bold circle borders indicate oscillators found to be motifs based on enumeration. Circle color indicates the rate with which CircuiTree labels each oscillator as an assembly motif. Circle size indicates Qmotif, the average robustness for a circuit completed randomly starting from this state of the assembly game. The correlation between discovery rate and Qmotif (plotted in Figure S5B) suggests that motifs found by CircuiTree correspond to beneficial game states. The bolded edges, which connect oscillators with a discovery rate > 80%, form a contiguous cluster representing optimal assembly strategies. The most robust oscillator, the repressilator with PAR of all components, is shown on the bottom-left and indicated on the atlas by a green arrow. See also Figures S4 and S5.
Note that these features correlate visually (large circles tend to be red and vice versa) and quan-titatively, as shown by the scatterplot in Figure S5B, indicating that higher quality motifs are more likely to be found. Additionally, the motifs that are discovered most reliably (discovery rate <80%, indicated by bold edges) form an optimal subset corresponding to the best motifs (the largest nodes). Surprisingly, the most robust oscillator, the repressilator with PAR on all three TFs (Rep+3xPAR, Q = 0.767), is discovered at a rate of 98%. Because Rep+2xPAR is a poor oscillator (ranked #187) and Rep+1xPAR does not oscillate at all, Rep+3xPAR cannot be assembled from Rep (or any other intermediate) without breaking oscillations. Nonetheless, its high motif discovery rate suggests that CircuiTree is capable of identifying special design strategies that require a specific combination of moves. Overall, we find that CircuiTree reliably infers the most robust motifs for a given phenotype, even finding obscure but optimal solutions, and the resulting motifs for 3-node oscillation form a single cluster of optimal designs. Note that, in contrast to classic network motifs which compare the frequency of a pattern in successful topologies against the entire set of topologies (or a comparable null model), we compare frequencies between circuit assemblies that were successful during search and random circuit assemblies. Therefore our method returns motifs that are overrepresented among search results and thereby accounts for the bias inherent to sampling the space of topologies by assembly rather than by flat enumeration.
Motif multiplexing allows five-node oscillators to compensate for deletions and single-gene knockdowns
While synthetic circuits generally rely on a single, minimal circuit module, naturally occurring circuits often use many functional modules. For instance, circadian oscillators across divergent taxa contain two or more oscillatory feedback loops (Lee et al. (2000); Cheng et al. (2001); Bell-Pedersen et al. (2005); Pokhilko et al. (2012)). Could these additional modules make circadian oscillators more resistant to deletions during evolution (Wagner (2005))? To explore this possibility, we implemented a parallelized version of CircuiTree with pruning (described in details in the Methods and illustrated in Figure S6) and used it to design 5-node circuit topologies that oscillate despite a 50% chance of deletion of a random TF (Figure 4A). Please see Methods for a description of the parallel implementation and Figure S6 for a schematic of the pruning.

Parallelized CircuiTree scales to large design spaces.
(A) A parallelized version of CircuiTree was used to search for five-node oscillators with ≤ 15 interactions (left) that oscillate despite a 50% chance of a single random deletion (right). (B) Search results after 5 · 106 iterations. Circles are putative oscillators
After 5 million search iterations (less than four days of real time using 1,000 parallel search threads and 300 CPUs), CircuiTree finds 1,386 putative oscillators with
In contrast, fault-tolerant oscillators
During evolution, genomic mutation may lead to a partial reduction in transcription rate rather than a complete knockout. Do multiplexed oscillators maintain their mutational robustness in these conditions? To explore this question, we simulated the 3AI+3Rep circuit under conditions where a single gene is partially knocked down by multiplying its transcription rate by a factor (100 − KD)/100, KD being the percent of knockdown. In Figure 5A, we visualize a representative stochastic simulation of the wild-type (WT) 3AI+3Rep circuit by reducing the system of five TFs (a five-dimensional phase space) to two composite dimensions using principal components analysis and plotting a simulated trajectory on the first and second principal component axes. At each time point, the dominantly expressed TF is indicated by a transparent marker of the same color, and the inset diagram shows the order of TFs activated in the limit cycle (A-B-D-C). As gene A is knocked down from KD=0% to KD=100% (Figure 5B, upper middle panel), the limit cycle gradual drifts in phase space until the trajectory eventually flattens, consistent with the lack of oscillations after deletion of gene A (Figure 4E). During knockdown of gene B (Figure 5B, top right panel), in contrast, the limit cycle drifts before discontinuously jumping to a new limit cycle (the A-E-C repressilator motif) between 75% and 100% KD, as shown in the inset diagram. Similarly, knockdown of gene C induces a gradual drift followed by a jump between 75% and 100% KD to a limit cycle driven by the B-D activator-inhibitor motif. Unlike genes B and C, knockdown of gene D produces no obvious discontinuities as the limit cycle gradually adapts from A-B-D-C to the A-B-C repressilator. Knockdown of gene E produces no appreciable changes in the limit cycle. Overall, the multiplexed oscillator 3AI+3Rep maintains oscillatory behavior during partial knockdown of genes B, C, D, or E and transitions between disparate limit cycles in qualitatively unique ways.

Motif multiplexing makes oscillators resistant to the failure of components.
The 3AI+3Rep circuit (A, top) oscillates with different limit cycles after partial knockdowns of different genes. (A, bottom) An exemplary trajectory of 3AI+3Rep is shown on axes of the first two principal components (PCs) of phase space. Transparent circles indicate the dominant species at each time-point. (B) Trajectories under scenarios where transcription rate is reduced by a factor KD. The ordering of species in the limit cycle at KD = 100% is shown by the inset diagram. (C) Oscillation quality and frequency in single-gene knockdowns. Oscillations persist (ACFmin < ACFthresh) for most knockdowns of genes B, C, D, and E (middle). Oscillation frequency (bottom) is pulled from its WT value in knockdowns of A, B, C, and D. (E) Robustness to parameter variation between TFs. The power spectral density of the trajectory of TF A (bottom, mean) and the overall oscillation rate (top, mean ± SEM) are shown for simulations in which parameters were perturbed by a Gaussian kernel of width σparam (n = 50 replicates). A dissipation of fundamental and harmonic frequencies and corresponding loss of oscillations occurs for σparam > 5 · 10−2.
To quantitatively understand how the system responds to knockdowns, single-gene knock-downs were simulated with a range of values of KD with n = 50 replicates with different random seeds. The quality and frequency of oscillations were then assessed by computing the ACFmin and, if oscillations were detected, the frequency of oscillation. In Figure 5C (upper panel), the ACFmin (mean ± 95% confidence interval) is plotted as a function of KD, and the dashed line indicates the value ACFmin = −0.4 used as a threshold between oscillatory and non-oscillatory dynamics (below and above the line, respectively. Knockdown of gene A leads to a rise in ACFmin and bomes lethal for oscillations above KD = 60%. For gene B, ACFmin increases before suddenly crossing the threshold to indicate loss of oscillations between KD = 85% and 95%, and for gene C, oscillations disappear in the range KD = 65%to90%. For genes D and E, no detrimental effects are observed during knockdown. Thus, oscillations persist in almost all cases (91.9% of samples) when partially knocking down genes B, C, D, or E (Figure 5C, middle panel). Partial knockdowns of A, B, C, and D each have distinct effects on oscillation frequency, a phenomenon called frequency pulling (Heltberg et al. (2021)). The bottom panel of Figure 5C shows how, for KD < 60%, the WT resonant frequency of 6.0hr−1 is pulled higher when knocking down A or B and lower when knocking down C or D (no change for E). At higher values of KD, knockdowns of B and C discontinuously jump from the drifted frequency to a new resonant frequency (2.6hr−1 and 6.6hr−1, respectively), while the D knockdown transitions smoothly to its new frequency of 3.8hr−1. Thus, the 3AI+3Rep oscillator accommodates most partial knockdowns of genes B, C, D, or E by modulating, yet retaining, oscillatory dynamics.
Up to this point, our modeling has assumed that the rates of reactions such as binding, transcription, translation, and degradation are identical between circuit components. To investigate whether the oscillators we discover are fine-tuned for this symmetry, we perturbed the rate parameters for each TF individually. A Gaussian perturbation kernel with standard deviation σparam was applied to each parameter, scaled to the pre-defined ranges of those parameters (see Table 2 for parameter ranges and Methods for details of the perturbation kernel). Stochastic simulations were performed for a range of values of σparam between 1 · 10−3 and 3 · 10−2. To assess the effects of parameter perturbation on oscillation, the power spectral density (PSD) of the dynamics of TF A (mean of n = 50 replicates with different random seeds) was calculated for each value of σparam and plotted as a heatmap (Figure 5D). For σparam < 10−2, the PSD shows peaks at the oscillator’s fundamental frequency of 6hr−1 and first and second harmonics (12hr−1 and 18hr−1, respectively). As σparam increases further to 5 · 10−2, the fundamental frequency peak gradually diffuses, and the oscillation rate drops from 1.0 to ∼ 0.5, as shown in the line plot in Figure 5D (envelope indicates 95% confidence interval). above σparam = 5 · 10−2, the oscillation rate drops to near-zero, and the mean PSD shows no visible peaks. Therefore, this oscillator can tolerate a minimal amount of heterogeneity between TFs, above which it appears somewhat sensitive to asymmetry in at least one parameter of the model.
Discussion
Natural biological networks contain a number of biochemical components that exceeds our current capabilities of engineering by orders of magnitude, underscoring the importance of scalable computational methods for synthetic circuit design and analysis of biological design principles. Inspired by state-of-the-art artificial intelligence, we approach circuit design as a game of step-by-step topology assembly where success is determined by the achievement of a target phenotype in simulation. Similar to game-playing platforms that have achieved superhuman mastery of complex games (Silver et al. (2017)), CircuiTree searches the space of possible circuit topologies using Monte Carlo tree search (MCTS), which balances exploitation of promising circuit assembly moves with exploration of other possibilities (Figure 1). This search strategy is comprehensive enough to infer motifs for a given phenotype (Figure 3) and efficient enough to search large spaces of designs fruitfully with limited samples (Figures 4A and 4B). Finally, we demonstrate CircuiTree’s scalability by characterizing a novel class of five-gene oscillators that use a strategy we call motif multiplexing to resist deletion and single-gene knockdown mutations by densely interleaving multiple sub-oscillators (Figures 4C-4E and 5A-5C). This design principle may grant evolutionary robustness to eukaryotic circadian clocks, which have been observed across many phyla to contain multiple oscillatory loops (Cheng et al. (2001); Bell-Pedersen et al. (2005); Pokhilko et al. (2012)), and it leaves open the question of how multiple sub-oscillators can be coupled without triggering chaos (Heltberg et al. (2021)).
CircuiTree is distinguished by its general framework. Developed for general game-playing and planning problems, MCTS can query very large spaces in search of robust topologies and assembly motifs for any measurable phenotype, without restrictions on the modeling or simulation frame-work, and without needing to enumerate all possible topologies. This property bridges a gap in computational circuit design where generally, methods for finding single topologies (such as evolution (François and Hakim (2004); François and Siggia (2008)), mixed-integer optimization (Otero-Muras and Banga (2016)), and recurrent neural networks (Shen et al. (2021))) do not generalize easily to design principles and/or require a specific mathematical form, while more comprehensive methods (such as enumeration (Chau et al. (2012); Schaerli et al. (2014)) and Bayesian sampling (Woods et al. (2016))) require an explicit list of topologies. Notably, while CircuiTree currently implements a “vanilla” version of the algorithm, MCTS can be modified in many ways to suit different problems (Browne et al. (2012); (Świechowski. (2022).
This work also highlights the possibilities of using reinforcement learning to address difficult combinatorial design and inference problems in biology. In future work, CircuiTree could be applied to combinatorial design of therapeutics, both in the sense of optimizing chimeric receptor design (Daniels et al. (2022)) or combinatorial antigen recognition (Dannenfelser et al. (2020); Williams et al. (2020)), and in the sense of generating novel multicellular therapies consisting of multiple engineered cell types that collaborate in situ, analogous to a natural immune system. Our work could also be used to generate synthetic morphogenesis circuits, which are difficult to design due to the many possible combinations of chemical and physical components (Davies (2017); Toda et al. (2018); Fleischer and Barr (1997)) and a high computational cost per simulation. Given a reward function that measures goodness-of-fit, CircuiTree could also be extended as an inference tool — for instance, for inferring transcriptional regulatory networks from data. More generally, this computational platform extends the study of design principles (Lim et al. (2013)), which has been limited to small motifs of 2 or 3 components, to a larger space of biological networks, presenting an opportunity to dissect the algorithms and strategies used by biology to assemble complex networks at scale.
Methods
Modeling and simulation
Transcription factor (TF) cooperativity was modeled as a slower unbinding rate when both response elements (REs) for the a promoter are bound by the same TF. For K genes (each with mRNA and protein species) with cumulative A activation reactions and R repression reactions, the elimination of explicit dimerization reduces the number of species from 2K + K2 + K3 (mRNAs, TFs, TF-TF complexes, and TF-TF-RE complexes) to 2K +A+R (mRNAs, TFs, and TF-RE complexes) and the number of reactions accordingly from 4K + 2K2 + 3K3 to 4K + 3A + 3R.
Stochastic trajectories were simulated using Gillespie’s exact method (Gillespie (1977)) and saved at nt = 2000 time points at intervals of dt = 20 sec. The separation of time-scales between fast binding-unbinding kinetics and the other reactions creates a long simulation time (>2 mins) that cannot be alleviated with, for example, traditional τ-leaping (Cao et al. (2004)). Thus, instead of using realistic but impractical binding and unbinding rates, these rates were set to virtual values determined from the equilibrium constant for first-binding KD,1 by the solving the equations
where the value Z = 100 was chosen heuristically to be large enough to maintain a separation of time-scales at low quantities. For each protein species, high-frequency noise was filtered from the stochastic signal using a 9-point binomial filter (Aubury and Luk (1996)), and the quality of oscillations overall was determined by computing the normalized autocorrelation function for each TF and finding the lowest minimum among TFs (ACFmin), excluding the bounds. Oscillations are considered present if this quantity, related to the dissipation constant for oscillations (Otero-Muras and Banga (2016)), is below a cutoff ACFthresh = −0.4.
For simulations with partial knockdown of a gene, all transcriptional rate parameters for that gene were multiplied by a coefficient on [0, 1] (for example, KD=80% was achieved using a coefficient of 0.2.), and all species were initialized with zero quantity. For all other simulations, the initial quantity of each TF was selected from a Poisson distribution with a mean of 10 proteins, and all other species were initialized with zero quantity.
For perturbation studies, the eight sampled dimensionless variables were converted to values on [0, 1] by normalization to the upper and lower values in Table 2. These values were stored in a K × 8 matrix, where each row represented one of the K = 5 TFs, and each value was perturbed independently with a Gaussian kernel of standard deviation σparam, truncated to the range [0, 1] to prevent values outside the reasonable range. The perturbed values were then converted to rate parameters for each TF as outlined above.
Random sampling
Random sets of the 10 rate parameters listed in Table 1 were generated by drawing samples of the 8 dimensionless variables described in Table 2. Latin Hypercube sampling was used to draw 104 samples from a multivariate uniform distribution, with bounds shown in Table 2. For the 3-node and 5-node cases, every possible initialization (with a unique set of random generator seed, initial protein quantity, and parameters) was stored in a table of 104 rows, and each simulation was initialized with a random row of this table.
Monte Carlo tree search
For all MCTS runs, the hyperparameter in Equation 2 was set to c = 2.00 to encourage exploration (the default value is
MCTS was parallelized using the lock-free method Enzenberger and Müller (2010) and implemented with the Python utility ‘celery’ (https://docs.celeryq.dev/en/stable/index.html). Tree search was parallelized over 1,000 green threads which dispatched simulation jobs to run in parallel on multiple cloud computing instances totaling 300 CPUs. Reward values for each topology-parameter set pair were stored in an in-memory cache to speed up subsequent training runs (epochs). Back-propagation was executed asynchronously with virtual loss, assuming a reward of 0 until the actual reward is returned. To prevent excessive sampling of local optima during the 5-node search, we introduced a form of decision tree pruning we term “node exhaustion.” Once a terminal node (a completed topology) is visited > 104 times, it is considered exhaustively sampled and pruned from the search graph, and a non-terminal node is pruned once all its successors have been pruned. As illustrated in Figure S5, the visits and reward for each in-edge to an exhausted node are subtracted from the parent node. Thus, once a node is marked exhausted, its history is “forgotten” by its predecessors.
Please see the documentation at https://pranav-bhamidipati.github.io/circuitree/index.html for all additional details of implementation, as well as code tutorials and descriptions of the API.
Motif identification
Before testing for motifs based on a tree search, we first determine whether each terminated topology sj discovered during the search is a successful oscillator by comparing its empirical robustness
Code availability
CircuiTree, written in Python 3.10, is available on GitHub (https://doi.org/10.5281/zenodo.11285522). The code used to run computational experiments, perform analyses, and plot results is available separately at https://doi.org/10.5281/zenodo.11285550.
Acknowledgements
We thank M. Elowitz, J. Bois, A. Barr, L. Morsut, J. Courte, and J. Paulsson for scientific discussions. We thank J. Bois and G. Manela for advice and assistance on scientific software development. We thank J. Gornet for assistance with distributed computing. We are grateful to I.-M. Strazhnik for assistance with figures and illustrations.
Additional files
References
- Network motifs: theory and experimental approachesNature Reviews Genetics Nature Publishing Group 8:450–461https://doi.org/10.1038/nrg2102Google Scholar
- An Introduction to Systems Biology: Design Principles of Biological CircuitsBoca Raton: Chapman and Hall/CRC https://doi.org/10.1201/9780429283321Google Scholar
- Binomial filtersJournal of VLSI signal processing systems for signal, image and video technology 12:35–50https://doi.org/10.1007/BF00936945Google Scholar
- Finite-time Analysis of the Multiarmed Bandit ProblemMachine Learning 47:235–256https://doi.org/10.1023/A:1013689704352Google Scholar
- Circadian rhythms from multiple oscillators: lessons from diverse organismsNature Reviews Genetics Nature Publishing Group 6:544–556https://doi.org/10.1038/nrg1633Google Scholar
- A Survey of Monte Carlo Tree Search MethodsIn: IEEE Transactions on Computational Intelligence and AI in Games pp. 1–43https://doi.org/10.1109/TCIAIG.2012.2186810Google Scholar
- The slow-scale stochastic simulation algorithmThe Journal of Chemical Physics 122:014116https://doi.org/10.1063/1.1824902Google Scholar
- Glycolytic Oscillations and Limits on Robust EfficiencyScience 333:187–192https://doi.org/10.1126/science.1200705Google Scholar
- Designing Synthetic Regulatory Networks Capable of Self-Organizing Cell PolarizationCell 151:320–332https://doi.org/10.1016/j.cell.2012.08.040Google Scholar
- Interlocked feedback loops contribute to the robustness of the Neurospora circadian clockProceedings of the National Academy of Sciences Proceedings of the National Academy of Sciences 98:7408–7413https://doi.org/10.1073/pnas.121170298Google Scholar
- An atlas of gene regulatory networks reveals multiple three-gene mechanisms for interpreting morphogen gradientsMolecular Systems Biology 6:425https://doi.org/10.1038/msb.2010.74Google Scholar
- Decoding CAR T cell phenotype using combinatorial signaling motif libraries and machine learningScience 378:1194–1200https://doi.org/10.1126/science.abq0225Google Scholar
- Discriminatory Power of Combinatorial Antigen Recognition in Cancer T Cell TherapiesCell Systems 11:215–228https://doi.org/10.1016/j.cels.2020.08.002Google Scholar
- Using synthetic biology to explore principles of developmentDevelopment 144:1146–1158https://doi.org/10.1242/dev.144196Google Scholar
- A synthetic oscillatory network of transcriptional regulatorsNature 403:335–338https://doi.org/10.1038/35002125Google Scholar
- A Lock-Free Multithreaded Monte-Carlo Tree Search AlgorithmIn:
- van den Herik HJ
- Spronck P
- Modeling the Cell Cycle: Why Do Certain Circuits Oscillate?Cell 144:874–885https://doi.org/10.1016/j.cell.2011.03.006Google Scholar
- A Simulation Testbed for the Study of Multicellular Development: The Multiple Mechanisms of MorphogenesisArtificial Life III Google Scholar
- Design of genetic networks with specified functions by evolution in silicoProceedings of the National Academy of Sciences 101:580–585https://doi.org/10.1073/pnas.0304532101Google Scholar
- Deriving structure from evolution: metazoan segmentationMolecular Systems Biology 3:154https://doi.org/10.1038/msb4100192Google Scholar
- A case study of evolutionary computation of biochemical adaptationPhysical Biology 5:026009https://doi.org/10.1088/1478-3975/5/2/026009Google Scholar
- Predicting embryonic patterning using mutual entropy fitness and in silico evolutionDevelopment 137:2385–2395https://doi.org/10.1242/dev.048033Google Scholar
- Construction of a genetic toggle switch in Escherichia coliNature 403:339–342https://doi.org/10.1038/35002131Google Scholar
- The Design Principles of Biochemical Timers: Circuits that Discriminate between Transient and Sustained StimulationCell Systems 9:297–308https://doi.org/10.1016/j.cels.2019.07.008Google Scholar
- Fourier analysis and systems identification of the p53 feedback loopProceedings of the National Academy of Sciences 107:13550–13555https://doi.org/10.1073/pnas.1001107107Google Scholar
- Oscillations and variability in the p53 systemMolecular Systems Biology 2https://doi.org/10.1038/msb4100068Google Scholar
- Exact stochastic simulation of coupled chemical reactionsThe Journal of Physical Chemistry 81:2340–2361https://doi.org/10.1021/j100540a008Google Scholar
- A tale of two rhythms: Locked clocks and chaos in biologyCell Systems 12:291–303https://doi.org/10.1016/j.cels.2021.03.003Google Scholar
- A spectrum of modularity in multi-functional gene circuitsMolecular Systems Biology 13:925https://doi.org/10.15252/msb.20167347Google Scholar
- Synthetic microbiology in sustainability applicationsNature Reviews Microbiology :1–15https://doi.org/10.1038/s41579-023-01007-9Google Scholar
- Bandit Based Monte-Carlo PlanningIn:
- Fürnkranz J
- Scheffer T
- Spiliopoulou M
- Interconnected Feedback Loops in the Neurospora Circadian SystemScience 289:107–110https://doi.org/10.1126/science.289.5476.107Google Scholar
- Design Principles of Regulatory Networks: Searching for the Molecular Algorithms of the CellMolecular Cell 49:202–212https://doi.org/10.1016/j.molcel.2012.12.020Google Scholar
- Defining Network Topologies that Can Achieve Biochemical AdaptationCell 138:760–773https://doi.org/10.1016/j.cell.2009.06.013Google Scholar
- Network Motifs: Simple Building Blocks of Complex NetworksScience 298:824–827https://doi.org/10.1126/science.298.5594.824Google Scholar
- BioNumbers—the database of key numbers in molecular and cell biologyNucleic Acids Research 38:D750–D753https://doi.org/10.1093/nar/gkp889Google Scholar
- Design principles of biochemical oscillatorsNature Reviews Molecular Cell Biology 9:981–991https://doi.org/10.1038/nrm2530Google Scholar
- Design Principles of Biological Oscillators through Optimization: Forward and Reverse AnalysisPLOS ONE 11:e0166867https://doi.org/10.1371/journal.pone.0166867Google Scholar
- The clock gene circuit in Arabidopsis includes a repressilator with additional feedback loopsMolecular Systems Biology 8:574https://doi.org/10.1038/msb.2012.6Google Scholar
- The second wave of synthetic biology: from modules to systemsNature Reviews Molecular Cell Biology 10:410–422https://doi.org/10.1038/nrm2698Google Scholar
- Network design principle for robust oscillatory behaviors with respect to biological noiseeLife 11:e76188https://doi.org/10.7554/eLife.76188Google Scholar
- Network Topologies That Can Achieve Dual Function of Adaptation and Noise AttenuationCell Systems 9:271–285https://doi.org/10.1016/j.cels.2019.08.006Google Scholar
- A unified design space of synthetic stripe-forming networksNature Communications 5:4905https://doi.org/10.1038/ncomms5905Google Scholar
- Robust Network Topologies for Generating Switch-Like Cellular ResponsesPLOS Computational Biology 7:e1002085https://doi.org/10.1371/journal.pcbi.1002085Google Scholar
- Programming a Computer for Playing ChessPhilosophical Magazine 41https://www.computerhistory.org/chess/doc-431614f453dde/Google Scholar
- Finding gene network topologies for given biological function with recurrent neural networkNature Communications 12:3125https://doi.org/10.1038/s41467-021-23420-5Google Scholar
- Mastering the game of Go with deep neural networks and tree searchNature 529:484–489https://doi.org/10.1038/nature16961Google Scholar
- A general reinforcement learning algorithm that masters chess, shogi, and Go through self-playScience 362:1140–1144https://doi.org/10.1126/science.aar6404Google Scholar
- Mastering the game of Go without human knowledgeNature 550:354–359https://doi.org/10.1038/nature24270Google Scholar
- A fast, robust and tunable synthetic gene oscillatorNature 456:516–519https://doi.org/10.1038/nature07389Google Scholar
- Programming self-organizing multicellular structures with synthetic cell-cell signalingScience 361:156–162https://doi.org/10.1126/science.aat0271Google Scholar
- Combinatorics of GoIn: Computers and Games Lecture Notes in Computer Science pp. 84–99https://doi.org/10.1007/978-3-540-75538-8_8Google Scholar
- A Simple Model of Circadian Rhythms Based on Dimerization and Proteolysis of PER and TIMBiophysical Journal 77:2411–2417https://doi.org/10.1016/S0006-3495(99)77078-5Google Scholar
- Circuit topology and the evolution of robustness in two-gene circadian oscillatorsProceedings of the National Academy of Sciences 102:11775–11780https://doi.org/10.1073/pnas.0501094102Google Scholar
- Precise T cell recognition programs designed by transcriptionally linking multiple receptorsScience 370:1099–1104https://doi.org/10.1126/science.abc6270Google Scholar
- A Statistical Approach Reveals Designs for the Most Robust Stochastic Gene OscillatorsACS Synthetic Biology 5:459–470https://doi.org/10.1021/acssynbio.5b00179Google Scholar
- Synthetic multistability in mammalian cellsScience 375:eabg9765https://doi.org/10.1126/science.abg9765Google Scholar
- Monte Carlo Tree Search: a review of recent modifications and applicationsArtificial Intelligence Review https://doi.org/10.1007/s10462-022-10228-yGoogle Scholar
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.106497. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2025, Pranav S Bhamidipati & Matthew Thomson
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 33
- downloads
- 0
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.