1. Computational and Systems Biology
Download icon

A convolutional neural network for the prediction and forward design of ribozyme-based gene-control elements

  1. Calvin M Schmidt
  2. Christina D Smolke  Is a corresponding author
  1. Stanford University, United States
Research Article
  • Cited 1
  • Views 1,001
  • Annotations
Cite this article as: eLife 2021;10:e59697 doi: 10.7554/eLife.59697


Ribozyme switches are a class of RNA-encoded genetic switch that support conditional regulation of gene expression across diverse organisms. An improved elucidation of the relationships between sequence, structure, and activity can improve our capacity for de novo rational design of ribozyme switches. Here, we generated data on the activity of hundreds of thousands of ribozyme sequences. Using automated structural analysis and machine learning, we leveraged these large datasets to develop predictive models that estimate the in vivo gene-regulatory activity of a ribozyme sequence. These models supported the de novo design of ribozyme libraries with low mean basal gene-regulatory activities and new ribozyme switches that exhibit changes in gene-regulatory activity in the presence of a target ligand, producing functional switches for four out of five aptamers. Our work examines how biases in the model and the dataset that affect prediction accuracy can arise and demonstrates that machine learning can be applied to RNA sequences to predict gene-regulatory activity, providing the basis for design tools for functional RNAs.

Data availability

All data generated or analyzed during this study and including in the manuscript and supporting file. Source data files are provided where appropriate.

The following previously published data sets were used

Article and author information

Author details

  1. Calvin M Schmidt

    Bioengineering, Stanford University, Stanford, United States
    Competing interests
    The authors declare that no competing interests exist.
  2. Christina D Smolke

    Bioengineering, Stanford University, Stanford, United States
    For correspondence
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5449-8495


National Institutes of Health (R01 GM086663)

  • Christina D Smolke

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Detlef Weigel, Max Planck Institute for Developmental Biology, Germany

Publication history

  1. Received: June 5, 2020
  2. Accepted: April 15, 2021
  3. Accepted Manuscript published: April 16, 2021 (version 1)
  4. Version of Record published: May 17, 2021 (version 2)


© 2021, Schmidt & Smolke

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.


  • 1,001
    Page views
  • 154
  • 1

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Computational and Systems Biology
    Alan E Mast et al.

    Early in the SARS-CoV-2 pandemic, we compared transcriptome data from hospitalized COVID-19 patients and control patients without COVID-19. We found changes in procoagulant and fibrinolytic gene expression in the lungs of COVID-19 patients (Mast et al., 2021). These findings have been challenged based on issues with the samples (Fitzgerald and Jamieson, 2022). We have revisited our previous analyses in the light of this challenge and find that these new analyses support our original conclusions.

    1. Computational and Systems Biology
    2. Neuroscience
    Jeremy D Wong et al.
    Research Article Updated

    The central nervous system plans human reaching movements with stereotypically smooth kinematic trajectories and fairly consistent durations. Smoothness seems to be explained by accuracy as a primary movement objective, whereas duration seems to economize energy expenditure. But the current understanding of energy expenditure does not explain smoothness, so that two aspects of the same movement are governed by seemingly incompatible objectives. Here, we show that smoothness is actually economical, because humans expend more metabolic energy for jerkier motions. The proposed mechanism is an underappreciated cost proportional to the rate of muscle force production, for calcium transport to activate muscle. We experimentally tested that energy cost in humans (N = 10) performing bimanual reaches cyclically. The empirical cost was then demonstrated to predict smooth, discrete reaches, previously attributed to accuracy alone. A mechanistic, physiologically measurable, energy cost may therefore explain both smoothness and duration in terms of economy, and help resolve motor redundancy in reaching movements.