Deciphering the regulatory genome of Escherichia coli, one hundred promoters at a time

  1. William T Ireland
  2. Suzannah M Beeler
  3. Emanuel Flores-Bautista
  4. Nicholas S McCarty
  5. Tom Röschinger
  6. Nathan M Belliveau
  7. Michael J Sweredoski
  8. Annie Moradian
  9. Justin B Kinney
  10. Rob Phillips  Is a corresponding author
  1. California Institute of Technology, United States
  2. California Institute of Technology, United States
  3. Cold Spring Harbor Laboratory, United States

Abstract

Advances in DNA sequencing have revolutionized our ability to read genomes. However, even in the most well-studied of organisms, the bacterium Escherichia coli, for ≈ 65% of promoters we remain ignorant of their regulation. Until we crack this regulatory Rosetta Stone, efforts to read and write genomes will remain haphazard. We introduce a new method, Reg-Seq, that links massively-parallel reporter assays with mass spectrometry to produce a base pair resolution dissection of more than 100 E. coli promoters in 12 growth conditions. We demonstrate that the method recapitulates known regulatory information. Then, we examine regulatory architectures for more than 80 promoters which previously had no known regulatory information. In many cases, we also identify which transcription factors mediate their regulation. This method clears a path for highly multiplexed investigations of the regulatory genome of model organisms, with the potential of moving to an array of microbes of ecological and medical relevance.

Data availability

Sequencing data has been deposited in the SRA under accession no.PRJNA599253 and PRJNA603368Mass spectrometry data is deposited in the CalTech data repository at doi:10.22002/d1.1336Model files and inferred information footprints are deposited in the CalTech data repository at doi:10.22002/D1.1331Processed sequencing data sets and analysis software are available in the GitHub repository available at https://doi.org/10.5281/zenodo.3953312

The following data sets were generated

Article and author information

Author details

  1. William T Ireland

    Physics, California Institute of Technology, Pasadena, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0971-2904
  2. Suzannah M Beeler

    Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1930-4827
  3. Emanuel Flores-Bautista

    Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. Nicholas S McCarty

    Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    Competing interests
    The authors declare that no competing interests exist.
  5. Tom Röschinger

    Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, United States
    Competing interests
    The authors declare that no competing interests exist.
  6. Nathan M Belliveau

    Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1536-1963
  7. Michael J Sweredoski

    Proteome Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of Technology, Pasadena, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0878-3831
  8. Annie Moradian

    Proteome Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of Technology, Pasadena, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0407-2031
  9. Justin B Kinney

    Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1897-3778
  10. Rob Phillips

    Department of Bioengineering, California Institute of Technology, Pasadena, United States
    For correspondence
    phillips@pboc.caltech.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3082-2809

Funding

National Institutes of Health (Director's Pioneer Award)

  • Rob Phillips

National Institutes of Health (National Research Service Award,5T32GM007616-38)

  • Suzannah M Beeler

National Institutes of Health (Maximizing Investigators Research Award)

  • Rob Phillips

Howard Hughes Medical Institute (International Student Research Fellowship)

  • Nathan M Belliveau

National Institutes of Health (1S10OD02001301)

  • Annie Moradian

National Institutes of Health (1S10OD02001301)

  • Michael J Sweredoski

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

© 2020, Ireland et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 7,574
    views
  • 836
    downloads
  • 52
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. William T Ireland
  2. Suzannah M Beeler
  3. Emanuel Flores-Bautista
  4. Nicholas S McCarty
  5. Tom Röschinger
  6. Nathan M Belliveau
  7. Michael J Sweredoski
  8. Annie Moradian
  9. Justin B Kinney
  10. Rob Phillips
(2020)
Deciphering the regulatory genome of Escherichia coli, one hundred promoters at a time
eLife 9:e55308.
https://doi.org/10.7554/eLife.55308

Share this article

https://doi.org/10.7554/eLife.55308

Further reading

    1. Cell Biology
    2. Physics of Living Systems
    Krishna Rijal, Pankaj Mehta
    Research Article

    The Gillespie algorithm is commonly used to simulate and analyze complex chemical reaction networks. Here, we leverage recent breakthroughs in deep learning to develop a fully differentiable variant of the Gillespie algorithm. The differentiable Gillespie algorithm (DGA) approximates discontinuous operations in the exact Gillespie algorithm using smooth functions, allowing for the calculation of gradients using backpropagation. The DGA can be used to quickly and accurately learn kinetic parameters using gradient descent and design biochemical networks with desired properties. As an illustration, we apply the DGA to study stochastic models of gene promoters. We show that the DGA can be used to: (1) successfully learn kinetic parameters from experimental measurements of mRNA expression levels from two distinct Escherichia coli promoters and (2) design nonequilibrium promoter architectures with desired input–output relationships. These examples illustrate the utility of the DGA for analyzing stochastic chemical kinetics, including a wide variety of problems of interest to synthetic and systems biology.

    1. Physics of Living Systems
    Juken Hong, Wenzhi Xue, Teng Wang
    Research Article

    Microbial communities living in the same environment often display alternative stable states, each characterized by a unique composition of species. Understanding the origin and determinants of microbiome multistability has broad implications in environments, human health, and microbiome engineering. However, despite its conceptual importance, how multistability emerges in complex communities remains largely unknown. Here, we focused on the role of horizontal gene transfer (HGT), one important aspect mostly overlooked in previous studies, on the stability landscape of microbial populations. Combining mathematical modeling and numerical simulations, we demonstrate that, when mobile genetic elements (MGEs) only affect bacterial growth rates, increasing HGT rate in general promotes multistability of complex microbiota. We further extend our analysis to scenarios where HGT changes interspecies interactions, microbial communities are subjected to strong environmental selections and microbes live in metacommunities consisting of multiple local habitats. We also discuss the role of different mechanisms, including interspecies interaction strength, the growth rate effects of MGEs, MGE epistasis and microbial death rates in shaping the multistability of microbial communities undergoing HGT. These results reveal how different dynamic processes collectively shape community multistability and diversity. Our results provide key insights for the predictive control and engineering of complex microbiota.