Synthetic and genomic regulatory elements reveal aspects of cis-regulatory grammar in Mouse Embryonic Stem Cells

  1. Dana M King
  2. Clarice Kit Yee Hong
  3. James L Shepherdson
  4. David M Granas
  5. Brett B Maricque
  6. Barak Cohen  Is a corresponding author
  1. Washington University in St Louis School of Medicine, United States

Abstract

In embryonic stem cells (ESCs), a core transcription factor (TF) network establishes the gene expression program necessary for pluripotency. To address how interactions between four key TFs contribute to cis-regulation in mouse ESCs, we assayed two massively parallel reporter assay (MPRA) libraries composed of binding sites for SOX2, POU5F1 (OCT4), KLF4, and ESRRB. Comparisons between synthetic cis-regulatory elements and genomic sequences with comparable binding site configurations revealed some aspects of a regulatory grammar. The expression of synthetic elements is influenced by both the number and arrangement of binding sites. This grammar plays only a small role for genomic sequences, as the relative activities of genomic sequences are best explained by the predicted affinity of binding sites, regardless of binding site identity and positioning. Our results suggest that the effects of transcription factor binding sites (TFBS) are influenced by the order and orientation of sites, but that in the genome the overall occupancy of TFs is the primary determinant of activity.

Data availability

Sequencing data has been deposited in GEO under accession code GSE120240.Any additional data generated during this study are included in the manuscript and supporting files.

The following data sets were generated
The following previously published data sets were used

Article and author information

Author details

  1. Dana M King

    Edison Center for Genome Sciences and Systems Biology, Washington University in St Louis School of Medicine, St Louis, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4635-5272
  2. Clarice Kit Yee Hong

    Edison Center for Genome Sciences and Systems Biology, Washington University in St Louis School of Medicine, St Louis, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. James L Shepherdson

    Edison Center for Genome Sciences and Systems Biology, Washington University in St Louis School of Medicine, St Louis, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. David M Granas

    Edison Center for Genome Sciences and Systems Biology, Washington University in St Louis School of Medicine, St Louis, United States
    Competing interests
    The authors declare that no competing interests exist.
  5. Brett B Maricque

    Edison Center for Genome Sciences and Systems Biology, Washington University in St Louis School of Medicine, St Louis, United States
    Competing interests
    The authors declare that no competing interests exist.
  6. Barak Cohen

    Edison Center for Genome Sciences and Systems Biology, Washington University in St Louis School of Medicine, St Louis, United States
    For correspondence
    cohen@wustl.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3350-2715

Funding

National Institutes of Health (R01 GM092910)

  • Barak Cohen

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Patricia J Wittkopp, University of Michigan, United States

Publication history

  1. Received: August 20, 2018
  2. Accepted: February 7, 2020
  3. Accepted Manuscript published: February 11, 2020 (version 1)
  4. Version of Record published: March 17, 2020 (version 2)

Copyright

© 2020, King et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 3,849
    Page views
  • 470
    Downloads
  • 19
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Dana M King
  2. Clarice Kit Yee Hong
  3. James L Shepherdson
  4. David M Granas
  5. Brett B Maricque
  6. Barak Cohen
(2020)
Synthetic and genomic regulatory elements reveal aspects of cis-regulatory grammar in Mouse Embryonic Stem Cells
eLife 9:e41279.
https://doi.org/10.7554/eLife.41279

Further reading

    1. Computational and Systems Biology
    Mayank Baranwal et al.
    Research Article

    Predicting the dynamics and functions of microbiomes constructed from the bottom-up is a key challenge in exploiting them to our benefit. Current models based on ecological theory fail to capture complex community behaviors due to higher order interactions, do not scale well with increasing complexity and in considering multiple functions. We develop and apply a long short-term memory (LSTM) framework to advance our understanding of community assembly and health-relevant metabolite production using a synthetic human gut community. A mainstay of recurrent neural networks, the LSTM learns a high dimensional data-driven non-linear dynamical system model. We show that the LSTM model can outperform the widely used generalized Lotka-Volterra model based on ecological theory. We build methods to decipher microbe-microbe and microbe-metabolite interactions from an otherwise black-box model. These methods highlight that Actinobacteria, Firmicutes and Proteobacteria are significant drivers of metabolite production whereas Bacteroides shape community dynamics. We use the LSTM model to navigate a large multidimensional functional landscape to design communities with unique health-relevant metabolite profiles and temporal behaviors. In sum, the accuracy of the LSTM model can be exploited for experimental planning and to guide the design of synthetic microbiomes with target dynamic functions.

    1. Biochemistry and Chemical Biology
    2. Computational and Systems Biology
    Laura M Doherty et al.
    Research Article

    Deubiquitinating enzymes (DUBs), ~100 of which are found in human cells, are proteases that remove ubiquitin conjugates from proteins, thereby regulating protein turnover. They are involved in a wide range of cellular activities and are emerging therapeutic targets for cancer and other diseases. Drugs targeting USP1 and USP30 are in clinical development for cancer and kidney disease respectively. However, the majority of substrates and pathways regulated by DUBs remain unknown, impeding efforts to prioritize specific enzymes for research and drug development. To assemble a knowledgebase of DUB activities, co-dependent genes, and substrates, we combined targeted experiments using CRISPR libraries and inhibitors with systematic mining of functional genomic databases. Analysis of the Dependency Map, Connectivity Map, Cancer Cell Line Encyclopedia, and multiple protein-protein interaction databases yielded specific hypotheses about DUB function, a subset of which were confirmed in follow-on experiments. The data in this paper are browsable online in a newly developed DUB Portal and promise to improve understanding of DUBs as a family as well as the activities of incompletely characterized DUBs (e.g. USPL1 and USP32) and those already targeted with investigational cancer therapeutics (e.g. USP14, UCHL5, and USP7).