Building accurate sequence-to-affinity models from high-throughput in vitro protein-DNA binding data using FeatureREDUCE

  1. Todd R Riley
  2. Allan Lazarovici
  3. Richard S Mann
  4. Harmen J Bussemaker  Is a corresponding author
  1. Columbia University, United States

Abstract

Transcription factors are crucial regulators of gene expression. Accurate quantitative definition of their intrinsic DNA binding preferences is critical to understanding their biological function. High-throughput in vitro technology has recently been used to deeply probe the DNA binding specificity of hundreds of eukaryotic transcription factors, yet algorithms for analyzing such data have not yet fully matured. Here we present a general framework (FeatureREDUCE) for building sequence-to-affinity models based on a biophysically interpretable and extensible model of protein-DNA interaction that can account for dependencies between nucleotides within the binding interface or multiple modes of binding. When training on protein binding microarray (PBM) data, we use robust regression and modeling of technology-specific biases to infer specificity models of unprecedented accuracy and precision. We provide quantitative validation of our results by comparing to gold-standard data when available.

Article and author information

Author details

  1. Todd R Riley

    Department of Biological Sciences, Columbia University, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  2. Allan Lazarovici

    Department of Biological Sciences, Columbia University, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Richard S Mann

    Department of Biochemistry and Molecular Biophysics, Columbia University, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. Harmen J Bussemaker

    Department of Biological Sciences, Columbia University, New York, United States
    For correspondence
    hjb2004@columbia.edu
    Competing interests
    The authors declare that no competing interests exist.

Reviewing Editor

  1. Nir Friedman, The Hebrew University of Jerusalem, Israel

Publication history

  1. Received: January 8, 2015
  2. Accepted: December 20, 2015
  3. Accepted Manuscript published: December 23, 2015 (version 1)
  4. Version of Record published: February 9, 2016 (version 2)

Copyright

© 2015, Riley et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,583
    Page views
  • 358
    Downloads
  • 26
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Todd R Riley
  2. Allan Lazarovici
  3. Richard S Mann
  4. Harmen J Bussemaker
(2015)
Building accurate sequence-to-affinity models from high-throughput in vitro protein-DNA binding data using FeatureREDUCE
eLife 4:e06397.
https://doi.org/10.7554/eLife.06397

Further reading

    1. Computational and Systems Biology
    2. Neuroscience
    Janus RL Kobbersmed et al.
    Research Article

    Synaptic communication relies on the fusion of synaptic vesicles with the plasma membrane, which leads to neurotransmitter release. This exocytosis is triggered by brief and local elevations of intracellular Ca2+ with remarkably high sensitivity. How this is molecularly achieved is unknown. While synaptotagmins confer the Ca2+ sensitivity of neurotransmitter exocytosis, biochemical measurements reported Ca2+ affinities too low to account for synaptic function. However, synaptotagmin's Ca2+ affinity increases upon binding the plasma membrane phospholipid PI(4,5)P2 and, vice versa, Ca2+-binding increases synaptotagmin's PI(4,5)P2 affinity, indicating a stabilization of the Ca2+/PI(4,5)P2 dual-bound syt. Here we devise a molecular exocytosis model based on this positive allosteric stabilization and the assumptions that (1.) synaptotagmin Ca2+/PI(4,5)P2 dual binding lowers the energy barrier for vesicle fusion and that (2.) the effect of multiple synaptotagmins on the energy barrier is additive. The model, which relies on biochemically measured Ca2+/PI(4,5)P2 affinities and protein copy numbers, reproduced the steep Ca2+ dependency of neurotransmitter release. Our results indicate that each synaptotagmin dual binding Ca2+/PI(4,5)P2 lowers the energy barrier for vesicle fusion by ~5 kBT and that allosteric stabilization of this state enables the synchronized engagement of several (typically three) synaptotagmins for fast exocytosis. Furthermore, we show that mutations altering synaptotagmin’s allosteric properties may show dominant-negative effects, even though synaptotagmins act independently on the energy barrier, and that dynamic changes of local PI(4,5)P2 (e.g. upon vesicle movement) dramatically impact synaptic responses. We conclude that allosterically stabilized Ca2+/PI(4,5)P2 dual binding enables synaptotagmins to exert their coordinated function in neurotransmission.

    1. Computational and Systems Biology
    2. Immunology and Inflammation
    Sanket Rane et al.
    Research Article Updated

    Naive CD4 and CD8 T cells are cornerstones of adaptive immunity, but the dynamics of their establishment early in life and how their kinetics change as they mature following release from the thymus are poorly understood. Further, due to the diverse signals implicated in naive T cell survival, it has been a long-held and conceptually attractive view that they are sustained by active homeostatic control as thymic activity wanes. Here we use multiple modelling and experimental approaches to identify a unified model of naive CD4 and CD8 T cell population dynamics in mice, across their lifespan. We infer that both subsets divide rarely, and progressively increase their survival capacity with cell age. Strikingly, this simple model is able to describe naive CD4 T cell dynamics throughout life. In contrast, we find that newly generated naive CD8 T cells are lost more rapidly during the first 3–4 weeks of life, likely due to increased recruitment into memory. We find no evidence for elevated division rates in neonates, or for feedback regulation of naive T cell numbers at any age. We show how confronting mathematical models with diverse datasets can reveal a quantitative and remarkably simple picture of naive T cell dynamics in mice from birth into old age.