1. Physics of Living Systems
Download icon

Measuring cis-regulatory energetics in living cells using allelic manifolds

  1. Talitha L Forcier
  2. Andalus Ayaz
  3. Manraj S Gill
  4. Daniel Jones
  5. Rob Phillips
  6. Justin B Kinney  Is a corresponding author
  1. Cold Spring Harbor Laboratory, United States
  2. California Institute of Technology, United States
Research Communication
Cite this article as: eLife 2018;7:e40618 doi: 10.7554/eLife.40618
12 figures, 2 tables and 2 additional files


Strategy for measuring TF-DNA interactions.

(A) A thermodynamic model of simple repression. Here, promoter DNA can transition between three possible states: unbound, bound by a TF, or bound by RNAP. Each state has an associated Boltzmann weight and rate of transcript initiation. F is the TF binding factor and P is the RNAP binding factor; see text for a description of how these dimensionless binding factors relate to binding affinity and binding energy. tsat is the rate of specific transcript initiation from a promoter fully occupied by RNAP. (B) Transcription is measured in the presence (t+) and absence (t-) of the TF. Measurements are made for an allelic series of RNAP binding sites that differ in their binding strengths (blue-yellow gradient). (C) If the model in panel A is correct, plotting t+ vs. t- for the promoters in panel B (colored dots) will trace out a 1D allelic manifold. Mathematically, this manifold reflects Equation 1 and Equation 2 computed over all possible values of the RNAP binding factor P while the other parameters (F, tsat) are held fixed. Note that these equations include a background transcription term tbg; it is assumed throughout that tbgtsat and that tbg is independent of RNAP binding site sequence. The resulting manifold exhibits five distinct regimes (circled numbers), corresponding to different ranges for the value of P that allow the mathematical expressions in Equations 1 and 2 to be approximated by simplified expressions. In regime 3, for instance, t+t-/(1+F), and thus the manifold approximately follows a line parallel (on a log-log plot) to the diagonal but offset below it by a factor of 1+F (dashed line). Data points in this regime can therefore be used to determine the value of F. (D) The five regimes of the allelic manifold, including approximate expressions for t+ and t- in each regime, as well as the range of validity for P.

Precision measurement of in vivo CRP-DNA binding.

(A) Expression measurements were performed on promoters for which CRP represses transcription by occluding RNAP. Each promoter assayed contained a near-consensus CRP binding site centered at either +0.5 bp or +4.5 bp, as well as an RNAP binding site with a partially mutagenized −35 region (gradient). t+ (or t-) denotes measurements made using E. coli strain JK10 grown in the presence (or absence) of the small molecule cAMP. (B) Dots indicate measurements for 41 such promoters. A best-fit allelic manifold (black) was inferred from n=39 of these data points after the exclusion of 2 outliers (gray ‘X’s). Gray lines indicate 100 plausible allelic manifolds fit to bootstrap-resampled data points. The parameters of these manifolds were used to determine the CRP-DNA binding factor F and thus the Gibbs free energy ΔGF=-kBTlogF. Error bars indicate 68% confidence intervals determined by bootstrap resampling. See Appendix 3 for more information about our manifold fitting procedure.

Measuring in vivo changes in TF concentration.

(A) Allelic manifolds were measured for the +0.5 bp occlusion promoter architecture using seven different concentrations of cAMP (ranging from 2.5 µM to 250 µM) when assaying t+. (B) As expected, these data follow allelic manifolds that have cAMP-dependent values for the CRP binding factor F. (C) Values for F inferred from the data in panel B exhibit a nontrivial power law dependence on [cAMP]. Error bars indicate 68% confidence intervals determined by bootstrap resampling.

Strategy for measuring TF-RNAP interactions.

(A) A thermodynamic model of simple activation. Here, promoter DNA can transition between four different states: unbound, bound by the TF, bound by RNAP, or doubly bound. As in Figure 1, F is the TF binding factor, P is the RNAP binding factor, and tsat is the rate of transcript initiation from an RNAP-saturated promoter. The cooperativity factor α quantifies the strength of the interaction between DNA-bound TF and RNAP molecules; see text for more information on this quantity. (B) As in Figure 1, expression is measured in the presence (t+) and absence (t-) of the TF for promoters that have an allelic series of RNAP binding sites (blue-yellow gradient). (C) If the model in panel A is correct, plotting t+ vs. t- (colored dots) will reveal a 1D allelic manifold that corresponds to Equation 4 (for t+) and Equation 2 (for t-) evaluated over all possible values of P. Circled numbers indicate the five regimes of this manifold. In regime 3, t+αt- where α is the renormalized cooperativity factor given in Equation 5; data in this regime can thus be used to measure α. Separate measurements of F, using the strategy in Figure 1, then allow one to compute α from knowledge of α. (D) The five regimes of the allelic manifold in panel C. Note that these regimes differ from those in Figure 1D.

Precision measurement of class I CRP-RNAP interactions.

(A) t+ and t- were measured for promoters containing a CRP binding site centered at −61.5 bp. The RNAP sites of these promoters were mutagenized in either their −10 or −35 regions (gradient), generating two allelic series. As in Figure 2, t+ and t- correspond to expression measurements respectively made in the presence and absence of cAMP. (B) Data obtained for 47 variant promoters having the architecture shown in panel A. Three data points designated as outliers are indicated by ‘X’s. The allelic manifold that best fits the n=44 non-outlier points is shown in black; 100 plausible manifolds, estimated from bootstrap-resampled data points, are shown in gray. The resulting values for α and ΔGα=-kBTlogα are also shown, with 68% confidence intervals indicated. (C) Allelic manifolds obtained for promoters with CRP binding sites centered at a variety of class I positions. (D) Inferred values for the cooperativity factor α and corresponding Gibbs free energy ΔGα for the 12 different promoter architectures assayed in panel C. Error bars indicate 68% confidence intervals. Numerical values for α and ΔGα at all of these class I positions are provided in Table 1.

RNAP-DNA binding energy cannot be accurately predicted from sequence.

(A) The PSAM for RNAP-DNA binding inferred by Kinney et al. (2010). This model assumes that the DNA base pair at each position in the RNAP binding site contributes independently to ΔGP. Shown are the ΔΔGP values assigned by this model to mutations away from the lac* RNAP site. The sequence of the lac* RNAP site is indicated by gray vertical bars; see also Appendix 1—figure 1. A sequence logo representation for this PSAM is provided for reference. (B) PSAM predictions plotted against the values ΔGP=kBTlogP inferred by fitting the allelic manifolds in Figure 5C. Error bars on these measurements represent 68% confidence intervals. Note that measured ΔGP values are absolute, whereas the ΔΔGP predictions of the PSAM are relative to the lac* RNAP site, which thus corresponds to ΔΔGP=0 kcal/mol. The dashed line, provided for reference, has slope 1 and passes through this lac* data point.

A strategy for distinguishing two different mechanisms of transcriptional activation.

(A) A TF can activate transcription in two ways: by stabilizing the RNAP-DNA complex or by accelerating the rate at which this complex initiates transcripts. (B) A thermodynamic model for the dual mechanism of transcriptional activation illustrated in panel A. Note that α multiplies the Boltzmann weight of the doubly bound complex, whereas β multiplies the transcript initiation rate of this complex. (C) Data points measured as in Figure 4C will lie along a 1D allelic manifold having the form shown here. This manifold is computed using t+ values from Equation 7 and t- values from Equation 2. Note that regime five occurs at a point positioned β-fold above the diagonal, where β is related to β through Equation 8. Measurements in or near the strong promoter regime (P1) can thus be used to determine the value of β and, consequently, the value of β. (D) The five regimes of this allelic manifold are listed.

Class I activation by CRP occurs exclusively through stabilization.

(A) t+ and t- were measured for promoters containing variants of the consensus RNAP binding site as well as a CRP binding site centered at −60.5 bp. Because the consensus RNAP site is 1 bp shorter than the RNAP site of the lac* promoter, CRP at −60.5 bp here corresponds to CRP at −61.5 bp in Figure 5. (B) n=18 data points obtained for the constructs in panel A, overlaid on the measurements from Figure 5B (gray). The value tsat=15.1 a.u., inferred for Figure 5C, is indicated by dashed lines. (C) Values for β inferred using the data in Figure 5 for the 10 CRP positions that exhibited greater than 2-fold inducibility; β values at the two other CRP positions (−66.5 bp and −76.5 bp) were highly uncertain and are not shown. Error bars indicate 68% confidence intervals.

Surprises in class II regulation by CRP.

(A) Regulation by CRP centered at −41.5 bp was assayed using an allelic series of RNAP binding sites that have variant −10 elements (gradient). (B) The observed allelic manifold plateaus at the value of tsat=15.1 a.u. (dashed lines) determined for Figure 5B, thus indicating no detectable acceleration by CRP. This lack of acceleration is at odds with prior in vitro studies (Niu et al., 1996; Rhodius et al., 1997). (C) Regulation by CRP centered at −40.5 bp was assayed in an analogous manner. (D) Unexpectedly, data from the promoters in panel C do not collapse to a 1D allelic manifold. This finding falsifies the biophysical models in Figures 4A and 7B and indicates that CRP can either activate or repress transcription from this position, depending on as-yet-unidentified features of the RNAP binding site. Error bars in panel D indicate 95% confidence intervals estimated from replicate experiments.

Appendix 1—figure 1
Promoter sequences used in this study.

In all panels, the −35 and −10 hexamers of the RNAP binding site are in bold. CRP binding site centers are indicated by small triangles. The palindromic pentamers of the core CRP binding site in each construct are underlined. The transcription start site (TSS) is bold and italicized. Lowercase bases (‘a’,‘c’,‘g’, and ‘t’) indicate positions synthesized with a 24% mutation rate. The lowercase character ‘n’ indicates completely randomized positions. (A) Occlusion promoters assayed for Figure 2. (B) Class I promoters assayed for Figure 5. In the main text we refer to the wild-type promoter with CRP at −61.5 bp as the lac* promoter. The lac* promoter served as the template for all of the promoters shown here. (C) Strong class I promoters assayed for Figure 8. (D) Class II promoters assayed for Figure 9.

Appendix 2—figure 1
Calibration of expression measurements with and without cAMP.

(A) Measurements of t+raw (in 250 µM cAMP) vs t-raw (in 0 µM cAMP) for promoters in which the CRP binding site has been replaced by a non-functional ‘null’ site. As expected, these data lie close to the t+raw=t-raw diagonal (dotted line). (B) Upon closer inspection, however, we found that t+raw values consistently fell slightly below corresponding t-raw values. Using least-squares fitting we found that, on average, t+raw/traw=0.8520.053+0.056 where uncertainties indicate a 95% confidence interval (reflecting 1.96 times the standard error of the mean in log space). To correct for this bias, we plot and fit models to t+=t+raw and t-=0.855×t-raw throughout this paper.

Appendix 4—figure 1
Derivation of the regimes of allelic manifolds.

Panels A-D show simulated induction curves for transcription t as a function of the RNAP binding factor P. Dashed lines indicate boundaries between the minimal and linear regimes of each curve, while dotted lines indicate boundaries between linear and maximal regimes. A formula for the value of P at each regime boundary is also shown. All simulations used tsat=1 a.u., tbg=10-4 a.u., F=100, and P ranging from 10−9 to 104. (A) Induction curve for unregulated transcription; see Equation 18. (B) Induction curve for transcription repressed by occlusion; see Equation 19. (C) Induction curve for transcription activated by stabilization (α=300); see Equation 20. (D) Induction curve for transcription activated by acceleration (α=10, β=30); see Equation 21. Panels E-G show how overlaps between the six regimes of two induction curves (three for t- and three for t+) result in five distinct regimes for the corresponding allelic manifold. (E) Regimes of the allelic manifold for occlusion, which is shown in Figure 1C. (F) Regimes of the allelic manifold for stabilization, which is shown in Figure 4C. (G) Regimes of the allelic manifold for acceleration, which is shown in Figure 7C.



Table 1
Summary of results for class I activation by CRP.

The α and ΔGα values listed here correspond to the values plotted in Figure 5D. The corresponding value inferred for the saturated transcription rate is tsat=15.1-0.5+0.6 a.u. Error bars indicate 68% confidence intervals; see Appendix 3 for details. n is the number of data points used to infer these values, while ‘outliers’ is the number of data points excluded in this analysis. For comparison we show the fold-activation measurements (i.e., t+/t-) reported in Gaston et al. (1990) and Ushida and Aiba (1990); ‘-’ indicates that no measurement was reported for that position.

Position (bp)nOutliersΔGα (kcal/mol)αt+/t- (Gaston)t+/t- (Ushida)
Key resources table
Reagent type
(species) or
DesignationSource or referenceIdentifiersAdditional information
Genetic reagent
(E. coli)
JK10this papernonegenotype: ∆cyaAcpdA
DNA reagent


this paper

cloning vector with BsmBI
cut sites, ccdB cassette, lacZ
reporter gene, kanamycin
resistance, pSC101 origin
DNA reagent
and variants
this papernonereporter plasmids cloned
from pJK47.419
cAMPSigma-AldrichA9501-1GAdenosine 3’,5’-cyclic
1 gram
β-D-1- thiogalactopyranoside,
1 gram
5 gram
assay or kit
PureLink Genomic
DNA Mini Kit
assay or kit
Nextera XT DNA Library
Preparation Kit



24 samples
OtherRDMTeknovaM2105growth media: MOPS
EZ Rich
Defined Medium Kit,
5 liter
MilliporeSigma71092–475 milliliters
OtherBreathe-Easier filmUSA Scientific9123–6100sterile, 100 per box
OtherEpoch 2 Microplate



Softwareanalysis scriptsthis papernoneAvailable at https://github.com/jbkinney/17_inducibility
(copy archived at https://github.com/elifesciences-publications/17_inducibility)

Data availability

All data used to make the Figures is available in Supplementary file 1. The PSAM for RNAP, previously published by Kinney et al. (2010), is also provided in Supplementary file 1 (with permission). Raw data, processed data, and analysis scripts are also available at https://github.com/jbkinney/17_inducibility (copy archived at https://github.com/elifesciences-publications/17_inducibility). No datasets have been deposited in public databases as part of this work.

Additional files

Supplementary file 1

Numerical results plotted in the Figures and listed in Table 1.

Please refer to the ’overview’ sheet within this workbook for a description of each data sheet therein.

Transparent reporting form

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)