(A) For the PR and PL libraries, the synthetic construct used to detect the effects of promoter mutations consisted of a yellow fluorescent marker (venus-yfp), preceded by a ribosomal binding site (RBS), and under the control of either the PR or PL promoter (or a PR or PL promoter mutant). The system was isolated from the rest of the plasmid by a T1 terminator (hairpin). This construct was placed on a small copy number pZS* plasmid (SC101* origin) with kanamycin resistance, with Escherichia coli MG1655 as host. (B) The expression of a green fluorescence protein (gfp) was under the control of a random 100 bp sequence consisting of: two 32-bp-long random, non-expressing flanking sequences that were not mutated; and a 36-bp-long sequence that was mutated randomly, with each nucleotide having 25% chance of being found at each position. This construct was placed on a pUA66 plasmid (SC101 origin), with E. coli NEB5α as a host. (C) Promoter mutants were cloned into the plasmid system using restriction/modification. The mutations were introduced at random, using pre-synthesized oligonucleotides with a fixed mutation rate (12% for the PR, 9% for the PL, and fully random for the 36N mutant library). The plasmids carrying mutant promoters were cloned either into MG1655 (PR and PL libraries) or NEB5α (36N library). (D) Each random mutant library was sorted through fluorescence activated cell sorting (FACS) based on the fluorescence intensity detected at the single cell level. Mutants in PR and PL libraries were sorted into four, while the 36N library was sorted into 12 equidistant bins. 150-bp-long fragments containing the promoter region of each sorted sub-library were PCR-tagged, and each library sequenced in bulk with 5 million total reads per library. (E) We screened each sequence library for only those mutants that had at least 30× coverage, and obtained fluorescence distributions of each mutant across the bins. (F) Flow cytometry measurements of 1 million mutants from each library showing distributions of fluorescence (as proxy for gene expression levels). The vertical red dotted line separates the mutants with no measurable expression (corresponding to Figure 4A). The red dotted line and the solid lines separate the four bins used to sort the PR and PL libraries (no, low, intermediate, and high expression, from left to right). The dotted lines mark the boundaries of the additional bins used to sort the 36N library. (G) Mutation frequencies in the three experimental libraries are shown as fraction of mutants with a given nucleotide at each position for PR, PL, and 36N libraries. We did not observe any bias in the mutagenesis of libraries. The consensus sequence for each library is provided underneath each plot. (H) Number of sequences in each library containing a spacer of specific length. The reported counts are based only on the spacer length of the strongest binding site identified in each sequence. Note that the Extended model accounts for cumulative binding between all possible σ70-RNAP configurations binding to a given sequence, meaning that our libraries contained a much greater number of sequences with each spacer length than shown here. In fact, because here we considered only the single strongest binding site, the counts over-represent the optimal spacer length because it has the lowest energy and, hence, binding configurations with that spacer length are more likely to be most strongly bound.