Regulatory roles of Escherichia coli 5' UTR and ORF-internal RNAs detected by 3' end mapping

  1. Philip P Adams
  2. Gabriele Baniulyte
  3. Caroline Esnault
  4. Kavya Chegireddy
  5. Navjot Singh
  6. Molly Monge
  7. Ryan K Dale
  8. Gisela Storz  Is a corresponding author
  9. Joseph T Wade  Is a corresponding author
  1. Division of Molecular and Cellular Biology, Eunice Kennedy Shriver National Institute of Child Health and Human Development, United States
  2. Postdoctoral Research Associate Program, National Institute of General Medical Sciences, National Institutes of Health, United States
  3. Wadsworth Center, New York State Department of Health, United States
  4. Bioinformatics and Scientific Programming Core, Eunice Kennedy Shriver National Institute of Child Health and Human Development, United States
  5. Department of Biomedical Sciences, School of Public Health, University at Albany, United States
9 figures, 1 table and 5 additional files

Figures

Figure 1 with 2 supplements
Distribution of 3´ ends and putative sites of Rho termination.

(A) Schematic of classification of Term-seq 3´ ends and Rho termination sites relative to an annotated ORF. 3´ ends and termination sites were defined as: primary (purple colored, located on the …

Figure 1—figure supplement 1
RNA-seq approaches.

Schematic of total RNA-seq (modified from the RNAtag-seq methodology, Shishkin et al., 2015), Term-seq (modified from Dar et al., 2016) and DirectRNA-seq (Ozsolak and Milos, 2011). RNA 3´ end …

Figure 1—figure supplement 2
Analysis of Term-seq and DirectRNA-seq data.

(A) Principal component analysis plot to show correlation among total RNA-seq and Term-seq replicates. Read counts at annotated ORFs for total RNA-seq or the union of 3´ ends characterized from all …

Figure 2 with 1 supplement
Experimental validation of premature Rho termination.

(A) RNA-seq screenshot of the sugE (gdx) locus displaying sequencing reads from LB 0.4 total RNA-seq, LB 0.4 Term-seq and DirectRNA-seq ±BCM treatment. Total and Term-seq tracks represent an overlay …

Figure 2—figure supplement 1
Test of Rho-dependent termination in several genes.

(A) RNA-seq screenshot of the thiM locus displaying sequencing reads from LB 0.4 total RNA-seq, LB 0.4 Term-seq and DirectRNA-seq ± BCM treatment. Total and Term-seq tracks represent an overlay of …

Figure 3 with 1 supplement
Effect of spermidine on mdtUJI expression.

(A) Sequence of the mdtJI 5´ UTR. The transcription start site (green shaded nucleotide) determined by dRNA-seq (Thomason et al., 2015) and 3´ end (red shaded nucleotide) determined by Term-seq …

Figure 3—figure supplement 1
Amino acid conservation of mdtU uORF in selected gammaproteobacterial species.

Unannotated uORF (mdtU) sequences were selected by searching for short (20–40 amino acid) ORFs with <300 nt distance from the start of an mdtJ ortholog in 1742 sequenced gammaproteobacterial …

Figure 4 with 1 supplement
Effect of sRNA deletions on eptB, ompA, and chiP fragments.

(A) Sequence of documented region of sRNA-mRNA pairing. 3´ end determined by Term-seq is highlighted in red. Start codon of the corresponding ORF is indicated with green text. (B) Northern analysis …

Figure 4—figure supplement 1
Sequences of eptB, ompA and chiP 5´ UTRs.

Transcription start sites (nucleotide shaded green) determined by dRNA-seq (Thomason et al., 2015) and 3´ ends (nucleotide shaded red) determined by Term-seq (current study) are indicated. Start …

Figure 5 with 1 supplement
5´ UTR-derived sRNAs ChiZ and IspZ.

(A) RNA-seq screenshot of the ChiZ and IspZ loci displaying sequencing reads from the LB 0.4 growth condition from dRNA-seq (Thomason et al., 2015, HS2 samples), total RNA-seq and Term-seq. Total …

Figure 5—figure supplement 1
Predicted secondary structures and levels of 5´ derived ChiZ and IspZ sRNAs expressed from plasmids.

Secondary structures of ChiX (A) and IspZ (B) predicted by sfold (http://sfold.wadsworth.org/cgi-bin/srna.pl; Ding et al., 2004). The base-pairing regions are highlighted in yellow, and predicted …

5´ UTR-derived sRNAs ChiZ and IspZ act as sRNA sponges.

(A) Northern analysis for ChiZ in WT (GSO989) and rhoR66S mutant (GSO990) cells. Cells were grown to OD600 ~0.4 or 2.0 after a dilution of the overnight culture. Total RNA was extracted, separated …

Detection of ORF-internal sRNAs.

(A) RNA-seq screenshots of the ftsI, aceK, rlmD, mglC, and ampG mRNAs containing putative internal (int) sRNAs. Sequencing reads from the LB 0.4 dRNA-seq (Thomason et al., 2015, HS2 samples), total …

Figure 8 with 2 supplements
ORF-internal sRNA FtsO acts as a sponge of the RybB sRNA.

(A) RIL-seq screenshot showing RybB chimeras at the ftsO locus. Data are from Hfq-FLAG LB RIL-seq performed 150 min after a dilution of the overnight culture, (Melamed et al., 2020, RIL-seq …

Figure 8—figure supplement 1
FtsO and RbsZ sponging induces ompC levels and RybB reciprocally affects FtsO.

(A) Northern analysis showing reciprocal effect of RybB on FtsO. RNA was extracted from WT (GSO982) cells harboring either the pBR or pBR-RybB at 150 and 360 min after dilution of the overnight …

Figure 8—figure supplement 2
Co-conservation of ftsO and RybB.

(A) Histogram of percent nucleotide sequence identity at wobble positions across the ftsI gene. Clustal (Madeira et al., 2019) was used to align the following species against the E. coli K12 MG1655 …

Author response image 1

Tables

Key resources table
Reagent type
(species) or resource
DesignationSource or referenceIdentifiersAdditional
information
Strain, strain background (Escherichia coli)MG1655 (WT)this studyN/Asee Supplementary file 4 for derivatives
Antibodymouse monoclonal anti-FLAG-M2-HRPSigma-AldrichCat#A8592WB (1:2000)
Antibodyrabbit polyclonal anti-OmpCBiorbytCat#orb6940WB (1:500)
Antibodydonkey polyclonal peroxidase labeled anti-rabbitGE HealthcareCat#NIF824WB (1:1000)
Recombinant DNA reagentpBRplacGuillier and Gottesman, 2006N/Asee Supplementary file 4 for derivatives
Recombinant DNA reagentpNM46 (pBRplac-lacI)this studyN/Asee Supplementary file 4 for derivatives
Recombinant DNA reagentpKD46Datsenko and Wanner, 2000N/Asee Supplementary file 4 for derivatives
Recombinant DNA reagentpMM1Stringer et al., 2014N/Asee Supplementary file 4 for derivatives
Sequence-based reagentnorthern probes and primersthis studyN/Asee Supplementary file 4
Chemical compound, drugspermidineSigma-AldrichCat#S2626-1G
OtherTRIzol reagentThermo Fisher ScientificCat#15596018RNA extractions
OtherRNA-sequencing reagentsMelamed et al., 2018N/A
Otherureagel-8National DiagnosticsCat#EC-838acrylamide northern solution
Otherureagel completeNational DiagnosticsCat#EC-841acrylamide northern solution
OtherNuSieve 3:1 agaroseLonzaCat#50090agarose for northern blotting
Other37% formaldehydeFisher ScientificCat#BP531-500
OtherRiboRuler high range RNA ladderThermo Fisher ScientificCat#SM1821
OtherRiboRuler low range RNA ladderThermo Fisher ScientificCat#SM1831
OtherZeta-Probe blotting membraneBio-RadCat#1620159northern membrane
OtherULTRAhyb-oligo hybridization bufferThermo Fisher ScientificCat#AM8663
Otherγ-32P ATPPerkinElmerCat#NEG035C010MC
OtherT4 polynucleotide kinaseNew England BiolabsCat#M0201L
OtherIllustra MicroSpin G-50 columnsGE HealthcareCat#27533001
Othermini-PROTEAN TGX gelsBio-RadCat#456–1086
OtherEZ-RUN pre-stained Rec protein ladderFisher ScientificCat#BP3603-500
Othernitrocellulose membraneThermo Fisher ScientificCat#LC2000western membrane
OtherSuperSignal West Pico PLUS chemiluminescent substrateThermo Fisher ScientificCat#34580
Software, algorithmlcdb-wf‘lcdb-wf’. Dale et al., GitHub Repositoryv1.5.3https://github.com/lcdb/lcdb-wf
Software, algorithmsra-toolsSRA Toolkit Development Teamv2.9.1_1http://ncbi.github.io/sra-tools/
Software, algorithmcutadaptMartin, 2011v2.3https://github.com/marcelm/cutadapt
Software, algorithmfastqcWingett and Andrews, 2018v0.11.8https://qubeshub.org/resources/fastqc
Software, algorithmbwaLi and Durbin, 2010v0.7.17http://bio-bwa.sourceforge.net/
Software, algorithmbowtie2Langmead and Salzberg, 2012v2.3.5http://bowtie-bio.sourceforge.net/bowtie2
Software, algorithmsamtoolsLi et al., 2009v1.9http://www.htslib.org/
Software, algorithmsubreadLiao et al., 2013v1.6.4http://subread.sourceforge.net/
Software, algorithmmultiqcEwels et al., 2016v1.7https://github.com/ewels/MultiQC
Software, algorithmpicard‘Picard Toolkit.’ 2019. Broad Institute, GitHub Repositoryv2.20.0https://github.com/broadinstitute/picard
Software, algorithmdeeptoolsRamírez et al., 2016v3.2.1https://deeptools.readthedocs.io/
Software, algorithmtermseq-peaks‘termseq-peaks.’ 2020. NICHD-BSPC, GitHub RepositoryN/Ahttps://github.com/NICHD-BSPC/termseq-peaks
Software, algorithmbedtoolsQuinlan and Hall, 2010v2.27.1https://bedtools.readthedocs.io/
Software, algorithmpybedtoolsDale et al., 2011v0.8.0https://github.com/daler/pybedtools
Software, algorithmucsc-toolkitKent et al., 2010v357https://doi.org/10.1093/bioinformatics/btq351
Software, algorithmbiopython SeqIOCock et al., 2009v1.73biopython.org
Software, algorithmrhoterm-peaks‘rhoterm-peaks’ 2020. GitHub repositoryN/Ahttps://github.com/gbaniulyte/rhoterm-peaks
Software, algorithmCLG Genomics WorkbenchQiagenv8.5.1alignment of DirectRNA-seq reads
Software, algorithmCLUSTALMadeira et al., 2019v2.1https://www.ebi.ac.uk/Tools/msa/clustalo/
Software, algorithmBOXSHADEv3.21https://embnet.vital-it.ch/software/BOX_form.html
Software, algorithmSfoldDing et al., 2004v2.2http://sfold.wadsworth.org/cgi-bin/srna.pl

Additional files

Supplementary file 1

3´ ends identified by Term-seq.

Curated 3´ ends for E. coli MG1655 (WT) grown to OD600 ~0.4 and 2.0 in LB, and to OD600 ~0.4 in minimal (M63) glucose medium. The genomic coordinate of the 3´ end (3´ end position), DNA strand, average RNA-seq read count of the 3´ end of both biological replicates, 3´ end classification (see Materials and methods), the gene annotation for the classification (details), and the sequence surrounding the 3´ end (40 bp upstream and 10 bp downstream, 3´ end nucleotide red and bolded) make up the columns of the table. The data for each growth condition are displayed on a separate tab.

https://cdn.elifesciences.org/articles/62438/elife-62438-supp1-v1.xlsx
Supplementary file 2

Identification of Rho termination regions using DirectRNA-seq ±BCM.

All identified Rho termination regions are represented, defined as regions with at least one genomic coordinate with a significance score <1e−4. Rho scores were calculated for each genomic position by comparing DirectRNA-seq coverage in windows 800 nt upstream and downstream in the treated (+BCM) and untreated (-BCM) samples (see materials and methods). The 3´ genomic coordinate with the highest Rho score, DNA strand, 3´ end classification (see materials and methods), the gene annotation for the classification (details), the read coverage in the 800 nt windows upstream and downstream the 3´ end position ±BCM, the Rho score, and the p-value from the Fisher’s exact test (significance score) make up the columns of the table. An ‘undefined’ Rho score indicates one that could not be calculated due to zero reads in the -BCM downstream region. A significance score of ‘N/A’ indicates that the significance score was too low to accurately report.

https://cdn.elifesciences.org/articles/62438/elife-62438-supp2-v1.xlsx
Supplementary file 3

Analysis of 3´ ends in 5´ UTRs and within coding sequences.

Term-seq identified 3´ ends for E. coli MG1655 (WT) that were between 200 nt upstream of an ORF and the corresponding stop codon. The genomic coordinate of the 3´ end (3´ end position), DNA strand, average RNA-seq read count of the 3´ end of both biological replicates, 3´ end classification (see materials and methods), the gene annotation for the classification (details), the location of the 3´ end relative to an ORF – upstream ORF or internal (ORF classification), and the gene annotation of the associated ORF (upstream ORF/internal details) make up the columns of the table. The data for each growth condition are displayed on a separate tab. For 3´ ends identified in WT E. coli grown to OD600 ~0.4 in LB, each 3´ end was also given a Rho score, an assessment of whether it is in a Rho termination region, and an intrinsic terminator score (see materials and methods). The read coverage in 800 nt windows upstream and downstream of each 3´ end position ±BCM from the DirectRNA-seq data, used to calculate Rho scores, is presented. An ‘undefined’ Rho score indicates one that could not be calculated due to zero reads in the ±BCM downstream region. Intrinsic terminator scores > 3.0 are suggestive of intrinsic termination (Chen et al., 2013). An ‘undefined’ intrinsic terminator score indicates one that could not be calculated because the sequence could not be folded into a recognizable secondary structure. Characterized regulatory elements in the LB 0.4 condition are also noted.

https://cdn.elifesciences.org/articles/62438/elife-62438-supp3-v1.xlsx
Supplementary file 4

List of strains together with plasmids (tab 1) and oligonucleotides (tab 2) used in this study.

https://cdn.elifesciences.org/articles/62438/elife-62438-supp4-v1.xlsx
Transparent reporting form
https://cdn.elifesciences.org/articles/62438/elife-62438-transrepform-v1.docx

Download links