1. Genetics and Genomics
  2. Microbiology and Infectious Disease
Download icon

An atlas of the binding specificities of transcription factors in Pseudomonas aeruginosa directs prediction of novel regulators in virulence

  1. Tingting Wang
  2. Wenju Sun
  3. Ligang Fan
  4. Canfeng Hua
  5. Nan Wu
  6. Shaorong Fan
  7. Jilin Zhang
  8. Xin Deng  Is a corresponding author
  9. Jian Yan  Is a corresponding author
  1. Department of Biomedical Sciences, City University of Hong Kong, Kowloon Tong, China
  2. School of Medicine, Northwest University, China
  3. Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Sweden
Tools and Resources
  • Cited 1
  • Views 2,457
  • Annotations
Cite this article as: eLife 2021;10:e61885 doi: 10.7554/eLife.61885

Abstract

A high-throughput systematic evolution of ligands by exponential enrichment assay was applied to 371 putative TFs in Pseudomonas aeruginosa, which resulted in the robust enrichment of 199 unique sequence motifs describing the binding specificities of 182 TFs. By scanning the genome, we predicted in total 33,709 significant interactions between TFs and their target loci, which were more than 11-fold enriched in the intergenic regions but depleted in the gene body regions. To further explore and delineate the physiological and pathogenic roles of TFs in P. aeruginosa, we constructed regulatory networks for nine major virulence-associated pathways and found that 51 TFs were potentially significantly associated with these virulence pathways, 32 of which had not been characterized before, and some were even involved in multiple pathways. These results will significantly facilitate future studies on transcriptional regulation in P. aeruginosa and other relevant pathogens, and accelerate to discover effective treatment and prevention strategies for the associated infectious diseases.

Introduction

Pseudomonas aeruginosa is an opportunistic human pathogen of considerable medical importance, as it causes pneumonia that is associated with high morbidity and mortality rates in immunocompromised patients, burn victims, and cystic fibrosis patients (Williams et al., 2010; Honda et al., 1977). Annually, more than 2 million patients are infected by the pathogen, and approximately 90,000 people die from it (Cross et al., 1983). The regulation of virulence-related pathways is mainly under the control of a large group of sequence-specific transcription factors (TFs) (Papavassiliou and Papavassiliou, 2016; Huang et al., 2019). TFs orchestrate the transcription of downstream genes and guide the expression of the genome by recognizing and occupying target promoter regions using their DNA-binding domains, thereby supporting or blocking the recruitment of RNA polymerase (Lambert et al., 2018; Wade, 2015). In eukaryotes, for example, the aberrant activity of human TFs or mutation in TF-binding sites (TFBSs) causes diseases such as cancers, cardiovascular disease, diabetes, obesity, and inflammation (Papavassiliou and Papavassiliou, 2016). The TFBSs on genomic DNA are predominantly determined by the sequence specificities of TFs, a fundamental property with which to unravel the functions of TFs and their roles in the mechanisms underlying disease etiology.

Although nearly 400 transcriptional regulators were previously predicted, only approximately 30 regulators have been characterized as virulence-associated TFs in P. aeruginosa over the past decades (Huang et al., 2019). To date, less than 5% of the P. aeruginosa TFs have been profiled for DNA-binding specificities, and thus, the downstream genes or upstream regulators of TFs remain largely unknown. To fill this gap, we applied a high-throughput systematic evolution of ligands by exponential enrichment (HT-SELEX) assay (Jolma et al., 2013) to all putative TFs (371 TFs) in the P. aeruginosa PAO1 genome, and successfully obtained 199 position weight matrix (PWM) models describing the DNA-binding specificities of 182 TFs. With the obtained PWM models, we scanned the PAO1 genome and reported 33,709 putative TFBSs that were highly enriched in the intergenic regions encompassing regulatory elements compared with gene body regions (odds ratio = 11.75). TF–target interactions were established for nine virulence-associated pathways, which enabled global decoding of the pathogenic regulatory networks of this microorganism. We validated some binding sites through a series of biochemical and genetic experiments and, interestingly, identified novel TFs potentially implicated in pathogenesis. This study provides an unprecedented scale of high-quality data on TF-binding specificity in a single bacterium and a global view of transcriptional regulatory relationships in P. aeruginosa, which constitute a major step toward deciphering the regulatory mechanisms of virulence. The findings are expected to substantially facilitate the development of effective therapies for the associated infectious disease.

Results

HT-SELEX reveals binding specificities of 182 TFs in P. aeruginosa

According to the existing annotations in ‘Pseudomonas Genome DB’ (https://www.pseudomonas.com/) (Winsor et al., 2016), the P. aeruginosa PAO1 genome contains 371 putative TFs, which can be classified into 29 function-associated families (El-Gebali et al., 2019; Pérez-Rueda and Collado-Vides, 2000). The vast majority of P. aeruginosa TFs (269/371) belong to six families: the LysR family, AraC family, LuxR family, OmpR family, TetR family, and GntR family (Supplementary file 1A–C). To determine the DNA-binding specificity, we performed a well-established HT-SELEX assay (Jolma et al., 2013) for all the 371 putative TFs, including 262 (71%) probable transcriptional regulators, 64 (17%) previously annotated TFs, and 45 (12%) cognate response regulators (RRs) in the two-component systems (TCS) (King et al., 1954) with predicted DNA-binding characteristics. Briefly, each full-length protein fused with a C-terminal 6×His-tag was expressed in and purified from Escherichia coli. The TFs were then subjected to four rounds of HT-SELEX enrichment, starting with a double-stranded DNA input library consisting of 40 bp randomized sequences adapted to the Illumina or BGI parallel sequencing system (see Figure 1A, Figure 1—figure supplement 1A–D, Supplementary file 1D and Materials and methods for details of the input design and experimental procedure). The sequencing reads from each consecutive HT-SELEX round were analyzed using a previously developed ‘multinomial’ algorithm (Nitta et al., 2015), which finally generated 199 binding profiles (PWM models) for 182 TFs. Some of these TFs were found to bind to DNA in homodimeric modes with different spacing and monomer orientations (Jolma et al., 2013), each represented by a PWM model (Stormo, 2013; Figure 1A, Supplementary file 2). Twelve independent replicates with separate protein purification and input DNA synthesis procedures were generated, and each pair of replicates showed virtually identical binding specificities, thus demonstrating the high reproducibility of HT-SELEX findings (highlighted in Supplementary file 2). In the subsequent analyses, we only included 199 unique PWM models by retaining one PWM model from each pair of technical replicates. Of the top six largest TF families, the OmpR, GntR, and LysR families demonstrated motif enrichment for more than 50% of their TF members (71%, 61%, and 52%, respectively; Supplementary file 1C). Taken together, HT-SELEX generated TF-binding motifs from 23 different families, with an overall success rate of 49% (182/371) for the entire TF repertoire of P. aeruginosa (Supplementary file 1B,C).

Figure 1 with 1 supplement see all
Summary of the HT-SELEX results in P. aeruginosa.

(A) Schematic description of protein expression and HT-SELEX procedure and output. (B) Network analysis of similarity of the obtained PWMs. Diamonds indicate TF genes, circles indicate individual PWMs. Edges are drawn between a TF and its PWM model, and between similar models if SSTAT similarity score >1.5e-05. To save space, ‘PA’ is omitted in all TFs names. The names of RRs of TCS are marked in pink font. (C) Comparison of the binding motifs of three TFs (LasR, RsaL, and PhoB) obtained from HT-SELEX (upper) and ChIP or cross-species sequence alignment methods (lower). Arrows indicate half-sites in dimeric sites. Also see Figure 1—figure supplement 1, Supplementary file 1.

TF classification by DNA-binding specificity

The DNA-binding specificity of a TF determines where it binds to the genome and which genes it regulates, and this property has notable effects on the physiological activity of the organism. We next generated TF regulatory networks to compare their sequence preferences and grouped the 182 TFs according to the similarity of their DNA-binding specificity. The similarity analysis using SSTAT (Jolma et al., 2013; Pape et al., 2008) (see Materials and methods) revealed that 133 PWM models from 116 TFs formed 38 interconnected subnetworks, each containing at least two TFs (Figure 1B). In addition, we observed 66 isolated subnetworks without significant connections to any other TF, which illustrated the generally diverse DNA-binding specificity of most TFs in P. aeruginosa (Figure 1B). In summary, the network analysis categorized 104 clusters of different DNA sequence preferences for 182 TFs in P. aeruginosa (Supplementary file 3).

Most TFs bind to DNA sequences in a homodimeric manner

A comparison between motifs previously established using different methods, e.g. chromatin immunoprecipitation (ChIP) (Gilbert et al., 2009; Shao et al., 2018) and cross-species consensus sequence alignment (Kang et al., 2017), and those identified in the present study revealed identical or similar motifs for LasR (Gilbert et al., 2009), RasL (Kang et al., 2017), and PhoB (Huang et al., 2019; Bielecki et al., 2015), demonstrating the high quality of PWM models obtained from our HT-SELEX analysis (Figure 1C). In general, HT-SELEX-generated motifs were also longer than the in vivo motifs generated from ChIP data, most likely because these in vivo motifs, compared with HT-SELEX-generated motifs, were derived from a much smaller number of binding sites owing to the small genome size. We could clearly identify two tandem homodimeric half sites (TAGCT) in the HT-SELEX-derived motif of LasR, whereas its ChIP-derived motif turned out sparse and contained only part of the monomeric site (CT), indicating the potentially poor quality of the previously known in vivo motifs.

Analysis of the PWM model length revealed that the motifs ranged from 9 to 28 bp in P. aeruginosa, with the most prevalent length being 16 bp (Figure 2A, Supplementary file 2). These long binding sites indicate that many TFs tend to bind to DNA sequences in a homodimeric manner. Indeed, the vast majority (179/199) of the PWM models displayed homodimeric-type binding (Figure 2B, Supplementary file 2), whereas only approx. 10% (20/199) of the PWM models exhibited monomeric binding specificity. Among the 20 monomeric PWM models, 16 TFs had only one PWM model while some TFs, such as PA1241, yielded two monomeric PWM models with different spacing between the two half sites (Figure 1B, Supplementary file 3). In addition, half of the TFs with monomeric motifs (nine TFs) belonged to the LysR and AraC families, the two largest families of TFs in P. aeruginosa PAO1. The monomeric sites were generally shorter than the dimeric sites and were therefore more prevalent in the genome, allowing higher flexibility in the transcriptional regulation of a broad range of genes. This suggests that these monomeric TFs play multiple regulatory roles. Most TFs exhibited two identical protein molecules bound in opposite orientations on different DNA strands forming a head-to-head homodimer, whereas only 16 TFs showed a consecutive binding orientation in the same direction (head-to-tail) (Figure 2B). Early studies have shown that some TF dimer formation events depend on the DNA molecules they interact with, and display strong enrichment for specific monomer orientation and spacing, unlike independent binding events by two monomers that may allow any spacing or orientation between them (Jolma et al., 2013; Jolma et al., 2015). For example, the TF PhoB binds to DNA as a homodimer in a head-to-tail consecutive orientation, with a ‘GTCA(C/T)’ monomer sequence preference spaced by a stretch of 6 bp AT-rich nucleotides (Figure 1C), supported by independent sets of ChIP-seq experiments (Huang et al., 2019; Bielecki et al., 2015). Consistently, many TFs have been confirmed to act as dimers when binding to DNA, such as LasR (Bottomley et al., 2007; Fan et al., 2013), RsaL (Kang et al., 2017), PsrA (Kang et al., 2009; Kojic et al., 2002), FleQ (Su et al., 2015), and QscR (Wysoczynski-Horita et al., 2018). Interestingly, most of the head-to-tail homodimeric TFs were found to belong to the OmpR family, which is also the main family for the RRs in the TCS (Figure 1B, Supplementary file 1B and Supplementary file 2). The head-to-head orientation is generally preferred by TFs across species, including humans (Jolma et al., 2015).

Figure 2 with 12 supplements see all
Comparison of different TF-binding modes.

(A) Histogram shows the distribution of the lengths of all PWM models, using red for odds numbers and blue for even numbers for better illustration. (B) Pie chart shows the category of different TF-binding modes. Classification of all binding models into non-repetitive sites (monomer) and sites with two similar subsequences (dimer). The dimeric types are further classified as head-to-head (two TF protein molecules bind to opposite orientation on DNA) and head-to-tail (two TF protein molecules bind consecutively on the same orientation on DNA). (C) Bar chart shows the number of the binding sites per TF. Note that most of TFs target fewer than 100 genes, while eight TFs exceptionally bind to more than 1000 sites in the genome. The inset histogram shows the prevalence of TFs with the corresponding number of predicted TFBSs. (D) The position annotation of binding sites of all 182 TFs in the P. aeruginosa genome using pie charts. The pie chart area is proportional to the percentage of predicted binding site location for all TFs, either inside (blue) or outside (orange) gene body regions. The inset shows the fraction of gene body (blue) and intergenic (orange) regions in the genome, reflected by the area of the two colors in the pie chart. (E) The number of TFs potentially significantly associated with nine virulence-associated pathways. The corresponding transcriptional regulatory network and validation details for each pathway are indicated in the parenthesis (display item). Note that newly associated TFs indicate that the TFs are uncharacterized genes. Also see Figure 2—figure supplement 12.

To predict direct interactions between TFs and DNA sequences, we first used the FIMO software (Grant et al., 2011) to scan genome-wide TFBSs for all of the 182 TFs with available PWM models in the P. aeruginosa PAO1 reference genome and identified 33,709 significant putative TFBSs (p<1.0e-5). For individual TFs, the number of TFBSs differed substantially, with more than 36% (65/182) of TFs predicted to bind to over 100 sites in the 6.3 Mb genome (Figure 2C, Supplementary file 4). The putative TFBSs were highly enriched in the intergenic regions than in the gene body regions (negative binomial test, p<2.2e-16), with 57% of them densely located within only a 10% fraction of the genome (Figure 2D). Some TFs were predicted to bind to only intergenic regions, such as PA4776 and PA5511 (Figure 1—figure supplement 1E). To verify the PWM-predicted binding sites, we performed electrophoretic mobility shift assays (EMSAs) using recombinant TF proteins and cloned double-stranded DNA fragments containing the predicted genomic binding sites for the tested TFs. This method was similar to HT-SELEX but avoided exponential competition between DNA sequences. EMSA could be more sensitive than HT-SELEX in detecting some weak binding sites but was limited in throughput in terms of the number of DNA sequences analyzed. In total, EMSA successfully validated 62 pairs of HT-SELEX motif-predicted TF–DNA interactions, including the binding modes (monomeric vs homodimeric binding) of the TFs in each interaction and various monomer spacings and orientations, thus demonstrating the high quality of the data set (Figure 2—figure supplements 111).

Previous studies have shown that the vast majority of TF binding within the gene body region has little or no effect on the transcriptional level in prokaryotes (Shimada et al., 2008); therefore, we primarily focused on TFBSs in the intergenic regions in the subsequent analyses. We were particularly interested in investigating the transcriptional regulatory program in relation to the virulence-associated growth and pathogenesis of P. aeruginosa. To colonize and overwhelm host tissues, nine pathways function in P. aeruginosa to exert its virulence, including biofilm production (Whiteley et al., 2001), quorum sensing (QS) (Lee and Zhang, 2015), Type VI (Ho et al., 2014) and Type III secretion systems (T6SS and T3SS, respectively) (Hauser, 2009; Hovey and Frank, 1995; Brutinel et al., 2008), motility (Coleman et al., 2020), antibiotic resistance (Chen et al., 2010), siderophores (Little et al., 2018), stringent response (SR) and persistence (Moradali et al., 2017; Goodman et al., 2004; Brauner et al., 2016), and oxidative stress resistance (Lan et al., 2010). From the literature, we comprehensively summarized genes implicated in these nine virulence-associated pathways (Supplementary file 5), and then applied bedtools (Wu and Jin, 2005) to annotate all TFBSs that occurred in the promoter regions of genes involved in these pathways. Our prediction of the binding sites of most TFs in the P. aeruginosa genome using HT-SELEX and EMSA enabled us to systematically depict the transcriptional regulatory networks and identified key players of these important pathways. We carried out enrichment analysis of TF-binding sites located in promoter of genes involved in various functional pathways by the hypergeometric test: a TF was annotated association with a virulence-associated pathway when its binding sites were significantly enriched within promoters of genes in that pathway (FDR < 0.05). Consequently, we managed to associate 51 unique TFs with the nine virulence-associated pathways, 32 TFs of which had not been functionally characterized before (Figure 2E; Figure 2—figure supplement 12).

PA0225 (ErfA) is a novel regulator of biofilm production

Biofilm is a complex community of diverse or single types of bacterial colonies embedded in an extracellular polymeric matrix that contains water, exopolysaccharides, extracellular DNA, proteins, and type IV pili (Tseng et al., 2018; Burrows, 2012; Laventie et al., 2019). P. aeruginosa produces biofilms to protect itself from host defense and antimicrobial agents, which helps enhance its pathogenicity (Mulcahy et al., 2014). We were interested in evaluating transcriptional regulation during biofilm production, which can be a critical step in development of strategies to control bacterial pathogenesis. To date, approximately 20 TFs have been reported to influence biofilm production (Kang et al., 2017; Badal et al., 2020; Baraquet and Harwood, 2013; Hickman and Harwood, 2008; Baraquet et al., 2012; Dieppois et al., 2012; Parkins et al., 2001; Petrova and Sauer, 2009; Vallet et al., 2004; Monds et al., 2001; Ernst et al., 1999; Macfarlane et al., 1999; Ramsey and Whiteley, 2004; McPhee et al., 2006; Liang et al., 2008; Nicastro et al., 2009; Mikkelsen et al., 2009; Mukherjee et al., 2017; Finelli et al., 2003; Damron et al., 2009; Kong et al., 2015; Deretic and Konyecsni, 1990; Balasubramanian et al., 2012; Hammond et al., 2015; Petrova and Sauer, 2010; Petrova et al., 2011; Guragain et al., 2016; Sarkisova et al., 2005; Yang et al., 2019; Yeung et al., 2011; Zhao et al., 2016). In our analysis, 57 putative TFs were predicted to regulate genes involved in biofilm production, including nine previously known regulators such as FleQ (Baraquet and Harwood, 2013; Hickman and Harwood, 2008), AlgB (Chand et al., 2012), and AmrZ (Jones et al., 2014; Figure 3A, Figure 2—figure supplement 4). Note that we were not able to recover any motif for the 11 other TFs that were known to be implicated in biofilm production. We reason that post-translational modification or protein–protein interaction may be required for their DNA binding.

Transcriptional regulatory network in biofilm production pathway.

(A) Network illustrates part of the regulatory relationship between TFs and their target genes in the biofilm pathway. Circles indicate TF proteins, and squares indicate target genes. Diamond highlights the gene with auto-regulation. Red arrows show that the binding sites are located in the putative promoters of the target genes, establishing the regulatory relationship. Black arrows highlight the binding with additional experimental validation. To save space, ‘PA’ is removed in the names of all TFs and their target genes. The dashed ovals and letters highlight regulatory relation validated in the corresponding figures and panels. (B) Comparison of binding motif of AmrZ derived by HT-SELEX (‘SELEX’) in the current study with a motif developed from a previous ChIP-seq study (upper). Arrows indicate half-sites in dimeric binding. Binding of AmrZ to a predicted site in the promoter of the psl operon is supported by a peak identified by a previous ChIP-seq study (lower) (Jones et al., 2014). (C) Electrophoretic mobility shift assay (EMSA) experiment validates the binding of PA0225, with its motif shown at the upper panel, to the promoter of pelABCDEFG. Arrows indicate half-sites in dimeric binding. Note that a binding-caused up-shift of the DNA band in gel is observed. By contrast, a negative control is used showing no binding to PA0225. (D) RT-qPCR shows that the transcription of pelA is significantly lower in PA0225 mutant cells compared with the wild-type cells (***p<0.001, Student’s t-test). ‘Endo. PA0225’ indicates endogenous expression of PA0225. ‘Ecto. PA0225’ indicates the ectopic expression of PA0225 delivered by the transformed plasmid pAK1900. Three technical replicates were performed. (E) The biofilm formation detection in the wild-type, PA0225 mutant, and PA0225 complemented strains using a crystal violate staining assay. ‘Endo. PA0225’ indicates endogenous expression of PA0225. ‘Ecto. PA0225’ indicates the ectopic expression of PA0225 by the transformed plasmid pAK1900. (F) The quantification of biofilm production in the wild-type, PA0225 mutant, and PA0225 complemented strains using a crystal violate staining assay. ‘Endo. PA0225’ indicates endogenous expression of PA0225. ‘Ecto. PA0225’ indicates the ectopically expression of PA0225 by the transformed plasmid pAK1900. Three technical replicates were conducted (*p<0.05; **p<0.01, Student’s t-test).

Biofilm encases bacteria in exopolysaccharide, which is composed of the key polysaccharides Pel and Psl, among others (Friedman and Kolter, 2004; Ryder et al., 2007). In mammals, TFs are known to collaboratively bind to the regulatory elements in dense clusters (Yan et al., 2013;Chronis et al., 2017), and thereby synergistically regulate the transcription of important genes (Hnisz et al., 2013). Here, we observed a similar phenomenon of co-binding among the P. aeruginosa TFs. For instance, multiple TFs, including AmrZ, FleQ, and two uncharacterized TFs, PA5293 and PA4547, were predicted to bind to the same psl operon (Figure 3A, Supplementary file 4). Consistently, AmrZ was previously reported to bind to the psl operon and repressed its transcription, consequently decreasing biofilm production (Jones et al., 2013). A very strong AmrZ ChIP-seq peak was identified for the psl promoter containing our PWM-predicted binding site (Coordinates: 2,453,511–2,453,531) (Jones et al., 2014Figure 3B), supporting the precise prediction of the binding site by our PWM motif. In addition, FleQ was previously observed to bind to the psl operon and consequently decrease biofilm production (Hickman and Harwood, 2008; Jones et al., 2013; Baraquet and Harwood, 2016). Our network confirmed the binding and suggested the precise binding site of FleQ in the promoter of the psl operon (Figure 3A).

In addition to the psl operon, we found that FleQ and three uncharacterized factors, PA0225 (ErfA) (Trouillon et al., 2020), PA0191, and PA1241, putatively bound the promoter of another biofilm-associated operon pelABCDEFG (Figure 3A). The binding of FleQ to the promoter of pel clusters had been verified by a footprint assay in a previous study (Baraquet and Harwood, 2016). We aimed to verify the regulatory roles of the uncharacterized TFs in biofilm production, given their predicted binding to the promoter of the pelABCDEFG operon. We first confirmed the direct binding between PA0225 and the pel operon using EMSA (Figure 3C). Next, we generated a PA0225-deficient clone of P. aeruginosa. The deletion of PA0225 significantly decreased the transcriptional activity of pelA down to only one-third of that in wild-type (WT) cells. The depleted pelA expression could be completely restored by ectopic overexpression of PA0225 in the mutant cells (Figure 3D). These findings support the specific regulatory role of PA0225 in pelA transcription. It is known that P. aeruginosa biofilm production is sensitive to the cellular abundance of PelA (Friedman and Kolter, 2004; Ryder et al., 2007), and therefore we attempted to investigate whether the deletion of PA0225 could consequently lead to deficiency in biofilm formation, given that PA0225 was a potential regulator of pelA expression. We examined and quantified biofilm production in PA0225-knockout cells using a crystal violet staining assay. We found that the intensity of the crystal violet staining ring formed by the mutant cells was significantly lighter than that formed by WT P. aeruginosa cells. Like the pelA transcription, the cell adherence phenotype in the knockout cells could also be effectively recovered by overexpressing PA0225 via transformed pAK1900 plasmid, supporting the role of PA0225 in regulating biofilm production (Figure 3E,F).

To comprehensively validate the list of TFs putatively associated with biofilm production, we obtained transposon-mediated mutant strains of 57 putative biofilm-associated TFs (Jacobs et al., 2003) except PA2121, PA2118, and PA2028, due to lack of availability. Using a crystal violet assay, we confirmed that the depletion of known regulators such as AmrZ, FleQ, or PA0191 significantly affected biofilm formation (Figure 2—figure supplement 4B). Similar to other prokaryotic TFs, both activating and repressing phenotypes could be observed upon manipulating TF binding in P. aeruginosa (Yang et al., 2019; Poole et al., 1996; Kawalek et al., 2019; Ha et al., 2004; Wu and Jin, 2005; Jin et al., 2011; Daddaoua et al., 2017; Chugani et al., 2001; Yan et al., 2019). However, we observed that most mutant strains did not show an overt biofilm-associated phenotype, even some of those previously known biofilm-regulatory TFs such as RsaL (Rampioni et al., 2009), CprR (Badal et al., 2020), AlgB (Chand et al., 2012; Mukherjee et al., 2019), CdpR (Zhao et al., 2016), BqsR (Synonym: CarR) (Guragain et al., 2016; Sarkisova et al., 2005), and PA3782 (Finelli et al., 2003). We reasoned that unlike the knockout deletion that fully removed the gene products, the transposon insertion mutants could influence the expression of more than one gene because of the polar mutation effects on the expression of downstream genes, leading to less predictable phenotypes.

TF-target network analysis reveals novel QS regulators

QS is a bacterial cell–cell communication system that fine-tunes the expression of hundreds of genes to produce, release, and recognize signaling molecules to monitor cell numbers and synchronize group behaviors (Lee and Zhang, 2015; Williams and Cámara, 2009; Høyland-Kroghsbo et al., 2017), while mediating cross-talks with a variety of virulence pathways involved in the aforementioned biofilm production and other important pathways such as T6SS, T3SS, motility, and antibiotic resistance (Lee and Zhang, 2015; Williams and Cámara, 2009). By using the PWM models of TFs to scan the genome, we predicted more than 100 TF-target gene relationships that occurred in the QS system, which involved a total of 49 TFs. In the putative QS network, four TFs had been previously annotated with regulatory roles in QS: Anr (Hammond et al., 2015), PsrA (Wells et al., 2017), PhoB (Blus-Kadosh et al., 2013), and RsaL (Kang et al., 2017; Figure 4A, Figure 2—figure supplement 5A). We predicted that an uncharacterized TF, PA1241, bound to the promoter of a QS-associated gene (Figure 4A). HT-SELEX generated a head-to-tail homodimeric motif for PhoB that was highly similar to a previously reported motif (Bielecki et al., 2015; Figure 1C). Our analysis showed that PhoB putatively conferred an auto-regulatory activity, with a predicted binding site on its own promoter (Figure 4A, Supplementary file 4). When we inspected the in vivo PhoB binding in the intergenic region upstream of its own transcription starting site, a very strong ChIP-seq peak was identified with its summit close to that of our PWM-predicted binding site (coordinate: 6,299,632–6,299,648) (Bielecki et al., 2015; Figure 4B).

TF-target networks in QS pathway.

(A) Network illustrates part of the regulatory relationship between TFs and their target genes in the QS pathway. The rest of the part is shown in Figure 2—figure supplement 5. Circles indicate TF proteins, and squares indicate target genes. Diamond highlights the gene with auto-regulation. Red arrows show that the binding sites are located in the putative promoters of the target genes, establishing the regulatory relationship. Black arrows highlight the binding with additional experimental validation. To save space, ‘PA’ is removed in the names of all TFs and their target genes. The dashed ovals and letters highlight regulatory relation validated in the corresponding figures and panels. (B) Arrows indicate half-sites in dimeric binding. Binding of PhoB to a predicted site in its own promoter, which was supported by a ChIP-seq peak. Input signal is shown for reference. The genomic coordinates are shown above the track and the genes (blue blocks) are indicated below the track. The exact binding site identified by our PhoB PWM is also highlighted. (C) Electrophoretic mobility shift assay (EMSA) and RT-qPCR validation of the predicted binding of PA2497 in the promoter of qslA. Left, EMSA experiment validates the binding of PA2497, with its motif shown at the upper panel, to the promoter of qslA. Arrows indicate half-sites in dimeric binding. Note that a binding-caused up-shift of the DNA band in gel was observed. By contrast, a negative control is used showing no binding to PA2497. Right, RT-qPCR shows that the qslA transcription is significantly lower in PA2497 mutant cells compared with the wild-type cells (*p<0.05, Student’s t-test). ‘Endo. PA2497’ indicates endogenous expression of PA2497. ‘Ecto. PA2497’ indicates the ectopic expression of PA2497 by the transformed plasmid. Three technical replicates were conducted. (D) EMSA and RT-qPCR validation of the predicted binding of PA1413 in the promoter of phzH. Left, EMSA experiment validates the binding of PA1413, with its motif shown at the upper panel, to the promoter of phzH. Arrows indicate half-sites in dimeric binding. Note that a binding-caused up-shift of the DNA band in gel was observed. By contrast, a negative control is used showing no binding to PA1413. Right, RT-qPCR shows that the phzH transcription is significantly lower in PA1413 mutant cells compared with the wild-type cells (*p<0.05, Student’s t-test). ‘Endo. PA1413’ indicates endogenous expression of PA1413. ‘Ecto. PA1413’ indicates the ectopic expression of PA1413 by the transformed plasmid. Three technical replicates were conducted.

Multi-TFs-mediated gene co-regulation was also commonly detected in the QS pathway. Co-binding sites of TFs were predicted in at least 10 promoters of QS-associated genes, including the well-studied genes, for example qlsA, phzH, lasR, and phoB, and some previously uncharacterized genes, such as PA0506 and PA4674. In the QS pathway, LasR is a key regulator of the expression of more than 300 genes, while QslA is an anti-activator that interacts with LasR and prevents it from regulating its downstream targets (Gilbert et al., 2009; Wade et al., 2005; Ueda and Wood, 2009). The promoter region of the qslA gene was putatively co-occupied by eight different TFs (Figure 4A). EMSA results confirmed the binding of at least six TFs to this putative regulatory element, whereas the binding-caused gel shift was not observed for negative controls randomly selected from genomic loci without predicted binding sites (Figure 4C, Figure 2—figure supplement 5B–F). To further explore the regulatory role of uncharacterized TF binding to this promoter, we generated a clean deletion mutant of TF PA2497 that exhibited a significant decrease in the transcription of qslA compared to the WT strain. The deficiency could be effectively restored by ectopic overexpression of PA2497 in the mutant strain (Figure 4C). Similarly, TF PA1413 was predicted to bind to the promoter region of phzH, which controls the synthesis of the well-known QS-mediated virulence factor phenazine (Mavrodi et al., 2001; Liang et al., 2011). Our EMSA result verified PA1413 binding to the promoter region of phzH in vitro (Figure 4D). PA1413 deletion significantly reduced the transcriptional level of phzH, which could also be reinstated by ectopically expressing PA1413 in the knockout cells (Figure 4D).

TFs involved in T6SS regulation

T6SS and T3SS are the two major types of effector protein-secreting apparatuses that contribute to strengthening the virulence of P. aeruginosa. T6SS is a bacteriophage-like membrane protein complex that injects T6SS effector proteins into target eukaryotic cells to cause host damage or delivers toxins into other prokaryotic cells to take control over inter-bacterial community competition, and is present in more than 200 Gram-negative bacterial species, including P. aeruginosa (Hood et al., 2010; Russell et al., 2011; Russell et al., 2013). We wired the interactions between TFs and T6SS-related genes and identified 37 TFs involved in the regulation of T6SS, 24 of which were previously uncharacterized (Figure 5A). Even for some of the previously known TFs, such as AmrZ, SphR, and SouR, we predicted new targets or new transcriptional regulatory relationships in the T6SS pathway. For example, TF SphR is responsive to host-derived sphingosine, which strengthens P. aeruginosa survival in mouse lungs (LaBauve and Wargo, 2014). Our data suggests that both SphR and the uncharacterized TF PA0048 bind to the promoter of the tagJ1-tssE1-tssF1-tssG1-clpV1-vgrG1 operon, implying a cooperative function between SphR and PA0048 (Figure 5A).

Transcriptional regulation in T6SS pathway.

(A) Network illustrates the regulatory relationship between TFs and their target genes in T6SS pathway. Circles indicate TF proteins, and squares indicate target genes. Diamond highlights the gene with auto-regulation. Red arrows show that the binding sites are located in the putative promoters of the target genes, establishing the regulatory relationship. Black arrows highlight the binding with additional experimental validation. To save space, ‘PA’ is removed in the names of all TFs and their target genes. The dashed ovals and letters highlight regulatory relation validated in the corresponding figures and panels. (B) Left panel shows an electrophoretic mobility shift assay (EMSA) experiment to validate the binding of PhoB, with its motif shown at the upper panel, to the promoter of amrZ (left). Arrows indicate half-sites in dimeric binding. Note that a binding-caused up-shift of the DNA band in gel was observed. By contrast, a negative control was used showing no binding to PhoB. Right panel shows the binding of PhoB to a predicted site in the promoter of amrZ, which was supported by a ChIP-seq peak. Input signal is shown for reference. The genomic coordinates are shown above the track and the genes are indicated below the track. The exact binding site identified by our PhoB PWM is also highlighted. (C) EMSA experiment validates the binding of PA1315, with its motif shown at the upper panel, to the promoter of amrZ. Arrows indicate half-sites in dimeric binding. Note that a binding-caused up-shift of the DNA band in gel was observed. By contrast, a negative control was used showing no binding to PA1315. (D) EMSA experiment validated the binding of PA3249, with its motif shown at the upper panel, to the promoter of amrZ. Arrows indicate half-sites in dimeric binding. Note that a binding-caused up-shift of the DNA band in gel was observed. By contrast, a negative control was used showing no binding to PA3249. (E) EMSA validation of the predicted binding of PA0535, with its motif shown at the upper panel, to the promoter of vgrG4. Arrows indicate half-sites in dimeric binding. Note that a binding-caused up-shift of the DNA band in gel was observed. By contrast, a negative control was used showing no binding to PA0535.

There are three major clusters, H1-T6SS, H2-T6SS, and H3-T6SS, in the T6SS system (Mougous et al., 2006). AmrZ may play a key role in biofilm production, as we discussed earlier, through binding to the psl operon. AmrZ is also implicated in the T6SS pathway by suppressing the expression of H2-T6SS (tssB2) but activating that of H1-T6SS (tssA1) and H3-T6SS (vgrG3) (Allsopp et al., 2017; Brencic and Lory, 2009). Here, we observed that AmrZ might also control the transcription of the vgrG1b cluster genes vgrG1b, tse6, and PA0099 (Figure 5A), which is genetically associated with the H1-T6SS major cluster (Pissaridou et al., 2018). In addition to identification of their downstream targets, TFBSs of up to 10 TFs (GacA, PhoB, PA5293, PA0032, PA2028, PA1437, PA0784, PA1315, PA3249, and PA3225) were putatively found in the amrZ promoter region, which suggests a transcriptional regulatory cascade and potential inter-connection between different virulence pathways (Figure 5A).

Next, we verified the binding sites of TFs using multiple sets of data: the binding of PhoB to the amrZ promoter was supported by ChIP-seq data and confirmed in vitro by EMSA (Figure 5B); the binding sites of the TFs PA1315 and PA3249 were confirmed by EMSA using the genomic fragments covering the amrZ promoter (coordinate: 3,790,981–3,791,197) (Figure 5C,D); the binding site of TF PA0535 to the promoter of the H1-T6SS apparatus component-encoding gene vgrG4 was also validated by EMSA (Figure 5E).

Motility is under the control of 37 TFs

P. aeruginosa converts its biofilm lifestyle to a planktonic lifestyle by regulating bacterial motility, which enhances its infectivity of the host (Drake and Montie, 1988). Bacterial motility shares a few common regulators with biofilm production, including FleQ (Arora et al., 1997; Jiménez-Fernández et al., 2016), AlgB (Chand et al., 2012), and AmrZ (Baynham et al., 2006), but it has many distinct TFs, such as GacA (Brencic et al., 2009) and PilR (Kilmury and Burrows, 2018). As expected, our network analysis reconfirmed the role of these TFs in regulation of motility-related genes (Figure 2—figure supplement 6A). For instance, FleQ was predicted to bind to the promoter of a flagellar gene flhA, and this prediction was supported by an independent study (Jyot et al., 2002). Similar to other pathways, we predicted many additional novel TFs involved in modulating bacterial motility. For example, PA3458 was predicted to bind to the promoter of the fimU-pilV-pilW-pilX-pilY1-pilY2-pilE operon, and PA1145, PA1399, PA3594, and PA4659 could all putatively bind to motility-related genes. Of these, PA3594 was predicted to bind the promoter of flgBCDE, which was confirmed by EMSA (Figure 2—figure supplement 6B). In sum, 37 TFs were predicted to regulate motility-related genes, in which 10 TFs were predicted to be highly involved regulators including RopN, PA0528, PA1145, PA1399, PA1490, EraR, PA3458, PA3594, PA4508, and PA4659 (Figure 2—figure supplement 12).

TFs potentially involved in other virulence-associated pathways

Antibiotic tolerance in P. aeruginosa causes antibiotic treatment failure or infection relapse (Brauner et al., 2016). TFs influence antibiotic resistance by regulating multiple genes, including mexF, mexE, and oprN. The TF PhoB was predicted to influence antibiotic resistance by binding to the promoters of pstB, oprD, PA3516, and czcABC, which was confirmed by both ChIP-seq (Bielecki et al., 2015) and EMSA (Figure 2—figure supplement 7B–E). Siderophores act as signaling molecules for the synthesis of two virulence proteins: exotoxin A and endo-proteinase PrpL (Lamont et al., 2002). The siderophore network revealed that 16 novel TFs could putatively bind to at least three siderophore-associated genes, with some loci co-bound by more than one TFs (Figure 2—figure supplement 9). Furthermore, 26, 17, and 11 TFs were predicted to be associated with the regulation of T3SS, ROS, and the SR and persistence pathways, respectively. Among them, 17, 11, and 8 factors, respectively, had not been functionally characterized before (Figure 2—figure supplement 10, Supplementary file 4).

Likewise, some TFs were putatively associated with multiple virulence-associated pathways. For example, PhoB and its target TF gene amrZ were likely implicated in virtually all virulence-associated pathways, including biofilm production, QS pathway, T6SS, T3SS, antibiotic resistance, and motility. Similar findings were observed for other factors, such as PA0048, whose putative target genes were implicated in biofilm production (mvfR), the QS pathway (rhlB), T6SS (tss), antibiotic resistance (pstB), and the persistence pathway (dnaJ). Our results provide a valuable resource on an unprecedented scale to dissect the intricate transcriptional regulatory networks in different important biological processes in the pathogenic bacterium P. aeruginosa and explicitly illustrate the inter-connectivity among these pathways.

Gene ontology analysis reveals potential functions for 69 putative TFs

TFs exert their functions by binding to DNA and driving transcription of the target genes. To potentially decode the transcriptional regulatory function of all putative TFs, we performed a gene ontology (GO) enrichment analysis for the targets of each TF (Figure 6). The target genes of 38% (69/182) of the TFs were enriched for at least one GO term (p<0.05), and 21 functional categories were associated with the regulons of these TFs. As expected, virtually all of them were annotated with ‘transcription regulator activity’. We then analyzed and explored the target genes for 20 TFs randomly selected from the 69 genes enriched for ‘transcription regulator activity’ and found that more than 80% of them coded for TFs (234/291), implying an inter-connective transcriptional regulatory network integrated with mutual regulation between TFs. In addition, the potential biological functions of 46 previously uncharacterized TFs were also suggested. For example, PA0048 could possibly be associated with ‘antibiotic transporter activity’, while the binding sites of PA3458 were enriched in promoters of genes with the ‘vitamin B6 binding’ and ‘amine transmembrane transporter activity’ characteristics. Interestingly, potentially additional functions were suggested for five previously annotated TFs, such as peptidase activity (AlgR and RtcR), FAD binding (NmoR), zinc ion binding (AguR), and phosphatase activity (RpoN).

TF clustering by function of the putative targets.

The heat map shows the functional annotation enrichment of TF targets in 29 gene ontology (GO) terms. The color key indicates the -log10 (p-value).

Discussion

Microbes appeared on earth nearly 2 billion years before the evolutionary arrival of human beings. Today, the population of microbes is estimated to be approximately 1 trillion, far exceeding the population of every other species in number and diversity (Krause, 2002; Locey and Lennon, 2016). High abundances of pathogenic microbes are found in high-density areas of human populations, especially in cosmopolitan urban centers (Wang et al., 2018). Due to the alarming emergence of drug-resistant pathogens across the planet, humans are being confronted with the unprecedented threat of untreatable infectious diseases (Sattar, 2007). However, very limited information and few useful treatment approaches are available to manage the infection and antibiotic resistance of such pathogens. P. aeruginosa is a model strain for studying pathogens as its virulence is controlled by many pathways. In this study, we successfully generated 199 unique PWM models to describe the DNA-binding specificities of 182 distinct TFs in P. aeruginosa, which covered nearly 50% of all TFs (including putative TFs) in its genome. We used all of the PWM models to scan the P. aeruginosa PAO1 genome and deduced 33,709 direct TF–DNA interactions, which enabled the construction of transcriptional regulatory networks for nine virulence-associated pathways, providing a valuable resource for deciphering the pathogenicity of P. aeruginosa.

Although many motifs obtained by HT-SELEX showed a high similarity to motifs derived from ChIP-seq peaks or other methods, such as sequence alignment, incomplete overlapping was observed for a few TFs (such as LasR) owing to method-specific differences, e.g. interference by protein–protein interactions, the number of TFBSs used to derived the motifs (Figure 2—figure supplement 11A). To further compare the PWM models between HT-SELEX and ChIP-seq, we used the PWM models of 10 TFs, namely AlgR, CdpR, ExsA, FleQ, GacA, MexT, PchR, PhoB, RsaL, and SphR, to predict the binding events for each TF in the P. aeruginosa genome (Huang et al., 2019) and evaluated the performance with a precision-recall curve analysis. The result demonstrated that most TFs showed a satisfactory area under the precision-recall curve value of more than 0.5, demonstrating that our PWM could moderately predict the in vivo binding events (Figure 2—figure supplement 11B). ChIP-seq data are highly influenced by antibody quality and specificity, and the fact that a ChIP assay cannot distinguish between direct binding and indirect binding occurring between the protein and DNA. Owing to the small genome size, the binding motifs of TFs by ChIP-seq were likely skewed by the very limited number of motif-generating sequences. Consistent with the findings in higher species that TF-binding specificity is conserved during evolution, we also found that orthologous TFs from different strains displayed a similar binding specificity (Fan et al., 2020). For example, the DNA sequence preference of P. aeruginosa PhoB generated by HT-SELEX in our study showed a high similarity to that of Caulobacter crescentus PhoB generated by ChIP-seq (Lubin et al., 2016). This suggests that our data are a potential reference resource for TF studies in other related organisms.

Prior to this work, we and our collaborators studied the regulatory mechanisms of a group of P. aeruginosa TFs, including AlgR, CdpR, RsaL, VqsR, AnvM, and VqsM (Kang et al., 2017; Kong et al., 2015; Zhao et al., 2016; Liang et al., 2012; Liang et al., 2014; Zhang et al., 2019). By integrating those and other published data sets, we have created a P. aeruginosa genome-wide regulatory network (PAGnet), which illustrates the regulatory relationships of 20 key virulence-related TFs with their target genes as profiled by ChIP-seq and RNA-seq (Huang et al., 2019). Given that most TFs have yet to be characterized, we presume that the comprehensive regulatory network of P. aeruginosa would be more complicated than is currently known. The present work significantly contributes to the PAGnet by systematically predicting direct interactions between several more TFs and their target genes. Our data enables to envision the underlying transcriptional regulatory relationships and the investigation of the potential function of many previously uncharacterized regulators. The building of virulence models, which is a major step toward decoding pathogenicity, may lead to the discovery of novel drug targets for combating the infection of P. aeruginosa and other pathogens in the future.

Materials and methods

Strains and growth condition

Request a detailed protocol

The bacterial strains used in this study are listed in Supplementary file 1A. P. aeruginosa PAO1 WT strain, their derivatives, and E. coli strains were grown at 37°C in Luria-Bertani (LB) agar plates statically or LB broth with shaking at 220 rpm.

Plasmids and primers

Request a detailed protocol

The plasmids and primers in this study are listed in Supplementary file 1A. Antibiotics for E. coli and its derivatives were used at the following concentrations: for E. coli with pET28a, 50 μg/ml kanamycin; for E. coli with pEX18AP, final concentration of 60 μg/ml ampicillin LB; for E. coli with pEX18AP-Gm plasmid, using final concentration of 15 μg/ml gentamycin in LB. For P. aeruginosa PAO1 with pEX18AP-Gm plasmid in LB media, antibiotic with final concentration 60 μg/ml gentamycin. For P. aeruginosa PAO1 with pAK1900 plasmid in LB media, antibiotic with final concentration of 100 μg/ml carbenicillin in LB. Antibiotics for P. aeruginosa PAO1 mutants, 60 μg/ml gentamycin.

Cloning and recombinant protein purification

Request a detailed protocol

Oligonucleotides and vectors used for cloning of His-tagged proteins in this study are listed in Supplementary file 2B. The cloning was carried out with a homologous recombination strategy, following the manufacturer’s instruction (Vazyme ClonExpress II One Step Cloning Kit, Vazyme Biotech). Briefly, the 371 TFs were identified in ‘Pseudomonas Genome Database’ (Winsor et al., 2016). DNA of 371 TFs were amplified by polymerase chain reaction (PCR) from P. aeruginosa PAO1 reference genome to obtain the coding regions of full-length proteins. Each forward PCR primer carried a 20 bp sequence identical to the linearized plasmid sequence at the 5’- and 3’-end of the cutting site followed by the gene-specific sequence. The homologous match between these two 20 bp recombination fragments determined the direction of the target gene in the expression vector. Then the BamHI-digested pET28a vector and individual TF PCR products (containing 20 bp overlapped sequences on 5’- and 3’-end, respectively) were mixed in the molar ratio of 1:2, and then incubated with recombinase for 30 min at 37°C. Each successfully constructed vector (371 reconstructed vectors in total) was then transformed into E. coli BL21 (DE3) strain and cultured in the LB agar plate, respectively. Then, a single colony of each strain was picked and cultured into 3 ml LB overnight, which was transferred into 300 ml LB containing 50 μg/ml kanamycin for protein extraction. After bacterial OD600 was near 0.6. 0.5 mM IPTG (isopropyl β-D-1-thiogalactopyranoside) was added into the cell culture with 16°C for 16 h. Then cell pellet was collected by centrifuging 7000 rpm for 5 min, at 4°C. The pellet was suspended in 15 ml buffer A (500 mM NaCl, 25 mM Tris-HCl, pH 7.4, 5% glycerol, 1 mM dithiothreitol, 1 mM PMSF (phenyl-methanesulfonyl fluoride)) and lysed by sonication for 30 min (20% power, 10 s on, 15 s off), and protein supernatant was obtained by centrifuging 12,000 rpm, 30 min, at 4°C. After filtering protein supernatant with a 0.45 μm filter, each filtrate was injected into a Ni-NTA column (Bio-Rad) to start a fast protein liquid chromatography (FPLC) system, respectively. The Ni-NTA column was washed with 30 ml gradient from 60 to 500 mM imidazole in buffer A gradually. Each eluted fraction was pooled, collected, and verified by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) based on molecular weight of each target TF plus 6× His-tag. HT-SELEX pipeline was exercised for all 371 TF proteins.

HT-SELEX

Request a detailed protocol

The HT-SELEX experimental pipeline was adapted from our previous study (Jolma et al., 2013). Briefly, DNA ligand libraries that contain 40 bp randomized sequences with illumina or BGI adaptor systems were synthesized and double stranded by PCR amplification using the primers in Supplementary file 1D, illustrated in Figure 1—figure supplement 1A–D. The uniformity of each base (A, T, C, G) was evaluated after the sequencing of libraries by using Illumina HiSeq Xten or BGI MGISEQ 2000 sequencer. Then, 100–200 ng TF proteins and 5 µl DNA ligands were combined with Promega binding buffer (10 mM Tris pH 7.5, 50 mM NaCl, 1 mM DTT, 1 mM MgCl2, 4% glycerol, 0.5 mM EDTA, 5 µg/ml poly-dIdC [Sigma P4929]) to make 25 µl of total volume. Protein–DNA complexes were incubated with 150 µl of Promega binding buffer (without poly-dIdC) containing 10 µl Ni Sepharose 6 Fast Flow resin (GE Healthcare 17-5318-01) equilibrated in binding buffer for 60 min with a gentle shaking after 30 min of reaction at room temperature. Subsequently, the unbound ligands were separated from the bound beads through washing with a gentle shaking for 12 times with 200 μl of Promega binding buffer (without poly-dIdC). After the washes, the beads were suspended by using 200 µl double-distilled water. Finally, the DNA libraries were built by 18 cycles of PCR amplification using the 20 µL bound DNA as template and the primers in Supplementary file 1D. The obtained PCR products were used as selection ligands for the next cycle of HT-SELEX. After the fourth cycle, the purified PCR samples from each HT-SELEX cycle were pooled and sequenced using Illumina HiSeq Xten or BGI MGISEQ 2000 sequencer.

SELEX data analysis

Request a detailed protocol

Raw sequencing data were binned according to barcodes for each sample. Sequences from the 40-nt random region were further analyzed, the low quality reads with bases annotated as ‘N’ being filtered out. PWM models were generated using initial seeds identified using Autoseed (Nitta et al., 2015) that were refined by expert analysis as described in Jolma et al., 2013. Exact seeds, cycles, and multinomial model used are indicated in Supplementary file 2. All motif seqlogos were generated using the R package ggseqlogo (Wagih, 2017).

Network analysis of similarity between PWMs

Request a detailed protocol

We calculated the similarities of all pairs of 199 PWMs using SSTAT (Pape et al., 2008) (parameters: 50% GC-content, pseudocount regularization, type I threshold 0.01) as described in Jolma et al., 2013. We generated a network containing two types of nodes: one type representing TF-binding profiles and another type representing TF proteins. TF protein nodes were connected to their binding models, and the binding models were further connected to each other if their SSTAT similarity score (asymptotic covariance) was greater than 1.5e-5 as described in Jolma et al., 2013. Finally, the network was visualized using Cytoscape software v3.7.2 (Smoot et al., 2011).

Analysis of the transcriptional regulatory networks

Request a detailed protocol

We first comprehensively summarized nine virulence-associated pathways' gene lists based on literatures, and then scanned binding sites in P. aeruginosa PAO1 genome with all PWMs using FIMO (p<1.0e-5) (Grant et al., 2011). The p-value cutoff of 1.0e-5 predicted the binding sites that were closest to the peaks identified by ChIP-seq of PhoB which we had high-quality ChIP-seq data for. When we set the p-value cutoff of 1.0e-5, we obtained 22 binding sites, which is highly consistent with the ChIP-seq result. Therefore, we determined to use p<1.0e-5 as a standard cutoff for all FIMO prediction throughout our manuscript. Then bedtools (v2.25.0) (Quinlan and Hall, 2010) was used to annotate all TFBSs, which then intersected with the promoter regions of genes involved in nine virulence-associated pathways. Uncharacterised TFs involved in regulating the pathway genes are recognized as the pathway-associated TFs. We therefore generated transcriptional regulatory networks for nine important systems of P. aeruginosa which contained two types of nodes, one type representing TF proteins and another type representing targets. TF protein nodes were connected to their targets if the TFBSs were located in the promoter region of the targets. All networks were visualized using Cytoscape software v3.7.2 (Smoot et al., 2011).

Electrophoretic mobility shift assay

Request a detailed protocol

EMSA is conducted in vitro using recombinant proteins and synthesized DNA ligands. DNA probes were PCR-amplified using primers listed in Supplementary file 1A, ranging from 210 bp to 240 bp long. Each PCR template was acquired from P. aeruginosa PAO1 genome. The 30 ng probe was mixed with varying amounts of TFs in binding buffer (10 mM Tris-HCl, pH 7.4, 50 mM KCl, 5 mM MgCl2, 10% glycerol) with the final volume of 20 μl. Meanwhile, DNA probes of the negative controls in each group of reaction were chosen randomly and ensured without the corresponding TFBSs. After 30 min incubation at room temperature, the reactions were loaded and run by 6% polyacrylamide gel electrophoresis at 100 V for 1 hr. Then, the gels were subjected to nucleic acid dye for 5 min, and visualized and photographed using the gel imaging system (Bio-Rad). The assay was repeated at least twice with similar results.

Construction of TF-deficient P. aeruginosa strains

Request a detailed protocol

Gene deletions were constructed as previously described (Hoang et al., 1998). The principle of gene knockout mutants depends on a SacB-based strategy. The pEX18AP plasmid was digested by using HindIII and EcoRI. The upstream arm (~1000 bp) and downstream arm (~1500 bp) of a TF gene were amplified from P. aeruginosa PAO1 genome and digested with XbaI (for detailed primers information, see Supplementary file 1A). Then the XbaI digested upstream and downstream fragments were ligated with T4 DNA ligase (NEB). The ligated DNA products were inserted into the EcoRI and HindIII digested pEX18AP plasmid using ClonExpress MultiS One Step Cloning Kit (Vazyme, China) to yield the pEX18AP-Up-Down plasmid. Then, pEX18AP-Up-Down of each TF was digested by XbaI and ligated with a 0.9 kb XbaI-digested gentamicin resistance cassette, generating pEX18AP-Up-Down-Gm plasmid, which was transformed into P. aeruginosa PAO1 WT competent cells with electroporation and cultured on the agar plate. Colonies were selected for gentamicin resistance and then transferred to LB agar plates containing 5% sucrose, which typically happens a double-cross-over event and thus gene replacement. Each TF mutant was further confirmed by PCR to detect the DNA and RT-qPCR to detect the mRNA level.

Reverse-transcription quantitative polymerase chain reaction (RT-qPCR)

Request a detailed protocol

P. aeruginosa PAO1 WT cells and its derivatives were cultured until OD600 to 0.6. To harvest the bacterial cells, the cultures were centrifuged at 6000 rpm for 3 min. RNA extraction and purification were performed using RNeasy minikit (Qiagen) following the manufacturer’s instruction. RNA concentration was measured by Nanodrop 2000 spectrophotometer (ThermoFisher). The synthesis of cDNA was carried out using the FastKing RT Kit (Tiangen Biotech). RT-qPCR was performed by SuperReal Premix Plus Kit (SYBR Green, Tiangen Biotech) following the manufacturer’s instruction. Each reaction was performed in triplicates in 20 μl reaction volume with 20 ng cDNA and 16S rRNA as an internal control. For each reaction, 100 nM primers (Supplementary file 1B) were used for RT-qPCR. The reactions were run at the program on a PCR machine (42°C for 15 min, 95°C for 3 min, and then kept at 4°C). The fold change represents relative expression level of mRNA relative to the 16S rRNA control gene, which can be estimated by the values of 2-(ΔΔCt). All the reactions were conducted with two biological repeats.

Biofilm formation assay

Request a detailed protocol

Biofilm production was detected as previously reported in minor modifications (King et al., 1954). In brief, overnight bacterial cultures of P. aeruginosa PAO1 WT and TF mutants were transferred to a 10 ml borosilicate tube containing 1 ml LB medium (without antibiotics) with the original concentration OD600 = 0.1. Then, the cultures grow statically at 37°C for 12–24 hr. Then, 0.1% crystal violet was used to stain the biofilm adhered to the tube tightly for 30 min and other components bound to tube loosely was washed off with distilled deionized water (ddH2O). Borosilicate tubes were washed for more than three times with ddH2O gently, and the remaining crystal violet was fully dissolved in 1 ml 95% ethanol with constantly shaking after photograph. 100 µl of this eluate was transferred to a transparent 96-well plate to measure its optical density at 590 nm using Biotek microplate reader. The experiment was repeated using three independent bacterial cultures.

GO analysis

Request a detailed protocol

GO enrichment analysis of TFs target genes was catalogued using DAVID version 6.7 (https://david-d.ncifcrf.gov/). The GO term with p-value < 0.05 was defined as significantly enriched term.

Statistical analysis

Request a detailed protocol

Two-tailed Student’s t-tests were performed using Microsoft Office Excel 2010. *p < 0.05, **p < 0.01, and ***p < 0.001 and error bars represent means ± standard deviation (SD). All experiments were repeated for at least twice. Statistical graphs were drawn using the GraphPad Prism 8, R ggplot2, and Python Matplotlib packages. All motif logos are drawn using R package ggseqlogo. All network was visualized by using Cytoscape software v3.7.2 (128).

Data availability

Sequencing data has been deposited in GEO under accession code GSE151518.

The following data sets were generated
    1. Yan J
    2. Deng X
    3. Wang T
    4. Sun W
    5. Fan L
    6. Hua C
    (2020) NCBI Gene Expression Omnibus
    ID GSE151518. An Atlas of the Binding Specificities of Transcription Factors in Pseudomonas aeruginosa Directs Prediction of Novel Regulators in Virulence.
The following previously published data sets were used
    1. Bielecki P
    2. Jensen V
    3. Schulze W
    4. Gödeke J
    5. Strehmel J
    6. Eckweiler D
    7. Nicolai T
    8. Bielecka A
    9. Wille T
    10. Gerlach RG
    11. Häussler S
    (2015) NCBI Gene Expression Omnibus
    ID GSE64056. Cross-regulation between the response regulators PhoB and TctD allows for the integration of diverse environmental signals in Pseudomonas aeruginosa.

References

    1. Honda E
    2. Homma JY
    3. Abe C
    4. Tanamoto K
    5. Noda H
    6. Yanagawa R
    (1977)
    Effects of the common protective antigen (OEP) and toxoids of protease and elastase from Pseudomonas aeruginosa on protection against hemorrhagic pneumonia in mink
    Zentralblatt Fur Bakteriologie, Parasitenkunde, Infektionskrankheiten Und Hygiene. Erste Abteilung Originale. Reihe A: Medizinische Mikrobiologie Und Parasitologie 237:297–309.
    1. King EO
    2. Ward MK
    3. Raney DE
    (1954)
    Two simple media for the demonstration of pyocyanin and fluorescin
    The Journal of Laboratory and Clinical Medicine 44:301–307.
    1. Sattar SA
    (2007)
    Reducing the health impact of infectious agents: the significance of preventive strategies
    GMS Krankenhaushygiene Interdisziplinar 2:Doc06.

Decision letter

  1. Christina L Stallings
    Reviewing Editor; Washington University School of Medicine, United States
  2. Bavesh D Kana
    Senior Editor; University of the Witwatersrand, South Africa
  3. Yu Zhang
    Reviewer; Shanghai Institute of Plant Physiology and Ecology, China

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This manuscript provides a global view of transcription factor interactions in Pseudomonasaeruginosa and represents an important resource for the research community. With the use of novel assays, the work has uncovered transcription factor binding specificities for nearly half of the annotated transcription factors and forms a good foundation for future studies. The study will be of interest to microbiologists and those interested in bacterial metabolism.

Decision letter after peer review:

Thank you for submitting your article "The Binding Specificity Atlas of Pseudomonasaeruginosa Transcription Factors Reveals Novel Regulators in Virulence" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Bavesh Kana as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Yu Zhang (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

As the editors have judged that your manuscript is of interest, but as described below that additional experiments are required before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)

Summary:

This manuscript analyzes the DNA binding sequence specificity of transcription factors (TFs) in the human pathogen Pseudomonasaeruginosa using a high-throughput DNA enrichment method. The consensus binding motifs of 182 (out of 371 in total) TFs were obtained and genomic loci were predicted to be targets of these TFs, some of which were experimentally validated. The work provides a useful resource for studying the transcription regulation network in Pseudomonasaeruginosa, and also provides DNA motif information for TFs in other bacteria. The reviewers all agree the work performed is generally of good quality, however, specific concerns raised by the reviewers regarding how the data are obtained, classified, and validated, need to be addressed.

Essential revisions:

1. One of the major conclusions is that most TFs bind as homodimers because their PWMs contain 'head-to-head' duplications of the same sequence. However, it could also be that two monomers bind a single double stranded DNA molecule in the HT-SELEX assays. It would be helpful if the authors could investigate whether homodimeric protein-protein interactions are known to occur for relevant TFs, or whether homodimers have been observed in structural studies.

2. The comparison of their HT-SELEX-derived PWMs with those obtained by ChIP should be expanded. After predicting binding events based on their PWM data, precision/recall analysis should be done with the ChIP data that compares individual binding events rather than the overall composite PWM binding motif. This will address the question of how many predicted binding events were also reported by ChIP and how many binding events detected by ChIP were also predicted. This is a key point.

3. The computational scanning for TF binding sites in the Pseudomonas genome is not sufficiently described: how was thresholding determined to call binding sites for each TF?

4. How a TF is associated with a biological pathway is not well described. How many binding events are required to call an association? Is this done one TF at a time, or one pathway at a time? How is a pathway defined?

5. The biological validation experiments appear ad hoc. It is not clear what this says about the entire dataset. For instance, only one new TF was validated experimentally out of 57 biofilm-associated TFs. The crystal violet assay can be easily performed in a 96-well plate format to validate 57 TFs that affect biofilm formation. PAO1 mutants can be obtained from Manoil lab (PAO1 transposon insertion library (https://www.gs.washington.edu/labs/manoil/libraryindex.htm)

6. It should be noted that the gel shift assays are similar to SELEX.

7. How many (predicted) binding events are functional? Analysis of (available) RNA-seq data would be key for this. For instance, Figure 1B. shows that the well-studied GacA and AmrZ have similar DNA binding specificity. Transcriptomic data are available for both these TFs (PMID: 31270321) and should be compared to the physical binding predictions.

8. Page 11. Lines 266-268, also Figure 5A. The authors state that ShpR shares six targets with PA4008 implying a cooperative (or redundant?) function in regulating T6SS. From the figure 5A, these 6 targets are: tagJ1, tssE1, tssF1, tssG1, clpV1, vgrG1. However these six genes are in the same operon and share a single promoter (https://Pseudomonas.com/feature/show/?id=102905&view=operons). The figure and the text imply that there are 6 independent TF binding sites located in 6 different promoters. However, there is a single promoter, and ShpR and PA4008 share only one target. Operon genes that share a single promoter are represented in the figures as independent genes. This is a major problem throughout the paper and figures and needs to be resolved.

9. Page 12. This paragraph is speculative as no functional motility assays were performed and no target genes were functionally validated. Thus the statement "in sum, 37 TFs were found involved (..) in regulating motility-related genes" is an overstatement.

10. Throughout the manuscript, the authors should be careful not to overstate their findings and be precise when biological function is predicted vs demonstrated experimentally.

11. The manuscript would benefit from thorough proofreading and editing.

12. Overall, 371 TFs entered in the pipeline and a result could be generated for 182 TFs (app. 50%). From an experimental design point of view, only 3 TFs have been used to benchmark the reproducibility of the binding site motifs which represents <1% of the input. This should be improved to provide a resource for the field. Since it is an in vitro assay, I would have expected to see the results validated systematically in duplicate. The author claim that the three replicates obtained "virtually identical binding specificity": but what does this mean statistically?

13. How did the authors select 182 TFs out of 371 TFs? Were there issues with protein purification for the others?

14. Line 167/Figure 2E: using the binding motifs, the authors predicted the involvement of 365 TFs in Pseudomonas virulence pathways. How is it possible to predict the role of 365 TFs while they obtained PWM for only 182 TFs?

15. Figure 3B/C: AmrZ binding site: it is not clear for the link between Figure 3B and 3C. Where is the motif in the psl operon?

16. The authors have already published a TF regulatory network paper (https://doi.org/10.1038/s41467-019-10778-w) covering 20 TFs : a supplementary figure that compares side-by-side the motifs found in the both papers could be interesting.

17. The author should explain why 201 sequence motifs (PWMs) were obtained for 182 TFs in the main text. For example, in the Data S1, the Module 37 of PA1241 has two sequence motifs containing almost the same consensus sequence. Why have two consensus sequence motifs have to be assigned?

18. The sequence motif for sig54 (rpoN) is conserved across bacteria and the core consensus sequence of GGN(10)GC has been confirmed in various studies. The sequence motif for sig54 by SELEX (Figure 1C) is inconsistent with previous literatures and should be reanalyzed and discussed.

19. The manuscript also describes interesting monomeric sequence motifs for 19 TFs. The discussion of these 19 TFs should be expanded.

20. The location of TF-binding sites relative to transcription start site (TSS) on the promoter DNA could help predict the activity of the corresponding TFs. For example, TF binding sites overlapping with the core promoter region (-35 to +1) and the proximal downstream region of a TSS may suggest transcriptional repression. As TSS profiles in Pseudomonasaeruginosa are published, it would strengthen the report to integrate the TSS information into the analysis herein.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "The Binding Specificity Atlas of Pseudomonasaeruginosa Transcription Factors Reveals Novel Regulators in Virulence" for further consideration by eLife. Your revised article has been evaluated by Bavesh Kana (Senior Editor) and a Reviewing Editor.

Summary:

This manuscript provides a global view of transcription factor interactions in Pseudomonasaeruginosa and represents an important resource for the research community. With the use of novel assays, the work has uncovered transcription factor binding specificities for nearly half of the annotated transcription factors and forms a good foundation for future studies. The study will be of interest to microbiologists and those interested in bacterial metabolism.

Although the manuscript has been improved, there are some remaining issues that need to be addressed, as outlined below:

1. There are still concerns about the interpretations of RpoN binding data. It is well established that RpoN doesn't function as a dimer to facilitate transcription initiation. It binds RNAP core enzyme in a 1:1 molar ratio to form an RNAP holoenzyme. Second, the sequence motif itself is not a direct repeat of two half-sites. The 5' 'GG' and 3' 'GC' consensuses of the GGN(10)GC motif are representative sequences of the '-24' and '-12' elements of a typical sigma 54 (encoded by rpoN)-initiated promoter. The two elements are recognized by two separate domains of RpoN, both in the presence and absence of RNAP holoenzyme. This issue of data interpretation and representation needs to be addressed. It is also recommended that the authors use a different example of head to tail binding.

2. The authors associate a TF with a pathway when a single predicted binding event occurs with any gene in that pathway. However, statistical analysis must be performed to determine the false discovery rate and whether their findings could occur via chance.

3. There are still points that the authors equate binding with function in the manuscript, please revise throughout to not make this overstatement.

https://doi.org/10.7554/eLife.61885.sa1

Author response

Essential revisions:

1. One of the major conclusions is that most TFs bind as homodimers because their PWMs contain 'head-to-head' duplications of the same sequence. However, it could also be that two monomers bind a single double stranded DNA molecule in the HT-SELEX assays. It would be helpful if the authors could investigate whether homodimeric protein-protein interactions are known to occur for relevant TFs, or whether homodimers have been observed in structural studies.

Thanks for the constructive comment. Many of the TF homodimers have been reported before by independent works, such as LasR (1, 2), RsaL (3), PsrA (4, 5), FleQ (6), and QscR (7). We have now cited these studies in our manuscript.

Furthermore, it should also be noted that unlike other cytosolic proteins, many TF dimer formations are mediated by sequence specific DNA molecules which they bound to and therefore may not be appropriately detected by protein-protein interaction assays. This has been shown before for both TF homodimers and heterodimers (8, 9). If the “head-to-head” motif had been indeed caused by “two monomers bind a single double stranded DNA molecule in the HT-SELEX assays”, the spacing between the monomer motif would have not been specific. In other words, two independent monomers could bind to a double stranded DNA in ANY orientation or spacing. Here in our study, we have shown that the TF homodimers displayed strong enrichment for particular orientation and spacing between two monomers. We have added the discussion to the revised manuscript. Please see Lines 154159.

2. The comparison of their HT-SELEX-derived PWMs with those obtained by ChIP should be expanded. After predicting binding events based on their PWM data, precision/recall analysis should be done with the ChIP data that compares individual binding events rather than the overall composite PWM binding motif. This will address the question of how many predicted binding events were also reported by ChIP and how many binding events detected by ChIP were also predicted. This is a key point.

Thanks for the suggestion. We agree that this is a very good way to evaluate the quality of our data. There are 10 TFs that were analyzed with both ChIP-seq (10) and HT-SELEX, including AlgR, CdpR, ExsA, FleQ, GacA, MexT, PchR, PhoB, RsaL, and SphR. We used their PWM models to predict the binding events for each TF in the Pseudomonas aeruginosa genome and assessed the performance with precision-recall analysis suggested by the reviewer. We found that most of TFs showed AUC (area under the precision-recall curve) over 0.5, demonstrating that our PWMs could well predict most of the in vivo binding (see Figure 2—figure supplement 11B). We also observed that prediction for some TFs is below 0.5, which is likely due to the fact that the binding is affected by chromatin accessibility, protein-protein interaction or other unknown mechanisms, which were quite common in higher species. The results are consistent with our direct comparison between motifs generated with ChIP-seq and HT-SELEX. We have discussed the comparison between ChIP-seq and PWM in the revised manuscript, please see Lines 419-429.

3. The computational scanning for TF binding sites in the Pseudomonas genome is not sufficiently described: how was thresholding determined to call binding sites for each TF?

Thanks for pointing out the missing information. The operation parameters of FIMO

(p<1.0E-5) to the manuscript, and other parameters take default values. This p value is the most commonly used cutoff when applying FIMO to scan TF binding sites in the genome. When we used the FIMO default p value cutoff (p<1.0E-4), an excessive number of peaks were predicted far beyond what we observed from the ChIP-seq data. For example, the highquality PhoB ChIP-seq detected 64 (34 of them were located upstream of a gene) binding peaks in vivo (10). When we set the p value cutoff of 1.0E-5, we obtained 22 binding sites, which is highly consistent with ChIP-seq result. Therefore, we determined to use p<1.0E-5 as a standard cutoff for FIMO prediction throughout our manuscript.

As the suggestion of the reviewer, we have expanded the Methods section by explicitly describing the threshold determination. Please see Lines 540-544.

4. How a TF is associated with a biological pathway is not well described. How many binding events are required to call an association? Is this done one TF at a time, or one pathway at a time? How is a pathway defined?

We apologize for the unclear description. We defined “association” as TF binding to at least one gene involved in a pathway or the TF itself was previously known to be implicated in the pathway. To predict direct interactions between TFs and DNA sequences, we first used the FIMO software (11) to scan genome-wide TF-binding sites (TFBSs) for the 182 TFs with available PWMs in the P. aeruginosa PAO1 reference genome (p<1.0E-5). Since the vast majority of binding within the gene body appeared to have little or no effect on transcription levels in prokaryotes according to previous studies (12), we primarily focused on predicted TFBSs in the intergenic regions in the subsequent analyses. In total, 33,709 significant TFBSs (p<1.0e-5) enriched in the intergenic regions were predicted. We are particularly interested in the transcriptional regulatory program in relation to the 9 virulence pathways of P. aeruginosa. Therefore, we comprehensively summarized 9 virulence-involved gene lists based on the last decades of studies, respectively. Then bedtools (v2.25.0) (87) was introduced to annotate all predicted TFBSs, which were intersected with the promoter regions of genes involved in these 9 virulence pathways. According to our analyses, TFs whose predicted binding sites were located in these genes are recognized as the corresponding pathway-associated TFs. We have revised the manuscript to include the explicit detail, please see Lines 189-195.

5. The biological validation experiments appear ad hoc. It is not clear what this says about the entire dataset. For instance, only one new TF was validated experimentally out of 57 biofilm-associated TFs. The crystal violet assay can be easily performed in a 96-well plate format to validate 57 TFs that affect biofilm formation. PAO1 mutants can be obtained from Manoil lab (PAO1 transposon insertion library (https://www.gs.washington.edu/labs/manoil/libraryindex.htm)

Thanks for suggesting the resource to evaluate our finding. We have obtained mutants and carried out the advised experiments. Among the 57 biofilm-associated TFs, 53 TF mutants were obtained from the Manoil lab except 4 mutants (PA0225, PA2121, PA2118, and PA2028). PA0225 has already been analyzed in our study while the mutant strains of PA2121, PA2118, and PA2028 were not available in their collection. A recent study had revealed that TF PA2121 functioned as a novel biofilm synthesis repressor in P. aeruginosa recently (13).

The crystal violet assay demonstrated that the biofilm production of the 10 TFs (PA0191, PA0491, PA0784, FleQ, PA1437, GacA, PA3249, AmrZ, SphR, and PhoB) had significant difference compared to the wild-type strain (P<0.05) (see Figure 2—figure supplement 4B), also consistent with the previous studies:

(1) FleQ, as c-di-GMP effector, positively regulates the expression of biofilm component- Pel exopolysaccharide in P. aeruginosa (14-16). Our crystal violet assay showed that FleQ mutant produced less biofilm compared to the wild-type, consistent to the previous study;

(2) AmrZ represses biofilm architecture by directly repressing transcription of the psl operon in P. aeruginosa (17, 18). Consistently, our crystal violet assay confirmed the point and the accurate binding site on the psl operon was successfully identified by AmrZ PWM generated from HT-SELEX (Coordinates: 2,453,511-2,453,531) (see also Figure 3B);

(3) GacA is a regulatory element in P. aeruginosa biofilm formation. A gacA mutant contains a deficit biofilm development revealed by scanning electron microscopy (19), consistent to our crystal violet result;

(4) PhoB negatively modulates the biofilm production of P. aeruginosa by controlling the Pho regulon (20), consistent to our crystal violet assay result that the phoB mutant strain produced more biofilm production compared to the wild-type strain.

Although the altered biofilm production of the remaining 43 TFs mutant strains were not significant in our assay, 6 TFs, including RsaL (21), CprR (22), AlgB (23, 24), CdpR (25), BqsR (Synonym: CarR) (26, 27), and PA3782 (28) had been reported to influence biofilm production in previous studies.

(1) The rsaL mutant is impaired in biofilm formation (21);

(2) CprR is involved to biofilm formation on indwelling medical devices such as endotracheal tubes (ETTs) (22);

(3) AlgB is required for the repression of colony biofilm formation by repressing biofilmrelated genes, while the algB mutant had a colony biofilm phenotype indistinguishable from the wild-type (24);

(4) A ∆cdpR mutant results in increased production of biofilm (25);

(5) BqsR (or named CarR) modulates Ca2+ signaling to influence biofilm production (26, 27); (6) The inactivation of PA3782 caused defects in biofilm formation in static and flowing systems (28).

We reasoned that unlike the clear deletion, the transposon insertion mutants could influence the expression of more than one genes because of the polar effects on the expression of downstream genes, or that mutating one TF gene could not change the biofilm phenotype owing to the complexity of regulation in vivo, e.g. the case of AlgB mentioned above.

6. It should be noted that the gel shift assays are similar to SELEX.

Thanks for the advice. We have noted that EMSA and HT-SELEX are similar in that they are both conducted in vitro using recombinant proteins and synthesized DNA ligands in the revised manuscript. However, we would like also to highlight the fundamental difference between these two experiments: 1) HT-SELEX introduces a large pool of DNA sequences for competition and selection, while EMSA can only analyzed one sequence at a time; 2) HTSELEX is based on exponential enrichment of binding sequences along the repeated cycles while EMSA is less quantitative in terms of binding affinity; 3) HT-SELEX can be easily adapted for high-throughput study, investigating binding of multiple TFs with multiple DNA sequences, while EMSA is difficult to be carried out in a high-throughput manner. Therefore, EMSA could be used as an orthogonal method to validate the individual binding events. In addition to EMSA, we also used other experiments, such as ChIP-seq to validate our findings.

7. How many (predicted) binding events are functional? Analysis of (available) RNA-seq data would be key for this. For instance, Figure 1B. shows that the well-studied GacA and AmrZ have similar DNA binding specificity. Transcriptomic data are available for both these TFs (PMID: 31270321) and should be compared to the physical binding predictions.

We appreciate the reviewer for suggesting multiple ways of validating the quality of our HTSELEX data. Accordingly, we have downloaded Huang et al. (10) RNA-seq data to compare the overlapping of the changed expression upon deletion. The list of GacA and AmrZ manipulation caused differentially expressed genes (defined as > 2 times fold change of the expression) was intersected with the list of GacA and AmrZ target genes predicted in this paper respectively (see Author response image 1). The results showed that only a small fraction of predicted target genes were dysregulated upon knockdown of the TF in cells. This could be due to several factors: 1) many TFs co-bound to the promoter region and therefore loss of one TF binding was not sufficient to significantly change the expression of the following gene by twofold; 2) Huang et al. did RNA-seq at the log phase (OD=0.6) only, whereas other growth phases could show a very different list of DEGs; 3) complex regulatory mechanism existed for the transcription, involving protein-protein interactions, chromatin accessibility, so that the changed transcription did not reflect cellular response to a simple one-to-one relationship between TF and target gene. Predicting transcription output is still a very challenging task in the field, even if we integrate the information of chromatin epigenetic information, protein abundance, and 3D genome folding, etc. Therefore, we avoided using the TF depletion followed by RNA-seq to assess the PWM quality in our manuscript. We have included the discussion in the revision.

Author response image 1
Venn diagrams show the comparison of predicted GacA (left) or AmrZ (right) target genes (red) and differentially expressed genes upon deletion of GacA or AmrZ (blue).

8. Page 11. Lines 266-268, also Figure 5A. The authors state that ShpR shares six targets with PA4008 implying a cooperative (or redundant?) function in regulating T6SS. From the figure 5A, these 6 targets are: tagJ1, tssE1, tssF1, tssG1, clpV1, vgrG1. However these six genes are in the same operon and share a single promoter (https://Pseudomonas.com/feature/show/?id=102905&view=operons). The figure and the text imply that there are 6 independent TF binding sites located in 6 different promoters. However, there is a single promoter, and ShpR and PA4008 share only one target. Operon genes that share a single promoter are represented in the figures as independent genes. This is a major problem throughout the paper and figures and needs to be resolved.

Thanks a lot for pointing out this issue and the constructive suggestion. To avoid confusion, we have revised the sentence to emphasize the point in manuscript, please see Lines 327 to 333. We also have merged all the genes sharing the same operon into one box in the network illustration to avoid the confusion. Please see revised Figure 3, 4 and 5 and Figure 2—figure supplement 4, 5, 6, 7, 8, 9, 10.

9. Page 12. This paragraph is speculative as no functional motility assays were performed and no target genes were functionally validated. Thus the statement "in sum, 37 TFs were found involved (..) in regulating motility-related genes" is an overstatement.

Thanks a lot for pointing out this issue. To avoid overstatement and confusion, we revised the statement from “In sum, 37 TFs were found involved in regulating motility-related genes.” to “In sum, 37 TFs were predicted to regulate motility-related genes.” Please see Line 358-359.

We have also gone through the manuscript and tuned down the overstatement of our findings.

10. Throughout the manuscript, the authors should be careful not to overstate their findings and be precise when biological function is predicted vs demonstrated experimentally.

Thanks a lot for pointing out the overstatement issue in this manuscript. We have carefully checked the manuscript and clearly differentiated predicted and experimentally verified TFBS to avoid confusion.

11. The manuscript would benefit from thorough proofreading and editing.

Thanks a lot. A scientist who is a native speaker of English has carefully checked the grammatical issues and edited the text throughout the manuscript.

12. Overall, 371 TFs entered in the pipeline and a result could be generated for 182 TFs (app. 50%). From an experimental design point of view, only 3 TFs have been used to benchmark the reproducibility of the binding site motifs which represents <1% of the input. This should be improved to provide a resource for the field. Since it is an in vitro assay, I would have expected to see the results validated systematically in duplicate. The author claim that the three replicates obtained "virtually identical binding specificity": but what does this mean statistically?

According to our previous experience, HT-SELEX is a highly quantitative and reproducible method. In our previous publication (8), we have systematically compared the technical replicates (replicates using the same protein but different DNA input libraries) and biological replicates (replicates using different protein clones, including DNA-binding domains and full length proteins of the same TFs) and demonstrated that HT-SELEX resulted in virtually the same results. Therefore, we only picked up three TFs as QC controls when we carried out HT-SELEX in different batches and expectedly these replicates turned out identical motifs.

To validate the results, we also used orthologous methods, e.g. we carried out hundreds of EMSA to validate individual binding sites predicted by PWM and also showed the consistency between HT-SELEX generated motifs and ChIP-seq generated motifs for some TFs for which both data are available.

In order to demonstrate the reproducibility in an acceptable timeframe of revision, we randomly chose 13 TFs and carried out HT-SELEX with a different batch protein expression and newly synthesized input oligos, and also with a different experimental researcher. Of the 13 TFs, nine TFs showed explicit motifs in our first batch of experiment while the other 4 did not. In the second batch of experiment, we successfully obtained identical motifs for all the 9 TFs (see Author response image 2).

Author response image 2
A table shows comparison of motifs discovered in different replicative experiments of TF HT-SELEX.

Note that the proteins and input DNA were both independently synthesized and all motifs for the same TF were identical.

For the other 4 TFs that we were not able to observe sequence specificity (GbdR, SoxR,

VqsR and QscR), we did not recover any motif either. We then checked the protein expression and found that except QscR, the other TF proteins were very abundant, suggesting that failure of motif enriched was unlikely to be caused by lack of protein in the reaction (see Author response image 3). The same question was also discussed below in response to the next point of reviewer’s comment.

Author response image 3
SDS-PAGE results detecting the TF protein expression.

The three TFs, GbdR, SoxR, and VqsR showing abundant expression did not recover any specific DNA binding motifs.

13. How did the authors select 182 TFs out of 371 TFs? Were there issues with protein purification for the others?

We apologize for confusing the reviewer due to our poor wording. We actually did not “select” any TFs. We have performed HT-SELEX for all 371 TFs, out of which 182 TF showed DNA binding specificities. We don’t think the failure was caused by protein purification issue as we carefully inspected protein amount for each HT-SELEX experiment. We ran SDS-PAGE gel for all TF proteins used in our paper (371 TFs) and measured the concentrations carefully. We used 100-200 ng per TF in each round of SELEX. The enrichment of specific motifs is not related to the protein abundance. For example, PA3122, PA3260 and PA3932, have a clear and bright protein band in SDS-PAGE gel, but HTSELEX did not detect any specific DNA sequence motifs (see Author response image 4).

Author response image 4
SDS-PAGE gel result for 23 purified TFs (Left).

All TFs expressed successfully. Only 13 TFs had HT-SELEX-generated motifs.

14. Line 167/Figure 2E: using the binding motifs, the authors predicted the involvement of 365 TFs in Pseudomonas virulence pathways. How is it possible to predict the role of 365 TFs while they obtained PWM for only 182 TFs?

As stated above, we apologize for the poor wording that caused confusion. The 365 TFs included redundant counting for TFs involved in multiple different pathways. Of these 365 TFs, only 127 TFs are unique, among which 92 TFs had never been characterized before. We have revised the manuscript, and noted the number clearly. Please see Lines 33-34, and 199200 in the revised manuscript.

15. Figure 3B/C: AmrZ binding site: it is not clear for the link between Figure 3B and 3C. Where is the motif in the psl operon?

The binding site “TAGCTATCACAAAGCCACTAT” was identified by scanning the Pseudomonas aeruginosa PAO1 genome with the PWM model of AmrZ generated in this study (FIMO score 8.75E-06). To clarify, we have now merged the former Figure 3 C with Figure 3 Band highlighted the predicted AmrZ-binding site.

16. The authors have already published a TF regulatory network paper (https://doi.org/10.1038/s41467-019-10778-w) covering 20 TFs : a supplementary figure that compares side-by-side the motifs found in the both papers could be interesting.

Thank you. This is a similar question to Point 2 above. We have provided a supplementary table (Figure 2—figure supplement 11A) to compare the ChIP-seq motif and HT-SELEX motif side-by-side, in total 13 TFs with both datasets available, as suggested by this reviewer (see Author response image 5). By comparison, 11 TFs shared identical or similar motifs derived from the two methods, including PhoB, RsaL, AmrZ, LasR, FleQ, PchR, ExsA, AlgR, CdpR, GacA and SphR. For BfmR, ChIP-seq didn’t recover any motif while HT-SELEX reported two weak but similar motifs (TANNNTA or TANNNNNNNNTA). As we discussed above, these two methods are used to investigate different problems, conducted in different experimental settings: ChIP-seq is mainly used to survey the in vivo binding sites which could be affected by protein-protein interactions, genome accessibility, chromatin 3D organizations (enhancer-promoter loop), etc while HT-SELEX primarily focused on biochemical binding affinity between TF protein and DNA sequences. In our previous studies, we have shown that in many cases they were not completely identical (29). We have added the comparison data and a brief discussion to the revised manuscript. See Lines 419438.

17. The author should explain why 201 sequence motifs (PWMs) were obtained for 182 TFs in the main text. For example, in the Data S1, the Module 37 of PA1241 has two sequence motifs containing almost the same consensus sequence. Why have two consensus sequence motifs have to be assigned?

Thanks for the advice. Some TFs bind to the DNA in a homodimeric mode with different spacing and monomer orientation (8). Therefore, multiple PWMs were assigned to one TF to respectively describe its different spacing and orientation. This has been discussed before in our previous studies (8, 30). To avoid confusion, we have noted this in the revised manuscript. Please see Lines 91-96.

18. The sequence motif for sig54 (rpoN) is conserved across bacteria and the core consensus sequence of GGN(10)GC has been confirmed in various studies. The sequence motif for sig54 by SELEX (Figure 1C) is inconsistent with previous literatures and should be reanalyzed and discussed.

We deeply appreciate this reviewer for pointing out a potential issue of the RpoN PWM we reported in the first submission. In our first round of HT-SELEX, we have identified a motif that was different from conventional and highly conserved RpoN motif reported for different bacterial species, including Vibrio cholerae, Salmonella enterica Serovar Typhimurium, Yersinia pseudotuberculosis, Agrobacterium tumefaciens, Geobacter sulfurreducens, etc. Then, we repeated the HT-SELEX experiment for RpoN using independently purified protein lysates and input DNA oligos for 4 times, we have now detected the GGN(10)GC conventional motif as the most enriched one (see Author response image 5), although the unknown RpoN motif was still observed in one of the replicative experiment. By carefully inspecting this unknown motif, we found that it might result from a mixed mode of two tandem half sites (TGC) on the 3’-side while the 5’-side half site (GGC) was missing. We can also see that the 5’-half site sometimes shows much weaker specificity, e.g. in both replication 1 and replication 3, compared to 3’-half site. So we reason that the folding of RpoN proteins was sometimes flexible in binding to DNA, which gives more DNA binding flexibility when it forms homodimer or heterodimers with other proteins. We have added discussion on RpoN binding in revised manuscript, Line 124-130: “We reason that RpoN proteins could sometimes form dimers while binding to DNA, and in this form, the 5′-side half site binds to another protein molecule. This result demonstrates that HT-SELEX is a highly sensitive method for detecting any change in protein topology.”

Author response image 5
Table shows comparison of RpoN motifs.

Original motif shows the PWM logo of RpoN binding we obtained in original submission while the current motifs shows the replicative experiments in revision. Note that we repeated the HT-SELEX for 4 times with different input DNA ligands. Except in the replication 1 that we detected both conventional RopN motif and the motif we obtained in first batch, the other 3 replicates only detected the conventional RopN motif.

19. The manuscript also describes interesting monomeric sequence motifs for 19 TFs. The discussion of these 19 TFs should be expanded.

LysR and AraC are the top 2 families with the largest TFs in Pseudomonas aeruginosa

PAO1. Among 182 TFs with successful SELEX motifs, 43% belongs to LysR and AraC family (see Author response image 6). In general, the monomeric sites are shorter and more frequent in the genome, indicating their broad regulatory roles of these TFs. Meanwhile, we also observed that there were also prevalent homodimer binding modes for TFs belonging to these two families in addition to these monomeric TFs. The availability of both monomer and homodimer binding modes reflected the diverse amino acid sequences for DNA binding domains of TFs in these two families and could be used for further classification of these TFs. We have noted the broad target sites and the high diversity of DNA binding specificities for these TFs in the discussion. Please see Lines 147 to 151 in the revised manuscript.

Author response image 6

20. The location of TF-binding sites relative to transcription start site (TSS) on the promoter DNA could help predict the activity of the corresponding TFs. For example, TF binding sites overlapping with the core promoter region (-35 to +1) and the proximal downstream region of a TSS may suggest transcriptional repression. As TSS profiles in Pseudomonas aeruginosa are published, it would strengthen the report to integrate the TSS information into the analysis herein.

Thanks for your advice. We also agree that it would be very useful information to annotate the genome with the location of TF-binding sites relative to transcription start site (TSS) on the promoter. Unfortunately, the published TSS information of Pseudomonas aeruginosa corresponded to the UCBPP-PA14 strain (31) which was different from the Pseudomonas aeruginosa PAO1 strain used in this study. We could not find a proper tool to liftover the two reference genomes and therefore to avoid confusing future users, we chose not to include such information. Researchers may have their own TFBS prediction in the PA14 strain (using PWM from the current study) to associate the TFs with corresponding target genes if interested.

To compare the location of TFs binding sites relative to TSS in the PAO1 strain, we illustrated several examples in Author response image 7. Based on the TSS information originated from PA14, we deduced the corresponding TSS of PAO1 by sequence alignment. We have now added the TSS information (31) of pel operon (see Author response image 7). For example, PA0225 and PA2497 were predicted to bind to the upstream region of pelA and qslA TSS, respectively. Our RTqPCR and crystal violet results validated that PA0225 positively regulated the expression of pelA. Similar cases, such as PA2497, PA1241, and PA1234 were predicted to bind to the upstream region of qslA, anvM, and algB TSS, respectively (see Author response image 7).

Author response image 7
The association between predicted TSS and TF binding sites for PA0225 and PA2497.

References:

1. M. J. Bottomley, E. Muraglia, R. Bazzo, A. Carfi, Molecular insights into quorum sensing in the human pathogen Pseudomonas aeruginosa from the structure of the virulence regulator LasR bound to its autoinducer. J Biol Chem 282, 13592-13600 (2007).

2. H. Fan et al., QsIA disrupts LasR dimerization in antiactivation of bacterial quorum sensing. Proc Natl Acad Sci U S A 110, 20765-20770 (2013).

3. H. Kang et al., Crystal structure of Pseudomonasaeruginosa RsaL bound to promoter DNA reaffirms its role as a global regulator involved in quorum-sensing. Nucleic Acids Res 45, 699-710 (2017).

4. Y. Kang et al., The long-chain fatty acid sensor, PsrA, modulates the expression of rpoS and the type III secretion exsCEBA operon in Pseudomonas aeruginosa. Mol Microbiol 73, 120-136 (2009).

5. M. Kojic, C. Aguilar, V. Venturi, TetR family member psrA directly binds the Pseudomonas rpoS and psrA promoters. J Bacteriol 184, 2324-2330 (2002).

6. T. Su et al., The REC domain mediated dimerization is critical for FleQ from Pseudomonasaeruginosa to function as a c-di-GMP receptor and flagella gene regulator. J Struct Biol 192, 1-13 (2015).

7. C. L. Wysoczynski-Horita et al., Mechanism of agonism and antagonism of the Pseudomonasaeruginosa quorum sensing regulator QscR with non-native ligands. Mol Microbiol 108, 240-257 (2018).

8. A. Jolma et al., DNA-binding specificities of human transcription factors. Cell 152, 327-339 (2013).

9. A. Jolma et al., DNA-dependent formation of transcription factor pairs alters their binding specificity. Nature 527, 384-388 (2015).

10. H. Huang et al., An integrated genomic regulatory network of virulence-related transcriptional factors in Pseudomonas aeruginosa. Nat Commun 10, 2931 (2019).

11. C. E. Grant, T. L. Bailey, W. S. Noble, FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017-1018 (2011).

12. T. Shimada, A. Ishihama, S. J. Busby, D. C. Grainger, The Escherichia coli RutR transcription factor binds at targets within genes as well as intergenic regions. Nucleic Acids Res 36, 3950-3955 (2008).

13. X. Yang et al., A putative LysR-type transcriptional regulator inhibits biofilm synthesis in Pseudomonas aeruginosa. Biofouling 35, 541-550 (2019).

14. C. Baraquet, C. S. Harwood, Cyclic diguanosine monophosphate represses bacterial flagella synthesis by interacting with the Walker A motif of the enhancer-binding protein FleQ. Proc Natl Acad Sci U S A 110, 18478-18483 (2013).

15. J. W. Hickman, C. S. Harwood, Identification of FleQ from Pseudomonasaeruginosa as a c-di-GMP-responsive transcription factor. Mol Microbiol 69, 376-389 (2008).

16. C. Baraquet, K. Murakami, M. R. Parsek, C. S. Harwood, The FleQ protein from Pseudomonasaeruginosa functions as both a repressor and an activator to control gene expression from the pel operon promoter in response to c-di-GMP. Nucleic Acids Res 40, 7207-7218 (2012).

17. C. J. Jones et al., ChIP-seq and RNA-Seq reveal an AmrZ-mediated mechanism for cyclic di-GMP synthesis and biofilm development by Pseudomonas aeruginosa. PLoS Pathog 10, e1003984 (2014).

18. C. J. Jones, C. R. Ryder, E. E. Mann, D. J. Wozniak, AmrZ modulates Pseudomonasaeruginosa biofilm architecture by directly repressing transcription of the psl operon.

J Bacteriol 195, 1637-1644 (2013).

19. M. D. Parkins, H. Ceri, D. G. Storey, Pseudomonasaeruginosa GacA, a factor in multihost virulence, is also essential for biofilm formation. Mol Microbiol 40, 12151226 (2001).

20. R. D. Monds, M. W. Silby, H. K. Mahanty, Expression of the Pho regulon negatively regulates biofilm formation by Pseudomonas aureofaciens PA147-2. Mol Microbiol 42, 415-426 (2001).

21. G. Rampioni, M. Schuster, E. P. Greenberg, E. Zennaro, L. Leoni, Contribution of the RsaL global regulator to Pseudomonasaeruginosa virulence and biofilm formation. FEMS Microbiol Lett 301, 210-217 (2009).

22. D. Badal, A. V. Jayarani, M. A. Kollaran, A. Kumar, V. Singh, Pseudomonasaeruginosa biofilm formation on endotracheal tubes requires multiple two-component systems. J Med Microbiol 69, 906-919 (2020).

23. N. S. Chand, A. E. Clatworthy, D. T. Hung, The two-component sensor KinB acts as a phosphatase to regulate Pseudomonasaeruginosa Virulence. J Bacteriol 194, 65376547 (2012).

24. S. Mukherjee, M. Jemielita, V. Stergioula, M. Tikhonov, B. L. Bassler, Photosensing and quorum sensing are integrated to control Pseudomonasaeruginosa collective behaviors. PLoS Biol 17, e3000579 (2019).

25. J. Zhao et al., Structural and Molecular Mechanism of CdpR Involved in QuorumSensing and Bacterial Virulence in Pseudomonasaeruginosa. PLoS Biol 14, e1002449 (2016).

26. M. Guragain et al., The Pseudomonasaeruginosa PAO1 Two-Component Regulator CarSR Regulates Calcium Homeostasis and Calcium-Induced Virulence Factor Production through Its Regulatory Targets CarO and CarP. J Bacteriol 198, 951-963 (2016).

27. S. Sarkisova, M. A. Patrauchan, D. Berglund, D. E. Nivens, M. J. Franklin, Calciuminduced virulence factors associated with the extracellular matrix of mucoid Pseudomonasaeruginosa biofilms. J Bacteriol 187, 4327-4337 (2005).

28. A. Finelli, C. V. Gallant, K. Jarvi, L. L. Burrows, Use of in-biofilm expression technology to identify genes involved in Pseudomonasaeruginosa biofilm development. J Bacteriol 185, 2700-2710 (2003).

29. J. Yan et al., Transcription factor binding in human cells occurs in dense clusters formed around cohesin anchor sites. Cell 154, 801-813 (2013).

30. Y. Yin et al., Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356, (2017).

31. O. Wurtzel et al., The single-nucleotide resolution transcriptome of Pseudomonasaeruginosa grown in body temperature. PLoS Pathog 8, e1002945 (2012).

32. K. Poole et al., Expression of the multidrug resistance operon mexA-mexB-oprM in Pseudomonasaeruginosa: mexR encodes a regulator of operon expression. Antimicrob Agents Chemother 40, 2021-2028 (1996).

33. A. Kawalek, M. Modrzejewska, B. Zieniuk, A. A. Bartosik, G. Jagura-Burdzy, Interaction of ArmZ with the DNA-binding domain of MexZ induces expression of mexXY multidrug efflux pump genes and antimicrobial resistance in Pseudomonasaeruginosa. Antimicrob Agents Chemother, (2019).

34. U. H. Ha et al., An in vivo inducible gene of Pseudomonasaeruginosa encodes an anti-ExsA to suppress the type III secretion system. Mol Microbiol 54, 307-320 (2004).

35. W. Wu, S. Jin, PtrB of Pseudomonasaeruginosa suppresses the type III secretion system under the stress of DNA damage. J Bacteriol 187, 6058-6068 (2005).

36. Y. Jin, H. Yang, M. Qiao, S. Jin, MexT regulates the type III secretion system through MexS and PtrC in Pseudomonasaeruginosa. J Bacteriol 193, 399-410 (2011).

37. A. Daddaoua, A. Corral-Lugo, J. L. Ramos, T. Krell, Identification of GntR as regulator of the glucose metabolism in Pseudomonasaeruginosa. Environ Microbiol 19, 3721-3733 (2017).

38. S. A. Chugani et al., QscR, a modulator of quorum-sensing signal synthesis and virulence in Pseudomonasaeruginosa. Proc Natl Acad Sci U S A 98, 2752-2757 (2001).

39. J. Yan et al., Systems-level analysis of NalD mutation, a recurrent driver of rapid drug resistance in acute Pseudomonasaeruginosa infection. PLoS Comput Biol 15, e1007562 (2019).

40. G. Dieppois, V. Ducret, O. Caille, K. Perron, The transcriptional regulator CzcR modulates antibiotic resistance and quorum sensing in Pseudomonasaeruginosa. PLoS One 7, e38148 (2012).

41. O. E. Petrova, K. Sauer, A novel signaling network essential for regulating Pseudomonasaeruginosa biofilm development. PLoS Pathog 5, e1000668 (2009).

42. I. Vallet et al., Biofilm formation in Pseudomonasaeruginosa: fimbrial cup gene clusters are controlled by the transcriptional regulator MvaT. J Bacteriol 186, 28802890 (2004).

43. R. K. Ernst et al., Specific lipopolysaccharide found in cystic fibrosis airway Pseudomonasaeruginosa. Science 286, 1561-1565 (1999).

44. E. L. Macfarlane, A. Kwasnicka, M. M. Ochs, R. E. Hancock, PhoP-PhoQ homologues in Pseudomonasaeruginosa regulate expression of the outer-membrane protein OprH and polymyxin B resistance. Mol Microbiol 34, 305-316 (1999).

45. M. M. Ramsey, M. Whiteley, Pseudomonasaeruginosa attachment and biofilm development in dynamic environments. Mol Microbiol 53, 1075-1087 (2004).

46. J. B. McPhee et al., Contribution of the PhoP-PhoQ and PmrA-PmrB two-component regulatory systems to Mg2+-induced gene regulation in Pseudomonasaeruginosa. J Bacteriol 188, 3995-4006 (2006).

47. H. Liang, L. Li, Z. Dong, M. G. Surette, K. Duan, The YebC family protein PA0964 negatively regulates the Pseudomonasaeruginosa quinolone signal system and pyocyanin production. J Bacteriol 190, 6217-6227 (2008).

48. G. G. Nicastro, A. L. Boechat, C. M. Abe, G. H. Kaihami, R. L. Baldini,

Pseudomonasaeruginosa PA14 cupD transcription is activated by the RcsB response regulator, but repressed by its putative cognate sensor RcsC. FEMS Microbiol Lett 301, 115-123 (2009).

49. H. Mikkelsen, G. Ball, C. Giraud, A. Filloux, Expression of Pseudomonasaeruginosa CupD fimbrial genes is antagonistically controlled by RcsB and the EAL-containing PvrR response regulators. PLoS One 4, e6018 (2009).

50. S. Mukherjee, D. Moustafa, C. D. Smith, J. B. Goldberg, B. L. Bassler, The RhlR quorum-sensing receptor controls Pseudomonasaeruginosa pathogenesis and biofilm development independently of its canonical homoserine lactone autoinducer. PLoS Pathog 13, e1006504 (2017).

51. F. H. Damron, D. Qiu, H. D. Yu, The Pseudomonasaeruginosa sensor kinase KinB negatively controls alginate production through AlgW-dependent MucA proteolysis. J Bacteriol 191, 2285-2295 (2009).

52. W. Kong et al., ChIP-seq reveals the global regulator AlgR mediating cyclic di-GMP synthesis in Pseudomonasaeruginosa. Nucleic Acids Res 43, 8268-8282 (2015).

53. V. Deretic, W. M. Konyecsni, A procaryotic regulatory factor with a histone H1-like carboxy-terminal domain: clonal variation of repeats within algP, a gene involved in regulation of mucoidy in Pseudomonasaeruginosa. J Bacteriol 172, 5544-5554 (1990).

54. D. Balasubramanian et al., The regulatory repertoire of Pseudomonasaeruginosa AmpC ss-lactamase regulator AmpR includes virulence genes. PLoS One 7, e34067 (2012).

55. J. H. Hammond, E. F. Dolben, T. J. Smith, S. Bhuju, D. A. Hogan, Links between Anr and Quorum Sensing in Pseudomonasaeruginosa Biofilms. J Bacteriol 197, 2810-2820 (2015).

56. O. E. Petrova, K. Sauer, The novel two-component regulatory system BfiSR regulates biofilm development by controlling the small RNA rsmZ through CafA. J Bacteriol 192, 5275-5288 (2010).

57. O. E. Petrova, J. R. Schurr, M. J. Schurr, K. Sauer, The novel Pseudomonasaeruginosa two-component regulator BfmR controls bacteriophage-mediated lysis and DNA release during biofilm development through PhdA. Mol Microbiol 81, 767-783 (2011).

58. A. T. Yeung, M. Bains, R. E. Hancock, The sensor kinase CbrA is a global regulator that modulates metabolism, virulence, and antibiotic resistance in Pseudomonasaeruginosa. J Bacteriol 193, 918-931 (2011).

59. G. Wells, S. Palethorpe, E. C. Pesci, PsrA controls the synthesis of the Pseudomonasaeruginosa quinolone signal via repression of the FadE homolog, PA0506. PLoS One 12, e0189331 (2017).

60. I. Blus-Kadosh, A. Zilka, G. Yerushalmi, E. Banin, The effect of pstS and phoB on quorum sensing and swarming motility in Pseudomonasaeruginosa. PLoS One 8, e74444 (2013).

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

[…] Although the manuscript has been improved, there are some remaining issues that need to be addressed, as outlined below:

1. There are still concerns about the interpretations of RpoN binding data. It is well established that RpoN doesn't function as a dimer to facilitate transcription initiation. It binds RNAP core enzyme in a 1:1 molar ratio to form an RNAP holoenzyme. Second, the sequence motif itself is not a direct repeat of two half-sites. The 5' 'GG' and 3' 'GC' consensuses of the GGN(10)GC motif are representative sequences of the '-24' and '-12' elements of a typical sigma 54 (encoded by rpoN)-initiated promoter. The two elements are recognized by two separate domains of RpoN, both in the presence and absence of RNAP holoenzyme. This issue of data interpretation and representation needs to be addressed. It is also recommended that the authors use a different example of head to tail binding.

We deeply appreciate the reviewers for pointing out the potential issue of the data interpretation and representation of RpoN motif we included in previous submissions. We apologize for not carefully comparing the two similar motifs of RpoN identified in HT-SELEX. Putting the so-called “degenerated dimer” motif of RpoN in a reverse complementary order, we found that the motif was also composed of a “GG” and a “GC” half site. The difference lies on the fact that “GC” core sequence is on the 5’ position relative to the 3’ “GG” core sequence in the unknown motif. However, this previously uncharacterized motif, as explained by the reviewers, was not reproduced by any in vivo ChIP-seq data and may result from a possible topology in the in vitro biochemical condition (without interaction with other proteins) and therefore, we decided not to overstate the finding. Suggested by the reviewers, we replaced Figure 1C illustration of RpoN with another confident example of head-to-tail binding TF, PhoB, to avoid confusing readers. Agreed by the data from HT-SELEX and ChIP-seq by us (1) and by others (2), PhoB binds to DNA as a homodimer in a head-to-tail consecutive orientation, with a “GTCA(C/T)” monomer sequence preference spaced by a stretch of 6-bp ATrich nucleotides (see new Figure 1C). We have now added the text to the manuscript in Line 139-142.

2. The authors associate a TF with a pathway when a single predicted binding event occurs with any gene in that pathway. However, statistical analysis must be performed to determine the false discovery rate and whether their findings could occur via chance.

Thanks for the reviewers’ suggestion. We have now carried out enrichment analysis of TF binding sites located in promoter of genes involved in various functional pathways by the hypergeometric test. The P value was corrected by multiple test to obtain false discovery rate (method="fdr"). Then we associated a TF with a pathway when its binding sites were significantly enriched in promoters of genes in that pathway (FDR < 0.05). Finally, we counted the number of TFs significantly involved in each virulence pathway, and updated the numbers in the new Figure 2E. We have also modified the relevant description in the text, as shown in Lines 183-186. In addition, to highlight and better illustrate the major TFs involved in each pathway, we visualized the enrichment analysis results for each pathway using radar plots, as shown in "Figure 2—figure supplement 12" (cited in Lines 188, 343). In the plot, the radius represents the minus logarithm-transformed FDR, meaning that the longer the radius is, the more significantly the TF is associated with the pathway.

3. There are still points that the authors equate binding with function in the manuscript, please revise throughout to not make this overstatement.

Thanks a lot for pointing out this issue. We have now thoroughly gone through the text and revised the overstatements mixing the biological function with the putative TF binding. Please see changes highlighted in red font in the revised manuscript file.

References:

1. H. Huang et al., An integrated genomic regulatory network of virulence-related transcriptional factors in Pseudomonasaeruginosa. Nat Commun 10, 2931 (2019).

2. P. Bielecki et al., Cross talk between the response regulators PhoB and TctD allows for the integration of diverse environmental signals in Pseudomonasaeruginosa. Nucleic Acids Res 43, 6413-6425 (2015).

https://doi.org/10.7554/eLife.61885.sa2

Article and author information

Author details

  1. Tingting Wang

    Department of Biomedical Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
    Contribution
    Data curation, Validation, Investigation, Writing - original draft
    Contributed equally with
    Wenju Sun, Ligang Fan and Canfeng Hua
    Competing interests
    No competing interests declared
  2. Wenju Sun

    School of Medicine, Northwest University, Xi’an, China
    Contribution
    Software, Formal analysis, Visualization, Writing - original draft
    Contributed equally with
    Tingting Wang, Ligang Fan and Canfeng Hua
    Competing interests
    No competing interests declared
  3. Ligang Fan

    1. Department of Biomedical Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
    2. School of Medicine, Northwest University, Xi’an, China
    Contribution
    Data curation, Validation, Investigation, Writing - original draft
    Contributed equally with
    Tingting Wang, Wenju Sun and Canfeng Hua
    Competing interests
    No competing interests declared
  4. Canfeng Hua

    Department of Biomedical Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
    Contribution
    Data curation, Investigation, Writing - original draft
    Contributed equally with
    Tingting Wang, Wenju Sun and Ligang Fan
    Competing interests
    No competing interests declared
  5. Nan Wu

    School of Medicine, Northwest University, Xi’an, China
    Contribution
    Data curation, Validation
    Competing interests
    No competing interests declared
  6. Shaorong Fan

    School of Medicine, Northwest University, Xi’an, China
    Contribution
    Data curation, Validation
    Competing interests
    No competing interests declared
  7. Jilin Zhang

    Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Solna, Sweden
    Contribution
    Software, Writing - review and editing
    Competing interests
    No competing interests declared
  8. Xin Deng

    Department of Biomedical Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Writing - original draft, Project administration
    For correspondence
    xindeng@cityu.edu.hk
    Competing interests
    No competing interests declared
  9. Jian Yan

    1. Department of Biomedical Sciences, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
    2. School of Medicine, Northwest University, Xi’an, China
    Contribution
    Conceptualization, Supervision, Funding acquisition, Methodology, Writing - original draft, Project administration
    For correspondence
    jian.yan@cityu.edu.hk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1267-2870

Funding

National Natural Science Foundation of China (8187364)

  • Jian Yan

National Natural Science Foundation of China (31900443)

  • Wenju Sun

National Natural Science Foundation of China (31870116)

  • Xin Deng

Research Grants Council, University Grants Committee (21103018)

  • Xin Deng

Research Grants Council, University Grants Committee (21100420)

  • Jian Yan

Research Grants Council, University Grants Committee (11101619)

  • Xin Deng

China Postdoctoral Science Foundation (2019M663799)

  • Wenju Sun

China Postdoctoral Science Foundation (2019M663794)

  • Ligang Fan

City University of Hong Kong (9667188)

  • Jian Yan

City University of Hong Kong (7005314)

  • Jian Yan

National Natural Science Foundation of China (32070596)

  • Jian Yan

City University of Hong Kong (9610424)

  • Jian Yan

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

Funding: This work was supported by the National Natural Science Foundation of China (32070596 and 81873642 to JY, 31900443 to WS, 31870116 to XD), the Research Grants Council of Hong Kong SAR (21103018 and 11101619 to XD, 21100420 to JY), the City University of Hong Kong (9667188, 9610424 and 7005314 to JY), and the China Postdoctoral Science Foundation (2019M663794 to LF and 2019M663799 to WS).

Senior Editor

  1. Bavesh D Kana, University of the Witwatersrand, South Africa

Reviewing Editor

  1. Christina L Stallings, Washington University School of Medicine, United States

Reviewer

  1. Yu Zhang, Shanghai Institute of Plant Physiology and Ecology, China

Publication history

  1. Received: August 7, 2020
  2. Accepted: March 26, 2021
  3. Accepted Manuscript published: March 29, 2021 (version 1)
  4. Version of Record published: April 12, 2021 (version 2)

Copyright

© 2021, Wang et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,457
    Page views
  • 328
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Genetics and Genomics
    2. Microbiology and Infectious Disease
    Jessamyn I Perlmutter et al.
    Research Article

    Wolbachia are the most widespread bacterial endosymbionts in animals. Within arthropods, these maternally-transmitted bacteria can selfishly hijack host reproductive processes to increase the relative fitness of their transmitting females. One such form of reproductive parasitism called male killing, or the selective killing of infected males, is recapitulated to degrees by transgenic expression of the WO-mediated killing (wmk) gene. Here, we characterize the genotype-phenotype landscape of wmk-induced male killing in D. melanogaster using transgenic expression. While phylogenetically distant wmk homologs induce no sex-ratio bias, closely-related homologs exhibit complex phenotypes spanning no death, male death, or death of all hosts. We demonstrate that alternative start codons, synonymous codons, and notably a single synonymous nucleotide in wmk can ablate killing. These findings reveal previously unrecognized features of transgenic wmk-induced killing and establish new hypotheses for the impacts of post-transcriptional processes in male killing variation. We conclude that synonymous sequence changes are not necessarily silent in nested endosymbiotic interactions with life-or-death consequences.

    1. Genetics and Genomics
    Kevin R Costello et al.
    Research Article

    Transposable elements (TEs) are mobile genetic elements that make up a large fraction of mammalian genomes. While select TEs have been co-opted in host genomes to have function, the majority of these elements are epigenetically silenced by DNA methylation in somatic cells. However, some TEs in mice, including the Intracisternal A-particle (IAP) subfamily of retrotransposons, have been shown to display interindividual variation in DNA methylation. Recent work has revealed that IAP sequence differences and strain-specific KRAB zinc finger proteins (KZFPs) may influence the methylation state of these IAPs. However, the mechanisms underlying the establishment and maintenance of interindividual variability in DNA methylation still remain unclear. Here we report that sequence content and genomic context influence the likelihood that IAPs become variably methylated. IAPs that differ from consensus IAP sequences have altered KZFP recruitment that can lead to decreased KAP1 recruitment when in proximity of constitutively expressed genes. These variably methylated loci have a high CpG density, similar to CpG islands, and can be bound by ZF-CxxC proteins, providing a potential mechanism to maintain this permissive chromatin environment and protect from DNA methylation. These observations indicate that variably methylated IAPs escape silencing through both attenuation of KZFP binding and recognition by ZF-CxxC proteins to maintain a hypomethylated state.