Structural and Functional Aspects of G-quadruplex (G4) Structures and Pathogenicity Islands (PAIs).

(A) Schematic representation of a guanine tetrad stabilized by Hoogsten base pairing and a positively charged central ion, illustrating the key elements of G4 structures. (B) Structural heterogeneity of G4s. G-quadruplex structures exhibit polymorphism and can be categorized into different families, such as parallel or antiparallel, based on the orientation of the DNA strands. They can fold either intramolecularly or intermolecularly, leading to diverse structural configurations. (C) General sequence formula for G4s, highlighting the repeated occurrence of guanine-rich sequences that form G-quartets. (D) Regulatory roles of G-quadruplexes in transcription. G4 structures can regulate transcription by blocking RNA polymerase from binding to promoter sequences or aiding in single-stranded DNA (ssDNA) formation, thereby enhancing transcription. (E) General structure of PAI. PAIs are characteristic regions of DNA found within the genomes of pathogenic bacteria, distinguishing them from nonpathogenic strains of the same or related species. Repeat sequences are DNA segments duplicated within the PAI and can serve as recognition sites for various enzymes involved in the integration and excision of the PAI from the bacterial chromosome. tRNA genes act as anchor points for the insertion of foreign DNA acquired through horizontal gene transfer. Virulence genes encode proteins or factors that play crucial roles in the virulence and pathogenicity of the bacterium, contributing to adhesion, invasion, immune evasion, toxin production, or other pathogenic mechanisms. Insertion elements include transposons, bacteriophages, or plasmids, enabling the PAI to be transferred between bacterial cells and potentially disseminated to different strains or species.

Analysis of Pathogenicity Islands (PAIs) and G-quadruplexes (G4s) in Pathogen Genomes.

(A) Phylogenetic analysis of pathogen genomes based on 89 bacterial strains, showing the evolutionary relationships among species. Additional genomic information, including genome size, GC content, rRNA density, tRNA density, and PAI length, is provided. The same color indicates the same species. (B) Genomic location of specific PAIs in bacterial genomes, divided into ten regions. PAIs are represented by green triangles, and their names are indicated. The tRNA insertion sites are also marked. (c) Heatmap illustrating the relative abundance of G4s in bacterial genomes, divided into ten regions. Red indicates a higher relative abundance, while blue indicates a lower relative abundance. (D & E) Correlation analysis between the number of G4s, the frequency of G4s and GC content in various genomic features, including the whole genome, genes, promoters, rRNA, and tRNA.

Putative Origin and Functional Annotation of G-quadruplexes (G4s) within Pathogenicity Islands (PAIs) in Escherichia coli.

(A-E) Comparison of GC content (left panel) and GC frequency (right panel) between the genome and pathogenicity islands (PAIs), categorized into five regions (20%-30%, 30%-40%, 40%-50%, 50%-60%, and 60%-70%). */**/***/**** indicates significant difference (p < 0.05/0.01/0.001/0.0001). (F) Hypotheses on the origin of G4 structures within PAIs, involving gene horizontal transfer mechanisms (conjugation, transduction, and transformation). (G) Evolutionary relatedness of 10 types of PAIs (categorized into six main categories) in E. coli strains. (H & I) Examples of G4 structures within PAIs in E. coli strains. The grey bar represents the virulence region, the red box indicates a virulence gene, the blue box represents an insertion site region or repeat, the green box denotes an integrase, the purple triangle indicates a tRNA insertion site, and the yellow triangle indicates an effector. (J &K) Functional annotation analysis of G4-covered genes within PAIs in two E. coli strains, including biological process (BP), cellular component (CC), and molecular function (MF) categories.