Abstract
Summary
Streptococcus pyogenes causes mild human infections as well as life-threatening invasive diseases. Since the mutations known to enhance virulence to date account for only half of the severe invasive infections, additional mechanisms/mutations need to be identified. Here, we conducted a genome-wide association study of emm89 S. pyogenes strains to comprehensively identify pathology-related bacterial genetic factors (SNPs, indels, genes, or k-mers). Japanese (n=311) and global (n=666) cohort studies of strains isolated from invasive or non-invasive infections revealed 17 and 1,075 SNPs/indels and 2 and 169 genes, respectively, that displayed associations with invasiveness. We validated one of them, a non-invasiveness-related point mutation, fhuB T218C, by structure predictions and introducing it into a severe invasive strain and confirmed that the mutant showed slower growth in human blood. Thus, we report novel mechanisms that convert emm89 S. pyogenes to an invasive phenotype and a platform for establishing novel treatments and prevention strategies.
Introduction
Streptococcus pyogenes is a human-restricted gram-positive pathogen associated with a wide spectrum of diseases. While S. pyogenes often causes non-invasive diseases, including pharyngitis and impetigo in children, it is also known as a “flesh-eating bacterium” owing to its involvement in life-threatening invasive diseases, such as necrotizing fasciitis and streptococcal toxic shock syndrome (STSS)1,2. In 2005, more than 0.6 million people were estimated to have invasive S. pyogenes infections3, and the reported incidence of invasive S. pyogenes infections continues to increase globally1. In cases of severe infection, rapid bacterial growth and profound metabolic acidosis necessitate urgent surgical inspection and extended debridement with empiric antibacterial chemotherapy4,5. However, even with proper treatment, the mortality rate of patients with S. pyogenes infections remains 23–81%6. Moreover, although several protective vaccine candidates against S. pyogenes exist, no safe and effective commercial vaccine has yet been licensed for human use2,7.
S. pyogenes has been classified into at least 240 emm types based on the emm gene hypervariable region sequence8. Since the mid to late 2000s, emm89 strains have been increasingly isolated from samples obtained from patients with invasive diseases, becoming one of the most frequently identified lineages in developed countries9,10. For example, in the United Kingdom, no more than ten cases of invasive diseases caused by emm89 S. pyogenes were reported annually in the 1990s. In the 2010s, however, approximately 150 cases were reported out of the 1,000–1,500 invasive cases caused by all emm types of S. pyogenes in each year11. Similarly, in Japan, in the 2010s, there was an increase in the number of patients with emm89 S. pyogenes-induced severe invasive infections. In 2018, emm89 strains were isolated from 36% of patients diagnosed with S. pyogenes-induced severe invasive infections, following emm1 strains, which were isolated from 57% of patients9.
S. pyogenes emm89 strains have been genetically sub-clustered into three clades according to the nga promoter region patterns and the presence/absence of the hasABC locus, which is responsible for hyaluronan capsule synthesis. Clade 3 is distinct from clades 1 and 2 in terms of two features: overexpression of virulence factors NAD glycohydrolase (NADase) and streptolysin O (SLO), owing to mutations in the promoter region of the nga-ifs-slo operon, and the lack of a hyaluronan capsule11–13. Although clade 3 strains have frequently been isolated from invasive diseases, their numbers from non-invasive infections have also increased. We previously reported no difference in the isolation frequencies of clade 3 strains between invasive and non-invasive diseases, at least in Japan, and concluded that the mutations in clade 3 are not responsible for the gain of invasiveness14. Therefore, there must be other genetic features within the emm89 strains that determine their phenotypes.
During infections caused by emm1 S. pyogenes, mutations in the two-component system, CovR/S, promote high virulence15. These mutations cause the upregulation of DNase, hyaluronan capsule, IL-8 protease, C5a peptidase, streptokinase, NADase, SLO, and superantigen SpeA, as well as the downregulation of cysteine protease SpeB and streptolysin S16,17. The resulting mutants can prevent neutrophil death and subsequently promote tissue destruction and systemic infections. Epidemiologically, Ikebe et al. reported that nonsense mutations in covR and/or covS are present in 46.3% of S. pyogenes strains isolated from severe invasive infections in Japan, but only 1.69% of isolates from non-severe ones18. Moreover, these studies indicated that the covR/S mutation is not responsible for all invasive clinical strains. Thus, we hypothesize that other mechanisms are involved in the development of invasiveness.
In the present study, we aimed to explore novel hypervirulent mechanisms of S. pyogenes by performing a genome-wide association study (GWAS) on S. pyogenes. Our GWAS focused on the emm89 lineage of S. pyogenes to detect lineage-specific factors and minimize false positives due to lineage differentiation. For a comprehensive analysis, we constructed the core genome and pan-genome of emm89 S. pyogenes strains and evaluated the effect of single-nucleotide polymorphisms (SNPs) on core gene alignment and accessory clusters of orthologous genes (COGs). In addition, we performed a k-mers-based GWAS to detect SNPs in the intergenic regions and multiple mutations. We collected and sequenced emm89 clinical strains isolated in Japan during 2016–2021, in addition to public emm89 genome sequences. Using these sequences, we investigated the bacterial factors associated with severe invasive infections in Japan and globally, using GWAS. Based on the bacterial protein structural predictions, we then selected candidates with high potential relevance to the phenotype. Finally, we introduced an SNP related to non-invasiveness into a clinical strain isolated from a severe invasive infection and examined the alteration of the bacterial phenotype through an ex vivo infection assay.
Results
Collection of emm89 S. pyogenes clinical isolates in Japan and construction of cohorts
We collected clinical S. pyogenes strains isolated between 2016 and 2021 from patients with non-invasive and severe invasive infections in Japan. The Ministry of Health, Labour, and Welfare of Japan has defined the clinical criteria of severe invasive β-hemolytic streptococcal infections as STSS, based on the STSS 2010 Case Definition of the Centers for Disease Control and Prevention in the US, with minor modifications, including the addition of encompassing symptoms in the central nervous system (Table S1)19,20.
For the emm89 clinical isolates, we collected T serotype TB3264 and untypable strains, in addition to emm genotype-identified strains. T-typing is a serologically-based approach that is often used as an alternative or supplement to emm typing. T-antigens are trypsin-resistant surface antigens exhibiting extensive antigenic diversity21. Isolates of a given emm type frequently share the same T serotype pattern22,23. The T serotype TB3264 corresponds to the genotype emm89 or emm9421,24. A total of 207 clinical isolates were collected with the cooperation of the National Institute of Infectious Diseases and ten public health institutes nationwide (Tables S2 and S3). We performed draft genome sequencing of the strains and identified their emm types. In total, 150 of these were determined as emm89, followed by 24 and 19 strains as emm4 and emm12, respectively. To focus on the pathogenic mechanisms underlying severe invasive infections in the emm89 cohort, we used 150 emm89 strains for subsequent analyses (Figure 1A, Tables S2, and S3). We previously determined the draft genome sequences of 161 emm89 strains isolated in Japan between 2011 and 2019 and determined their phenotypes using the same criteria (Table S2)14. We combined these two sets and finally considered a total of 311 emm89 strains, including 135 severe invasive and 176 non-invasive isolates, as the Japanese cohort.
We also collected public genome sequences of emm89 S. pyogenes strains isolated from nine countries to further characterize the genetic properties of the Japanese cohort (Table S2)25–27. In this study, the phenotypes of these strains were considered invasive if the diagnoses included severe infections, STSS, invasive infections, necrotizing fasciitis, bacteremia, or sepsis, and isolation sites were described as normally sterile sites, such as the blood, brain, kidney, muscular tissue, or brain. Consequently, we identified 666 strains in the global cohort, including 420 isolates from invasive cases and 246 from non-invasive ones (Table S2).
Pan-genome and phylogenetic analyses reveal both shared and distinct features in the Japanese and global cohorts
To determine the core genes and gene distribution in both cohorts, we performed pan-genome analyses. In the Japanese cohort, 1,417 core genes common to more than 99% of all isolates were determined out of the 3,334 different genes detected within the 311 strains. In contrast, the global cohort was more diverse, with 4,743 different genes, of which 1,327 were core genes (Figure 1B).
Next, we calculated the phylogenetic relationships based on the maximum likelihood of the core gene sequences (Figure 1C and S2). The tree for the global cohort branched into four clusters, with clusters A, B1, B2, and B3. Cluster B3 included 640 genetically similar strains isolated mainly from Europe, North America, and Japan, whereas cluster A comprised 19 strains isolated from Oceanian countries, Kenya, Lebanon, the US, and Japan (Figure 1C). The phylogenetic tree for the Japanese cohort could also be clustered as in the case of the global cohort, with no significant difference in the proportions of strains classified into each sub-cluster (chi-square test, p=0.13; Figure S2 and Table S4). Thus, we concluded that the overall phylogenetic features of emm89 strains were distributed similarly in Japan and other areas, especially Europe and North America. Within cluster B3, we identified a non-invasive strain from Japan that had no identical pattern to the reported nga promoter variations (Figure 1C)28. This pattern is likely a subtype of clade 3 as it shares the haplotype A–27G–22T–18, which is distinctive of clade 3, but has a mutation in the –10 box (Figure S2)28. Thus, we named this novel nga promoter variation type 3.428. Multilocus sequence typing (MLST) analysis revealed that in cluster B3, 522 strains (80.9%) were ST101, and 96 strains (14.8%) were ST646 (Figure 1C). Notably, ST646 was the second most dominant type and a Japan-specific lineage. Moreover, they only differed in the 295th nucleotide in the murI locus, one of the seven loci that determine MLST, suggesting that both lineages have a genetically close relationship. Eight novel MLSTs were determined (ST1450, 1451, 1452, 1454, 1455, 1456, 1461, and 1463) and 15 novel MLST strains were detected in the Japanese cohort (Figure S1). Taken together, using phylogenetic approaches, we found that most strains from Japan and countries in Europe and North America share genetically close relationships, with only one unique lineage in Japan, ST646.
GWASes detect SNPs/indels associated with invasiveness that are both common and specific to Japan and other countries
To discover all types of genetic variants in whole genes within emm89 S. pyogenes associated with (severe) invasiveness, we applied pan-genome analysis and performed three types of independent GWASes targeting SNPs in core genes, the presence or absence of all genes, and other variants located in intergenic regions spanning several nucleotides.
We extracted SNPs and single-nucleotide indels from core gene alignments and detected 24,627 and 47,060 SNPs/indels in the Japanese and global cohorts, respectively. Subsequent GWASes identified SNPs/indels associated with severe invasiveness in a Japanese cohort and invasiveness in a global cohort. To control for population bias, we calculated pairwise distance matrices and selected seven and three dimensions for the analyses of the Japanese and global cohorts, respectively (Figure S3A and S3B). For each cohort, we performed a permutation test by conducting 1,000 iterations of calculations with randomly permuted phenotypes, with the significance level set at the 5th percentile of the 1,000 minimal p-values (p=5.75×10-4 and p=5.75×10-4 for the Japanese and global cohorts, respectively). The GWAS of the Japanese cohort detected 17 SNPs/indels in 13 core genes (Figure 2A and Table S5). Of the 17 significant variants, there were 7 single-nucleotide deletions (SNDs), 7 SNPs causing non-synonymous amino acid substitutions, and 3 SNPs causing synonymous substitutions. The covS gene (also known as csrS), encoding a sensor kinase of the two-component system CovR/S, contains four SNDs with the lowest p-values (p=1.16×10−7 for the 39th, 40th, and 46th nucleotides, and p=1.15×10−6 for the 125th nucleotide). These four deletions were associated with severe invasive infections.
We also performed a GWAS for the global cohort and detected 1,075 SNPs/indels significantly related to invasive infections among the 360 core genes (Figure 2B, 2C and Table S6). Among the significant SNPs/indels, 725 caused synonymous substitutions and 319 caused non-synonymous substitutions or frameshift mutations. Moreover, 19 SNPs induced nonsense mutations, whereas the effects of 12 SNPs/indels were unpredictable because of a lack of reference sequences (Table S3). Notably, 96 SNPs/indels accumulated in a single gene, murJ, which is involved in peptidoglycan biosynthesis, whereas 53 and 51 SNPs/indels were detected in murE and group_1008, respectively (Figure S3C). The SNP with the lowest p-value (p=1.35×10−14) was lacE, which encodes the EIICB component of the lactose-specific phosphotransferase system (Figure 2C). This mutation was associated with an invasive phenotype and waws mainly observed in strains isolated in the US. Compared with the significant 17 SNPs/indels in the Japanese cohort, 10 SNPs/indels were also detected in the global cohort, including 4 SNDs in covS and 1 SNP each in 6 loci (Figure 2C). Deletions at the covS locus were common among strains from several countries, including Japan. In contrast, SNPs in six loci, gatA, group_1102, group_647, iscS_1, recU, and fhuB, were present exclusively in Japan (Figure 2C and Table S6). These results suggest that several bacterial mechanisms cause severe invasive S. pyogenes infections, and some prevail worldwide, such as covS mutations, whereas others are specific to Japan.
GWAS on COGs reveals 2 and 109 genes associated with severe invasiveness in the Japanese cohort and invasiveness in the global cohort, respectively
Next, we examined the associations of accessory COGs with severe invasiveness and global invasiveness in Japanese patients. A permutation test determined significance levels as p-values of 1.09×10-4 and 7.72×10−5 for the Japanese and global cohorts, respectively. Two significant genes were detected in the GWAS for the Japanese cohort: group_184, which encodes a hypothetical protein, and divIC, which encodes a septum formation initiator protein (p=8.81×10−6 and p=6.72×10−6, respectively; Figure 3A and Table S7). Although analysis of the global cohort revealed the presence of 169 genes that were significantly related to invasiveness, no genes were identical or homologous to the two genes detected in the Japanese cohort (Figure 3B, 3C, and Table S8). Among the 169 genes, 25 encoded phage-related genes and 14 encoded mobile genetic elements (MGEs) such as transposase, integrase, and recombinase. The gene with the lowest p-value was group_829, which encodes a transposase and is related to the invasive phenotype (Figure 3C and Table S8). In addition, group_2689, which encodes a multidrug efflux transporter permease, rhaR_3, which encodes a transcriptional regulator involved in rhamnose metabolism, and group_1829, which encodes a vitamin B12 import transporter permease, were also substantially associated with the invasive phenotype. We identified several gene distribution patterns associated with invasiveness, suggesting that multiple independent genetic factors cause invasive infections (Figure 3C). Taken together, genes associated with invasiveness were found to encode mobile genetic factors and transporters, whereas major virulence factors were not significantly associated with invasiveness.
K-mers-based GWAS detects both distinctive and identical variants compared to the SNPs- and COGs-based GWASes
To detect SNPs/indels and multiple mutations in the entire genome, we extracted 31-nt-length k-mers from whole genomes and performed a GWAS. The k-mers-based GWAS can handle polymorphisms spanning more than one base, such as indels, inversions, and translocations, in both the coding and non-coding regions.
In the Japanese cohort, the k-mers-based GWAS detected two regions containing causative variants associated with severe invasiveness (Table S9). As shown in the de-Bruijn graphs, overlapping k-mers were concatenated into a single node, and edges represent variability among the sequences (Figure 4A–F). The set of connected nodes and edges comprising each de-Bruijn graph is called a complex. Nodes were determined to be significant if their q-value was less than 0.05. Significant nodes are shown in red and blue, indicating an association with severe invasive or non-invasive infections, respectively. The complex comprising the nodes with the lowest q-value (q=1.49×10−2) was Comp_11 in the covS locus (Table S9). The causative mutations in covS were an SND and 10 nucleotide polymorphisms (Figure 4A). This deletion resulted in a frameshift mutation that shortened the length of CovS from 500 to 35 amino acids, leading to increased invasiveness, as previously reported in other emm types17. Another complex significantly associated with severe invasiveness is Comp_2 (q=4.22×10−2; Table S9). Comp_2 is a highly variable region containing eight hypothetical protein-coding genes, with high similarity within the first 75 bp. Significant k-mers were also mapped to the first 26 bp of group_184 and 20 bp upstream (Figure 4B). These findings suggest that group_184 possibly contributes to severe invasiveness through not only its presence but also by that of the upstream region.
Next, we analyzed the global cohort and identified mutations that were significantly associated with invasiveness in five regions (Table S10). The mutation with the lowest q-value (q=1.90×10-2) existed in Comp_7 and was identical to the SND in covS present in the k-mers-based GWAS of the Japanese cohort. Thus, this SND is global, as are the four SNDs detected in the SNPs/indels-based GWAS.
Two significant k-mers were present in Comp_6, which were found to be an intergenic region of 270 bp (q=1.90×10−2; Table S10 and Figure 4C). In Comp_24, with a high sequence variation containing group_141, group_142, and group_143, which encode transposases, the presence of a 281-bp sequence consisting of several k-mers was significantly associated with the invasive phenotype (q=1.90×10−2; Table S10 and Figure 4D). In Comp_10, n27458, the sagG locus encoding the ATP-binding protein of the efflux transporter of SLS, was significantly correlated with the non-invasive phenotype (q=2.40×10-2; Table S10 and Figure 4E). However, a significant SNP in n27458, sagG (A882G), was found to cause a synonymous mutation. The other significant mutation was present in the fhuB locus, encoding a putative ferrichrome transport system permease (q=2.40×10-2; Table S10). This mutation was identical to SNP T218C detected in the SNP/indels-based GWAS (Figure 4F). Moreover, this mutation changes the 73rd residue from valine to alanine in FhuB, which is a putative ferrichrome transport system permease. Therefore, the k-mer approach identified multiple variants, including the mutation identified in the SNPs/indels-based GWAS, fhuB SNP T218C. In addition, while the mutation detected in covS differed from that detected in the SNPs/indels-based GWAS, both caused frameshift mutations.
AlphaFold-based prediction of the impact of the identified mutations on function
To assess the impact of mutations on protein function, we predicted the protein structure using AlphaFold29. Here, we present structural predictions for three representative proteins: LacE, whose mutation was observed mainly in invasive strains from the US; CovS, whose invasion-related deletions prevailed worldwide; and FhuB, which carries a prominent mutation in the Japanese cohort and is associated with non-invasiveness.
The invasive-related SNP in lacE substitutes the 554th glycine in LacE with valine. LacE was predicted to be a membrane protein with nine transmembrane regions (Figure S4A). LacE is the EIICB component of the lactose-specific phosphotransferase system, and the EIIC and EIIB domains correspond to the transmembrane and intracellular regions, respectively. As shown in magenta in the model, the 554th residue is predicted to be in the intracellular EIIB domain (Figure S4B).
Next, we predicted a homodimerized CovS model using AlphaFold, as the CovS of S. pyogenes forms homodimers30. SOSUI predicted that CovS has two transmembrane regions (Figure 5A). Mutations detected in the SNP/indels- and k-mers-based GWASes were predicted to shorten the CovS protein to 35 and 45 amino acids, respectively. As the intracellular domain of CovS is in the C-terminal region and is involved in the phosphorylation of the transcriptional regulator CovR, frameshift mutations leading to CovS truncation would inactivate the protein, and thus, CovR function (Figure 5B).
As described above, the SNP T218C in the fhuB locus substitutes the 73rd valine of FhuB with alanine. FhuB is a component of an ATP-binding cassette transporter system that utilizes ferrichrome, which is one of the carriers of Fe3+. FhuB is predicted to localize to the cell membrane and form a channel with FhuG (Figure 5C). SOSUI suggested that FhuB and FhuG are 9-transmembrane proteins. The FhuBG complex can bind to one molecule of the extracellular ferrichrome-binding lipoprotein, FhuD, and two molecules of the intracellular ATP-binding protein, FhuC. Therefore, we constructed a structural model of the FhuBCCDG complex (Figure 5D), which implied that the 73rd residue of FhuB exists in a region adjacent to FhuD. The hydrophobicity of the side chain was attenuated by the mutation, which potentially affected ferrichrome transport (Figure 5E). The SNP G538A in fhuD was Japan-specific, significantly related to severe invasiveness, and caused the V180I mutation in FhuD (Figure S3C). The prediction suggested that the
180th residue is in an α-helix distant from the active site or the interactive sites with FhuB. Both valine and isoleucine are branched-chain amino acids, and the amino acid residue is located on a rigid structure, the α-helix (Figure 5D). Thus, we believe that this mutation is less likely to cause structural changes than the other non-synonymous mutations.
The fhuB T218C mutation inhibits the growth of a severe invasive strain in human blood
Based on the GWAS results and predicted protein structures, we focused on the SNP fhuB T218C. We constructed a mutant strain, in which the SNP fhuB T218C was introduced, to further investigate its potential virulence. We selected the strain TK02 which carries the wild-type (WT) allele T218 in fhuB and was originally isolated from a sample obtained from a patient with severe invasive infection in Japan. We used a several times-passaged TK02 strain, TK02’, as a WT strain and introduced the SNP fhuB T218C into it via allelic exchange mutagenesis with a thermo-sensitive shuttle vector. We then confirmed that there were no differences between the WT and fhuB T218C strains using whole-genome resequencing.
To reveal the effects of the SNP on invasiveness, we performed a transcriptomic analysis of the WT and fhuB T218C strains in THY broth and human blood. Principal component analysis revealed that the differences in the overall transcriptional profiles between the strains were more remarkable in blood than in THY (Figure 6A, 6B, and Tables S11–14). Among the overlapping genes, we found that the expression of CovR-regulating genes, including speB, nga-ifs-slo operon, and sag operon, was upregulated in the blood, compared to that in THY, in both the strains (Figure 6C, Tables S12, and S13). CovR has also been reported to regulate sda1, which plays a key role in invasive disease progression in emm1 strains; however, sda1 does not exist in the emm89 reference strain MGAS27061 or WT strain 15. In human blood, the mutant strain resulted in the downregulation of mga and emm expression and upregulation of the expression of genes encoding virulence factors, such as speC, scpA/B, endoS, fba, ska, and sfbX, compared to those observed in the WT strain (Table S14). Although Mga regulates the expression of surface and secreted molecules important for colonization and immune evasion31, no strong expression changes were observed in the Mga regulon, except in the emm gene. Notably, fhuB, fhuC, and fhuD were upregulated in the fhuB mutant in both human blood and THY (Tables S11 and S14). Moreover, the transcriptomic analysis revealed that both WT and fhuB T218C strains displayed upregulation of the speB, nga-ifs-slo operon, and sag operons in human blood. In addition, the fhuB T218C mutation caused the upregulation of fhuBCD expression.
To determine whether fhuB T218C mutation affects iron transport, we measured the intracellular free ferric ion concentration in each strain and observed no differences among the strains in either environment (Figure 7A). Therefore, the upregulation of fhuBCD may compensate for the impaired function mediated by SNP T218C.
Next, to investigate the effects of SNP on bacterial survival in human blood, we performed a bactericidal assay using human blood. At 2 and 3 h after incubation, the fhuB T218C mutant strain exhibited a significantly decreased survival rate than that of the WT strain (Figure 7B). To further determine the blood components that the attenuated survival of the mutant can be attributed to, we compared bacterial survival rates in erythrocytes, polymorphonuclear cells, plasma, heat-inactivated plasma, and brain heart infusion broth (Figure 7C–7G). Notably, after incubation with erythrocytes, polymorphonuclear cells, or plasma, the fhuB T218C mutant strain exhibited a significantly lower survival index than that of the WT strain, at 2 and 3 h after incubation, as observed in whole blood (Figure 7C–7E and 7G). However, there were no significant differences between the survival rates of the WT and mutant strains in heat-inactivated plasma, suggesting that the mutant strain is susceptible to complement (Figure 7F). Taken together, the polymorphism T218C in fhuB impaired the survival of severe invasive strains in human blood through interactions with erythrocytes.
Discussion
The present study focused on emm89 S. pyogenes causing globally expanding invasive infections. We constructed a workflow for bacterial GWAS and explored the bacterial genetic factors related to severe or invasive infections to reveal the plausible mechanisms of pathogenesis. We independently performed GWASes for Japanese and global cohorts to investigate the associations of variants with strictly defined phenotypes in severe invasive S. pyogenes infections and with phenotypes in a broader context of invasive infections. Several GWASes have been conducted on S. pyogenes to date. Davies et al. extracted k-mers from the genomes of 1,944 strains of S. pyogenes with broad emm genotypes and assessed their associations with invasiveness, and Kachroo et al. performed GWAS on k-mers using 442 emm28 strains isolated from either invasive or non-invasive infections27,32. However, no studies have described the combined analysis of SNPs, genes, and k-mers exclusively using emm89 strains that share genetically close genomes and, thus, a large number of core genes. The investigations we performed led to the discovery of a tremendous number of causative variants, including not only SNPs in core genes but also accessory genes and insertions spanning intergenic regions.
Spontaneous mutations in covR/S genes potentiate the transition from localized to invasive infection by M1T1 S. pyogenes 17. We demonstrated that some SNDs in the covS locus were significantly related to the phenotypes in both the SNPs/indels- and k-mers- based GWASes. CovS forms a two-component system with CovR that regulates the transcription of operons encoding virulence factors when phosphorylated by the histidine sensor kinase CovS33. Sumby et al. suggested that systemic invasive infections are caused by the overexpression of virulence factors through the inactivation of CovS, as a result of the introduction of a depletive mutation in the covS locus after infection15,16. Several mutation sites causing such overexpression, such as the 7 nt insertion in covS of emm1 S. pyogenes and the insertion of adenine at the 877th nucleotide, have been reported15,16. Ikebe et al. demonstrated variations in the lengths of mutant CovS proteins by analyzing the gene sequences of 164 STSS strains in Japan18. In line with their study, we found that the shortening of CovS that potentially occurred owing to frameshifts caused by SNDs is significantly associated with invasive infections, potentially contributing to the upregulation of virulence factors and invasiveness.
In contrast, we observed several factors associated with invasiveness to be independent of covR/S mutations. Based on conformational predictions, we selected SNPs with non-synonymous substitutions that were likely to affect protein function. Our transcriptomic analysis suggested that the Japan-specific fhuB mutation contributes to the growth rate of S. pyogenes in human blood by adapting to the environment. In addition, emm89 clade 3 carries the identical promoter region pattern of the nga operon as emm1 strains, and the pattern conferred similarly high expression of nga and slo11,12,28. Our RNA-seq data demonstrated that a severe invasive Japanese strain without covS mutations increased the expression of speB, nga, slo, and genes in the sag operon in the blood. Although covR/S mutations downregulate the expression of SpeB and SLS, SpeB and SLS act as virulence factors, allowing S. pyogenes to invade host tissues34–37. These data suggest that severe invasive infections have multiple gene expression profiles in addition to covR/S mutation-induced profiles within a single lineage, emm89, and that the synergy between optimizing bacterial survival in human blood and upregulating multiple virulence factors contributes to the development of severe invasive infections.
In addition, we propose two possible roles of the FhuB V73A mutation in the pathogenesis of severe invasive infections. First, the mutation in FhuB could increase bacterial susceptibility to free radicals generated in the presence of ferric ions in the blood. Generally, ferric ions are essential for the survival of almost all living organisms. Catalyzed by ferric ions, the Fenton and Haber–Weiss reactions generate hydroxyl radicals in erythrocytes38. Previously, we provided evidence that the iron in erythrocytes partially inhibits pneumococcal growth via a free radical-based mechanism7. Although no significant difference was observed in the intracellular ferric ion concentration in the present study, the structural changes in FhuB may have caused an increased generation of free radicals in bacterial cells and the prevention of bacterial survival in human blood. Second, the FhuB mutation could affect bacterial transcriptional profiles and result in reduced fitness for survival in human blood. We also observed a lower survival rate of the mutant strain with polymorphonuclear cells (Figure 7D). Lower emm transcription in the mutant may lead to attenuated immune evasion. The M protein can bind C4-binding protein and factor H and inactivate the deposited C4b and C3b, leading to limited surface opsonization39. Furthermore, complements could have inhibited the proliferation of the mutant strain, as suggested by the activation of the membrane attack complex (Figure 7D and 7E). Although the mechanisms of the FhuB V73A mutation in the interaction with each blood component must be further elucidated, our results indicated that the 73rd residue of FhuB is a key factor for bacterial survival in human blood during systemic infection.
Recently, He et al. reported that antibodies against the ferrichrome-binding protein FtsB, also known as FhuD, decrease blood bacterial burden and skin abscess formation in murine models infected with emm1 S. pyogenes40. FhuD is a lipoprotein that has been studied as a potential vaccine candidate against Staphylococcus aureus41. The detailed function and effect of the FhuBCCDG complex in S. pyogenes remain unknown; however, at least in emm1 strains, the complex can be expressed and localized abundantly on the cell membrane. Hence, this complex may be a promising target for vaccines against S. pyogenes.
Our gene-based GWAS revealed no significant correlation with the distribution of genes encoding virulence factors. This finding minimizes the possibility that invasive infections are the result of the acquisition of virulence factors by non-invasive strains and indirectly supports the hypothesis that changes in gene expression profiles cause invasive infections. Although no vaccine has been commercialized against S. pyogenes, given the possibility that the pathogen has multiple gene expression patterns, even within a single lineage, it may be difficult to develop a universal vaccine with a single antigen, and a vaccine containing multiple antigens may be effective.
In the global cohort, 169 genes were related to the phenotype, including 25 phage- and 14 MGE-related genes. The genome of S. pyogenes is rich in prophages, phage-like chromosomal islands, and MGEs 42. Prophages and MGEs sometimes function as vectors of virulence factors and antibiotic-resistance genes through interspecies and intraspecific transmission. Common superantigen-coding genes, such as speA, speC, speG, speH, speI, speJ, speK, and speL, and DNase-coding genes, such as spd1, are exogenous genes derived from prophages43. In addition, in the genome of S. pyogenes, there are broad MGEs where erm and mefA related to macrolide resistance and tetM involved in resistance to tetracyclines are located44. In the present study, we observed no direct association between invasiveness and virulence or antibiotic-resistance genes located in the prophages and MGEs. One possible explanation is that significantly related genes reflect the process of not only gain but also loss of factors affecting the fitness cost. Given that we also identified hypothetical genes associated with this phenotype, we assumed that there is an unknown mechanism contributing to the pathogenesis of invasive infections.
In the present study, MLST analysis revealed that 522 of the 666 strains belonged to ST101. Following ST101, ST646 was the second most prevalent lineage in the Japanese cohort and was not detected in any other country. In addition, the difference between ST101 and ST646 was found only at the 295th nucleotide in mulI, among the seven genes subjected to MLST typing. Our phylogenetic trees also suggested that ST101 and ST646 have a close phylogenetic relationship. Five previous studies have described ST646 in S. pyogenes, and all of them reported emm89 ST646 strains isolated in Japan45–49. Furthermore, all ST646 strains were isolated from the 4,249 emm89 S. pyogenes strains in the MLST database PubMLST50. Therefore, ST646 appears to be specific to emm89 strains and a unique lineage in Japan. Notably, Ubukata et al. also reported that ST646 strains began increasing after 2012, whereas the ST101 lineage was dominant until 200849. Taken together, we conclude that ST646 is possibly a relatively new and Japan-specific lineage of emm89 S. pyogenes.
Our study had three limitations. First, because we sequenced bacterial genomes using short-read sequencing, we could not detect large bacterial genome rearrangements by comparing complete genome sequences. Thus, we were unable to investigate the effects of long genomic structural dynamics, such as inversions, on pathogenesis. Second, the clinical information associated with the isolates was limited; therefore, our analyses did not reflect host information, such as age, sex, and underlying health conditions. Third, we only used bacterial genetic distances based on core genome sequences as covariates in the GWAS. A combined GWAS of the host and pathogen in S. pyogenes infection would highlight the relationship between host risk factors and bacterial genetic variants.
In this study, we revealed the genotype-phenotype associations found in not only the known factors represented by covS but also factors that are related to invasiveness, including fhuB. Moreover, we experimentally validated the contribution of the fhuB mutation to bacterial survival in human blood. This study demonstrates the potential of our genomic statistical approach for elucidating the pathogenesis of invasive infections. Further analyses of the invasiveness-related factors identified in this study could provide a platform for establishing novel treatments and preventive strategies against invasive infections.
Acknowledgements
We would like to thank the NGS core facility of the Genome Information Research Center at the Research Institute for Microbial Diseases of Osaka University for their support in the DNA sequencing and data analysis and the Bioinformatic Research Unit of Osaka University Graduate School of Dentistry for their support in the bioinformatics analysis. This study was partially performed on the National Institute of Genetics (NIG) supercomputer at the Research Organization of Information and Systems National Institute of Genetics. This study was partly completed using SQUID at the Cybermedia Center, Osaka University, under the “Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures (JHPCN)” in Japan (Project ID: EX22701 and jh230035). Masayuki Ono and Kotaro Higashi were recipients of the Iwadare Scholarship from the Iwadare Scholarship Foundation. We wish to express our gratitude to Mami Tateshita (Sapporo City Institute of Public Health) as well as the medical institutions that participated in the collection of clinical strains.
This study was partly supported by AMED (JP19fk0108044, JP22fk0108130, and JP22wm0325001), the Japan Society for the Promotion of Science KAKENHI (grant numbers 20KK0210, 22H03262, 22K19618, 22K19619, 23H03073, and 23K19687), the Takeda Science Foundation, Naito Foundation, and Joint Research Program of the Research Center for GLOBAL and LOCAL Infectious Diseases, Oita University (2022B05). This study was conducted as part of “The Nippon Foundation - Osaka University Project for Infectious Disease Prevention.” This study was supported by JST SPRING (grant number JPMJSP2138). The funders had no role in the study design, data collection or analysis, decision to publish, or preparation of the manuscript.
Declaration of interests
The authors declare no competing interests.
Data availability
Data for the 207 sequenced S. pyogenes genomes were deposited in the DDBJ sequence read archive, under BioProject PRJDB16457. The DRR run number is DRR511668-DRR511874.
Materials and methods
Clinical isolates in Japan
Clinical isolates were collected from public health institutions in Tokyo, Osaka, Yamaguchi, Fukushima, Kobe, Kyoto, Amagasaki, Sapporo, and Niigata, Japan. We defined the strains collected as STSS according to the Infectious Diseases Control Law in Japan. Non-invasive strains were defined based on diagnostic names, including asymptomatic, pharyngitis, tonsillitis, or non-invasive infections. Strains with no diagnostic names for the non-STSS strains were defined as non-invasive based on the isolate sites. Information on all the strains included in this study is presented in Table S3. The collected S. pyogenes strains were cultured at 37°C in an atmosphere containing 5% CO2, in Todd Hewitt broth supplemented with 0.2% yeast extract (THY; both from BD Biosciences, Franklin Lakes, NJ, USA) and stored in THY broth with 30 % glycerol (Nacalai Tesque, Kyoto, Japan), at –80°C.
Genomic DNA sequencing of the clinical isolates
The S. pyogenes strains were cultured until the exponential growth phase (OD600=0.3– 0.4). Bacterial cells were lysed with T10E1N100 buffer (10 mM Tris-HCl buffer, 100 mM sodium chloride, and 1 mM EDTA), 10 units/mL mutanolysin (Sigma-Aldrich, St. Louis, MO, USA), 10 mg/mL lysozyme (Fujifilm Wako Pure Chemical Co., Osaka, Japan), 0.5 mg/mL achromopeptidase (Fujifilm Wako Pure Chemical Co.), and 0.3 mg/mL RNase A (Promega, Madison, WI, USA). Next, genomic DNA was extracted from each lysate using a Maxwell® RSC instrument (Promega), according to the manufacturer’s instructions, and 250 bp paired-end libraries were then generated from the extracted DNA using a Nextera XT DNA Kit (Illumina, San Diego, CA, USA). Libraries were sequenced using a NovaSeq 6000 system (Illumina) at the Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Osaka, Japan. On average, the number of reads was 5,433,301 (range 3,437,124–9,117,301).
Collection of published genome sequences
We previously sequenced the draft genomes of 161 emm89 clinical isolates collected in Japan between 2011 and 201914. We defined strains derived from STSS as “severe invasive,” and those obtained from pharyngitis, tonsillitis, and superficial skin lesions as “non-invasive” phenotypes.
To obtain public genome sequences of emm89 strains isolated from other countries, we downloaded draft genome sequences in FASTA format from the National Center for Biotechnology Information (NCBI) database, using Fasterq-dump v.2.9.6. The phenotype of each strain, whether invasive or non-invasive, was defined according to the definitions in the respective references that reported the strains25–27.
Genomic data processing and pan-genome analysis
All processes and analyses were performed using the National Institute of Genetics (NIG) supercomputer and SQUID at the Cybermedia Center of Osaka University (Osaka, Japan). We constructed a workflow for bacterial GWAS and other bioinformatic processes (Figure 5). All collected sequences were subjected to quality checks using Fastp v.0.20.151. For newly isolated strains in Japan, emm typing was performed using the emm-typing-tool v.0.0.1, and only sequences of strains determined as emm89 were used for the following analyses52. All emm89 sequence data were then subjected to de novo assembly using SKESA v.2.4.0, with default parameters53. Next, the MLST of each sequence was performed using MLST v.2.19.054,55. Clade typing was performed using BLAST v.2.13.0, with reference to the three nga promoter region sequences13. After the genes were annotated with Prokka v.1.14.5, the pan-genome of all sequences was calculated using Roary v.3.12.0, with the parameters “-e -mafft -r -qc -cd 99”56,57. Roary generated a core gene alignment and the distribution of all genes among the strains. To extract SNPs/indels from core genes, including single-nucleotide indels, snp-sites v.2.5.1 with the option “-v” was used. The output files were further processed using BCFtools v.1.9, with the parameter “norm -m –”, enabling analysis of multiple alleles in the GWAS. In parallel, k-mers were extracted using DBGWAS v.0.5.3, and the length of k-mers was set as 31 nt using the “-k 31” parameter of DBGWAS58.
Phylogenetic analysis
Phylogenetic relationships were calculated from the core gene alignment, using IQ-tree v.1.16.1259, based on maximum likelihood. The substitution model was automatically selected considering the Akaike’s and Bayesian information criteria by setting the “-m MFP” parameter of IQ-tree60. Phylogenetic trees were constructed using iTOL v.6.661. The similarity of clustering in the two phylogenetic trees was statistically examined using Pearson’s chi-square test with R v.4.0.362, followed by post-hoc analysis using residual analysis adjusted with the Holm’s method, if p<0.05.
GWAS
To investigate the associations between phenotypes and genotypes, including SNPs/indels and genes, we performed a GWAS using pyseer v.1.3.463. The VCF file of SNPs/indels or the gene distribution matrix was designated as the genotype. To remove biases derived from lineages, we added information on the genetic distances between all pairs of strains as covariates using mash v.2.364. Briefly, de novo assemblies were compressed through conversion into minimum hash values using the command “mash sketch -s 10000.” Subsequently, the commands “mash dist” and “square_mash” were utilized to generate a genetic distance matrix expressed with Jaccard coefficients64. The obtained matrix underwent eigenvalue decomposition, and the number of eigenvalues used for multidimensional scaling was visually determined using the plot of the relationships between the eigenvalues and contribution ratios. The number of eigenvalues and the distance matrix were then added as pyseer parameters. The pyseer calculation was iterated 1,000 times with randomized phenotypes, and the 5-percentile value of the minimal p-value in each calculation was set at the significance level. Using R and the package ggplot2, the results were visualized as a Manhattan plot for the SNPs/indels-based GWAS and a volcano plot for the genes-based GWAS, respectively65. Heatmaps of the strains possessing significant variants were generated using Excel (v.16.66.1; Microsoft, Redmond, WA, USA).
The k-mers-based GWAS was carried out using DBGWAS58. K-mers were considered significant at a false discovery rate (q-value) of <0.05. DBGWAS-calculated complexes, which are regions encompassing the k-mers, were significantly related to pathology, and de-Bruijn graphs were generated based on these complexes. The sequences of the k-mers were outputted and mapped to a reference sequence using Geneious Prime v.2022.0.1 (Biomatters, Auckland, New Zealand) to identify the mutations indicated by the k-mers. For the reference strain, we adopted MGAS27061, which was isolated from an invasive case in the USA and whose complete chromosomal sequence has been used as the reference sequence of emm89 clade 3 66.
Protein structure prediction
Significant variants found in the GWAS, resulting in non-synonymous substitutions in proteins, were searched by converting nucleotide sequences into amino acid sequences using EMBOSS Transeq v.6.6.0.067. To assess whether these mutations affected the protein function, protein structure prediction models were constructed using AlphaFold v.2.2.229. The calculations were performed five times for each model, by setting the option multimer_predictions_per_model=5. We predicted multimer models using the option -- model_preset=“multimer” if a protein is reported or anticipated to form a multimer. For each monomer, we selected the model with the best predicted local difference distance test (an indicator of local structural accuracy) score68. For each multimer, AlphaFold was calculated and expressed as a weighted combination of the interface-predicted TM and predicted TM scores (ipTM + pTM). pTM is a metric for overall topological accuracy and ipTM is used to measure the structural accuracy of the protein-protein interface69. The transmembrane regions of the proteins were predicted using SOSUI (https://harrier.nagahama-i-bio.ac.jp/sosui/mobile/)70. The structures of the obtained model were visualized using PyMOL v.2.5 (Schrödinger, LLC., New York, NY, USA).
Construction of the fhuB T218C mutant strains
We used the several times-passaged S. pyogenes TK02 strain, TK02’, as the WT strain. The TK02 strain was originally isolated from a patient with severe invasive infection14. The whole genome of TK02’ was sequenced, and the generated fasta file is available in Document S1. A point mutation, fhuB T218C, was introduced using the temperature-sensitive shuttle vector, pSET4s, as reported previously71. Sanger sequencing confirmed the presence of the point mutation. In addition, we resequenced the draft genome to confirm that there were no differences apart from the point mutation, as described above. The bacterial strains, primers, and plasmids used in this study are listed in Tables S15 and S16. Escherichia coli strain DH5α (Takara Bio, Shiga, Japan) was used as a host for the plasmid derivatives. All E. coli strains were cultured in Luria Bertani broth, at 37°C, with agitation. For selection and maintenance of mutants, antibiotics were added to the media at the following concentrations: carbenicillin (Nacalai Tesque), 100 µg/mL for E. coli; and spectinomycin (Fujifilm Wako Pure Chemical Corporation), 100 µg/mL for E. coli and S. pyogenes.
Transcriptomic analysis
The fhuB WT and mutant strains were harvested in 30 mL of THY broth, of which 1 mL was dispensed into 10 mL of THY and the remainder was centrifuged and resuspended in 2 mL of heparinized human blood. Bacterial mixtures with THY or blood were dispensed into three aliquots and incubated at 37°C for 3 h. THY samples were centrifuged and resuspended in RNA Shield (Zymo Research, Irvine, CA, USA). For each blood sample, 2 volumes (1 mL) of RNA protection bacteria reagent (Qiagen, Hilden, Germany) were added. L5, included in the PureLink™ Total RNA Blood Purification Kit (Thermo Fisher Scientific, Waltham, MA, USA), was added to remove erythrocytes. The bacterial cell wall was mechanically lysed in Lysing Matrix B using a MagNA Lyser (Roche, Basel, Switzerland). After centrifugation, the total bacterial RNA was extracted using a Quick-RNA™ Miniprep Kit (Zymo Research), according to the manufacturer’s instructions. Full-length cDNA was generated using the SMART-Seq® HT Kit (Takara Bio), according to the manufacturer’s instructions. Pair-end libraries were generated using a Nextera XT DNA Kit and sequenced using a NovaSeq 6000 system (both from Illumina, San Diego, CA, USA). Sequenced data were preprocessed using Trimmomatic v.0.33 and FastQC v.0.12.1. The reads were mapped to the complete MGAS27061 genome (NCBI reference sequence: NZ_CP013840.1) using STAR v.2.7.0a. After a second quality check using FastQC, read counting was performed using featureCounts v.1.5.2 2. Differentially expressed genes were identified using iDEP v.0.96 and gene annotations from NCBI and Prokka were combined73. Plots were created using iDEP and the R package ggplot2.
Intracellular ferric ion assay
Human plasma was obtained through centrifugation of heparinized human blood, after 30 min of incubation at 37°C. The WT and fhuB mutant strains were harvested at the exponential phase, resuspended into 1 mL of THY or serum, and then incubated at 37°C for 3 h. Viable bacterial cells were counted as colony-forming units (CFUs) by plating the diluted samples onto THY agar plates. Intracellular ferric ions were measured using a QuantiChrom™ Iron Assay Kit (BioAssay Systems, Hayward, CA, USA), according to the manufacturer’s instructions. Briefly, 50 µL of standards or samples in 96-well plates were mixed with 200 µL QuantiChrom™ Working Reagent and incubated at 20–24°C for an hour. The optical density at the wavelength of 600 nm was measured using an Infinite® 200 Pro F Plex Instrument (TECAN, Männedorf, Switzerland). The assay was repeated three times, and the results of the respective experiments were combined. Statistical analyses were performed using the Mann–Whitney U test.
Bactericidal assay
The bactericidal assay was performed as described previously, with minor modifications34,74–76. Briefly, whole blood was collected from healthy adults. Human neutrophils and erythrocytes were prepared using PolymorphPrep™ (Serumwerk Bernburg, Bernburg, Germany), according to the manufacturer’s instructions. Heparinized human blood was centrifuged at 500 × g for 30 min, to isolate erythrocytes and polymorphonuclear cells, which were then suspended in Roswell Park Memorial Institute (RPMI)-1640 medium containing L-glutamine and Phenol Red (Fujifilm Wako Pure Chemical Corporation). Heat-inactivated plasma was prepared at 56°C for 30 min.
Subsequently, 195 µL of heparinized human whole blood, erythrocytes in RPMI-1640, polymorphonuclear leukocytes in RPMI-1640, plasma, heat-inactivated plasma, or brain heart infusion broth (BD Biosciences), and 5 µL of early exponential phase bacteria with 0.9–2.0×104 CFUs/well were mixed in 96-well plates and incubated at 37°C, in an atmosphere containing 5% CO2, for 1, 2, and 3 h. Viable bacterial cells were counted as CFUs by plating the diluted samples onto THY agar plates. The growth index was calculated as the number of CFUs at a specified time point divided by the number of CFUs in the initial inoculum. The assay was repeated three times, and the results of the respective experiments were combined. Statistical analyses were performed using the Mann–Whitney U test. Differences were considered statistically significant at p<0.05, using Prism v.7.0c (GraphPad, La Jolla, CA, USA).
Ethical approval
Studies involving human participants were reviewed and approved by the Institutional Review Board of Osaka University Graduate School of Dentistry (approval nos. H26-E43 and H29-E16-2). The donors provided written informed consent to participate in the human blood bactericidal assay. For the S. pyogenes collection, as we retrospectively obtained clinical isolates of S. pyogenes, we utilized an opt-out consent procedure instead of obtaining written informed consent from the patients.
References
- 1.Global Disease Burden of Streptococcus pyogenesStreptococcus pyogenes: Basic Biology to Clinical Manifestations University of Oklahoma Health Sciences Center
- 2.Pathogenesis, epidemiology and control of Group A Streptococcus infectionNat. Rev. Microbiol 21:431–447https://doi.org/10.1038/s41579-023-00865-7
- 3.The global burden of group A streptococcal diseasesLancet Infect. Dis 5:685–694https://doi.org/10.1016/S1473-3099(05)70267-X
- 4.Practice guidelines for the diagnosis and management of skin and soft tissue infections: 2014 update by the Infectious Diseases Society of AmericaClin. Infect. Dis 59:e10–e52https://doi.org/10.1093/cid/ciu444
- 5.Severe Group A Streptococcal InfectionsStreptococcus pyogenes: Basic Biology to Clinical Manifestations University of Oklahoma Health Sciences Center
- 6.Disease manifestations and pathogenic mechanisms of Group A StreptococcusClin. Microbiol. Rev 27:264–301https://doi.org/10.1128/CMR.00101-13
- 7.Streptococcus pneumoniae invades erythrocytes and utilizes them to evade human innate immunityPLOS One 8https://doi.org/10.1371/JOURNAL.PONE.0077282
- 8.Molecular Epidemiology, Ecology, and Evolution of Group A StreptococciMicrobiol. Spectr 6https://doi.org/10.1128/microbiolspec.CPP3-0009-2018
- 9.Molecular characterization and antimicrobial resistance of group A streptococcus isolates in streptococcal toxic shock syndrome cases in Japan from 2013 to 2018Int. J. Med. Microbiol 311https://doi.org/10.1016/j.ijmm.2021.151496
- 10.Patterns of Antibiotic Nonsusceptibility Among Invasive Group A Streptococcus Infections-United States, 2006–2017Clin. Infect. Dis. 73:1957–1964https://doi.org/10.1093/cid/ciab575
- 11.Emergence of a new highly successful acapsular group a Streptococcus clade of genotype emm89 in the United KingdommBio 6https://doi.org/10.1128/mBio.00622-15
- 12.A molecular trigger for intercontinental epidemics of group A StreptococcusJ. Clin. Invest 125:3545–3559https://doi.org/10.1172/JCI82478
- 13.Trading Capsule for Increased Cytotoxin Production: Contribution to Virulence of a Newly Emerged Clade of emm89 Streptococcus pyogenesmBio 6:e01378–e01315https://doi.org/10.1128/mbio.01378-15
- 14.Genetic Characterization of Streptococcus pyogenes emm89 Strains Isolated in Japan From 2011 to 2019Infect. Microbes Dis 2:160–166https://doi.org/10.1097/IM9.0000000000000038
- 15.DNase Sda1 provides selection pressure for a switch to invasive group A streptococcal infectionNat. Med 13:981–985https://doi.org/10.1038/nm1612
- 16.Genome-wide analysis of group a streptococci reveals a mutation that modulates global phenotype and disease specificityPLoS Pathog 2https://doi.org/10.1371/journal.ppat.0020005
- 17.Molecular insight into invasive group A streptococcal diseaseNat. Rev. Microbiol 9:724–736https://doi.org/10.1038/nrmicro2648
- 18.Highly Frequent Mutations in Negative Regulators of Multiple Virulence Genes in Group A Streptococcal Toxic Shock Syndrome IsolatesPLOS Pathog 6https://doi.org/10.1371/journal.ppat.1000832
- 19.Severe Invasive Group A Streptococcal Infections.
- 20.Streptococcal Toxic Shock Syndrome (STSS) (Streptococcus pyogenes) 2010 Case Definition
- 21.Changing prevalent T serotypes and emm genotypes of Streptococcus pyogenes isolates from streptococcal toxic shock-like syndrome (TSLS) patients in JapanEpidemiol. Infect 130:569–572
- 22.Streptococcal emm types associated with T-agglutination types and the use of conserved emm gene restriction fragment patterns for subtyping group A streptococciJ. Med. Microbiol 47:893–898https://doi.org/10.1099/00222615-47-10-893
- 23.A review of the correlation of T-agglutination patterns and M-protein typing and opacity factor production in the identification of group A streptococciJ. Med. Microbiol 38:311–315https://doi.org/10.1099/00222615-38-5-311
- 24.M protein gene (emm) typing of Streptococcus pyogenesKansenshogaku Zasshi 76:238–245https://doi.org/10.11150/kansenshogakuzasshi1970.76.238
- 25.Genome sequence analysis of emm89 Streptococcus pyogenes strains causing infections in Scotland, 2010–2016J. Med. Microbiol. 66:1765–1773https://doi.org/10.1099/jmm.0.000622
- 26.Population and Whole Genome Sequence Based Characterization of Invasive Group A Streptococci Recovered in the United States during 2015mBio 8:e01422–17https://doi.org/10.1128/mBio.01422-17
- 27.Atlas of group A streptococcal vaccine candidates compiled using large-scale comparative genomicsNat. Genet 51:1035–1043https://doi.org/10.1038/s41588-019-0417-8
- 28.The Emergence of Successful Streptococcus pyogenes Lineages through Convergent Pathways of Capsule Loss and Recombination Directing High Toxin ExpressionmBio 10:1–20https://doi.org/10.1128/mBio.02521-19
- 29.Highly accurate protein structure prediction with AlphaFoldNature :583–589https://doi.org/10.1038/s41586-021-03819-2
- 30.The group A Streptococcus accessory protein RocA: regulatory activity, interacting partners and influence on disease potentialMol. Microbiol 113:190–207https://doi.org/10.1111/mmi.14410
- 31.Defining the Mga regulon: Comparative transcriptome analysis reveals both direct and indirect regulation by Mga in the group A streptococcusMol. Microbiol 62:491–508https://doi.org/10.1111/j.1365-2958.2006.05381.x
- 32.Integrated analysis of population genomics, transcriptomics and virulence provides novel insights into Streptococcus pyogenes pathogenesisNat. Genet 51:548–559https://doi.org/10.1038/S41588-018-0343-1
- 33.Virulence-Related Transcriptional Regulators of Streptococcus pyogenesStreptococcus pyogenes : Basic Biology to Clinical Manifestations University of Oklahoma Health Sciences Center
- 34.Group A streptococcal cysteine protease degrades C3 (C3b) and contributes to evasion of innate immunityJ. Biol. Chem 283:6253–6260https://doi.org/10.1074/jbc.M704821200
- 35.Streptolysin S contributes to group A streptococcal translocation across an epithelial barrierJ. Biol. Chem 286:2750–2761https://doi.org/10.1074/jbc.M110.171504
- 36.Cysteine proteinase from Streptococcus pyogenes enables evasion of innate immunity via degradation of complement factorsJ. Biol. Chem 288:15854–15864https://doi.org/10.1074/jbc.M113.469106
- 37.Streptolysin S inhibits neutrophil recruitment during the early stages of Streptococcus pyogenes infectionInfect. Immun 77:5190–5201https://doi.org/10.1128/IAI.00420-09
- 38.Free radical metabolism in human erythrocytesClin. Chim. Acta 390:1–11https://doi.org/10.1016/j.cca.2007.12.025
- 39.Catch Me if You Can: Streptococcus pyogenes Complement Evasion StrategiesJ. Innate. Immun 11:3–12https://doi.org/10.1159/000492944
- 40.Immunization with the lipoprotein FtsB stimulates protective immunity against Streptococcus pyogenes infection in miceFront. Microbiol 13https://doi.org/10.3389/fmicb.2022.969490
- 41.Lipoproteins of Gram-Positive Bacteria: Key Players in the Immune Response and VirulenceMicrobiol. Mol. Biol. Rev 80:891–903https://doi.org/10.1128/MMBR.00028-16
- 42.Molecular epidemiology and genomics of group A StreptococcusInfect. Genet. Evol 33:393–418https://doi.org/10.1016/j.meegid.2014.10.011
- 43.The Bacteriophages of Streptococcus pyogenesMicrobiol. Spectr 7https://doi.org/10.1128/microbiolspec.GPP3-0059-2018
- 44.Deciphering mobile genetic elements disseminating macrolide resistance in Streptococcus pyogenes over a 21 year period in Barcelona, SpainJ. Antimicrob. Chemother 76:1991–2003https://doi.org/10.1093/jac/dkab130
- 45.Prevalence of macrolide resistance among group A streptococci isolated from pharyngotonsillitisMicrob. Drug. Resist. 20:431–435https://doi.org/10.1089/mdr.2013.0213
- 46.Characterisation of clinically isolated Streptococcus pyogenes from balanoposthitis patients, with special emphasis on emm89 isolatesJ. Med. Microbiol 66:511–516https://doi.org/10.1099/jmm.0.000460
- 47.Molecular epidemiology, antimicrobial susceptibility, and characterization of fluoroquinolone non-susceptible Streptococcus pyogenes in JapanJ. Infect. Chemother 26:280–284https://doi.org/10.1016/j.jiac.2019.10.004
- 48.Molecular epidemiology, antimicrobial susceptibility, and characterization of macrolide-resistant Streptococcus pyogenes in JapanJ. Infect. Chemother 22:727–732https://doi.org/10.1016/j.jiac.2016.06.013
- 49.Changes in epidemiologic characteristics and antimicrobial resistance of Streptococcus pyogenes isolated over 10 years from Japanese children with pharyngotonsillitisJ. Med. Microbiol 69:443–450https://doi.org/10.1099/jmm.0.001158
- 50.Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applicationsWellcome Open Res. 3https://doi.org/10.12688/wellcomeopenres.14826.1
- 51.fastp: an ultra-fast all-in-one FASTQ preprocessorBioinformatics 34:i884–i890https://doi.org/10.1093/bioinformatics/bty560
- 52.Whole genome sequencing of group A Streptococcus: development and evaluation of an automated pipeline for emm gene typingPeerJ 5https://doi.org/10.7717/peerj.3226
- 53.SKESA: strategic k-mer extension for scrupulous assembliesGenome Biol 19https://doi.org/10.1186/s13059-018-1540-z
- 54.BIGSdb: scalable analysis of bacterial genome variation at the population levelBMC Bioinformatics 11https://doi.org/10.1186/1471-2105-11-595
- 55.MLST revisited: the gene-by-gene approach to bacterial genomics Europe PMC Funders GroupNat. Rev. Microbiol 11:728–736https://doi.org/10.1038/nrmicro3093
- 56.Prokka: rapid prokaryotic genome annotationBioinformatics 30:2068–2069https://doi.org/10.1093/bioinformatics/btu153
- 57.Roary: rapid large-scale prokaryote pan genome analysisBioinformatics 31:3691–3693https://doi.org/10.1093/bioinformatics/btv421
- 58.A fast and agnostic method for bacterial genome-wide association studies: Bridging the gap between k-mers and genetic eventsPLOS Genet 14https://doi.org/10.1371/journal.pgen.1007758
- 59.IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogeniesMol. Biol. Evol 32:268–274https://doi.org/10.1093/molbev/msu300
- 60.ModelFinder: fast model selection for accurate phylogenetic estimatesNat. Methods 14:587–589https://doi.org/10.1038/nmeth.4285
- 61.Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotationNucleic Acids Res 49:W293–W296https://doi.org/10.1093/nar/gkab301
- 62.A language and environment for statistical computing
- 63.pyseer: A comprehensive tool for microbial pangenome-wide association studiesBioinformatics 34:4310–4312https://doi.org/10.1093/bioinformatics/bty539
- 64.Mash: fast genome and metagenome distance estimation using MinHashGenome Biol 17https://doi.org/10.1186/s13059-016-0997-x
- 65.ggplot2: Elegant Graphics for Data AnalysisNew York: Springer-Verlag https://doi.org/10.1007/978-3-319-24277-4
- 66.Transcriptome remodeling contributes to epidemic disease caused by the human pathogen Streptococcus pyogenesmBio 7:1–14https://doi.org/10.1128/mBio.00403-16
- 67.EMBOSS: the European Molecular Biology Open Software SuiteTrends Genet 16:276–277https://doi.org/10.1016/s0168-9525(00)02024-2
- 68.lDDT: a local superposition-free score for comparing protein structures and models using distance difference testsBioinformatics 29:2722–2728https://doi.org/10.1093/bioinformatics/btt473
- 69.Benchmarking AlphaFold for protein complex modeling reveals accuracy determinantsProtein Sci 31https://doi.org/10.1002/pro.4379
- 70.SOSUI: classification and secondary structure prediction system for membrane proteinsBioinformatics 14:378–379https://doi.org/10.1093/bioinformatics/14.4.378
- 71.Thermosensitive suicide vectors for gene replacement in Streptococcus suisPlasmid 46:140–148https://doi.org/10.1006/PLAS.2001.1532
- 72.featureCounts: an efficient general purpose program for assigning sequence reads to genomic featuresBioinformatics 30:923–930https://doi.org/10.1093/bioinformatics/btt656
- 73.iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq dataBMC Bioinformatics 19https://doi.org/10.1186/s12859-018-2486-6
- 74.Differentiation of group A streptococci with a common R antigen into three serological types, with special reference to the bactericidal testJ. Exp. Med 106:525–544https://doi.org/10.1084/jem.106.4.525
- 75.Identification of evolutionarily conserved virulence factor by selective pressure analysis of Streptococcus pneumoniae. CommunBiol 2https://doi.org/10.1038/s42003-019-0340-7
- 76.Pneumococcal BgaA Promotes Host Organ Bleeding and Coagulation in a Mouse Sepsis ModelFront. Cell. Infect. Microbiol 12https://doi.org/10.3389/fcimb.2022.844000
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Copyright
© 2025, Ono et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 40
- downloads
- 0
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.