Core genes can have higher recombination rates than accessory genes within global microbial populations
Abstract
Recombination is essential to microbial evolution, and is involved in the spread of antibiotic resistance, antigenic variation, and adaptation to the host niche. However, assessing the impact of homologous recombination on accessory genes which are only present in a subset of strains of a given species remains challenging due to their complex phylogenetic relationships. Quantifying homologous recombination for accessory genes (which are important for niche-specific adaptations) in comparison to core genes (which are present in all strains and have essential functions) is critical to understanding how selection acts on variation to shape species diversity and genome structures of bacteria. Here, we apply a computationally efficient, non-phylogenetic approach to measure homologous recombination rates in the core and accessory genome using >100,000 whole genome sequences from Streptococcus pneumoniae and several additional species. By analyzing diverse sets of sequence clusters, we show that core genes often have higher recombination rates than accessory genes, and for some bacterial species the associated effect sizes for these differences are pronounced. In a subset of species, we find that gene frequency and homologous recombination rate are positively correlated. For S. pneumoniae and several additional species, we find that while the recombination rate is higher for the core genome, the mutational divergence is lower, indicating that divergence-based homologous recombination barriers could contribute to differences in recombination rates between the core and accessory genome. Homologous recombination may therefore play a key role in increasing the efficiency of selection in the most conserved parts of the genome.
Data availability
Lists of SRA accession numbers corresponding to the raw reads used to build the multi-sequence alignments analyzed in this manuscript are included as Figure 2 - source data 1 and Figure 3 - source data 1. All SRA files, reference genomes, and complete genome assemblies are available through NCBI. All sequence collections used are listed in Supplementary File 5. For the PubMLST sequence collections, PubMLST was used to identify whole genome sequences (by filtering for strains in the 'Genome Collection' of each species where the sequence length is at least that of the reference genome), then the raw reads were downloaded from NCBI using their SRA numbers. Accession numbers for reference genomes used for each microbial species are also listed in Supplementary File 5.All original code has been deposited at GitHub and is publicly available. Links are given below:- https://github.com/kussell-lab/mcorr- https://github.com/kussell-lab/mcorr-clustering- https://github.com/kussell-lab/ReferenceAlignmentGenerator- https://github.com/kussell-lab/PangenomeAlignmentGenerator
Article and author information
Author details
Funding
National Institutes of Health (R01-GM097356)
- Edo Kussell
Simons Foundation (Simons Foundation Awardee of the Life Sciences Research Foundation)
- Asher Preska Steinberg
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2022, Preska Steinberg et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 3,857
- views
-
- 541
- downloads
-
- 22
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Epidemiology and Global Health
- Microbiology and Infectious Disease
Several areas of the world suffer a notably high incidence of Shiga toxin-producing Escherichia coli. To assess the impact of persistent cross-species transmission systems on the epidemiology of E. coli O157:H7 in Alberta, Canada, we sequenced and assembled E. coli O157:H7 isolates originating from collocated cattle and human populations, 2007–2015. We constructed a timed phylogeny using BEAST2 using a structured coalescent model. We then extended the tree with human isolates through 2019 to assess the long-term disease impact of locally persistent lineages. During 2007–2015, we estimated that 88.5% of human lineages arose from cattle lineages. We identified 11 persistent lineages local to Alberta, which were associated with 38.0% (95% CI 29.3%, 47.3%) of human isolates. During the later period, six locally persistent lineages continued to be associated with human illness, including 74.7% (95% CI 68.3%, 80.3%) of reported cases in 2018 and 2019. Our study identified multiple locally evolving lineages transmitted between cattle and humans persistently associated with E. coli O157:H7 illnesses for up to 13 y. Locally persistent lineages may be a principal cause of the high incidence of E. coli O157:H7 in locations such as Alberta and provide opportunities for focused control efforts.
-
- Microbiology and Infectious Disease
Chlamydia trachomatis is an obligate intracellular bacterial pathogen with a unique developmental cycle. It differentiates between two functional and morphological forms: the elementary body (EB) and the reticulate body (RB). The signals that trigger differentiation from one form to the other are unknown. EBs and RBs have distinctive characteristics that distinguish them, including their size, infectivity, proteome, and transcriptome. Intriguingly, they also differ in their overall redox status as EBs are oxidized and RBs are reduced. We hypothesize that alterations in redox may serve as a trigger for secondary differentiation. To test this, we examined the function of the primary antioxidant enzyme alkyl hydroperoxide reductase subunit C (AhpC), a well-known member of the peroxiredoxins family, in chlamydial growth and development. Based on our hypothesis, we predicted that altering the expression of ahpC would modulate chlamydial redox status and trigger earlier or delayed secondary differentiation. Therefore, we created ahpC overexpression and knockdown strains. During ahpC knockdown, ROS levels were elevated, and the bacteria were sensitive to a broad set of peroxide stresses. Interestingly, we observed increased expression of EB-associated genes and concurrent higher production of EBs at an earlier time in the developmental cycle, indicating earlier secondary differentiation occurs under elevated oxidation conditions. In contrast, overexpression of AhpC created a resistant phenotype against oxidizing agents and delayed secondary differentiation. Together, these results indicate that redox potential is a critical factor in developmental cycle progression. For the first time, our study provides a mechanism of chlamydial secondary differentiation dependent on redox status.