Core genes can have higher recombination rates than accessory genes within global microbial populations
Abstract
Recombination is essential to microbial evolution, and is involved in the spread of antibiotic resistance, antigenic variation, and adaptation to the host niche. However, assessing the impact of homologous recombination on accessory genes which are only present in a subset of strains of a given species remains challenging due to their complex phylogenetic relationships. Quantifying homologous recombination for accessory genes (which are important for niche-specific adaptations) in comparison to core genes (which are present in all strains and have essential functions) is critical to understanding how selection acts on variation to shape species diversity and genome structures of bacteria. Here, we apply a computationally efficient, non-phylogenetic approach to measure homologous recombination rates in the core and accessory genome using >100,000 whole genome sequences from Streptococcus pneumoniae and several additional species. By analyzing diverse sets of sequence clusters, we show that core genes often have higher recombination rates than accessory genes, and for some bacterial species the associated effect sizes for these differences are pronounced. In a subset of species, we find that gene frequency and homologous recombination rate are positively correlated. For S. pneumoniae and several additional species, we find that while the recombination rate is higher for the core genome, the mutational divergence is lower, indicating that divergence-based homologous recombination barriers could contribute to differences in recombination rates between the core and accessory genome. Homologous recombination may therefore play a key role in increasing the efficiency of selection in the most conserved parts of the genome.
Data availability
Lists of SRA accession numbers corresponding to the raw reads used to build the multi-sequence alignments analyzed in this manuscript are included as Figure 2 - source data 1 and Figure 3 - source data 1. All SRA files, reference genomes, and complete genome assemblies are available through NCBI. All sequence collections used are listed in Supplementary File 5. For the PubMLST sequence collections, PubMLST was used to identify whole genome sequences (by filtering for strains in the 'Genome Collection' of each species where the sequence length is at least that of the reference genome), then the raw reads were downloaded from NCBI using their SRA numbers. Accession numbers for reference genomes used for each microbial species are also listed in Supplementary File 5.All original code has been deposited at GitHub and is publicly available. Links are given below:- https://github.com/kussell-lab/mcorr- https://github.com/kussell-lab/mcorr-clustering- https://github.com/kussell-lab/ReferenceAlignmentGenerator- https://github.com/kussell-lab/PangenomeAlignmentGenerator
Article and author information
Author details
Funding
National Institutes of Health (R01-GM097356)
- Edo Kussell
Simons Foundation (Simons Foundation Awardee of the Life Sciences Research Foundation)
- Asher Preska Steinberg
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2022, Preska Steinberg et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 3,956
- views
-
- 548
- downloads
-
- 23
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Microbiology and Infectious Disease
- Neuroscience
Glial cells of the enteric nervous system (ENS) interact closely with the intestinal epithelium and secrete signals that influence epithelial cell proliferation and barrier formation in vitro. Whether these interactions are important in vivo, however, is unclear because previous studies reached conflicting conclusions (Prochera and Rao, 2023). To better define the roles of enteric glia in steady state regulation of the intestinal epithelium, we characterized the glia in closest proximity to epithelial cells and found that the majority express the gene Proteolipid protein 1 (PLP1) in both mice and humans. To test their functions using an unbiased approach, we genetically depleted PLP1+ cells in mice and transcriptionally profiled the small and large intestines. Surprisingly, glial loss had minimal effects on transcriptional programs and the few identified changes varied along the gastrointestinal tract. In the ileum, where enteric glia had been considered most essential for epithelial integrity, glial depletion did not drastically alter epithelial gene expression but caused a modest enrichment in signatures of Paneth cells, a secretory cell type important for innate immunity. In the absence of PLP1+ glia, Paneth cell number was intact, but a subset appeared abnormal with irregular and heterogenous cytoplasmic granules, suggesting a secretory deficit. Consistent with this possibility, ileal explants from glial-depleted mice secreted less functional lysozyme than controls with corresponding effects on fecal microbial composition. Collectively, these data suggest that enteric glia do not exert broad effects on the intestinal epithelium but have an essential role in regulating Paneth cell function and gut microbial ecology.
-
- Microbiology and Infectious Disease
The latent HIV reservoir is a major barrier to HIV cure. Combining latency reversal agents (LRAs) with differing mechanisms of action such as AZD5582, a non-canonical NF-kB activator, and I-BET151, a bromodomain inhibitor is appealing toward inducing HIV-1 reactivation. However, even this LRA combination needs improvement as it is inefficient at activating proviruses in cells of people living with HIV (PLWH). We performed a CRISPR screen in conjunction with AZD5582 & I-BET151 and identified a member of the Integrator complex as a target to improve this LRA combination, specifically Integrator complex subunit 12 (INTS12). Integrator functions as a genome-wide attenuator of transcription that acts on elongation through its RNA cleavage and phosphatase modules. Knockout of INTS12 improved latency reactivation at the transcriptional level and is more specific to the HIV-1 provirus than AZD5582 & I-BET151 treatment alone. We found that INTS12 is present on chromatin at the promoter of HIV and therefore its effect on HIV may be direct. Additionally, we observed more RNAPII in the gene body of HIV only with the combination of INTS12 knockout with AZD5582 & I-BET151, indicating that INTS12 induces a transcriptional elongation block to viral reactivation. Moreover, knockout of INTS12 increased HIV-1 reactivation in CD4 T cells from virally suppressed PLWH ex vivo, and we detected viral RNA in the supernatant from CD4 T cells of all three virally suppressed PLWH tested upon INTS12 knockout, suggesting that INTS12 prevents full-length HIV RNA production in primary T cells. Finally, we found that INTS12 more generally limits the efficacy of a variety of LRAs with different mechanisms of action.