Core genes can have higher recombination rates than accessory genes within global microbial populations

  1. Asher Preska Steinberg
  2. Mingzhi Lin
  3. Edo Kussell  Is a corresponding author
  1. New York University, United States


Recombination is essential to microbial evolution, and is involved in the spread of antibiotic resistance, antigenic variation, and adaptation to the host niche. However, assessing the impact of homologous recombination on accessory genes which are only present in a subset of strains of a given species remains challenging due to their complex phylogenetic relationships. Quantifying homologous recombination for accessory genes (which are important for niche-specific adaptations) in comparison to core genes (which are present in all strains and have essential functions) is critical to understanding how selection acts on variation to shape species diversity and genome structures of bacteria. Here, we apply a computationally efficient, non-phylogenetic approach to measure homologous recombination rates in the core and accessory genome using >100,000 whole genome sequences from Streptococcus pneumoniae and several additional species. By analyzing diverse sets of sequence clusters, we show that core genes often have higher recombination rates than accessory genes, and for some bacterial species the associated effect sizes for these differences are pronounced. In a subset of species, we find that gene frequency and homologous recombination rate are positively correlated. For S. pneumoniae and several additional species, we find that while the recombination rate is higher for the core genome, the mutational divergence is lower, indicating that divergence-based homologous recombination barriers could contribute to differences in recombination rates between the core and accessory genome. Homologous recombination may therefore play a key role in increasing the efficiency of selection in the most conserved parts of the genome.

Data availability

Lists of SRA accession numbers corresponding to the raw reads used to build the multi-sequence alignments analyzed in this manuscript are included as Figure 2 - source data 1 and Figure 3 - source data 1. All SRA files, reference genomes, and complete genome assemblies are available through NCBI. All sequence collections used are listed in Supplementary File 5. For the PubMLST sequence collections, PubMLST was used to identify whole genome sequences (by filtering for strains in the 'Genome Collection' of each species where the sequence length is at least that of the reference genome), then the raw reads were downloaded from NCBI using their SRA numbers. Accession numbers for reference genomes used for each microbial species are also listed in Supplementary File 5.All original code has been deposited at GitHub and is publicly available. Links are given below:-

The following previously published data sets were used

Article and author information

Author details

  1. Asher Preska Steinberg

    Department of Biology, New York University, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  2. Mingzhi Lin

    Department of Biology, New York University, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Edo Kussell

    Department of Biology, New York University, New York, United States
    For correspondence
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0590-4036


National Institutes of Health (R01-GM097356)

  • Edo Kussell

Simons Foundation (Simons Foundation Awardee of the Life Sciences Research Foundation)

  • Asher Preska Steinberg

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Paul B Rainey, Max Planck Institute for Evolutionary Biology, Germany

Publication history

  1. Received: March 10, 2022
  2. Accepted: June 30, 2022
  3. Accepted Manuscript published: July 8, 2022 (version 1)


© 2022, Preska Steinberg et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.


  • 750
    Page views
  • 256
  • 0

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Asher Preska Steinberg
  2. Mingzhi Lin
  3. Edo Kussell
Core genes can have higher recombination rates than accessory genes within global microbial populations
eLife 11:e78533.

Further reading

    1. Cell Biology
    2. Microbiology and Infectious Disease
    Alice L Herneisen et al.
    Research Article

    Apicomplexan parasites cause persistent mortality and morbidity worldwide through diseases including malaria, toxoplasmosis, and cryptosporidiosis. Ca2+ signaling pathways have been repurposed in these eukaryotic pathogens to regulate parasite-specific cellular processes governing the replicative and lytic phases of the infectious cycle, as well as the transition between them. Despite the presence of conserved Ca2+-responsive proteins, little is known about how specific signaling elements interact to impact pathogenesis. We mapped the Ca2+-responsive proteome of the model apicomplexan T. gondii via time-resolved phosphoproteomics and thermal proteome profiling. The waves of phosphoregulation following PKG activation and stimulated Ca2+ release corroborate known physiological changes but identify specific proteins operating in these pathways. Thermal profiling of parasite extracts identified many expected Ca2+-responsive proteins, such as parasite Ca2+-dependent protein kinases. Our approach also identified numerous Ca2+-responsive proteins that are not predicted to bind Ca2+, yet are critical components of the parasite signaling network. We characterized protein phosphatase 1 (PP1) as a Ca2+-responsive enzyme that relocalized to the parasite apex upon Ca2+ store release. Conditional depletion of PP1 revealed that the phosphatase regulates Ca2+ uptake to promote parasite motility. PP1 may thus be partly responsible for Ca2+-regulated serine/threonine phosphatase activity in apicomplexan parasites.

    1. Microbiology and Infectious Disease
    Zikai Zhao et al.
    Research Article

    Zika virus (ZIKV) can be transmitted from mother to fetus during pregnancy, causing adverse fetal outcomes. Several studies have indicated that ZIKV can damage the fetal brain directly; however, whether the ZIKV-induced maternal placental injury contributes to adverse fetal outcomes is sparsely defined. Here, we demonstrated that ZIKV causes the pyroptosis of placental cells by activating the executor gasdermin E (GSDME) in vitro and in vivo. Mechanistically, TNF-α release is induced upon the recognition of viral genomic RNA by RIG-I, followed by activation of caspase-8 and caspase-3 to ultimately escalate the GSDME cleavage. Further analyses revealed that the ablation of GSDME or treatment with TNF-α receptor antagonist in ZIKV-infected pregnant mice attenuates placental pyroptosis, which consequently confers protection against adverse fetal outcomes. In conclusion, our study unveils a novel mechanism of ZIKV-induced adverse fetal outcomes via causing placental cell pyroptosis, which provides new clues for developing therapies for ZIKV-associated diseases.