Major genetic discontinuity and novel toxigenic species in Clostridioides difficile taxonomy
Abstract
Clostridioides difficile infection (CDI) remains an urgent global One Health threat. The genetic heterogeneity seen across C. difficile underscores its wide ecological versatility and has driven the significant changes in CDI epidemiology seen in the last 20 years. We analysed an international collection of over 12,000 C. difficile genomes spanning the eight currently defined phylogenetic clades. Through whole-genome average nucleotide identity, and pangenomic and Bayesian analyses, we identified major taxonomic incoherence with clear species boundaries for each of the recently described cryptic clades CI-III. The emergence of these three novel genomospecies predates clades C1-5 by millions of years, rewriting the global population structure of C. difficile specifically and taxonomy of the Peptostreptococcaceae in general. These genomospecies all show unique and highly divergent toxin gene architecture, advancing our understanding of the evolution of C. difficile and close relatives. Beyond the taxonomic ramifications, this work may impact the diagnosis of CDI.
Data availability
All data generated or analysed during this study are included in the manuscript and Supplementary Data which is hosted at Figshare http://doi.org/10.6084/m9.figshare.12471461.Data files on figshare include:[1] Full MLST data for all 12000+ C. difficile genomes (Fig 1).[2] Whole-genome ANI analyses (Table 1, Fig 3, Fig 5).[3] Tree files for phylogenetic analyses (Fig 2, Fig 4).[4] Pangenome data (Fig 6).[5] Pan-GWAS data (Table 2).[6] Comparative genomic analysis of virulence gene architecture (Fig 7).Note: Regarding the question below - Did your work use any previously published datasets (e.g., DNA sequence data, clinical trial data, field data)?We retrieved the entire collection of C. difficile genomes (taxid ID 1496) held at the NCBI Sequence Read Archive [https://www.ncbi.nlm.nih.gov/sra/]. The raw dataset (as of 1st January 2020) comprised 12,621 genomes. These genomes comprise hundreds, maybe thousands of publications. The individual accession numbers for all genomes analysed in this study are provided in the Supplementary Data at http://doi.org/10.6084/m9.figshare.12471461.
Article and author information
Author details
Funding
Raine Medical Research Foundation
- Daniel R Knight
National Health and Medical Research Council
- Daniel R Knight
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2021, Knight et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 3,383
- views
-
- 432
- downloads
-
- 61
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Genetics and Genomics
N6-methyladenosine (m6A) in eukaryotic RNA is an epigenetic modification that is critical for RNA metabolism, gene expression regulation, and the development of organisms. Aberrant expression of m6A components appears in a variety of human diseases. RNA m6A modification in Drosophila has proven to be involved in sex determination regulated by Sxl and may affect X chromosome expression through the MSL complex. The dosage-related effects under the condition of genomic imbalance (i.e. aneuploidy) are related to various epigenetic regulatory mechanisms. Here, we investigated the roles of RNA m6A modification in unbalanced genomes using aneuploid Drosophila. The results showed that the expression of m6A components changed significantly under genomic imbalance, and affected the abundance and genome-wide distribution of m6A, which may be related to the developmental abnormalities of aneuploids. The relationships between methylation status and classical dosage effect, dosage compensation, and inverse dosage effect were also studied. In addition, we demonstrated that RNA m6A methylation may affect dosage-dependent gene regulation through dosage-sensitive modifiers, alternative splicing, the MSL complex, and other processes. More interestingly, there seems to be a close relationship between MSL complex and RNA m6A modification. It is found that ectopically overexpressed MSL complex, especially the levels of H4K16Ac through MOF, could influence the expression levels of m6A modification and genomic imbalance may be involved in this interaction. We found that m6A could affect the levels of H4K16Ac through MOF, a component of the MSL complex, and that genomic imbalance may be involved in this interaction. Altogether, our work reveals the dynamic and regulatory role of RNA m6A modification in unbalanced genomes, and may shed new light on the mechanisms of aneuploidy-related developmental abnormalities and diseases.
-
- Genetics and Genomics
The DYRK1A enzyme is a pivotal contributor to frequent and severe episodes of otitis media in Down syndrome, positioning it as a promising target for therapeutic interventions.