Research Article

Major genetic discontinuity and novel toxigenic species in Clostridioides difficile taxonomy

Murdoch University, Australia
University of Western Australia, Australia
University of Cape Town, South Africa
Universidad Andrés Bello, Chile
Universidad Andrés Bello, United Kingdom
University of Warwick, United Kingdom
University of Oxford, United Kingdom
Universidad de Costa Rica, Costa Rica

Jun 11, 2021

Open access
Copyright information

Abstract
Data availability
Article and author information
Metrics

Abstract

Clostridioides difficile infection (CDI) remains an urgent global One Health threat. The genetic heterogeneity seen across C. difficile underscores its wide ecological versatility and has driven the significant changes in CDI epidemiology seen in the last 20 years. We analysed an international collection of over 12,000 C. difficile genomes spanning the eight currently defined phylogenetic clades. Through whole-genome average nucleotide identity, and pangenomic and Bayesian analyses, we identified major taxonomic incoherence with clear species boundaries for each of the recently described cryptic clades CI-III. The emergence of these three novel genomospecies predates clades C1-5 by millions of years, rewriting the global population structure of C. difficile specifically and taxonomy of the Peptostreptococcaceae in general. These genomospecies all show unique and highly divergent toxin gene architecture, advancing our understanding of the evolution of C. difficile and close relatives. Beyond the taxonomic ramifications, this work may impact the diagnosis of CDI.

Data availability

All data generated or analysed during this study are included in the manuscript and Supplementary Data which is hosted at Figshare http://doi.org/10.6084/m9.figshare.12471461.Data files on figshare include:[1] Full MLST data for all 12000+ C. difficile genomes (Fig 1).[2] Whole-genome ANI analyses (Table 1, Fig 3, Fig 5).[3] Tree files for phylogenetic analyses (Fig 2, Fig 4).[4] Pangenome data (Fig 6).[5] Pan-GWAS data (Table 2).[6] Comparative genomic analysis of virulence gene architecture (Fig 7).Note: Regarding the question below - Did your work use any previously published datasets (e.g., DNA sequence data, clinical trial data, field data)?We retrieved the entire collection of C. difficile genomes (taxid ID 1496) held at the NCBI Sequence Read Archive [https://www.ncbi.nlm.nih.gov/sra/]. The raw dataset (as of 1st January 2020) comprised 12,621 genomes. These genomes comprise hundreds, maybe thousands of publications. The individual accession numbers for all genomes analysed in this study are provided in the Supplementary Data at http://doi.org/10.6084/m9.figshare.12471461.

Article and author information

Author details

Daniel R Knight

Murdoch University, Murdoch, Australia

For correspondence
daniel.knight@murdoch.edu.au

Competing interests
No competing interests declared.

"This ORCID iD identifies the author of this article:" 0000-0002-9480-4733
Korakrit Imwattana

School of Biomedical Sciences, University of Western Australia, Nedlands, Australia

Competing interests
No competing interests declared.

"This ORCID iD identifies the author of this article:" 0000-0002-2538-9775
Brian Kullin

Department of Pathology, University of Cape Town, Cape Town, South Africa

Competing interests
No competing interests declared.

"This ORCID iD identifies the author of this article:" 0000-0001-5460-1977
Enzo Guerrero-Araya

Microbiota-Host Interactions and Clostridia Research Group, Universidad Andrés Bello, Santiago, Chile

Competing interests
No competing interests declared.
Daniel Paredes-Sabja

Microbiota-Host Interactions and Clostridia Research Group, Universidad Andrés Bello, Santiago, United Kingdom

Competing interests
No competing interests declared.
Xavier Didelot

University of Warwick, Coventry, United Kingdom

Competing interests
No competing interests declared.

"This ORCID iD identifies the author of this article:" 0000-0003-1885-500X
Kate E Dingle

Nuffield Department of Clinical Medicine, University of Oxford, Oxford, United Kingdom

Competing interests
No competing interests declared.
David W Eyre

Big Data Institute, University of Oxford, Oxford, United Kingdom

Competing interests
David W Eyre, DWE declares lecture fees from Gilead, outside the submitted work..

"This ORCID iD identifies the author of this article:" 0000-0001-5095-6367
César Rodríguez

Facultad de Microbiología & Centro de Investigación en Enfermedades Tropicales (CIET), Universidad de Costa Rica, San José, Costa Rica

Competing interests
No competing interests declared.

"This ORCID iD identifies the author of this article:" 0000-0001-5599-0652
Thomas V Riley

School of Biomedical Sciences, University of Western Australia, Nedlands, Australia

For correspondence
thomas.riley@uwa.edu.au

Competing interests
No competing interests declared.

"This ORCID iD identifies the author of this article:" 0000-0002-1351-3740

Funding

Raine Medical Research Foundation

Daniel R Knight

National Health and Medical Research Council

Daniel R Knight

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.