Regional sequencing collaboration reveals persistence of the T12 Vibrio cholerae O1 lineage in West Africa

  1. Eme Ekeng
  2. Serges Tchatchouang
  3. Blaise Akenji
  4. Bassira Boubacar Issaka
  5. Ifeoluwa Akintayo
  6. Christopher Chukwu
  7. Ibrahim Dan Dano
  8. Sylvie Melingui
  9. Sani Ousmane
  10. Michael Oladotun Popoola
  11. Ariane Nzouankeu
  12. Yap Boum
  13. Francisco Luquero
  14. Anthony Ahumibe
  15. Dhamari Naidoo
  16. Andrew Azman
  17. Justin Lessler
  18. Shirlee Wohl  Is a corresponding author
  1. Nigeria Centre for Disease Control, Nigeria
  2. Centre Pasteur du Cameroun, Cameroon
  3. National Public Health Laboratory, Cameroon
  4. Centre de Recherche Médicale et Sanitaire, Niger
  5. Department of Pharmaceutical Microbiology, Faculty of Pharmacy, University of Ibadan, Nigeria
  6. Epicentre, France
  7. World Health Organization Nigeria, Nigeria
  8. Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, United States

Abstract

Background:

Despite recent insights into cholera transmission patterns in Africa, regional and local dynamics in West Africa—where cholera outbreaks occur every few years—are still poorly understood. Coordinated genomic surveillance of Vibrio cholerae in the areas most affected may reveal transmission patterns important for cholera control.

Methods:

During a regional sequencing workshop in Nigeria, we sequenced 46 recent V. cholerae isolates from Cameroon, Niger, and Nigeria (37 from 2018 to 2019) to better understand the relationship between the V. cholerae bacterium circulating in these three countries.

Results:

From these isolates, we generated 44 whole Vibrio cholerae O1 sequences and analyzed them in the context of 1280 published V. cholerae O1 genomes. All sequences belonged to the T12 V. cholerae seventh pandemic lineage.

Conclusions:

Phylogenetic analysis of newly generated and previously published V. cholerae genomes suggested that the T12 lineage has been continuously transmitted within West Africa since it was first observed in the region in 2009, despite lack of reported cholera in the intervening years. The results from this regional sequencing effort provide a model for future regionally coordinated surveillance efforts.

Funding:

Funding for this project was provided by Bill and Melinda Gates Foundation OPP1195157.

Introduction

Molecular characterization of seventh pandemic Vibrio cholerae has led to new insights about global cholera transmission and has highlighted the important role of transmission within and between Asia and Africa (Weill et al., 2017). Combining genomic and epidemiological data may allow us to better understand the dynamics of ongoing outbreaks, and advances in sequencing technology have recently made whole genome sequencing more feasible even in low-resource settings.

Although recent studies have used whole genome sequences of V. cholerae O1 to better understand the global movement patterns of seventh pandemic cholera in sub-Saharan Africa and elsewhere (Domman et al., 2017; Mutreja et al., 2011; Weill et al., 2019; Wick et al., 2017), regional and local dynamics in West Africa—which reports cholera regularly—are still poorly understood. Epidemiological data suggest outbreaks across this region may be connected (UNICEF, 2013a; UNICEF, 2013b; UNICEF, 2013c), but the nature of recurring outbreaks in the region has yet to be understood. It is unclear if cases go unreported in years between outbreaks or if each represents a distinct pandemic V. cholerae introduction from outside the region. Discerning between these two transmission scenarios is important for the development of locally adapted and effective cholera control and prevention strategies.

One way to distinguish between transmission scenarios is to compare the specific V. cholerae lineages in circulation. Previous studies initially identified three waves of seventh pandemic V. cholerae (Mutreja et al., 2011), which were later broken down into 13 plausible introduction events from Asia into Africa (Weill et al., 2019; Wick et al., 2017), termed T1–T13. These studies used sequencing of historical isolates to identify the predominant lineages circulating in different regions of the African continent since 1970, providing important information about global and continent-wide circulation patterns. In this manuscript, we aim to use additional sequence data from recent cholera outbreaks to improve our understanding of more detailed V. cholerae transmission patterns within West Africa, and to understand if regionally coordinated surveillance and response may improve containment outcomes.

Approach: a regional sequencing collaboration

In October 2019, researchers from Cameroon, Niger, Nigeria, and the United States came together to discuss how whole genome sequencing could contribute to understanding of cholera spread in these countries. Participants brought V. cholerae isolates (collected from medically attended cholera cases in cholera treatment centers or health centers in their countries) to the one-week workshop in Abuja, Nigeria, and were trained in pathogen sequencing methods using the portable Oxford Nanopore MinION platform. The workshop cohort was predominantly laboratory scientists from regional and national health laboratories who specialize in V. cholerae and microbiology techniques; individuals who were well-poised to master the laboratory demands of whole genome sequencing and had a deep understanding of cholera in the region. Bioinformatic analysis was also a central component of the training, and participants focused on analyzing data generated during the workshop on specialized laptop computers. Sequencing and analysis of recent isolates during the training allowed for joint, regionally focused interpretation of the data during and after the workshop.

In addition to the practical, hands-on sequencing training, a key component of this workshop was bringing together personal experience and detailed epidemiological data from the three countries, which we used to select the specific V. cholerae isolates to be sequenced during the workshop — isolates that would provide the best understanding of the 2018–2019 outbreak. During this outbreak, Cameroon, Niger, and Nigeria all reported large numbers of cholera cases (2066, 3083, and 32,752 cases, respectively) (Nigeria Centre for Disease Control, 2019; Regional Cholera Platform in West and Central Africa, 2012) despite several years of low reported cholera incidence in the region (Figure 1A). In all countries, reported cases were defined according to the World Health Organization suspected case definition (Global Task Force on Cholera Control Surveillance Working Group, 2017): individuals with acute watery diarrhoea and severe dehydration in areas where a cholera outbreak has not been declared, and individuals with acute watery diarrhoea in areas where an outbreak has been declared.

Cholera cases and sequenced isolates.

(A) Weekly suspected cholera cases for Cameroon (orange), Niger (blue), and Nigeria (green) from 2010 through 2019 (‘Regional Cholera Platform in West and Central Africa,’ n.d.) (Figure 1—source data 1). Suspected cases are defined as recommended by the World Health Organization (Global Task Force on Cholera Control Surveillance Working Group, 2017). Red points: weeks with no more than five suspected cases reported across all three countries. Arrows: collection dates of isolates sequenced. Collection dates provided as year only (n=2) are plotted on January 1 of their given year. (B) Map of Cameroon, Niger, and Nigeria. Colors are as in (A). White points: location of sequenced isolates collected in 2018 and 2019 (Figure 1—source data 2). Black crosses: location of sequenced isolates collected prior to 2018. One isolate with unknown sub-country location is not shown.

Figure 1—source data 1

Case counts by epidemiological week in Niger, Nigeria, and Cameroon, 2010 to 2019.

https://cdn.elifesciences.org/articles/65159/elife-65159-fig1-data1-v1.zip
Figure 1—source data 2

Sample metadata for isolates sequenced in this study.

https://cdn.elifesciences.org/articles/65159/elife-65159-fig1-data2-v1.zip

To better understand the history of cholera transmission in the region, we spent the beginning of the workshop discussing previous sequencing efforts of historical V. cholerae isolates from the region and existing gaps in our understanding of transmission. Previously published sequences from the three countries include 45 sequences from Cameroon from 1970 to 2011, 15 sequences from Niger from 1970 to 2010, and 22 sequences from Nigeria from 1970 to 2014 (Weill et al., 2019; Wick et al., 2017). These data show that the T1 lineage was the predominant lineage circulating in Cameroon, Nigeria, and Nigeria in 1970, followed by the T7 and T9 lineages in the 1990s and early 2000s (Weill et al., 2019; Wick et al., 2017). Since 2009, only T12 has been observed in these countries, though at the start of the workshop prior sequences were available only through 2014. In any given year, the circulating V. cholerae lineage was the same in Cameroon, Niger, and Nigeria (Weill et al., 2019; Wick et al., 2017), providing additional evidence that outbreaks in these countries are connected. As five years had elapsed since the last published V. cholerae O1 genome in the region, we aimed to generate a contemporary snapshot of recent V. cholerae O1 outbreaks to determine if recent cases were due to a new introduction of the pathogen, and to gain new insights into regional cholera transmission dynamics and spread of antibiotic resistance. Simultaneously, we discussed plans for building a regional cholera genomics network to help inform cholera preparedness and outbreak response in the future.

Materials and methods

Sample collection and cholera confirmation

Request a detailed protocol

V. cholerae isolates were collected from clinically confirmed cholera cases in Cameroon, Niger, and Nigeria between 2011 and 2019 (Figure 1—source data 2). In Cameroon, isolates came from cholera treatment centers from throughout the country. In Niger, isolates came from health centers and cholera treatment centers during outbreaks in the three southern provinces (Dosso, Maradi and Zinder), as well as from the province of Tillabery. And in Nigeria, isolates came from cholera treatment centers in the Northeastern and Northwestern regions of the country. In all three countries, suspected cholera stools were cultured on TCBS medium. Phenotypic identification of V. cholerae colonies was based on morphology, motility, and biochemical characteristics (positive oxidase, saccharose, indole, and gelatinase). V. cholerae isolates were confirmed with agglutination tests with anti-O1 or anti-O139 serum (WHO antisera and Denka Seiken Agglutinating Sera Vibrio cholerae Antisera Set). Two isolates from Nigeria that did not show agglutination were identified as non-O1 Vibrio cholerae through Gram staining (Gram negative), oxidase testing (positive), and the API (Analytical Profile Index) 20E kit at the National Reference Laboratory in Nigeria.

DNA extraction and quantification

Request a detailed protocol

DNA extraction was performed at Centre Pasteur du Cameroun (Yaounde, Cameroun), Centre de Recherche Médicale et Sanitaire (Niamey, Niger), and at National Reference Laboratory (NRL; Nigeria Centre for Disease Control, Abuja, Nigeria). Confirmed V. cholerae isolates were cultured on selective TCBS, MH and/or HCK agar plates and incubated at 37°C overnight. In all countries, V. cholerae DNA was extracted following the standard protocol from the Qiagen QIAamp DNA Mini Kit with a final elution volume of 200 µL. Extracted DNA from V. cholerae isolates were stored in 1.5 mL Eppendorf tubes at 4°C or at room temperature until quantification during the workshop at NRL. DNA concentrations were measured with the Qubit Fluorometer 4.0 (Thermo Fisher) using the dsDNA HS assay standard protocol. Input concentrations for each isolate are recorded in Figure 2—source data 1.

Oxford nanopore library construction and sequencing

Request a detailed protocol

Library preparation and sequencing was performed at the NRL or at Johns Hopkins University (Baltimore, Maryland, United States). In both locations, quantified, extracted DNA was diluted to 1000 ng in 48 µL input material, as required by the SQK-LSK109 library preparation kit from Oxford Nanopore Technologies. For samples with a concentration below 20 ng/µL, 48 µL of extracted DNA was used, regardless of the final amount (<1000 ng).

DNA repair and end-prep was performed with 2 µL NEBNext FFPE DNA Repair Mix, 3.5 µL NEBNext FFPE DNA Repair Buffer, 3 µL Ultra II End-Prep Reaction Mix, and 3.5 µL Ultra II End-Prep Enzyme Buffer (New England Biolabs). The reaction was incubated at 20°C for 20 min and then 65°C for 10 min, and then purified with 1x AMPure XP beads (Beckman Coulter) and eluted in 25 µL Elution Buffer (Promega). A unique 2.5 µL Native Barcode (Oxford Nanopore EXP-NBD104) and 25 µL Blunt/TA Ligase Master Mix (New England Biolabs) were added to dA-tailed DNA and the reaction mixture was incubated for 30 min at room temperature. As before, the reaction was purified with 1x AMPure XP beads. The purified DNA was eluted in 10 µL Elution Buffer. Equimolar amounts of six to nine isolates were pooled to an approximate total DNA amount of 1000 ng, and the pool was diluted to 65 µL in Elution Buffer. The barcoded pool was added to 5 µL Adapter Mix II (Oxford Nanopore EXP-NBD104), 20 µL NEBNext Quick Ligation Reaction Buffer (New England Biolabs), and 10 µL Quick T4 DNA Ligase (New England Biolabs). The reaction was incubated at room temperature for 30 min and purified with 0.5x AMPure XP beads, using Long Fragment Buffer (Oxford Nanopore SQK-LSK109) instead of ethanol in the wash steps. The final library was eluted in 15 µL Elution Buffer and quantified using the Qubit Fluorometer 4.0 to confirm the process had been successful.

For sequencing, 190 ng pooled final library was diluted to 12 µL in Elution Buffer and combined with 37.5 µL Sequencing Buffer and 25.5 µL Loading Beads (Oxford Nanopore SQK-LSK109) before loading onto a primed R9.4.1 Oxford Nanopore MinION flowcell. For each sequencing run, the MinION was run for 48 hr using an offline-capable version of MinKNOW made available by Oxford Nanopore Technologies specifically for this project.

All sequencing files were basecalled using Guppy version 3.0.3 with the flip-flop model (dna_r9.4.1_450bps_fast.cfg). Adapters were removed from reads and reads were demultiplexed into individual samples with Porechop (Wick et al., 2017). Filtlong (https://github.com/rrwick/Filtlong) was used with the following options to reduce file size and filter out low quality sequencing reads: ‘--keep_percent 90 --target_bases 800000000.’ The resulting filtered FASTQ files were used as input for all subsequent analyses, including assembly.

Reference-based genome assembly

Request a detailed protocol

Reference-based assembly was performed with nanopolish version 0.11.1 (Loman et al., 2015) using the seventh pandemic O1 strain N16961 as a reference (accession: AE003852/AE003853). Briefly, reads were indexed with nanopolish, then mapped to the reference with minimap2 version 2.17 (Li, 2018), then sorted and indexed with samtools version 1.10 (Li et al., 2009). Variants to the reference genome were then called with nanopolish, and bcftools version 1.9 (Li, 2011) was used to filter out variants with less than 75% support and sites with less than 100x coverage. A custom Python3 script was used to create consensus genomes from the VCF files, keeping only sites with at least 100x coverage. The full reference-based genome assembly pipeline for V. cholerae is publicly available at: https://github.com/HopkinsIDD/minion-vc (copy archived at swh:1:dir:3180fe441b99128a760fac987cf02a5b4e9bbdaf).

Maximum likelihood estimation

Request a detailed protocol

A maximum likelihood tree was generated on a multiple sequence alignment of the genomes from Cameroon, Niger, and Nigeria as well as 1280 previously published V. cholerae whole genome sequences (Figure 2—source data 3Bwire et al., 2018; Weill et al., 2019; Wick et al., 2017). Recombinant sites were masked as previously described (Weill et al., 2017) (see https://figshare.com/s/d6c1c6f02eac0c9c871e for masking sites) and using Gubbins version 2.3.4 (Croucher et al., 2015). The maximum likelihood tree was generated on the resulting masked SNP alignment using IQ-TREE version 1.6.10 (Nguyen et al., 2015) with a GTR substitution model and 1000 bootstrap iterations. The tree was rooted on A6 (accession: ERR025382). Trees were visualized in FigTree version 1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/).

Vibrio cholerae typing

Request a detailed protocol

Reads from each isolate were mapped to reference sequences ctxA (accession: AF463401), wbeO1 (accession: KC152957), and wbfO139 (accession: AB012956), as described in Greig et al., 2018. Reads were mapped using minimap2 version 2.17 (Li, 2018) and then sorted with samtools version 1.10 (Li et al., 2009). The number of reads mapping at each reference gene site was counted using samtools.

Antibiotic susceptibility assays

Request a detailed protocol

Antibiotic susceptibility testing of V. cholerae isolates were done by disc diffusion on MH agar plates at Centre Pasteur du Cameroun, Centre de Recherche Médicale et Sanitaire, and NRL. The following antibiotics were tested for resistance: amoxicillin/clavulanic acid, nalidixic acid, polymyxin, tetracycline, ciprofloxacin, and chloramphenicol. The interpretation of resistance/susceptibility profiles was done according to the CASFM 2007 guidelines for Enterobacteriaceae.

Antibiotic resistance gene detection

Request a detailed protocol

Genes associated with antibiotic resistance were detected in two ways: first, by using the software abricate (https://github.com/tseemann/abricate (copy archived at swh:1:dir:822d8a8864813a025f248c85d422f5b6a82a5b2a)) with the NCBI (Feldgarden et al., 2019) and CARD (Jia et al., 2017) databases. And second, by translating the nucleotide sequences on the EMBOSS (Rice et al., 2000) website and aligning the resulting translations to target genes downloaded from https://pubmlst.org/vcholerae/ and https://www.ncbi.nlm.nih.gov/ with the CLUSTALW (Thompson et al., 1994) website. Resistance to quinolones was determined by comparing the assemblies to reference strains, as described in previous studies (Chien et al., 2016; Divya et al., 2014; Zhou et al., 2013).

Results

During isolate selection, we prioritized isolates from locations near the borders between Cameroon, Niger, and Nigeria, in order to increase the likelihood of capturing evidence of cross-border transmission. We selected 46 isolates from clinical and lab confirmed cases for sequencing (16 from Cameroon, 15 from Niger, 15 from Nigeria), 37 of which were from 2018 to 2019 cases (Figure 1B, Figure 1—source data 2).

During the workshop, we sequenced these isolates on the Oxford Nanopore MinION platform, producing 44 genomes covering at least 95% of the N16961 reference at >100x coverage (Figure 2, Figure 2—source data 1). We aligned these genomes to 1280 previously published V. cholerae O1 whole genome sequences (Bwire et al., 2018; Irenge et al., 2020; Weill et al., 2019; Wick et al., 2017), which provide a broad overview of global seventh pandemic O1 cholera diversity. The majority of these sequences are from Africa and South Asia, and reported cases in West and Central Africa (World Health Organization, 2019) are well represented by the sequence data (Figure 2—source data 2, Figure 2—figure supplement 1, Figure 2—figure supplement 2). We generated a maximum likelihood tree from these data, which showed that all 44 isolates belong to the T12 lineage. Two sequenced isolates did not align to the reference (NGA_148_2019, NGA_252_2019), and we found that the reads did not map to the wbe or wbf gene clusters associated with the O1 and O139 serogroups, respectively (or the ctxA cholera toxin) (Greig et al., 2018). These results are concordant with laboratory testing performed in Nigeria, which designated these isolates as non-O1 V. cholerae. We also performed antimicrobial resistance (AMR) gene detection on these sequences and compared the results to previously published sequences. AMR profiles were similar to prior sequences from this region, though we detected new mutations in genes (gyrA and gyrB) associated with quinolone resistance (see Supplementary file 1 for AMR results).

Figure 2 with 2 supplements see all
Phylogenetic tree of V. cholerae O1 sequences.

Left: maximum likelihood tree of global V. cholerae isolates. Samples generated in this study are shown in blue (see also, Figure 2—source data 1). Right: zoom view of a portion of the T12 lineage containing V. cholerae genomes generated in this study. Country codes: TGO, Togo; NER, Niger; GHA, Ghana; CMR, Cameroon; NGA, Nigeria; TCD, Chad. Scale bar unit: nucleotide substitutions per site.

Figure 2—source data 1

Table of sequencing metrics for V. cholerae O1 genomes.

https://cdn.elifesciences.org/articles/65159/elife-65159-fig2-data1-v1.xlsx
Figure 2—source data 2

Summary table of published V. cholerae O1 genomes included in this study.

https://cdn.elifesciences.org/articles/65159/elife-65159-fig2-data2-v1.xlsx
Figure 2—source data 3

Accession numbers, references, and basic metadata for sequences included in global phylogeny.

The majority of these data were compiled by Weill et al., and the majority of background sequences were originally published in Weill et al., 2017, Weill et al., 2017, Irenge et al., 2020, and Bwire et al., 2018, among others.

https://cdn.elifesciences.org/articles/65159/elife-65159-fig2-data3-v1.zip
Figure 2—source data 4

Cases reported to the World Health Organization by decade, continent, and African subregion (where applicable), compiled from weekly epidemiological records (World Health Organization, 2019).

https://cdn.elifesciences.org/articles/65159/elife-65159-fig2-data4-v1.zip

Discussion

We found no evidence for a new introduction of V. cholerae O1 from outside of West Africa between 2014—the date of the last previously published isolate from these countries—and 2018. This finding is based on the fact that all newly sequenced isolates from 2018 to 2019 (as well as the 2011–2016 isolates from Niger) fall within the T12 lineage, which was present in West (and parts of Central) Africa from 2009 to 2014. Instead, we found evidence for evolution of the bacterium within the region, as the 2018–2019 sequences are part of a sub-clade containing 2014 and 2016 isolates from countries in West Africa (Figure 2). Within this sub-clade, the 2018–2019 sequences are distinct. In concordance with the epidemiologic data, the high similarity between sequences (including shared mutations in known AMR genes, see Supplementary file 1A and 1B) from the three countries suggests that their outbreaks are linked.

The phylogeny also suggests that nearby countries such as Ghana and Togo may have outbreaks connected to those in Cameroon, Niger, and Nigeria, although it is unclear just how far this cholera transmission region extends. As of this writing, there are no published V. cholerae O1 genomes from the nearby countries of Benin, Burkina Faso, or Mali, for example, from after 2010. Sequences from Central African Republic, South Sudan and Democratic Republic of the Congo from 2011 to 2018 suggest that the T10 lineage is predominant in Central Africa (Irenge et al., 2020; Weill et al., 2019), but more recent data are needed to delineate the boundary between T10 and T12 circulation.

Even so, there is strong evidence that, at least in Cameroon, Niger, and Nigeria, T12 V. cholerae was present throughout 2009–2019 despite few reported cases in 2014–2017. This could be explained by underreporting or under-diagnosing of cases, movement of the same cholera lineage in and out of these countries during that time period, or an environmental reservoir of V. cholerae in the region (such as Lake Chad, although prior studies have provided evidence against this Debes et al., 2016) that is responsible for repeated reintroduction of the same lineage back into the population.

Expanded surveillance and additional sequencing data will be necessary to understand the interconnectedness of populations across countries and achieve global elimination goals such as those outlined in the Global Task Force for Cholera Control Roadmap (Global Task Force for Cholera Control, 2017 ). For example, limiting regional movement could be an ineffective strategy if new lineages are repeatedly and simultaneously introduced from outside the region. Additionally, sequencing data can be used to detect and track resistance to antibiotic treatments, and to identify potential new resistance-conferring mutations present in local isolates (Mashe et al., 2020). A regional cholera surveillance network could facilitate efforts to gather isolates, build the local capacity needed to generate sequencing data, and ensure that these data are interpreted within the regional context.

As a result of the workshop, scientists from Cameroon, Niger, and Nigeria have all been trained in V. cholerae sequencing with the Oxford Nanopore MinION platform, and scientists at Nigeria Centre for Disease Control National Reference Laboratory have used this expertise to continue sequencing V. cholerae and other pathogens. Challenges still exist to continued development of sequencing capacity in West Africa, including the need for additional bioinformatics training and the lack of established supply chains for sequencing reagents. Through subsequent training workshops and ongoing collaborations facilitated through online networks, we are continuing to build sequencing capacity in Cameroon, Niger, and Nigeria, while also expanding our network to include other West African countries. Through this continued training and collaboration, we hope all participating groups will be able to sequence isolates locally from future cholera outbreaks and generate data from other high-priority pathogens. Finally, expanding sequencing knowledge, capacity, and collaboration beyond the original workshop participants will facilitate generation of data needed to control cholera outbreaks in the region.

Data availability

Raw data for all sequenced isolates is available under NCBI BioProject accession: PRJNA616029. Source data files have been provided for Figures 1 and 2, and R code used to make figures has been provided as Source Code File 1.

The following data sets were generated

References

  1. Report
    1. Global Task Force on Cholera Control Surveillance Working Group
    (2017)
    Interim Guidance Document on Cholera Surveillance
    WHO.
  2. Report
    1. Nigeria Centre for Disease Control
    (2019)
    National Monthly Update for Cholera in Nigeria
    NCDC.
  3. Report
    1. UNICEF
    (2013a)
    Cholera Epidemiology and Response Factsheet Niger
    WHO.
  4. Report
    1. UNICEF
    (2013b)
    Cholera Epidemiology and Response Factsheet Nigeria
    WHO.
  5. Report
    1. UNICEF
    (2013c)
    Cholera Epidemiology Response Factsheet Cameroon
    WHO.

Decision letter

  1. Isabel Rodriguez-Barraquer
    Reviewing Editor; University of California, San Francisco, United States
  2. Eduardo Franco
    Senior Editor; McGill University, Canada
  3. Surajit Basak
    Reviewer
  4. Taj Azerian
    Reviewer

Our editorial process produces two outputs: i) public reviews designed to be posted alongside the preprint for the benefit of readers; ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Acceptance summary:

The paper "Regional sequencing collaboration reveals persistence of the T12 Vibrio cholerae O1 lineage in West Africa" presents results from sequencing and analyzing 46 Vibrio cholerae whole genome sequence data. The paper presents findings from a region with little genomic surveillance, and as such these data are valuable. The study was conducted in the context of a regional training, and as such adds value as a potential model for regionally coordinated genomic surveillance efforts in areas where surveillance is limited.

Decision letter after peer review:

Thank you for submitting your article "Regional sequencing collaboration reveals persistence of the T12 Vibrio cholerae O1 lineage in West Africa" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Surajit Basak (Reviewer #1); Taj Azerian (Reviewer #2).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

The paper "Regional sequencing collaboration reveals persistence of the T12 Vibrio cholerae O1 lineage in West Africa" presents results from sequencing and analyzing 46 Vibrio cholerae whole genome sequence data. The paper presents findings from a region with little genomic surveillance, and as such these data are valuable. While the analysis doesn't provide much novelty in terms of understanding cholera transmission, the study was conducted in the context of a regional training, and as such adds value as a potential model for regionally coordinated genomic surveillance efforts in areas where surveillance is limited. However even though it seems that the authors aim to present this as a surveillance model, the current focus of the paper is on the somewhat limited inference made about transmission. We ask the authors to revise the paper, focusing on the regional genomic surveillance effort. In particular we would like the authors to provide more detail on:

1) Where did the isolates come from (e.g., cholera treatment centers, hospitals, or broader active surveillance)?

2) Do they conduct environmental sampling and could this be part of future efforts?

3) Who attended the training? Were they members from regional ministry of health labs, academic institutes etc?

4) Were the attendees laboratorians, bioinformaticians, clinicians etc?

5) Was there an effort to analyze the data, particularly the bioinformatics portion, locally or did the rely 100% on the collaborators at JHU? If the latter, this is probably not a good sustainable model for ongoing genomic epidemiology. If the prior, then were local or regional computing resources used

6) Are they continuing to sequence isolates after the training?

In addition we would like you to address the following points regarding the analyses:

7) A sentence about the cholera case definition would be helpful to clarify what reported suspected cases include. Is this based on symptoms, culture, molecular data?

8) Last paragraph of the introduction – There needs to be a little more preamble before introducing the lineage designations. How are they defined in the population structure of the 7th pandemic?

9) The authors should include some brief statistics on the distribution of regions represented in the 1280 comparison genomes even if these data can be abstracted from the supplemental material. These data would allow the reader to assess the representativeness of the sample. Particular focus should be given to the proportion of the sample from West Africa and the rest of the continent.

10) In the first paragraph of the discussion, it is stated that there is no evidence of introductions during the time periods in question. How was this inferred? Was it completely subjective or based on lineages and genetic distance? This should be clarified. Also, the authors address the limitation of sampling from the region. In particular, there is generally low representation from African countries. Further the T12 lineage is generally prevalent in the region. Therefore, if the determination of importations was made solely on lineages, then it will be difficult to resolve migration among the neighboring countries, as the authors note. Either there should be a more detailed analysis of strains from Africa (e.g., a separate phylogeny like the expanded clade in Figure 2) or clarification that importations are referring to migration from other regions where cholera is endemic e.g., SE Asia.

Reviewer #1 (Recommendations for the authors):

The authors may consider the following suggestions:

1. Is there any characteristics variation of mutational landscape of isolates from three different countries?

2. Does the mutational landscape change yearwise?

Reviewer #2 (Recommendations for the authors):

1) A sentence about the cholera case definition would be helpful to clarify what reported suspected cases include. Is this based on symptoms, culture, molecular data?

2) Last paragraph of the introduction – There needs to be a little more preamble before introducing the lineage designations. How are they defined in the population structure of the 7th pandemic?

3) The abstract is excessively sparse and could be expanded. It would be worthwhile to note that the analysis included 1280 other genomes as well as including the motivation for the analysis.

4) The authors should include some brief statistics on the distribution of regions represented in the 1280 comparison genomes even if these data can be abstracted from the supplemental material. These data would allow the reader to assess the representativeness of the sample. Particular focus should be given to the proportion of the sample from West Africa and the rest of the continent.

5) In the first paragraph of the discussion, it is stated that there is no evidence of introductions during the time periods in question. How was this inferred? Was it completely subjective or based on lineages and genetic distance? This should be clarified. Also, the authors address the limitation of sampling from the region. In particular, there is generally low representation from African countries. Further the T12 lineage is generally prevalent in the region. Therefore, if the determination of importations was made solely on lineages, then it will be difficult to resolve migration among the neighboring countries, as the authors note. Either there should be a more detailed analysis of strains from Africa (e.g., a separate phylogeny like the expanded clade in Figure 2) or clarification that importations are referring to migration from other regions where cholera is endemic e.g., SE Asia.

https://doi.org/10.7554/eLife.65159.sa1

Author response

Essential revisions:

The paper "Regional sequencing collaboration reveals persistence of the T12 Vibrio cholerae O1 lineage in West Africa" presents results from sequencing and analyzing 46 Vibrio cholerae whole genome sequence data. The paper presents findings from a region with little genomic surveillance, and as such these data are valuable. While the analysis doesn't provide much novelty in terms of understanding cholera transmission, the study was conducted in the context of a regional training, and as such adds value as a potential model for regionally coordinated genomic surveillance efforts in areas where surveillance is limited. However even though it seems that the authors aim to present this as a surveillance model, the current focus of the paper is on the somewhat limited inference made about transmission. We ask the authors to revise the paper, focusing on the regional genomic surveillance effort. In particular we would like the authors to provide more detail on:

1) Where did the isolates come from (e.g., cholera treatment centers, hospitals, or broader active surveillance)?

We have added the following note to the methods describing where the isolates came from:

“In Cameroon, isolates came from cholera treatment centers from throughout the country. […] And in Nigeria, isolates came from cholera treatment centers in the Northeastern and Northwestern regions of the country.”

We have also added a mention of this in the newly added “Approach” section of the main text of the manuscript.

2) Do they conduct environmental sampling and could this be part of future efforts?

No environmental isolates were sequenced as part of this paper, as all included samples were collected as part of routine clinical surveillance. Sequencing from environmental samples could be a way to interrogate the hypothesis of an environmental reservoir. However, previous efforts have not been successful in identifying toxigenic O1 cholera in candidate reservoirs (e.g., in Lake Chad, as cited in the manuscript) so efforts here were focused on clinical isolates, as these have been shown to be more important for local and regional decision making. Building local capacity to sequence clinical isolates will likely prove more fruitful for understanding regional outbreaks in the short term, though sequencing from environmental samples will be an important future area of improvement for cholera research studies.

3) Who attended the training? Were they members from regional ministry of health labs, academic institutes etc?

Participants of the training included three individuals from Nigeria Centre for Disease Control National Reference Laboratory, one individual from Centre Pasteur du Cameroun, two individuals from the National Public Health Laboratory in Cameroon, two individuals from Centre de Recherche Médicale et Sanitaire in Niger, and one individual from the University of Ibadan in Nigeria. These participants brought with them a wealth of knowledge in regional cholera transmission, as well as microbiology techniques used in public health labs. We have restructured the manuscript to include an Approach section that more clearly discusses the participants and goals of the workshop, and we have added the following text to highlight the experience of the workshop participants:

“In October 2019, researchers from Cameroon, Niger, Nigeria, and the United States came together to discuss how whole genome sequencing could contribute to understanding of cholera spread in these countries. […] In all countries, reported cases were defined according to the World Health Organization suspected case definition (Global Task Force on Cholera Control Surveillance Working Group, 2017): individuals with acute watery diarrhoea and severe dehydration in areas where a cholera outbreak has not been declared, and individuals with acute watery diarrhoea in areas where an outbreak has been declared.”

4) Were the attendees laboratorians, bioinformaticians, clinicians etc?

As explained in the response above, most participants were microbiologists working in public health labs. All participants had technical laboratory skills, and a few had prior bioinformatics experience. We hope that the added Approach section of the manuscript provides additional details as to the goals and participants in the workshop experience.

5) Was there an effort to analyze the data, particularly the bioinformatics portion, locally or did the rely 100% on the collaborators at JHU? If the latter, this is probably not a good sustainable model for ongoing genomic epidemiology. If the prior, then were local or regional computing resources used

Absolutely. The workshop provided a basic introduction to sequencing processes (basecalling, demultiplexing, assembly) as well a tutorial on using the Linux command line to run bioinformatics processes. During the workshop, participants worked on one of two provided sequencing laptops (one of which remains at the National Reference Laboratory at Nigeria Centre for Disease Control) or own their own laptop booted from an external hard drive with all software installed.

It is very difficult to become a bioinformatics expert overnight, so the workshop focused on the key concepts needed to be conversant in whole genome sequencing and bioinformatics, in the hopes that this would seed future collaborations between the workshop participants and with other groups. Indeed, Nigeria Centre for Disease Control is currently involved in several pathogen sequencing projects (notably including ongoing SARS-CoV-2 sequencing projects), several using the laptop and MinION (and bioinformatics protocols) provided during the workshop. The workshop also focused on using publicly available phylogenetic tools and on interpretation of phylogenetic trees, and we spent the last day of the workshop discussing the conclusions from a phylogenetic tree that included the sequences generated during the training.

Additional follow up phylogenetic analysis for this manuscript did rely on collaborators at JHU, but all participating groups have been invited to an ongoing follow-up workshop, which will again teach key bioinformatics skills and analysis techniques. All of these analyses are to be conducted on laptop computers provided to participating groups (or acquired independently, with advice and setup support from JHU collaborators), with the goal of working towards a more sustainable model for ongoing genomic epidemiology that does not rely on JHU.

Finally, all antibiotic resistance analyses presented in the manuscript were conducted entirely independently from the JHU authors, using the FASTA files generated during the workshop and local expertise.

As shown in the expert copied above, we have aimed to revise the manuscript to highlight the emphasis on bioinformatics training and collaborative analysis during the workshop period. We have also added additional text to the discussion highlighting some of the points above:

“As a result of the workshop, scientists from Cameroon, Niger, and Nigeria have all been trained in V. cholerae sequencing with the Oxford Nanopore MinION platform, and scientists at Nigeria Centre for Disease Control National Reference Laboratory have used this expertise to continue sequencing V. cholerae and other pathogens. […] Finally, expanding sequencing knowledge, capacity, and collaboration beyond the original workshop participants will facilitate generation of data needed to control cholera outbreaks in the region.”

6) Are they continuing to sequence isolates after the training?

As mentioned briefly above, Nigeria Centre for Disease control has continued to sequence isolates after the training. They briefly attempted cholera sequencing but have mostly transitioned to SARS-CoV-2 sequencing in light of the ongoing pandemic. Individuals from Centre Pasteur du Cameroun, the Cameroon National Public Health Lab, and Centre de Recherche Médicale et Sanitaire in Niger recently received sequencing equipment (including a laptop for bioinformatic analysis) and will begin sequencing SARS-CoV-2 as part of an ongoing follow-up workshop. The hope is that additional training in pathogen sequencing will contribute to cholera surveillance after the imminent needs of the COVID-19 pandemic have subsided. As shown in the previous response, we have updated the manuscript to mention ongoing training and opportunities to build sequencing capacity (despite the brief pivot away from cholera in light of the ongoing COVID-19 pandemic).

In addition we would like you to address the following points regarding the analyses:

7) A sentence about the cholera case definition would be helpful to clarify what reported suspected cases include. Is this based on symptoms, culture, molecular data?

We have added a sentence on how cholera cases were defined to the Approach section and the Figure 1 caption. Specifically, we specify that case counts are clinically based, using the WHO definition of suspected cases:

“In all countries, reported cases were defined according to the World Health Organization suspected case definition (Global Task Force on Cholera Control Surveillance Working Group, 2017): individuals with acute watery diarrhoea and severe dehydration in areas where a cholera outbreak has not been declared, and individuals with acute watery diarrhoea in areas where an outbreak has been declared.”

8) Last paragraph of the introduction – There needs to be a little more preamble before introducing the lineage designations. How are they defined in the population structure of the 7th pandemic?

This is an excellent point, and we realize we did not provide enough information about the existing cholera lineages in our first submission. We have restructured the introduction of the manuscript entirely, now including an Approach section, but have made sure to include this background information in the introduction:

“One way to distinguish between transmission scenarios is to compare the specific V. cholerae lineages in circulation. […] In this manuscript, we aim to use additional sequence data from recent cholera outbreaks to improve our understanding of more detailed V. cholerae transmission patterns within West Africa, and to understand if regionally coordinated surveillance and response may improve containment outcomes.”

9) The authors should include some brief statistics on the distribution of regions represented in the 1280 comparison genomes even if these data can be abstracted from the supplemental material. These data would allow the reader to assess the representativeness of the sample. Particular focus should be given to the proportion of the sample from West Africa and the rest of the continent.

The majority of the background samples included in this study come from Genomic history of the seventh pandemic of cholera in Africa published by Weill et al. in 2017. The Weill et al. publication includes a breakdown of the 1070 sequences included in that paper by introduction (Figure 3) and by continent and year (Figure S1). They also include a plot of the reported cholera cases in Africa by year in Figure S2, for purposes of comparison.

To facilitate understanding of the background dataset in this manuscript, we have added Figure 2—figure supplement 2, which breaks down the included sequences by decade and continent, and further enumerates sequences from the regions and countries relevant to this manuscript. We have also added Figure 2—figure supplements 3 and 4 in order to compare the sequences included in this study to cases reported to the WHO. In each, we report the distribution of cases across regions for the relevant time period, and plot this alongside the distribution of sequences across regions for the relevant time period. As shown in these figures, the African continent is well represented by sequence data across all relevant time periods. Additionally, Western and Central Africa – the regions most relevant to this study – are consistently well represented by the included sequence data (though the total number of sequences are of course limited in number).

To the Results section, we have added the following description:

“We aligned these genomes to 1280 previously published V. cholerae O1 whole genome sequences (Bwire et al., 2018; Irenge et al., 2020; Weill et al., 2019, 2017), which provide a broad overview of global seventh pandemic O1 cholera diversity. The majority of these sequences are from Africa and South Asia, and reported cases in West and Central Africa (World Health Organization, 2019) are well represented by the sequence data (Figure 2—figure supplements 2–4).”

10) In the first paragraph of the discussion, it is stated that there is no evidence of introductions during the time periods in question. How was this inferred? Was it completely subjective or based on lineages and genetic distance? This should be clarified. Also, the authors address the limitation of sampling from the region. In particular, there is generally low representation from African countries. Further the T12 lineage is generally prevalent in the region. Therefore, if the determination of importations was made solely on lineages, then it will be difficult to resolve migration among the neighboring countries, as the authors note. Either there should be a more detailed analysis of strains from Africa (e.g., a separate phylogeny like the expanded clade in Figure 2) or clarification that importations are referring to migration from other regions where cholera is endemic e.g., SE Asia.

We have rephrased the first sentence of the discussion to clarify that lack of evidence for introductions is based on the fact that all newly sequenced isolates from the region fall clearly within the T12 lineage, which was the lineage last reported in the region. We have also specified that we mean introduction from outside of West Africa, since the reviewer correctly notes that we are not able to resolve transmission within the region:

“We found no evidence for a new introduction of V. cholerae O1 from outside of West Africa between 2014 – the date of the last published isolate from these countries – and 2018. […] In concordance with the epidemiologic data, the high similarity between sequences (including shared mutations in known AMR genes, see Supplementary File 1A and 1B) from the three countries suggests that their outbreaks are linked.”

Reviewer #1 (Recommendations for the authors):

The authors may consider the following suggestions:

1. Is there any characteristics variation of mutational landscape of isolates from three different countries?

As shown in the phylogeny, 2018–2019 sequences from Cameroon, Niger, and Nigeria form a distinct cluster. This cluster shares 4 mutations not seen in any other T12 sequences. However, the fact that these mutations are not shared with previously published sequences from the same countries suggests that these mutations are not characteristic to the countries, but rather the result of evolution of the bacterium over time (as noted in the Discussion section). Additionally, the sequences from the three countries are interspersed within the 2018–2019 cluster, indicating there are no mutations characteristic of one country and not seen in the others (confirmed by analysis of individual variants in these sequences). These observations support our conclusion that outbreaks in these three countries are likely linked.

2. Does the mutational landscape change yearwise?

While this is an important question, we do not have enough samples from each year to provide a definitive answer. We do note in the discussion, however, that samples from 2018–2019 form a distinct clade on the phylogeny (distinct from 2014 and 2016), which suggests evolution of the mutational landscape over this time period.

Reviewer #2 (Recommendations for the authors):

The abstract is excessively sparse and could be expanded. It would be worthwhile to note that the analysis included 1280 other genomes as well as including the motivation for the analysis.

The reviewer is correct. We have revised the abstract accordingly. It now reads:

“Despite recent insights into cholera transmission patterns in Africa, regional and local dynamics in West Africa – where cholera outbreaks occur every few years – are still poorly understood. […] The results from this regional sequencing effort provide a model for future regionally coordinated surveillance efforts.”

https://doi.org/10.7554/eLife.65159.sa2

Article and author information

Author details

  1. Eme Ekeng

    Nigeria Centre for Disease Control, Abuja, Nigeria
    Contribution
    Conceptualization, Resources, Formal analysis, Investigation, Project administration, Writing - review and editing
    Contributed equally with
    Serges Tchatchouang
    Competing interests
    No competing interests declared
  2. Serges Tchatchouang

    Centre Pasteur du Cameroun, Yaoundé, Cameroon
    Contribution
    Conceptualization, Formal analysis, Investigation, Writing - original draft, Writing - review and editing
    Contributed equally with
    Eme Ekeng
    Competing interests
    No competing interests declared
  3. Blaise Akenji

    National Public Health Laboratory, Yaoundé, Cameroon
    Contribution
    Conceptualization, Formal analysis, Investigation, Writing - original draft, Writing - review and editing
    Competing interests
    No competing interests declared
  4. Bassira Boubacar Issaka

    Centre de Recherche Médicale et Sanitaire, Niamey, Niger
    Contribution
    Conceptualization, Formal analysis, Investigation, Writing - review and editing
    Competing interests
    No competing interests declared
  5. Ifeoluwa Akintayo

    Department of Pharmaceutical Microbiology, Faculty of Pharmacy, University of Ibadan, Ibadan, Nigeria
    Contribution
    Conceptualization, Formal analysis, Investigation, Writing - review and editing
    Competing interests
    No competing interests declared
  6. Christopher Chukwu

    Nigeria Centre for Disease Control, Abuja, Nigeria
    Contribution
    Conceptualization, Formal analysis, Investigation, Writing - review and editing
    Competing interests
    No competing interests declared
  7. Ibrahim Dan Dano

    Centre de Recherche Médicale et Sanitaire, Niamey, Niger
    Contribution
    Conceptualization, Formal analysis, Investigation, Writing - original draft, Writing - review and editing
    Competing interests
    No competing interests declared
  8. Sylvie Melingui

    National Public Health Laboratory, Yaoundé, Cameroon
    Contribution
    Conceptualization, Formal analysis, Investigation, Writing - review and editing
    Competing interests
    No competing interests declared
  9. Sani Ousmane

    Centre de Recherche Médicale et Sanitaire, Niamey, Niger
    Contribution
    Supervision, Investigation, Writing - review and editing
    Competing interests
    No competing interests declared
  10. Michael Oladotun Popoola

    Nigeria Centre for Disease Control, Abuja, Nigeria
    Contribution
    Conceptualization, Formal analysis, Investigation, Writing - review and editing
    Competing interests
    No competing interests declared
  11. Ariane Nzouankeu

    Centre Pasteur du Cameroun, Yaoundé, Cameroon
    Contribution
    Investigation, Writing - review and editing
    Competing interests
    No competing interests declared
  12. Yap Boum

    Epicentre, Paris, France
    Contribution
    Supervision, Writing - review and editing
    Competing interests
    No competing interests declared
  13. Francisco Luquero

    Epicentre, Paris, France
    Contribution
    Supervision, Writing - review and editing
    Competing interests
    No competing interests declared
  14. Anthony Ahumibe

    Nigeria Centre for Disease Control, Abuja, Nigeria
    Contribution
    Resources, Supervision, Project administration, Writing - review and editing
    Competing interests
    No competing interests declared
  15. Dhamari Naidoo

    World Health Organization Nigeria, Abuja, Nigeria
    Contribution
    Resources, Supervision, Project administration, Writing - review and editing
    Competing interests
    No competing interests declared
  16. Andrew Azman

    Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Investigation, Visualization, Project administration, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8662-9077
  17. Justin Lessler

    Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Investigation, Visualization, Project administration, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9741-8109
  18. Shirlee Wohl

    Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, United States
    Contribution
    Conceptualization, Software, Formal analysis, Validation, Investigation, Visualization, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    swohl@jhu.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0311-3348

Funding

Bill and Melinda Gates Foundation (OPP1195157)

  • Andrew Azman
  • Justin Lessler

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This project was made possible by the Nigeria Centre for Disease Control, who graciously hosted researchers at the National Reference Laboratory in Abuja, Nigeria for training and V. cholerae sequencing. We thank Aicha Omar, Zanguina Jibir, and Moussa Soussou Amadou for laboratory assistance and strain isolation, Alama Keita and the West Africa Cholera Platform for sharing incidence data, David Mohr for laboratory assistance and space, and Daryl Domman for providing V. cholerae alignments used as a background phylogenetic dataset. We also thank the original authors of the sequences used in this background dataset (references provided in Figure 2—source data 3). Funding for this project was provided by Bill and Melinda Gates Foundation OPP1195157.

Senior Editor

  1. Eduardo Franco, McGill University, Canada

Reviewing Editor

  1. Isabel Rodriguez-Barraquer, University of California, San Francisco, United States

Reviewers

  1. Surajit Basak
  2. Taj Azerian

Version history

  1. Received: November 25, 2020
  2. Accepted: June 5, 2021
  3. Version of Record published: June 18, 2021 (version 1)

Copyright

© 2021, Ekeng et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 699
    Page views
  • 96
    Downloads
  • 6
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Eme Ekeng
  2. Serges Tchatchouang
  3. Blaise Akenji
  4. Bassira Boubacar Issaka
  5. Ifeoluwa Akintayo
  6. Christopher Chukwu
  7. Ibrahim Dan Dano
  8. Sylvie Melingui
  9. Sani Ousmane
  10. Michael Oladotun Popoola
  11. Ariane Nzouankeu
  12. Yap Boum
  13. Francisco Luquero
  14. Anthony Ahumibe
  15. Dhamari Naidoo
  16. Andrew Azman
  17. Justin Lessler
  18. Shirlee Wohl
(2021)
Regional sequencing collaboration reveals persistence of the T12 Vibrio cholerae O1 lineage in West Africa
eLife 10:e65159.
https://doi.org/10.7554/eLife.65159

Further reading

    1. Epidemiology and Global Health
    Charumathi Sabanayagam, Feng He ... Ching Yu Cheng
    Research Article Updated

    Background:

    Machine learning (ML) techniques improve disease prediction by identifying the most relevant features in multidimensional data. We compared the accuracy of ML algorithms for predicting incident diabetic kidney disease (DKD).

    Methods:

    We utilized longitudinal data from 1365 Chinese, Malay, and Indian participants aged 40–80 y with diabetes but free of DKD who participated in the baseline and 6-year follow-up visit of the Singapore Epidemiology of Eye Diseases Study (2004–2017). Incident DKD (11.9%) was defined as an estimated glomerular filtration rate (eGFR) <60 mL/min/1.73 m2 with at least 25% decrease in eGFR at follow-up from baseline. A total of 339 features, including participant characteristics, retinal imaging, and genetic and blood metabolites, were used as predictors. Performances of several ML models were compared to each other and to logistic regression (LR) model based on established features of DKD (age, sex, ethnicity, duration of diabetes, systolic blood pressure, HbA1c, and body mass index) using area under the receiver operating characteristic curve (AUC).

    Results:

    ML model Elastic Net (EN) had the best AUC (95% CI) of 0.851 (0.847–0.856), which was 7.0% relatively higher than by LR 0.795 (0.790–0.801). Sensitivity and specificity of EN were 88.2 and 65.9% vs. 73.0 and 72.8% by LR. The top 15 predictors included age, ethnicity, antidiabetic medication, hypertension, diabetic retinopathy, systolic blood pressure, HbA1c, eGFR, and metabolites related to lipids, lipoproteins, fatty acids, and ketone bodies.

    Conclusions:

    Our results showed that ML, together with feature selection, improves prediction accuracy of DKD risk in an asymptomatic stable population and identifies novel risk factors, including metabolites.

    Funding:

    This study was supported by the National Medical Research Council, NMRC/OFLCG/001/2017 and NMRC/HCSAINV/MOH-001019-00. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

    1. Epidemiology and Global Health
    C Kim, Benjamin Chen ... RECOVER Mechanistic Pathways Task Force
    Review Article

    The NIH-funded RECOVER study is collecting clinical data on patients who experience a SARS-CoV-2 infection. As patient representatives of the RECOVER Initiative’s Mechanistic Pathways task force, we offer our perspectives on patient motivations for partnering with researchers to obtain results from mechanistic studies. We emphasize the challenges of balancing urgency with scientific rigor. We recognize the importance of such partnerships in addressing post-acute sequelae of SARS-CoV-2 infection (PASC), which includes ‘long COVID,’ through contrasting objective and subjective narratives. Long COVID’s prevalence served as a call to action for patients like us to become actively involved in efforts to understand our condition. Patient-centered and patient-partnered research informs the balance between urgency and robust mechanistic research. Results from collaborating on protocol design, diverse patient inclusion, and awareness of community concerns establish a new precedent in biomedical research study design. With a public health matter as pressing as the long-term complications that can emerge after SARS-CoV-2 infection, considerate and equitable stakeholder involvement is essential to guiding seminal research. Discussions in the RECOVER Mechanistic Pathways task force gave rise to this commentary as well as other review articles on the current scientific understanding of PASC mechanisms.