High resolution species assignment of Anopheles mosquitoes using k-mer distances on targeted sequences
Abstract
The ANOSPP amplicon panel is a genus-wide targeted sequencing panel to facilitate large-scale monitoring of Anopheles species diversity. Combining information from the 62 nuclear amplicons present in the ANOSPP panel allows for a more nuanced species assignment than single gene (e.g. COI) barcoding, which is desirable in the light of permeable species boundaries. Here, we present NNoVAE, a method using Nearest Neighbours (NN) and Variational Autoencoders (VAE), which we apply to k-mers resulting from the ANOSPP amplicon sequences in order to hierarchically assign species identity. The NN step assigns a sample to a species-group by comparing the k-mers arising from each haplotype’s amplicon sequence to a reference database. The VAE step is required to distinguish between closely related species, and also has sufficient resolution to reveal population structure within species. In tests on independent samples with over 80% amplicon coverage, NNoVAE correctly classifies to species level 98% of samples within the An. gambiae complex and 89% of samples outside the complex. We apply NNoVAE to over two thousand new samples from Burkina Faso and Gabon, identifying unexpected species in Gabon. NNoVAE presents an approach that may be of value to other targeted sequencing panels, and is a method that will be used to survey Anopheles species diversity and Plasmodium transmission patterns through space and time on a large scale, with plans to analyse half a million mosquitoes in the next five years.
Data availability
Raw sequencing data will be made available on ENA (accession to be confirmed). Pipelines and analysis code, together with processed target haplotypes are available on GitHub: https://github.com/mariloubodde/NNoVAE.
Article and author information
Author details
Funding
Wellcome Trust (206194/Z/17/Z)
- Mara KN Lawniczak
Wellcome Trust (RG92770)
- Marilou Boddé
Wellcome Trust (WT207492)
- Richard Durbin
Agence Nationale de la Recherche (ANR-18-CE35-0002-01 - WILDING).)
- Diego Ayala
Institut de Recherche pour le Développement (Bourse ARTS/IRD)
- Lemonde Bouafou
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Reviewing Editor
- Daniel R Matute, University of North Carolina, Chapel Hill, United States
Publication history
- Received: March 18, 2022
- Preprint posted: March 20, 2022 (view preprint)
- Accepted: October 11, 2022
- Accepted Manuscript published: October 12, 2022 (version 1)
- Version of Record published: November 10, 2022 (version 2)
Copyright
© 2022, Boddé et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 556
- Page views
-
- 129
- Downloads
-
- 0
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Ecology
- Evolutionary Biology
Phytoplankton rely on diverse mechanisms to adapt to the decreased iron bioavailability and oxidative stress-inducing conditions of today's oxygenated oceans, including replacement of the iron-requiring ferredoxin electron shuttle protein with a less-efficient iron-free flavodoxin under iron limiting conditions. And yet, diatoms transcribe flavodoxins in high-iron regions in contrast to other phytoplankton. Here, we show that the two clades of flavodoxins present within diatoms exhibit a functional divergence, with only clade II flavodoxins displaying the canonical role in acclimation to iron limitation. We created CRISPR/Cas9 knock-outs of the clade I flavodoxin from the model diatom Thalassiosira pseudonana and found these cell lines are hypersensitive to oxidative stress, while maintaining a wild-type response to iron limitation. Within natural diatom communities, clade I flavodoxin transcript abundance is regulated over the diel cycle rather than in response to iron availability, whereas clade II transcript abundances increase either in iron‑limiting regions or under artificially induced iron-limitation. The observed functional specialization of two flavodoxin variants within diatoms reiterates two major stressors associated with contemporary oceans and illustrates diatom strategies to flourish in diverse aquatic ecosystems.
-
- Ecology
- Evolutionary Biology
Most phytophagous insect species exhibit a limited diet breadth and specialize on a few or a single host plant. In contrast, some species display a remarkably large diet breadth, with host plants spanning several families and many species. It is unclear, however, whether this phylogenetic generalism is supported by a generic metabolic use of common host chemical compounds (‘metabolic generalism’) or alternatively by distinct uses of diet-specific compounds (‘multi-host metabolic specialism’)? Here, we simultaneously investigated the metabolomes of fruit diets and of individuals of a generalist phytophagous species, Drosophila suzukii, that developed on them. The direct comparison of metabolomes of diets and consumers enabled us to disentangle the metabolic fate of common and rarer dietary compounds. We showed that the consumption of biochemically dissimilar diets resulted in a canalized, generic response from generalist individuals, consistent with the metabolic generalism hypothesis. We also showed that many diet-specific metabolites, such as those related to the particular color, odor, or taste of diets, were not metabolized, and rather accumulated in consumer individuals, even when probably detrimental to fitness. As a result, while individuals were mostly similar across diets, the detection of their particular diet was straightforward. Our study thus supports the view that dietary generalism may emerge from a passive, opportunistic use of various resources, contrary to more widespread views of an active role of adaptation in this process. Such a passive stance towards dietary chemicals, probably costly in the short term, might favor the later evolution of new diet specializations.