Ribosomal RNA (rRNA) sequences from 33 globally distributed mosquito species for improved metagenomics and species identification

  1. Cassandra Koh  Is a corresponding author
  2. Lionel Frangeul
  3. Hervé Blanc
  4. Carine Ngoagouni
  5. Sébastien Boyer
  6. Philippe Dussart
  7. Nina Grau
  8. Romain Girod
  9. Jean-Bernard Duchemin
  10. Maria-Carla Saleh  Is a corresponding author
  1. Institut Pasteur, Université Paris Cité, CNRS UMR3569, Viruses and RNA Interference Unit, F-75015, France
  2. Institut Pasteur de Bangui, Medical Entomology Laboratory, Central African Republic
  3. Institut Pasteur du Cambodge, Medical and Veterinary Entomology Unit, Cambodia
  4. Institut Pasteur du Cambodge, Virology Unit, Cambodia
  5. Institut Pasteur de Madagascar, Medical Entomology Unit, Madagascar
  6. Institut Pasteur de la Guyane, Vectopôle Amazonien Emile Abonnenc, French Guiana

Abstract

Total RNA sequencing (RNA-seq) is an important tool in the study of mosquitoes and the RNA viruses they vector as it allows assessment of both host and viral RNA in specimens. However, there are two main constraints. First, as with many other species, abundant mosquito ribosomal RNA (rRNA) serves as the predominant template from which sequences are generated, meaning that the desired host and viral templates are sequenced far less. Second, mosquito specimens captured in the field must be correctly identified, in some cases to the sub-species level. Here, we generate mosquito rRNA datasets which will substantially mitigate both of these problems. We describe a strategy to assemble novel rRNA sequences from mosquito specimens and produce an unprecedented dataset of 234 full-length 28S and 18S rRNA sequences of 33 medically important species from countries with known histories of mosquito-borne virus circulation (Cambodia, the Central African Republic, Madagascar, and French Guiana). These sequences will allow both physical and computational removal of rRNA from specimens during RNA-seq protocols. We also assess the utility of rRNA sequences for molecular taxonomy and compare phylogenies constructed using rRNA sequences versus those created using the gold standard for molecular species identification of specimens—the mitochondrial cytochrome c oxidase I (COI) gene. We find that rRNA- and COI-derived phylogenetic trees are incongruent and that 28S and concatenated 28S+18S rRNA phylogenies reflect evolutionary relationships that are more aligned with contemporary mosquito systematics. This significant expansion to the current rRNA reference library for mosquitoes will improve mosquito RNA-seq metagenomics by permitting the optimization of species-specific rRNA depletion protocols for a broader range of species and streamlining species identification by rRNA sequence and phylogenetics.

Editor's evaluation

Mosquitoes are an important vector for viruses and other pathogens worldwide. However, significant genomic resources are scarce for the study of these species. In this work, the authors create a significant genomic resource that will enable the study of mosquitoes and the pathogens that they carry.

https://doi.org/10.7554/eLife.82762.sa0

Introduction

Mosquitoes top the list of vectors for arthropod-borne diseases, being implicated in the transmission of many human pathogens responsible for arboviral diseases, malaria, and lymphatic filariasis (WHO, 2017). Mosquito-borne viruses circulate in sylvatic (between wild animals) or urban (between humans) transmission cycles driven by different mosquito species with their own distinct host preferences. Although urban mosquito species are chiefly responsible for amplifying epidemics in dense human populations, sylvatic mosquitoes maintain the transmission of these viruses among forest-dwelling animal reservoir hosts and are involved in spillover events when humans enter their ecological niches (Valentine et al., 2019). Given that mosquito-borne virus emergence is preceded by such spillover events, continuous surveillance and virus discovery in sylvatic mosquitoes is integral to designing effective public health measures to pre-empt or respond to mosquito-borne viral epidemics.

Metagenomics on field specimens is a powerful method in our toolkit to understand mosquito-borne disease ecology through the One Health lens (Webster et al., 2016). With next-generation sequencing becoming more accessible, such studies have provided unprecedented insights into the interfaces among mosquitoes, their environment, and their animal and human hosts. As mosquito-associated viruses are mostly RNA viruses, RNA sequencing (RNA-seq) is especially informative for surveillance and virus discovery. However, working with lesser studied mosquito species poses several problems.

First, metagenomics studies based on RNA-seq are bedevilled by overabundant ribosomal RNAs (rRNAs). These non-coding RNA molecules comprise at least 80% of the total cellular RNA population (Gale and Crampton, 1989). Due to their length and their abundance, they are a sink for precious next-generation sequencing reads, decreasing the sensitivity of pathogen detection unless depleted during library preparation. Yet the most common rRNA depletion protocols require prior knowledge of rRNA sequences of the species of interest as they involve hybridizing antisense oligos to the rRNA molecules prior to removal by ribonucleases (Fauver et al., 2019; Phelps et al., 2021) or by bead capture (Kukutla et al., 2013). Presently, reference sequences for rRNAs are limited to only a handful of species from three genera: Aedes, Culex, and Anopheles (Ruzzante et al., 2019). The lack of reliable rRNA depletion methods could deter mosquito metagenomics studies from expanding their sampling diversity, resulting in a gap in our knowledge of mosquito vector ecology. The inclusion of lesser studied yet medically relevant sylvatic species is therefore imperative.

Second, species identification based on morphology is notoriously complicated for members of certain species subgroups. This is especially the case among Culex subgroups. Sister species are often sympatric and show at least some competence for a number of viruses, such as Japanese encephalitis virus, St Louis encephalitic virus, and Usutu virus (Nchoutpouen et al., 2019). Although they share many morphological traits, each of these species have distinct ecologies and host preferences, thus the challenge of correctly identifying vector species can affect epidemiological risk estimation for these diseases (Farajollahi et al., 2011). DNA molecular markers are often employed to a limited degree of success to distinguish between sister species (Batovska et al., 2017; Zittra et al., 2016).

To address the lack of full-length rRNA sequences in public databases, we sought to determine the 28S and 18S rRNA sequences of a diverse set of Old and New World sylvatic mosquito species from four countries representing three continents: Cambodia, the Central African Republic, Madagascar, and French Guiana. These countries, due to their proximity to the equator, contain high mosquito biodiversity (Foley et al., 2007) and have had long histories of mosquito-borne virus circulation (Desdouits et al., 2015; Halstead, 2019; Héraud et al., 2022; Jacobi and Serie, 1972; Ratsitorahina et al., 2008; Saluzzo et al., 2017; Zeller et al., 2016). Increased and continued surveillance of local mosquito species could lead to valuable insights on mosquito virus biogeography. Using a unique score-based read filtration strategy to remove interfering non-mosquito rRNA reads for accurate de novo assembly, we produced a dataset of 234 novel full-length 28S and 18S rRNA sequences from 33 mosquito species, 30 of which have never been recorded before.

We also explored the functionality of 28S and 18S rRNA sequences as molecular markers by comparing their performance to that of the mitochondrial cytochrome c oxidase subunit I (COI) gene for molecular taxonomic and phylogenetic investigations. The COI gene is the most widely used DNA marker for molecular species identification and forms the basis of the Barcode of Life Data System (BOLD) (Hebert et al., 2003; Ratnasingham and Hebert, 2007). Presently, full-length rRNA sequences are much less represented compared to other molecular markers. However, given the availability of relevant reference sequences, 28S and concatenated 28S+18S rRNA sequences can be the better approach for molecular taxonomy and phylogenetic studies. We hope that our sequence dataset, with its species diversity and eco-geographical breadth, and the assembly strategy we describe would further facilitate the use of rRNA as markers. In addition, this dataset enables the design of species-specific oligos for cost-effective rRNA depletion for a broader range of mosquito species and streamlined molecular species identification during RNA-seq.

Results

Poor rRNA depletion using a non-specific depletion method

During library preparations of mosquito samples for RNA-seq, routinely used methods for depleting rRNA are commercial kits optimised for human or mice samples (Belda et al., 2019; Bishop-Lilly et al., 2010; Chandler et al., 2015; Kumar et al., 2012; Weedall et al., 2015; Zakrzewski et al., 2018) or through 80–100 base pair antisense probe hybridisation followed by ribonuclease digestion (Fauver et al., 2019; Phelps et al., 2021). In cases where the complete reference rRNA sequence of the target species is not known, oligos would be designed based on the rRNA sequence of the closest related species (25, this study). These methods should deplete reads from the conserved regions of rRNA sequences. However, reads from the variable regions remain at abundances high enough to compromise RNA-seq output. In our hands, we have found that using probes designed for the Ae. aegypti rRNA sequence followed by RNase H digestion according to the protocol published by Morlan et al., 2012, produced poor depletion in Aedes albopictus, and in Culicine and Anopheline species (Figure 1), in which between 46% and 94% of reads post-depletion were ribosomal. Additionally, the lack of full-length reference rRNA sequences compromises the in silico clean-up of remaining rRNA reads from sequencing data, as reads belonging to variable regions would not be removed. To solve this and to enable RNA-seq metagenomics on a broader range of mosquito species, we performed RNA-seq to generate reference rRNA sequences for 33 mosquito species representing 10 genera from Cambodia, the Central African Republic, Madagascar, and French Guiana. Most of these species are associated with vector activity for various pathogens in their respective ecologies (Table 1). In parallel, we sequenced the mitochondrial COI gene to perform molecular species identification of our samples and to comparatively evaluate the use of rRNA as a molecular marker (Figure 2).

Percentage of rRNA reads in mosquito total RNA sequencing (RNA-seq) data after depletion using probes antisense to Aedes aegypti sequences.

Pools of five individual mosquitoes from genera Aedes (Ae), Culex (Cx), Mansonia (Ma), and Anopheles (An) were ribodepleted by probe hybridisation followed by RNase H digestion according to the protocol by Morlan et al., 2012. Y-axis depicts percentages of remaining rRNA reads calculated as the number of rRNA reads over total reads per sample pool. Depletion efficiency decreases with taxonomic distance from Ae. aegypti underlining the need for reference sequences for species of interest.

Table 1
Mosquito species represented in this study and their vector status.
Mosquito taxonomyOrigin*Collection site (ecosystem type)Vector forReference
Aedes (Fredardsius) vittatusCFRural (village)ZIKV, CHIKV, YFVDiallo et al., 2020
Aedes (Ochlerotatus) scapularisGFRural (village)YFVVasconcelos et al., 2001
Aedes (Ochlerotatus) serratusGFRural (village)YFV, OROVCardoso et al., 2010; Romero-Alvarez and Escobar, 2018
Aedes (Stegomyia) aegyptiCFUrbanDENV, ZIKV, CHIKV, YFVKraemer et al., 2019
Aedes (Stegomyia) albopictusCF, KHRural (village, nature reserve)DENV, ZIKV, CHIKV, YFV, JEVAuerswald et al., 2021; Kraemer et al., 2019
Aedes (Stegomyia) simpsoniCFRural (village)YFVMukwaya et al., 2000
Anopheles (Anopheles) baezaiKHRural (nature reserve)Unreported
Anopheles (Anopheles) coustaniMG, CFRural (village)RVFV, malariaMwangangi et al., 2013; Nepomichene et al., 2018; Ratovonjato et al., 2011
Anopheles (Cellia) funestusMG, CFRural (village)ONNV, malariaLutomiah et al., 2013; Tabue et al., 2017
Anopheles (Cellia) gambiaeMG, CFRural (village)ONNV, malariaBrault et al., 2004
Anopheles (Cellia) squamosusMGRural (village)RVFV, malariaRatovonjato et al., 2011; Stevenson et al., 2016
Coquillettidia (Rhynchotaenia) venezuelensisGFRural (village)OROVTravassos da Rosa et al., 2017
Culex (Culex) antennatusMGRural (village)RVFVNepomichene et al., 2018; Ratovonjato et al., 2011
Culex (Culex) duttoniCFRural (village)Unreported
Culex (Culex) neaveiMGRural (village)USUVNikolay et al., 2011
Culex (Culex) orientalisKHRural (nature reserve)JEVKim et al., 2015
Culex (Culex) perexiguusMGRural (village)WNV, USUVVezenegho et al., 2022
Culex (Culex) pseudovishnuiKHRural (nature reserve)JEVAuerswald et al., 2021
Culex (Culex) quinquefasciatusMG, CF, KHRural (village, nature reserve)ZIKV, JEV, WNV, DENV, SLEV, RVFV, Wuchereria bancroftiBhattacharya and Basu, 2016; Maquart et al., 2021; Ndiaye et al., 2016; Serra et al., 2016
Culex (Culex) tritaeniorhynchusMG, KHRural (village, nature reserve)JEV, WNV, RVFVAuerswald et al., 2021; Hayes et al., 1980; Jupp et al., 2002
Culex (Melanoconion) spissipesGFRural (village)VEEVWeaver et al., 2004
Culex (Melanoconion) portesiGFRural (village)VEEV, TONVTalaga et al., 2021; Weaver et al., 2004
Culex (Melanoconion) pedroiGFRural (village)EEEV, VEEV, MADVTalaga et al., 2021; Turell et al., 2008
Culex (Oculeomyia) bitaeniorhynchusMG, KHRural (village, nature reserve)JEVAuerswald et al., 2021
Culex (Oculeomyia) poicilipesMGRural (village)RVFVNdiaye et al., 2016
Eretmapodites intermediusCFRural (village)Unreported
Limatus durhamiiGFRural (village)ZIKVBarrio-Nuevo et al., 2020
Mansonia (Mansonia) titillansGFRural (village)VEEV, SLEVHoyos-López et al., 2015; Turell, 1999
Mansonia (Mansonioides) indianaKHRural (nature reserve)JEVArunachalam et al., 2004
Mansonia (Mansonioides) uniformisMG, CF, KHRural (village, nature reserve)RVFV, Wuchereria bancroftiLutomiah et al., 2013; Ughasi et al., 2012
Mimomyia (Etorleptiomyia) mediolineataMGRural (village)Unreported
Psorophora (Janthinosoma) feroxGFRural (village)ROCVMitchell et al., 1986
Uranotaenia (Uranotaenia) geometricaGFRural (village)Unreported
  1. *

    Dengue virus, DENV; Zika virus, ZIKV; chikungunya virus, CHIKV; Yellow Fever virus, YFV; Oropouche virus, OROV; Japanese encephalitis virus, JEV; Rift Valley Fever virus, RVFV; O’Nyong Nyong virus, ONNV; Usutu virus, USUV; West Nile virus, WNV; St Louis encephalitis virus, SLEV; Venezuelan equine encephalitis virus, VEEV; Tonate virus, TONV; Eastern equine encephalitis virus, EEEV; Madariaga virus, MADV; Rocio virus, ROCV.

  2. Origin countries are listed as their ISO alpha-2 codes: Central African Republic, CF; Cambodia, KH; Madagascar, MG; French Guiana, GF.

  3. Subgenus indicated in brackets.

Novel mosquito rRNA sequences were obtained using a unique reads filtering method.

(A) Schematic of sequencing and bioinformatics analyses performed in this study to obtain full-length 18S and 28S rRNA sequences as well as cytochrome c oxidase I (COI) DNA sequences. Nucleic acids were isolated from mosquito specimens for next-generation (for rRNA) or Sanger (for COI) sequencing. Two in-house libraries were created from the SILVA rRNA gene database: Insecta and Non-Insecta, which comprises 8,585 sequences and 558,185 sequences, respectively. Following BLASTn analyses against these two libraries, each RNA-sequencing (RNA-seq) read is assigned a ratio of BLASTn scores to describe their relative nucleotide similarity to insect rRNA sequences. Based on these ratios of scores, RNA-seq reads can then be filtered to remove non-mosquito reads prior to assembly with SPAdes to give full-length 18S and 28S rRNA sequences. Image created with https://biorender.com/. (B) Based on their ratio of scores, reads can be segregated into four categories, as shown on this ratio of scores versus number of reads plot for the representative specimen ‘CF S27’: (i) reads with hits only in the Insecta library (shaded in green), (ii) reads with a higher score against the Insecta library (shaded in blue), (iii) reads with a higher score against the Non-Insecta library (shaded in yellow), and (iv) reads with no hits in the Insecta library (shaded in red). We applied a conservative threshold at 0.8, indicated by the black horizontal line, where only reads above this threshold are used in the assembly with SPAdes. For this given specimen, 175,671 reads (96.3% of total reads) passed the ≥0.8 cut-off, 325 reads (0.18% of total reads) had ratios of scores <0.8, while 6,423 reads (3.52%) did not have hits against the Insecta library.

rRNA reads filtering and sequence assembly

Assembling Illumina reads to reconstruct rRNA sequences from total mosquito RNA is not a straightforward task. Apart from host rRNA, total RNA samples also contain rRNA from other organisms associated with the host (microbiota, external parasites, or ingested diet). As rRNA sequences share high homology in conserved regions, Illumina reads (150 bp) from non-host rRNA can interfere with the contig assembly of host 28S and 18S rRNA.

Our score-based filtration strategy, described in detail in the Materials and methods section, allowed us to bioinformatically remove interfering rRNA reads and achieve successful de novo assembly of 28S and 18S rRNA sequences for all our specimens. Briefly, for each Illumina read, we computed a ratio of BLAST scores against an Insecta library over scores against a Non-Insecta library (Figure 2A). Based on their ratio of scores, reads could be segregated into four categories (Figure 2B): (i) reads mapping only to the Insecta library, (ii) reads mapping better to the Insecta relative to Non-Insecta library, (iii) reads mapping better to the Non-Insecta relative to the Insecta library, and (iv) reads mapping only to the Non-Insecta library. By applying a conservative threshold at 0.8 to account for the non-exhaustiveness of the SILVA database, we removed reads that likely do not originate from mosquito rRNA. Notably, 15 of our specimens were engorged with vertebrate blood, a rich source of non-mosquito rRNA (Appendix 1—table 1). The successful assembly of complete 28S and 18S rRNA sequences for these specimens demonstrates that this strategy performs as expected even with high amounts of non-host rRNA reads. This is particularly important in studies on field-captured mosquitoes as females are often sampled already having imbibed a blood meal or captured using the human landing catch technique.

We encountered challenges for three specimens morphologically identified as Mansonia africana (Specimen ID S33–S35) (Appendix 1—table 1). COI amplification by PCR did not produce any product, hence COI sequencing could not be used to confirm species identity. In addition, the genome assembler SPAdes (Bankevich et al., 2012) was only able to assemble partial length rRNA contigs, despite the high number of reads with high scores against the Insecta library. Among other Mansonia specimens, these partial length contigs shared the highest similarity with contigs obtained from sample ‘Ma uniformis CF S51’. We then performed a guided assembly using the 28S and 18S sequences of this specimen as references, which successfully produced full-length contigs. In two of these specimens (Specimen ID S34 and S35), our assembly initially produced two sets of 28S and 18S rRNA sequences, one of which was similar to mosquito rRNA with low coverage and another with 10-fold higher coverage and 95% nucleotide sequence similarity to a water mite of genus Horreolanus known to parasitize mosquitoes. Our success in obtaining rRNA sequences for mosquito and water mite shows that our strategy can be applied to metabarcoding studies where the input material comprises multiple insect species, provided that appropriate reference sequences of the target species or of a close relative are available.

Altogether, we were able to assemble 122 28S and 114 18S full-length rRNA sequences for 33 mosquito species representing 10 genera sampled from four countries across three continents. This dataset contains, to our knowledge, the first records for 30 mosquito species and for seven genera: Coquillettidia, Mansonia, Limatus, Mimomyia, Uranotaenia, Psorophora, and Eretmapodites. Individual GenBank accession numbers for these sequences and specimen information are listed in Appendix 1—table 1.

Comparative phylogeny of novel rRNA sequences relative to existing records

To verify the assembly accuracy of our rRNA sequences, we constructed a comprehensive phylogenetic tree from the full-length 28S rRNA sequences generated from our study and included relevant rRNA sequences publicly available from GenBank (Figure 3). We applied a search criterion for GenBank sequences with at least 95% coverage of our sequence lengths (~4000 bp), aiming to represent as many species or genera as possible. Although we rarely found records for the same species included in our study, the resulting tree showed that our 28S sequences generally clustered according to their respective species and subgenera, supported by moderate to good bootstrap support at terminal nodes. Species taxa generally formed monophyletic clades, with the exception of An. gambiae and Cx. quinquefasciatus. An. gambiae 28S rRNA sequences formed a clade with closely related sequences from Anopheles arabiensis, Anopheles merus, and Anopheles coluzzii, suggesting unusually high interspecies homology for Anophelines or other members of subgenus Cellia (Figure 3, in purple, subgenus Cellia). Meanwhile, Cx. quinquefasciatus 28S rRNA sequences formed a taxon paraphyletic to sister species Culex pipiens (Figure 3, in coral, subgenus Culex).

Figure 3 with 2 supplements see all
28S sequences generated from this study clustered with conspecifics or congenerics from existing GenBank records.

A rooted phylogenetic tree based on full-length 28S sequences (3,900 bp) from this study and from GenBank was inferred using the maximum-likelihood method and constructed to scale in MEGA X (Kumar et al., 2018) using an unknown Horreolanus species found among our samples as an outgroup. Values at each node indicate bootstrap support (%) from 500 replications. Sequences from GenBank are annotated with filled circles and their accession numbers are shown. For sequences from this study, each specimen label contains information on taxonomy, origin (in two-letter country codes), and specimen ID number. Some specimens produced up to two consensus 28S sequences; this is indicated by the numbers 1 or 2 at the beginning of the specimen label. Specimen genera are indicated by colour: Culex in coral, Anopheles in purple, Aedes in dark blue, Mansonia in dark green, Culiseta in maroon, Limatus in light green, Coquillettidia in light blue, Psorophora in yellow, Mimomyia in teal, Uranotaenia in pink, and Eretmapodites in brown. Scale bar at 0.05 is shown.

Figure 3—source data 1

Multiple sequence alignment of 169 28S rRNA sequences from this study and from GenBank (FASTA).

https://cdn.elifesciences.org/articles/82762/elife-82762-fig3-data1-v2.zip

28S rRNA sequence-based phylogenetic reconstructions (Figure 3, with GenBank sequences; Figure 4—figure supplement 1, this study only) showed marked incongruence to that of 18S rRNA sequences (Figure 4—figure supplement 2). Although all rRNA trees show the bifurcation of family Culicidae into subfamilies Anophelinae (genus Anopheles, in purple) and Culicinae (all other genera), the recovered intergeneric phylogenetic relationships vary between the 28S and 18S rRNA trees and are weakly supported. The 18S rRNA tree also exhibited several taxonomic anomalies: (i) the lack of definitive clustering by species within the Culex subgenus (in coral); (ii) the lack of distinction between 18S rRNA sequences of Cx. pseudovishnui and Cx. tritaeniorhynchus (in coral); (iii) the placement of Ma sp.3 CF S35 (in dark green) within a Culex clade; and (iv) the lack of a monophyletic Mimomyia clade (in teal) (Figure 4—figure supplement 2). However, 28S and 18S rRNA sequences are encoded by linked loci in rDNA clusters and should not be analysed separately.

Indeed, when concatenated 28S+18S rRNA sequences were generated from the same specimens (Figure 4), the phylogenetic tree resulting from these sequences more closely resembles the 28S tree (Figure 3) with regard to the basal position of the Mimomyia clade (in teal) within the Culicinae subfamily with good bootstrap support in either tree (84% in 28S rRNA tree, 100% in concatenated 28S+18S rRNA tree). For internal nodes, bootstrap support values were higher in the concatenated tree compared to the 28S tree. Interestingly, the 28S+18S rRNA tree formed an Aedini tribe-clade encompassing taxa from genera Psorophora (in yellow), Aedes (in dark blue), and Eretmapodites (in brown), possibly driven by the inclusion of 18S rRNA sequences. Concatenation also resolved the anomalies found in the 18S rRNA tree and added clarity to the close relationship between Culex (in coral) and Mansonia (in dark green) taxa. Of note, relative to the 28S tree (Figure 3) the Culex and Mansonia genera are no longer monophyletic in the concatenated 28S+18S rRNA tree (Figure 4). Genus Culex is paraphyletic with respect to subgenus Mansonoides of genus Mansonia (Figure 3). Ma. titillans and Ma sp.4, which we suspect to be Mansonia pseudotitillans, always formed a distinct branch in 28S or 18S rRNA phylogenies, thus possibly representing a clade of subgenus Mansonia.

Figure 4 with 2 supplements see all
Concatenating 28S and 18S rRNA sequences produces phylogenetic relationships that are concordant with classical Culicidae systematics with higher bootstrap support than 28S sequences alone.

This phylogenetic tree based on concatenated 28S+18S rRNA sequences (3,900+1,900 bp) generated from this study was inferred using the maximum-likelihood method and constructed to scale using MEGA X (Kumar et al., 2018) using an unknown Horreolanus species found among our samples as an outgroup. Values at each node indicate bootstrap support (%) from 500 replications. Each specimen label contains information on taxonomy, origin (as indicated in two-letter country codes), and specimen ID number. Some specimens produced up to two consensus 28S+18S rRNA sequences; this is indicated by the numbers 1 or 2 at the beginning of the specimen label. Specimen genera are indicated by colour: Culex in coral, Anopheles in purple, Aedes in dark blue, Mansonia in dark green, Limatus in light green, Coquillettidia in light blue, Psorophora in yellow, Mimomyia in teal, Uranotaenia in pink, and Eretmapodites in brown. Scale bar at 0.05 is shown.

Figure 4—source data 1

Multiple sequence alignment of 122 28S rRNA sequences, including two sequences from Horreolanus sp. (FASTA).

https://cdn.elifesciences.org/articles/82762/elife-82762-fig4-data1-v2.zip
Figure 4—source data 2

Multiple sequence alignment of 114 18S rRNA sequences, including two sequences from Horreolanus sp. (FASTA).

https://cdn.elifesciences.org/articles/82762/elife-82762-fig4-data2-v2.zip

The concatenated 28S+18S rRNA tree (Figure 4) recapitulates what is classically known about the systematics of our specimens, namely (i) the early divergence of subfamily Anophelinae from subfamily Culicinae, (ii) the division of genus Anopheles (in purple) into two subgenera, Anopheles and Cellia, (iii) the division of genus Aedes (in dark blue) into subgenera Stegomyia and Ochlerotatus, (iv) the divergence of the monophyletic subgenus Melanoconion within the Culex genus (in coral) (Harbach, 2007; Harbach and Kitching, 2016).

rRNA as a molecular marker for taxonomy and phylogeny

We sequenced a 621 bp region of the COI gene to confirm morphological species identification of our specimens and to compare the functionality of rRNA and COI sequences as molecular markers for taxonomic and phylogenetic investigations. COI sequences were able to unequivocally determine the species identity in most specimens except for the following cases. An. coustani COI sequences from our study, regardless of specimen origin, shared remarkably high nucleotide similarity (>98%) with several other Anopheles species such as An. rhodesiensis, An. rufipes, An. ziemanni, An. tenebrosus, although An. coustani remained the most frequent and closest match. In the case of Ae. simpsoni, three specimens had been morphologically identified as Ae. opok although their COI sequences showed 97–100% similarity to that of Ae. simpsoni. As GenBank held no records of Ae. opok COI at the time of this study, we instead aligned the putative Ae. simpsoni COI sequences against two sister species of Ae. opok: Ae. luteocephalus and Ae. africanus. We found they shared only 90% and 89% similarity, respectively. Given this significant divergence, we concluded these specimens to be Ae. simpsoni. Ambiguous results were especially frequent among Culex specimens belonging to the Cx. pipiens or Cx. vishnui subgroups, where the query sequence differed with either of the top two hits by a single nucleotide. For example, between Cx. quinquefasciatus and Cx. pipiens of the Cx. pipiens subgroup, and between Cx. vishnui and Cx. tritaeniorhynchus of the Cx. vishnui subgroup.

Among our three specimens of Ma. titillans, two appeared to belong to a single species that is different from but closely related to Ma. titillans. We surmised that these specimens could instead be Ma. pseudotitillans based on morphological similarity but were not able to verify this by molecular means as no COI reference sequence is available for this species. These specimens are hence putatively labelled as ‘Ma sp.4’.

Phylogenetic reconstruction based on the COI sequences showed clustering of all species taxa into distinct clades, underlining the utility of the COI gene in molecular taxonomy (Figure 5; Hebert et al., 2003; Ratnasingham and Hebert, 2007). However, species delineation among members of Culex subgroups were not as clear-cut, although sister species were correctly placed as sister taxa (Figure 5, in coral). This is comparable to the 28S+18S rRNA tree (Figure 4, in coral) and is indicative of lower intraspecies distances relative to interspecies distances.

Cytochrome c oxidase I (COI) sequences cluster by species but show phylogenetic relationships that contrast those derived from rRNA trees.

A phylogenetic tree based on COI sequences (621–699 bp) was inferred using the maximum-likelihood method and constructed to scale using MEGA X (Kumar et al., 2018) with three water mite species to serve as outgroups. Outgroup sequences obtained from GenBank are annotated with filled circles and their accession numbers are shown. Values at each node indicate bootstrap support (%) from 500 replications. Each specimen label contains information on taxonomy, origin (as indicated in two-letter country codes), and specimen ID. Specimen genera are indicted by colour: Culex in coral, Anopheles in purple, Aedes in dark blue, Mansonia in dark green, Limatus in light green, Coquillettidia in light blue, Psorophora in yellow, Mimomyia in teal, Uranotaenia in pink, and Eretmapodites in brown. Scale bar at 0.05 is shown.

Figure 5—source data 1

Multiple sequence alignment of 106 cytochrome c oxidase I (COI) sequences (FASTA).

https://cdn.elifesciences.org/articles/82762/elife-82762-fig5-data1-v2.zip

To evaluate the utility of 28S and 18S rRNA sequences for molecular taxonomy, we used the 28S+18S rRNA tree to discern the identity of six specimens for which COI sequencing could not be performed. These specimens include three unknown Mansonia species (Specimen ID S33–S35), a Ma. uniformis (Specimen ID S51), an An. gambiae (Specimen ID S47), and a Ur. geometrica (Specimen ID S113) (Appendix 1—table 1). Their positions in the 28S+18S rRNA tree relative to adjacent taxa confirms the morphological identification of all six specimens to the genus level and, for three of them, to the species level (Figure 4; Mansonia in dark green, Anopheles in purple, Uranotaenia in pink).

The phylogenetic relationships indicated by the COI tree compared to the 28S+18S rRNA tree present only few points of similarity, with key differences summarised in Table 2. COI-based phylogenetic inference indeed showed clustering of generic taxa into monophyletic clades albeit with very weak bootstrap support, except for genera Culex and Mansonia (Figure 5; Culex in coral, Mansonia in dark green). Contrary to the 28S+18S rRNA tree (Figure 4), Culex subgenus Melanoconion was depicted as a polyphyletic taxon with Cx. spissipes being a part of the greater Culicini clade with members from subgenera Oculeomyia and Culex while Cx. pedroi and Cx. portesi formed a distantly related clade. Among the Mansonia specimens, the two unknown Ma sp.4 specimens were not positioned as the nearest neighbours of Ma. titillans and instead appeared to have diverged earlier from most of the other taxa from the Culicidae family. Notably, the COI sequences of genus Anopheles (Figure 5, in purple) is not basal to the other members of Culicidae and is instead shown to be sister to Culex COI sequences (8% bootstrap support). This is a direct contrast to what is suggested by the rRNA phylogenies (Figures 3 and 4, Figure 4—figure supplements 1 and 2; Anopheles in purple), which suggests Culex (in coral) rRNA sequences to be among the most recently diverged. Bootstrap support for the more internal nodes of the COI trees were remarkably low compared to those of rRNA-based trees.

Table 2
Summary of differences between rRNA and cytochrome c oxidase I (COI) phylogenies.
Taxa28S+18S rRNA phylogeny (Figure 4)COI phylogeny (Figure 5)
The Anopheles genusForms a clade that is basal to the all other members of family Culicidae; interspecies branch lengths are notably longForms a sister clade to the Culex genus, and is depicted to have diverged more recently; interspecies branch lengths are comparable to that of other genera
The Ur. geometrica speciesForms a clade within the Culicinae subfamily lineageForms a clade that is basal to the all other members of family Culicidae
The Aedini tribeForms a monophyletic clade comprising the genera Aedes, Eretmapodites, and Psorophora, with the latter being an early divergent lineageDoes not form a monophyletic clade; the Psorophora clade is placed among Aedes taxa and the Eretmapodites clade is sister to a Culex subgenus Melanoconion clade
The Culex genusSplits into two monophyletic clades with the three French Guyanese species forming a closely related minor cladeSplits into two clades with two out of three French Guyanese species (Cx. pedroi and Cx. portesi) forming a distantly related minor clade, while the third (Cx. spissipes) is a part of the greater clade
The Mansonia genusIs a polyphyletic group comprising two clades with the two French Guyanese taxa forming a distantly related minor clade; the major clade is placed among Culex taxaForms a subgenus Mansonoides clade as per the 28S+18S rRNA tree but the French Guyanese taxa do not cluster together; is depicted to have diverged earlier relative to other taxa in the assemblage
The Ma sp.4 speciesForms a sister clade to Ma. titillans as part of a minor French Guyanese Mansonia cladeDoes not form a sister clade to Ma. titillans; instead is shown to have diverged earlier than all other members of family Culicidae after Ur. geometrica

In all rRNA trees, it is clear that the interspecific and intersubgeneric evolutionary distances within the genus Anopheles are high relative to any other genera, indicating a greater degree of divergence (Figure 3, Figure 3—figure supplement 1, Figure 4, Figure 4—figure supplements 1 and 2; Anopheles in purple). This is evidenced by the longer branch lengths connecting Anopheline species-clades to the node of the most recent common ancestor for subgenera Anopheles and Cellia. This feature is not evident in the COI tree, where the Anopheline interspecies distances are comparable to those within the Culex, Aedes, and Mansonia taxa (Figure 5; Anopheles in purple, Culex in coral, Aedes in dark blue, Mansonia in dark green).

On Culex subgroups

Culex (subgenus Culex) specimens of this study comprise several closely related sister species belonging to the Cx. vishnui and Culex univittatus subgroups, which are notoriously difficult to differentiate based on morphology. Accordingly, in the 28S+18S rRNA (Figure 4, in coral) and COI (Figure 5, in coral) trees these species and their known sister species were clustered together within the Culex (subgenus Culex) clade: Cx. tritaeniorhynchus with Cx. pseudovishnui (Cx. vishnui subgroup); Cx. perexiguus with Cx. neavei (Cx. univittatus subgroup).

The use of the COI sequence to distinguish between members of the Culex subgroups was limited. For example, for the two Cx. quinquefasciatus samples in our taxonomic assemblage (Specimen ID S74 and S75) (Appendix 1—table 1), BLAST analyses of their COI sequences revealed they are a single nucleotide apart from Cx. pipiens or Cx. quinquefasciatus COI sequences (Appendix 2—table 1). In the 28S rRNA tree with GenBank sequences (Figure 3), two Cx. pipiens GenBank sequences formed a clade sister to another containing three Cx. quinquefasciatus GenBank sequences and the ‘Cx quinquefasciatus MG S74’ sequence with 78% bootstrap support. This is in accordance with other studies examining mitochondrial sequences (Sun et al., 2019) and morphological attributes (Harbach et al., 2017). This shows that the 28S rRNA sequence can distinguish the two species and confirms that ‘Cx quinquefasciatus MG S74’ is indeed a Cx. quinquefasciatus specimen. However, ‘Cx quinquefasciatus MG S75’ is shown to be basal from other sequences within this Cx. pipiens subgroup-clade with 100% bootstrap support. Given that Cx. quinquefasciatus and Cx. pipiens are known to interbreed, it is plausible that this individual is a hybrid of the two species (Farajollahi et al., 2011).

Discussion

RNA-seq metagenomics on field-captured sylvatic mosquitoes is a valuable tool for tracking mosquito viruses through surveillance and virus discovery. However, the lack of reference rRNA sequences hinders good oligo-based depletion and efficient clean-up of RNA-seq data. Additionally, de novo assembly of rRNA sequences is complicated due to regions that are highly conserved across all distantly related organisms that could be present in a single specimen, that is, microbiota, parasites, or vertebrate blood meal. Hence, we established a method to bioinformatically filter out non-host rRNA reads for the accurate assembly of novel 28S and 18S rRNA reference sequences.

We found that phylogenetic reconstructions based on 28S sequences or concatenated 28S+18S rRNA sequences were able to correctly cluster mosquito taxa according to species and corroborate current mosquito classification. This demonstrates that our bioinformatics methodology reliably generates bona fide 28S and 18S rRNA sequences, even in specimens parasitized by water mites or engorged with vertebrate blood. Further, we were able to use 28S+18S rRNA sequence taxonomy for molecular species identification when COI sequences were unavailable or ambiguous, thus supporting the use of rRNA sequences as a molecular marker. In RNA-seq metagenomics applications, they have the advantage of circumventing the need to additionally isolate and sequence DNA from specimens, as RNA-seq reads can be directly mapped against reference sequences. In our hands, there are sufficient numbers of remaining reads post-depletion (5–10% of reads per sample) to assemble complete rRNA contigs (unpublished data).

Phylogenetic inferences based on 28S or 18S rRNA sequences alone do not recover the same interspecific relationships (Figure 4—figure supplements 1 and 2). Relative to 28S sequences, we observed more instances where multiple specimens have near-identical 18S rRNA sequences. This can occur for specimens belonging to the same species, but also for conspecifics sampled from different geographic locations, such as An. coustani, An. gambiae, or Ae. albopictus. More rarely, specimens from the same species subgroup, such as Cx. pseudovishnui and Cx. tritaeniorhynchus, also shared 18S rRNA sequences. This was surprising given that the 18S rRNA sequences in our dataset is 1,900 bp long. Concatenation of 28S and 18S rRNA sequences resolved this issue, enabling species delineation even among sister species of Culex subgroups, where morphological identification meets its limits.

In Cambodia and other parts of Asia, the Cx. vishnui subgroup includes Cx. tritaeniorhynchus, Cx. vishnui, and Cx. pseudovishnui, which are important vectors of JEV (Maquart and Boyer, 2022). The former two were morphologically identified in our study but later revealed by COI sequencing to be a sister species. Discerning sister species of the Cx. pipiens subgroup is further complicated by interspecific breeding, with some populations showing genetic introgression to varying extents (Cornel et al., 2003). The seven sister species of this subgroup are practically indistinguishable based on morphology and require molecular methods to discern (Farajollahi et al., 2011; Zittra et al., 2016). Indeed, the 621 bp COI sequence amplified in our study did not contain enough nucleotide divergence to allow clear identification, given that the COI sequence of Cx. quinquefasciatus specimens differed from that of Cx. pipiens by a single nucleotide. Batovska et al., 2017, found that even the Internal Transcribed Spacer 2 (ITS2) rDNA region, another common molecular marker, could not differentiate the two species. Other DNA molecular markers such as nuclear Ace-2 or CQ11 genes (Aspen and Savage, 2003; Zittra et al., 2016) or Wolbachia pipientis infection status (Cornel et al., 2003) are typically employed in tandem. In our study, 28S rRNA sequence-based phylogeny validated the identity of specimen ‘Cx quinquefasciatus MG S74’ (Figure 3, in coral) and suggested that specimen ‘Cx quinquefasciatus MG S75’ might have been a pipiens-quinquefasciatus hybrid. These examples demonstrate how 28S rRNA sequences, concatenated with 18S rRNA sequences or alone, contain enough resolution to differentiate between Cx. pipiens and Cx. quinquefasciatus. rRNA-based phylogeny thus allows for more accurate species identification and ecological observations in the context of disease transmission. Additionally, tracing the genetic flow across hybrid populations within the Cx. pipiens subgroup can inform estimates of vectorial capacity for each species. As only one or two members from the Cx. pipiens and Cx. vishnui subgroups were represented in our taxonomic assemblage, an explicit investigation including all member species of these subgroups in greater sample numbers is warranted to further test the degree of accuracy with which 28S and 18S rRNA sequences can delineate sister species.

Our study included French Guianese Culex species Cx. spissipes (group Spissipes), Cx. pedroi (group Pedroi), and Cx. portesi (group Vomerifer). These species belong to the New World subgenus Melanoconion, section Spissipes, with well-documented distribution in North and South Americas (Sirivanakarn, 1982) and are vectors of encephalitic alphaviruses EEEV and VEEV among others (Talaga et al., 2021; Turell et al., 2008; Weaver et al., 2004). Indeed, our rooted rRNA and COI trees showed the divergence of the three Melanoconion species from the major Culex clade comprising species broadly found across Africa and Asia (Auerswald et al., 2021; Farajollahi et al., 2011; Nchoutpouen et al., 2019; Takhampunya et al., 2011). The topology of the concatenated 28S+18S rRNA tree places the Cx. portesi and Cx. pedroi species-clades as sister groups (92% bootstrap support), with Cx. spissipes as a basal group within the Melanoconion clade (100% bootstrap support) (Figure 4, in coral). This corroborates the systematics elucidated by Navarro and Weaver, 2004, using the ITS2 marker, and those by Sirivanakarn, 1982 and Sallum and Forattini, 1996 based on morphology. Curiously, in the COI tree, Cx. spissipes sequences were clustered with unknown species Cx. sp.1, forming a clade sister to another containing other Culex (Culex) and Culex (Oculeomyia) species, albeit with very low bootstrap support (Figure 5, in coral). Previous phylogenetic studies based on the COI gene have consistently placed Cx. spissipes or the Spissipes group basal to other groups within the Melanoconion subgenus (Torres-Gutierrez et al., 2016; Torres-Gutierrez et al., 2018). However, these studies contain only Culex (Melanoconion) species in their assemblage, apart from Cx. quinquefasciatus to act as an outgroup. This clustering of Cx. spissipes with non-Melanoconion species in our COI phylogeny could be an artefact of a much more diversified assemblage rather than a true phylogenetic link.

Taking advantage of our multi-country sampling, we examined whether rRNA or COI phylogeny can be used to distinguish conspecifics originating from different geographies. Our assemblage contains five of such species: An. coustani, An. funestus, An. gambiae, Ae. albopictus, and Ma. uniformis. Among the rRNA trees, the concatenated 28S+18S and 28S rRNA trees were able to discriminate between Ma. uniformis specimens from Madagascar, Cambodia, and the Central African Republic (in dark green), and between An. coustani specimens from Madagascar and the Central African Republic (in purple) (100% bootstrap support). In the COI tree, only Ma. uniformis was resolved into geographical clades comprising specimens from Madagascar and specimens from Cambodia (in dark green) (72% bootstrap support). No COI sequence was obtained from one Ma. uniformis specimen from the Central African Republic. The 28S+18S rRNA sequences ostensibly provided more population-level genetic information than COI sequences alone with better support. The use of rRNA sequences in investigating the biodiversity of mosquitoes should therefore be explored with a more comprehensive taxonomic assemblage.

The phylogenetic reconstructions based on rRNA or COI sequences in our study are hardly congruent (Table 2), but two principal differences stand out. First, the COI phylogeny does not recapitulate the early divergence of Anophelinae from Culicinae (Figure 5). This is at odds with other studies estimating mosquito divergence times based on mitochondrial genes (Logue et al., 2013; Lorenz et al., 2021) or nuclear genes (Reidenbach et al., 2009). The second notable feature in the rRNA trees is the remarkably large interspecies and intersubgeneric evolutionary distances within genus Anopheles relative to other genera in the Culicinae subfamily (Figure 3, Figure 3—figure supplement 1, Figure 4, Figure 4—figure supplements 1 and 2; Anopheles in purple) but this is not apparent in the COI tree. The hyperdiversity among Anopheles taxa may be attributed to the earlier diversification of the Anophelinae subfamily in the early Cretaceous period compared to that of the Culicinae subfamily—a difference of at least 40 million years (Lorenz et al., 2021). The differences in rRNA and COI tree topologies indicate a limitation in using COI alone to determine evolutionary relationships. Importantly, drawing phylogenetic conclusions from short DNA markers such as COI has been cautioned against due to its weak phylogenetic signal (Hajibabaei et al., 2006). The relatively short length of our COI sequences (621–699 bp) combined with the 100-fold higher nuclear substitution rate of mitochondrial genomes relative to nuclear genomes (Arctander, 1995) could result in homoplasy (Danforth et al., 2005), making it difficult to clearly discern ancestral sequences and correctly assign branches into lineages, as evidenced by the poor nodal bootstrap support at genus-level branches. Indeed, in the study by Lorenz et al., 2021, a phylogenetic tree constructed using a concatenation of all 13 protein-coding genes of the mitochondrial genome was able to resolve ancient divergence events. This affirms that while COI sequences can be used to reveal recent speciation events, longer or multi-gene molecular markers are necessary for studies into deeper evolutionary relationships (Danforth et al., 2005).

In contrast to Anophelines where 28S rRNA phylogenies illustrated higher interspecies divergence compared to COI phylogeny, two specimens of an unknown Mansonia species, ‘Ma sp.4 GF S103’ and ‘Ma sp.4 GF S104’, provided an example where interspecies relatedness based on their COI sequences is greater than that based on their rRNA sequences in relation to ‘Ma titillans GF S105’. While all rRNA trees placed ‘Ma titillans GF S105’ as a sister taxon with 100% bootstrap support, the COI tree placed M sp.4 basal to all other species except Ur. geometrica (Figure 5; Mansonia in dark green, Uranotaenia in pink). This may hint at a historical selective sweep in the mitochondrial genome, whether arising from geographical separation, mutations, or linkage disequilibrium with inherited symbionts (Hurst and Jiggins, 2005), resulting in the disparate mitochondrial haplogroups found in French Guyanese Ma sp.4 and Ma. titillans. In addition, both haplogroups are distant from those associated with members of subgenus Mansonoides. To note, the COI sequences of ‘M sp.4 GF S103’ and ‘M sp.4 GF S104’ share 87.12% and 87.39% nucleotide similarity, respectively, to that of ‘Ma titillans GF S105’. Interestingly, the endosymbiont Wo. pipientis has been detected in Ma. titillans sampled from Brazil (de Oliveira et al., 2015), which may contribute to the divergence of ‘Ma titillans GF S105’ COI sequence away from those of Ma sp.4. This highlights other caveats of using a mitochondrial DNA marker in determining evolutionary relationships (Hurst and Jiggins, 2005), which nuclear markers such as 28S and 18S rRNA sequences may be immune to.

Conclusions

Total RNA-seq is a valuable tool for surveillance and virus discovery in sylvatic mosquitoes but it is impeded by the lack of full-length rRNA reference sequences. Here, we presented an rRNA sequence assembly strategy and a dataset of 234 newly generated mosquito 28S and 18S rRNA sequences. Our work has expanded the current mosquito rRNA reference library by providing, to our knowledge, the first full-length rRNA records for 30 species in public databases and paves the way for the assembly of many more. These novel rRNA sequences can improve mosquito metagenomics based on RNA-seq by enabling physical and computational removal of rRNA from specimens and streamlined species identification using rRNA markers.

Given that a reference sequence is available, rRNA markers could serve as a better approach for mosquito taxonomy and phylogeny than COI markers. In analysing the same set of specimens based on their COI and rRNA sequences, we showed that rRNA sequences can discriminate between members of a species subgroup as well as conspecifics from different geographies. Phylogenetic inferences from a tree based on 28S rRNA sequences alone or on concatenated 28S+18S rRNA sequences are more aligned with contemporary mosquito systematics, showing evolutionary relationships that agree with other phylogenetic studies. While COI-based phylogeny can reveal recent speciation events, rRNA sequences may be better suited for investigations of deeper evolutionary relationships as they are less prone to selective sweeps and homoplasy. The advantages and disadvantages of rRNA and COI sequences as molecular markers are summarised in Table 3. Further studies are necessary to reveal how rRNA sequences compare against other nuclear or mitochondrial DNA marker systems (Batovska et al., 2017; Beebe, 2018; Behura, 2006; Ratnasingham and Hebert, 2007; Reidenbach et al., 2009; Vezenegho et al., 2022).

Table 3
Comparison of 28S or concatenated 28S+18S rRNA and cytochrome c oxidase I (COI) sequences as molecular markers.
28S+18S rRNA
AdvantagesDisadvantages
  • In RNA-seq metagenomics studies, molecular taxonomy of specimens based on rRNA sequences can be done from RNA-seq data without additional sample preparation or sequencing.

  • 28S rRNA and concatenated 28S+18S rRNA sequences can resolve the identity of specimens where COI sequences were ambiguous, particularly between members of species subgroups.

  • 28S rRNA and concatenated 28S+18S rRNA sequences can distinguish conspecifics from different geographies for certain species.

  • Phylogenetic inferences based on 28S rRNA and concatenated 28S+18S rRNA sequences show relationships that are more concordant to contemporary mosquito systematics elucidated by other studies and may be a more suitable marker to study deep evolutionary relationships.

  • Being longer and nuclear-encoded, 28S or concatenated 28S+18S rRNA sequences are immune to homoplasy or to selective sweeps that may affect genomes of inherited symbionts such as mitochondria.

  • RNA-seq costs more than Sanger sequencing.

  • Reference rRNA sequences are currently much more limited in breadth compared to other established molecular markers.

COI
AdvantagesDisadvantages
  • With a larger reference database, the COI is a versatile marker for molecular taxonomy.

  • Being a shorter DNA marker, the COI gene is cost- and time-effective to amplify, sequence, and characterise.

  • Universal primer sets to amplify the COI marker have been developed and tested for many diverse species.

  • All species taxa clustered into distinct clades but with weaker bootstrap support at internal nodes relative to those of the 28S+18S rRNA tree.

  • For An. coustani, and members of Culex species subgroups such as Cx. quinquefasciatus and Cx. tritaeniorhynchus, COI sequences are unable to unequivocally confirm species identity as species can differ by just one nucleotide. Other molecular markers are often used in tandem.

Materials and methods

Sample collection

Request a detailed protocol

Mosquito specimens were sampled from 2019 to 2020 by medical entomology teams from the Institut Pasteur de Bangui (Central African Republic, Africa; CF), Institut Pasteur de Madagascar (Madagascar, Africa; MG), Institut Pasteur du Cambodge (Cambodia, Asia; KH), and Institut Pasteur de la Guyane (French Guiana, South America; GF). Adult mosquitoes were sampled using several techniques including CDC light traps, BG sentinels, and human-landing catches. Sampling sites are sylvatic locations including rural settlements in the Central African Republic, Madagascar, and French Guiana and national parks in Cambodia. Mosquitoes were morphologically identified using taxonomic identification keys (Edwards, 1941; Grjebine, 1966; Huang and Ward, 1981; Oo et al., 2006; Rattanarithikul et al., 2007; Rattanarithikul et al., 2010; Rattanarithikul et al., 2005a; Rattanarithikul et al., 2005b; Rattanarithikul et al., 2006a; Rattanarithikul et al., 2006b; Rueda, 2004) on cold tables before preservation by flash freezing in liquid nitrogen and transportation in dry ice to Institut Pasteur Paris for analysis. A list of the 112 mosquito specimens included in our taxonomic assemblage and their related information are provided in Appendix 1—table 1. To note, specimen ID S53, S80, and S81 were removed from our assemblage as their species identity could not be determined by COI or rRNA sequences.

RNA and DNA isolation

Request a detailed protocol

Nucleic acids were isolated from mosquito specimens using TRIzol reagent according to the manufacturer’s protocol (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA). Single mosquitoes were homogenised into 200 µL of TRIzol reagent and other of the reagents within the protocol were volume-adjusted accordingly. Following phase separation, RNA were isolated from the aqueous phase while DNA were isolated from the remaining interphase and phenol-chloroform phase. From here, RNA is used to prepare cDNA libraries for next-generation sequencing while DNA is used in PCR amplification and Sanger sequencing of the mitochondrial COI gene as further described below.

Probe depletion of rRNA

Request a detailed protocol

We tested a selective rRNA depletion protocol by Morlan et al., 2012 on several mosquito species from the Aedes, Culex, and Anopheles genera. We designed 77 tiled 80 bp DNA probes antisense to the Ae. aegypti 28S, 18S, and 5.8S rRNA sequences. A pool of probes at a concentration of 0.04 µM were prepared. To bind probes to rRNA, 1 µL of probes and 2 µL of Hybridisation Buffer (100 mM Tris-HCl and 200 mM NaCl) were added to rRNA samples to a final volume of 20 µL and subjected to a slow-cool incubation starting at 95°C for 2 min, then cooling to 22°C at a rate of 0.1°C per second, ending with an additional 5 min at 22°C. The resulting RNA:DNA hybrids were treated with 2.5 µL Hybridase Thermostable RNase H (Epicentre, Illumina, Madison, WI, USA) and incubated at 37°C for 30 min. To remove DNA probes, the mix was treated with 1 µL DNase I (Invitrogen) and purified with Agencourt RNAClean XP Beads (Beckman Coulter, Brea, CA, USA). The resulting RNA is used for total RNA-seq to check depletion efficiency.

Total RNA-seq

Request a detailed protocol

To obtain rRNA sequences, RNA samples were quantified on a Qubit Fluorometer (Invitrogen) using the Qubit RNA BR Assay kit (Invitrogen) for concentration adjustment. Non-depleted total RNA was used for library preparation for next-generation sequencing using the NEBNext Ultra II RNA Library Preparation Kit for Illumina (New England Biolabs, Ipswich, MA, USA) and the NEBNext Multiplex Oligos for Illumina (Dual Index Primers Set 1) (New England Biolabs). Sequencing was performed on a NextSeq500 sequencing system (Illumina, San Diego, CA, USA). Quality control of fastq data and trimming of adapters were performed with FastQC and cutadapt, respectively.

28S and 18S rRNA assembly

Request a detailed protocol

To obtain 28S and 18S rRNA contigs, we had to first clean our fastq library by separating the reads representing mosquito rRNA from all other reads. To achieve this, we used the SILVA RNA sequence database to create two libraries: one containing all rRNA sequences recorded under the ‘Insecta’ node of the taxonomic tree, the other containing the rRNA sequences of many other nodes distributed throughout the taxonomic tree, hence named ‘Non-Insecta’ (Quast et al., 2013). Each read was aligned using the nucleotide Basic Local Alignment Search Tool (BLASTn, https://blast.ncbi.nlm.nih.gov/) of the National Center for Biotechnology Information (NCBI) against each of the two libraries and the scores of the best high-scoring segment pairs from the two BLASTns are subsequently used to calculate a ratio of Insecta over Non-Insecta scores (Altschul et al., 1990). Only reads with a ratio greater than 0.8 were used in the assembly. The two libraries being non-exhaustive, we chose this threshold of 0.8 to eliminate only reads that were clearly of a non-insect origin. Selected reads were assembled with the SPAdes genome assembler using the ‘-rna’ option, allowing more heterogeneous coverage of contigs and kmer lengths of 31, 51, and 71 bases (Bankevich et al., 2012). This method successfully assembled rRNA sequences for all specimens, including a parasitic Horreolanus water mite (122 sequences for 28S and 114 sequences for 18S).

Initially, our filtration technique had two weaknesses. First, there is a relatively small number of complete rRNA sequences in the Insecta library from SILVA. To compensate for this, we carried out several filtration cycles, each time adding in the complete sequences produced in previous cycles to the Insecta library. Second, when our mosquito specimens were parasitized by other insects, it was not possible to bioinformatically filter out rRNA reads belonging to the parasite. For these rare cases, we used the ‘ --trusted-contigs’ option of the SPAdes assembler (Bankevich et al., 2012), giving it access to the 28S and 18S rRNA sequences of the mosquito closest in terms of taxonomic distance. By doing this, the assembler was able to reconstruct the rRNA of the mosquito as well as the rRNA of the parasitizing insect. All assembled rRNA sequences from this study have been deposited in GenBank with accession numbers OM350214–OM350327 for 18S rRNA sequences and OM542339–OM542460 for 28S rRNA sequences.

COI amplicon sequencing

Request a detailed protocol

The mitochondrial COI gene was amplified from DNA samples using the universal ‘Folmer’ primer set LCO1490 (5’- GGTCAACAAATCATAAAGATATTGG -3’) and HCO2198 (5’-TAAACTTCAGGGTGACCAAAAAATCA-3’), as per standard COI marker sequencing practices, producing a 658 bp product (Folmer et al., 1994). PCRs were performed using Phusion High-Fidelity DNA Polymerase (Thermo Fisher Scientific). Every 50 µL reaction contained 10 µL of 5× High Fidelity buffer, 1 µL of 10 mM dNTPs, 2.5 µL each of 10 mM forward (LCO1490) and reverse (HCO2198) primer, 28.5 µL of water, 5 µL of DNA sample, and 0.5 µL of 2 U/µL Phusion DNA polymerase. A three-step cycling incubation protocol was used: 98°C for 30 s; 35 cycles of 98°C for 10 s, 60°C for 30 s, and 72°C for 15 s; 72°C for 5 min ending with a 4°C hold. PCR products were size-verified using gel electrophoresis and then gel-purified using the QIAquick Gel Extraction Kit (Qiagen, Hilden, Germany). Sanger sequencing of the COI amplicons were performed by Eurofins Genomics, Ebersberg, Germany.

COI sequence analysis

Request a detailed protocol

Forward and reverse COI DNA sequences were end-trimmed to remove bases of poor quality (Q score <30). At the 5’ ends, sequences were trimmed at the same positions such that all forward sequences start with 5’-TTTTGG and all reverse sequences start with 5’-GGNTCT. Forward and reverse sequences were aligned using BLAST to produce a 621 bp consensus sequence. In cases where good quality sequences extends beyond 621 bp, forward and reverse sequences were assembled using Pearl (https://www.gear-genomics.com/pearl/) and manually checked for errors against trace files (Rausch et al., 2019; Rausch et al., 2020). We successfully assembled a total of 106 COI sequences. All assembled COI sequences from this study have been deposited in GenBank with accession numbers OM630610–OM630715.

COI validation of morphology-based species identification

Request a detailed protocol

We analysed assembled COI sequences with BLASTn against the nucleotide collection (nr/nt) database to confirm morphology-based species identification. BLAST analyses revealed 32 cases where top hits indicated a different species identity, taking <95% nucleotide sequence similarity as the threshold to delineate distinct species (Appendix 2—table 1). In these cases, the COI sequence of the specimen was then BLAST-aligned against a GenBank record representing the morphological species to verify that the revised identity is a closer match by a significant margin, that is, more than 2% nucleotide sequence similarity. All species names reported hereafter reflect identities determined by COI sequence except for cases where COI-based identities were ambiguous, in which case morphology-based identities were retained. In cases where matches were found within a single genus but of multiple species, specimens were indicated as an unknown member of their genus (e.g., Culex sp.). Information of the highest-scoring references for all specimens, including details of ambiguous BLASTn results, are recorded in Appendix 2—table 1.

Within our COI sequences, we found six unidentified Culex species (including two that matched to GenBank entries identified only to the genus level), four unidentified Mansonia species, and one unidentified Mimomyia species. For An. baezai, no existing GenBank records were found at the time this analysis was performed.

Phylogenetic analysis

Request a detailed protocol

Multiple sequence alignment (MSA) were performed on assembled COI and rRNA sequences using the MUSCLE software (Edgar, 2004; Madeira et al., 2019). As shown in Figure 3—figure supplement 2, the 28S rRNA sequences contain many blocks of highly conserved nucleotides, which makes the result of multiple alignment particularly evident. We therefore did not test other alignment programs. The multiple alignment of the COI amplicons is even more evident since no gaps are necessary for this alignment.

Phylogenetic tree reconstructions were performed with the MEGA X software using the maximum-likelihood method (Kumar et al., 2018). Default parameters were used with bootstrapping with 500 replications to quantify confidence level in branches. For rRNA trees, sequences belonging to an unknown species of parasitic water mite (genus Horreolanus) found in our specimens served as an outgroup taxon. In addition, we created and analysed a separate dataset combining our 28S rRNA sequences and full-length 28S rRNA sequences from GenBank totalling 169 sequences from 58 species (12 subgenera). To serve as outgroups for the COI tree, we included sequences obtained from GenBank of three water mite species, Horreolanus orphanus (KM101004), Sperchon fuxiensis (MH916807), and Arrenurus sp. (MN362807).

Appendix 1

Appendix 1—table 1
Taxonomic and sampling information on mosquito specimens and associated accession numbers of their cytochrome c oxidase I (COI), 18S rRNA, and 28S rRNA sequences (XLSX).
Sequence IDTaxonomy [Genus (subgenus) species]OriginCollection siteCollection periodBlood engorged (Y/N)Sample IDCOI accession number18S rRNA accession number28S rRNA accession number
Ae_albopictus_KH_S1Aedes (Stegomyia) albopictusCambodiaRattanakiriDec 2019N1OM630613OM350214OM542460
Ae_albopictus_KH_S2Aedes (Stegomyia) albopictusCambodiaRattanakiriDec 2019N2OM630614OM350220OM542373
Ae_albopictus_KH_S3Aedes (Stegomyia) albopictusCambodiaRattanakiriDec 2019N3OM630615OM350316OM542374
An_baezai_KH_S4Anopheles (Anopheles) baezaiCambodiaKoh KongMar 2019N4OM630631OM350327OM542357
An_baezai_KH_S5Anopheles (Anopheles) baezaiCambodiaKoh KongMar 2019N5OM630632OM350233OM542440
An_baezai_KH_S6Anopheles (Anopheles) baezaiCambodiaKoh KongMar 2019N6OM630633OM350234OM542358
Cx_pseudovishnui_KH_S7Culex (Culex) pseudovishnuiCambodiaRattanakiriDec 2019N7OM630689OM350285OM542413
Cx_pseudovishnui_KH_S8Culex (Culex) pseudovishnuiCambodiaRattanakiriDec 2019N8OM630690OM350286OM542414
Cx_pseudovishnui_KH_S9Culex (Culex) pseudovishnuiCambodiaRattanakiriDec 2019N9OM630691OM350287OM542415
Ma_indiana_KH_S10Mansonia (Mansonioides) indianaCambodiaBattambongNov 2019N10OM630698OM350295OM542422
Ma_uniformis_KH_S11Mansonia (Mansonioides) uniformisCambodiaBattambongNov 2019N11OM630699OM350296OM542423
Ma_indiana_KH_S12Mansonia (Mansonioides) indianaCambodiaBattambongNov 2019N12OM630700OM350297OM542424
Cx_sp.1_KH_S13Culex sp.1CambodiaPrek ToalFeb 2019N13OM630672OM350267OM542395
Cx_orientalis_KH_S14Culex (Culex) orientalisCambodiaPrek ToalFeb 2019N14OM630673OM350268OM542396
Ma_uniformis_KH_S15Mansonia (Mansonioides) uniformisCambodiaBattambongNov 2019N15OM630705OM350303OM542430
Ma_uniformis_KH_S16Mansonia (Mansonioides) uniformisCambodiaBattambongNov 2019N16OM630706OM350305OM542432
Ma_uniformis_KH_S17Mansonia (Mansonioides) uniformisCambodiaBattambongNov 2019N17OM630707OM350304OM542431
Cx_bitaeniorhynchus_KH_S18Culex (Oculeomyia) bitaeniorhynchusCambodiaBattambongNov 2019N18OM630656OM350255OM542381
Cx_bitaeniorhynchus_KH_S19Culex (Oculeomyia) bitaeniorhynchusCambodiaBattambongNov 2019N19OM630657OM350256OM542382
Cx_bitaeniorhynchus_KH_S20Culex (Oculeomyia) bitaeniorhynchusCambodiaBattambongNov 2019N20OM630658OM350257OM542383, OM542384
Cx_tritaeniorhynchus_KH_S21Culex (Culex) tritaeniorhynchusCambodiaBattambongNov 2019N21OM630680OM350277OM542404
Cx_tritaeniorhynchus_KH_S22Culex (Culex) tritaeniorhynchusCambodiaBattambongNov 2019N22OM630681OM350278OM542405
Cx_tritaeniorhynchus_KH_S23Culex (Culex) tritaeniorhynchusCambodiaBattambongNov 2019N23OM630682OM350279OM542406
Ae_aegypti_CF_S24Aedes (Stegomyia) aegyptiCentral African RepublicPissaJun 2019N24OM630610OM350314OM542339
Ae_aegypti_CF_S25Aedes (Stegomyia) aegyptiCentral African RepublicPissaJun 2019N25OM630611OM350215OM542340
Ae_aegypti_CF_S26Aedes (Stegomyia) aegyptiCentral African RepublicPissaJun 2019N26OM630612OM350216OM542341
Ae_simpsoni_CF_S27Aedes (Stegomyia) simpsoniCentral African RepublicPissaJun 2019N27OM630619OM350221OM542345
Ae_simpsoni_CF_S28Aedes (Stegomyia) simpsoniCentral African RepublicPissaJun 2019N28OM630620OM350222OM542346
Ae_simpsoni_CF_S29Aedes (Stegomyia) simpsoniCentral African RepublicPissaJun 2019N29OM630621OM350223OM542347
Ae_vittatus_CF_S30Aedes (Fredwardsius) vittatusCentral African RepublicGbozoAug 2019Y30OM630628OM350230OM542439
Ae_vittatus_CF_S31Aedes (Fredwardsius) vittatusCentral African RepublicGbozoAug 2019N31OM630629OM350231OM542355
Ae_vittatus_CF_S32Aedes (Fredwardsius) vittatusCentral African RepublicGbozoAug 2019N32OM630630OM350232OM542356
Ma_sp.1_CF_S33Mansonia sp.1Central African RepublicBayangaNov 2019Y33N/AOM350294OM542449
Ma_sp.2_CF_S34Mansonia sp.2Central African RepublicBayangaNov 2019Y34N/AOM350322OM542450, OM542456
Ho_sp.1_CF_S34Horreolanus sp.1Central African RepublicBayangaNov 201934N/AOM350325OM542457
Ho_sp.2_CF_S34Horreolanus sp.2Central African RepublicBayangaNov 201934N/AOM350326OM542458
Ma_sp.3_CF_S35Mansonia sp.3Central African RepublicBayangaNov 2019Y35N/AOM350323OM542451
Ae_albopictus_CF_S36Aedes (Stegomyia) albopictusCentral African RepublicPissaJun 2019N36OM630616OM350217OM542342
Ae_albopictus_CF_S37Aedes (Stegomyia) albopictusCentral African RepublicPissaJun 2019N37OM630617OM350218OM542343
Ae_albopictus_CF_S38Aedes (Stegomyia) albopictusCentral African RepublicPissaJun 2019N38OM630618OM350219OM542344
An_coustani_CF_S39Anopheles (Anopheles) coustaniCentral African RepublicPissaJan 2020N39OM630634OM350235OM542359
An_coustani_CF_S40Anopheles (Anopheles) coustaniCentral African RepublicPissaJan 2020N40OM630635OM350236OM542360
An_coustani_CF_S41Anopheles (Anopheles) coustaniCentral African RepublicPissaJan 2020N41OM630636OM350237OM542361
An_funestus_CF_S42Anopheles (Cellia) funestusCentral African RepublicPissaJun 2019Y42OM630640OM350241OM542365
An_funestus_CF_S43Anopheles (Cellia) funestusCentral African RepublicPissaJun 2019Y43OM630641OM350242OM542366
An_funestus_CF_S44Anopheles (Cellia) funestusCentral African RepublicPissaJun 2019Y44OM630642OM350243OM542367
An_gambiae_CF_S45Anopheles (Cellia) gambiaeCentral African RepublicPissaJun 2019Y45OM630645OM350245OM542369, OM542370
An_gambiae_CF_S46Anopheles (Cellia) gambiaeCentral African RepublicPissaJun 2019Y46OM630646OM350246OM542371
An_gambiae_CF_S47Anopheles (Cellia) gambiaeCentral African RepublicPissaJun 2019Y47N/AOM350247OM542372
Cx_sp.2_CF_S48Culex sp.2Central African RepublicBayangaNov 2019Y48OM630669OM350269OM542446
Cx_sp.2_CF_S49Culex sp.2Central African RepublicBayangaNov 2019Y49OM630670OM350315OM542397
Cx_sp.2_CF_S50Culex sp.2Central African RepublicBayangaNov 2019Y50OM630671OM350270OM542398
Ma_uniformis_CF_S51Mansonia (Mansonioides) uniformisCentral African RepublicBouarMay 2019Y51N/AOM350301OM542428
Cx_duttoni_CF_S52Culex (Culex) duttoniCentral African RepublicMbaikiJan 2019Y52OM630704OM350302OM542429
Er_intermedius_CF_S54Eretmapodites intermediusCentral African RepublicPissaJun 2019N54OM630692OM350288OM542416
Er_intermedius_CF_S55Eretmapodites intermediusCentral African RepublicPissaJun 2019N55OM630693OM350289OM542417
Er_intermedius_CF_S56Eretmapodites intermediusCentral African RepublicPissaJun 2019N56OM630694OM350290OM542418
Cx_antennatus_MG_S57Culex (Culex) antennatusMadagascarAmbato BoenyFeb 2019N57OM630653OM350253OM542379
Cx_antennatus_MG_S58Culex (Culex) antennatusMadagascarAmbato BoenyFeb 2019N58OM630654OM350319OM542444
Cx_antennatus_MG_S59Culex (Culex) antennatusMadagascarAmbato BoenyFeb 2019N59OM630655OM350254OM542380
Cx_perexiguus_MG_S60Culex (Culex) perexiguusMadagascarAmparafaravolaFeb 2019N60OM630660OM350258OM542386
Cx_sp.3_MG_S61Culex sp.3MadagascarAmbato BoenyAug 2019N61OM630661OM350259OM542387
Cx_sp.4_MG_S62Culex sp.4MadagascarAmbato BoenyAug 2019N62OM630662OM350260OM542388
Cx_sp.3_MG_S63Culex sp.3MadagascarAmbato BoenyFeb 2019N63OM630686OM350282OM542410
Mi_sp.1_MG_S64Mimomyia sp.1MadagascarAmbato BoenyFeb 2019N64OM630687OM350283OM542411
Cx_sp.3_MG_S65Culex sp.3MadagascarAmbato BoenyFeb 2019N65OM630688OM350284OM542412
An_coustani_MG_S66Anopheles (Anopheles) coustaniMadagascarAmbato BoenyFeb 2019N66OM630637OM350238OM542362
An_coustani_MG_S67Anopheles (Anopheles) coustaniMadagascarAmbato BoenyFeb 2019N67OM630638OM350239OM542363
An_squamosus_MG_S68Anopheles (Cellia) squamosusMadagascarAmbato BoenyFeb 2019N68OM630639OM350240OM542364
An_funestus_MG_S69Anopheles (Cellia) funestusMadagascarAmbato BoenyFeb 2019N69OM630643OM350244OM542368
An_funestus_MG_S70Anopheles (Cellia) funestusMadagascarAmbato BoenyFeb 2020N70OM630644OM350317OM542441
An_gambiae_MG_S71Anopheles (Cellia) gambiaeMadagascarAmbato BoenyFeb 2019N71OM630647OM350249OM542442
An_gambiae_MG_S72Anopheles (Cellia) gambiaeMadagascarAmbato BoenyFeb 2019N72OM630648OM350248OM542443
An_gambiae_MG_S73Anopheles (Cellia) gambiaeMadagascarAmbato BoenyFeb 2019N73OM630649OM350318OM542459
Cx_quinquefasciatus_MG_S74Culex (Culex) quinquefasciatusMadagascarAmparafaravolaFeb 2019N74OM630674OM350271OM542399
Cx_quinquefasciatus_MG_S75Culex (Culex) quinquefasciatusMadagascarAmparafaravolaFeb 2019N75OM630675OM350272OM542447
Cx_perexiguus_MG_S76Culex (Culex) perexiguusMadagascarMampikonyAug 2019N76OM630676OM350273OM542400
Ma_uniformis_MG_S77Mansonia (Mansonioides) uniformisMadagascarAmbato BoenyFeb 2019N77OM630708OM350306OM542433
Ma_uniformis_MG_S78Mansonia (Mansonioides) uniformisMadagascarAmbato BoenyFeb 2019N78OM630709OM350307OM542434
Ma_uniformis_MG_S79Mansonia (Mansonioides) uniformisMadagascarAmbato BoenyFeb 2019N79OM630710OM350308OM542435
Cx_poicilipes_MG_S82Culex poicilipesMadagascarMampikonyFeb 2019N82OM630659OM350320OM542385, OM542445
Mi_mediolineata_MG_S83Mimomyia mediolineataMadagascarAmbato BoenyFeb 2019N83OM630683OM350280OM542407
Cx_neavei_MG_S84Culex (Culex) neaveiMadagascarAmbato BoenyFeb 2019N84OM630684OM350281OM542408, OM542409
Ae_scapularis_GF_S85Aedes (Ochlerotatus) scapularisFrench GuianaHameau PrefontaineJul 2019N85OM630624OM350224OM542348, OM542349
Ae_scapularis_GF_S86Aedes (Ochlerotatus) scapularisFrench GuianaHameau PrefontaineJul 2019N86OM630622OM350225OM542350
Ae_scapularis_GF_S87Aedes (Ochlerotatus) scapularisFrench GuianaHameau PrefontaineJul 2019N87OM630623OM350226OM542351
Ae_serratus_GF_S88Aedes (Ochlerotatus) serratusFrench GuianaHameau PrefontaineNov 2020N88OM630625OM350227OM542352
Ae_serratus_GF_S89Aedes (Ochlerotatus) serratusFrench GuianaHameau PrefontaineNov 2020N89OM630626OM350228OM542353
Ae_serratus_GF_S90Aedes (Ochlerotatus) serratusFrench GuianaHameau PrefontaineNov 2020N90OM630627OM350229OM542354
Cq_venezuelensis_GF_S91Coquillettidia venezuelensisFrench GuianaHameau PrefontaineJul 2019N91OM630650OM350250OM542375
Cq_venezuelensis_GF_S92Coquillettidia venezuelensisFrench GuianaHameau PrefontaineJul 2019N92OM630651OM350251OM542376
Cq_venezuelensis_GF_S93Coquillettidia venezuelensisFrench GuianaHameau PrefontaineJul 2019N93OM630652OM350252OM542377, OM542378
Cx_portesi_GF_S94Culex sp. BTLHVDV-2014French GuianaHameau PrefontaineJul 2019N94OM630666OM350264OM542392
Cx_portesi_GF_S95Culex sp. BTLHVDV-2014French GuianaHameau PrefontaineJul 2019N95OM630667OM350265OM542393
Cx_portesi_GF_S96Culex sp. BTLHVDV-2014French GuianaHameau PrefontaineJul 2019N96OM630668OM350266OM542394
Cx_spissipes_GF_S97Culex (Melanoconion) sp. DJS-2020French GuianaHameau PrefontaineJul 2019N97OM630677OM350274OM542401
Cx_spissipes_GF_S98Culex (Melanoconion) sp. DJS-2020French GuianaHameau PrefontaineJul 2019N98OM630678OM350275OM542402
Cx_spissipes_GF_S99Culex (Melanoconion) sp. DJS-2020French GuianaHameau PrefontaineJul 2019N99OM630679OM350276OM542403
Li_durhamii_GF_S100Limatus durhamiiFrench GuianaHameau PrefontaineJul 2019N100OM630695OM350291OM542419
Li_durhamii_GF_S101Limatus durhamiiFrench GuianaHameau PrefontaineJul 2019N101OM630696OM350292OM542420
Li_durhamii_GF_S102Limatus durhamiiFrench GuianaHameau PrefontaineJul 2019N102OM630697OM350293OM542421
Ma_sp.4_GF_S103Mansonia sp.4French GuianaHameau PrefontaineJan 2020N103OM630701OM350298OM542425
Ma_sp.4_GF_S104Mansonia sp.4French GuianaHameau PrefontaineJan 2020N104OM630702OM350299OM542426
Ma_titillans_GF_S105Mansonia (Mansonia) titillansFrench GuianaHameau PrefontaineJan 2020N105OM630703OM350300OM542427
Cx_pedroi_GF_S106Culex (Melanoconion) pedroiFrench GuianaHameau PrefontaineNov 2020N106OM630663OM350261OM542389
Cx_pedroi_GF_S107Culex (Melanoconion) pedroiFrench GuianaHameau PrefontaineNov 2020N107OM630664OM350262OM542390
Cx_pedroi_GF_S108Culex (Melanoconion) pedroiFrench GuianaHameau PrefontaineNov 2020N108OM630665OM350263OM542391
Ps_ferox_GF_S109Psorophora feroxFrench GuianaIracoubo2009N109OM630711OM350309OM542436
Ps_ferox_GF_S110Psorophora feroxFrench GuianaIracoubo2009N110OM630712OM350310OM542437
Ps_ferox_GF_S111Psorophora feroxFrench GuianaIracoubo2009N111OM630713OM350324OM542452
Ur_geometrica_GF_S112Uranotaenia (Uranotaenia) geometricaFrench Guiana2010N112OM630714OM350311OM542453
Ur_geometrica_GF_S113Uranotaenia (Uranotaenia) geometricaFrench Guiana2010N113N/AOM350312OM542454
Ur_geometrica_GF_S114Uranotaenia (Uranotaenia) geometricaFrench Guiana2010N114OM630715OM350313OM542438, OM542455
Cx_tritaeniorhynchus_MG_S115Culex (Culex) tritaeniorhynchusMadagascarAmbato BoenyFeb 2019N115OM630685OM350321OM542448

Appendix 2

Appendix 2—table 1
Cytochrome c oxidase I (COI) sequence BLAST analyses summary (XLSX).
Sequence IDSequence lengthMorphological identificationBLASTn top hit speciesBLASTn top hit accessionQuery coverage% identityComments
Ae_albopictus_KH_S1699Aedes albopictusAedes albopictusMK714006.199%99.71%
Ae_albopictus_KH_S2695Aedes albopictusAedes albopictusMK714006.1100%99.71%
Ae_albopictus_KH_S3695Aedes albopictusAedes albopictusMK714006.1100%99.71%
An_baezai_KH_S4658Anopheles baezaiAnopheles darlingiMF381626.1100%92.71%Anopheles baezai not found in GenBank
An_baezai_KH_S5670Anopheles baezaiAnopheles darlingiMF381626.199%92.81%Anopheles baezai not found in GenBank
An_baezai_KH_S6659Anopheles baezaiAnopheles darlingiMF381626.1100%92.72%Anopheles baezai not found in GenBank
Cx_pseudovishnui_KH_S7660Culex vishnuiCulex pseudovishnuiMW321882.198%98.92%95% similarity to Culex vishnui, 94% similarity to Culex tritaeniorhynchus
Cx_pseudovishnui_KH_S8659Culex vishnuiCulex pseudovishnuiMW321882.198%99.38%95% similarity to Culex vishnui, 94% similarity to Culex tritaeniorhynchus
Cx_pseudovishnui_KH_S9659Culex vishnuiCulex pseudovishnuiMW321882.198%98.92%95% similarity to Culex vishnui, 94% similarity to Culex tritaeniorhynchus
Ma_indiana_KH_S10660Mansonia indianaMansonia indianaMK637632.198.00%99.54%
Ma_uniformis_KH_S11686Mansonia indianaMansonia uniformisMK757484.199%99.71%89.99% similarity to Mansonia indiana MK637632.1
Ma_indiana_KH_S12693Mansonia indianaMansonia indianaMK637632.197%99.41%
Cx_sp.1_KH_S13687Culex quinquefasciatusCulex (Lophoceraomyia) sp.5 HY-2020MW321904.198%94.39%90% similarity to Culex quinquefasciatus GU188856.2
Cx_orientalis_KH_S14662Culex quinquefasciatusCulex orientalisMW228488.197%98.29%
Ma_uniformis_KH_S15658Mansonia uniformisMansonia uniformisMK757484.1100.00%99.54%
Ma_uniformis_KH_S16654Mansonia uniformisMansonia uniformisMK757484.1100.00%99.39%
Ma_uniformis_KH_S17657Mansonia uniformisMansonia uniformisMK757484.199.00%99.54%
Cx_bitaeniorhynchus_KH_S18658Culex bitaeniorhynchusCulex bitaeniorhynchusHQ398898.197.00%99.69%
Cx_bitaeniorhynchus_KH_S19650Culex bitaeniorhynchusCulex bitaeniorhynchusHQ398898.198.00%99.84%
Cx_bitaeniorhynchus_KH_S20652Culex bitaeniorhynchusCulex bitaeniorhynchusHQ398898.198.00%99.38%
Cx_tritaeniorhynchus_KH_S21695Culex tritaeniorhynchusCulex vishnui or Culex tritaeniorhynchusMH374857.1100%99.57%99.69% similarity to Culex tritaeniorhynchus MF179213.1
Cx_tritaeniorhynchus_KH_S22690Culex tritaeniorhynchusCulex vishnui or Culex tritaeniorhynchusMT876103.1100%99.57%
Cx_tritaeniorhynchus_KH_S23663Culex tritaeniorhynchusCulex tritaeniorhynchusMT876103.199%98.79%
Ae_aegypti_CF_S24689Aedes aegyptiAedes aegyptiMN299016.1100%99.56%
Ae_aegypti_CF_S25660Aedes aegyptiAedes aegyptiMN299024.1100.00%99.70%
Ae_aegypti_CF_S26660Aedes aegyptiAedes aegyptiMN299024.1100.00%99.70%
Ae_simpsoni_CF_S27644Aedes opokAedes simpsoniLC473669.197.00%97.77%Aedes opok not found in GenBank, sequence has 90% and 89% similarity to Aedes luteocephalus and Aedes africanus, sister species of Aedes opok.
Ae_simpsoni_CF_S28649Aedes opokAedes simpsoniMN552302.199.00%100.00%Aedes. opok not found in GenBank, sequence has 90% and 89% similarity to Aedes luteocephalus and Aedes africanus, sister species of Aedes opok.
Ae_simpsoni_CF_S29627Aedes opokAedes simpsoniMN552302.198.00%98.87%Aedes opok not found in GenBank, sequence has 90% and 89% similarity to Aedes luteocephalus and Aedes africanus, sister species of Aedes opok.
Ae_vittatus_CF_S30623Aedes vittatusAedes vittatusMN552298.1100.00%99.84%
Ae_vittatus_CF_S31622Aedes vittatusAedes vittatusMN552298.1100.00%99.68%
Ae_vittatus_CF_S32621Aedes vittatusAedes vittatusMN552298.1100.00%99.68%
Ma_sp.1_CF_S33Mansonia africanaNo COI obtained
Ma_sp.2_CF_S34Mansonia africanaNo COI obtained
Ma_sp.3_CF_S35Mansonia africanaNo COI obtained
Ae_albopictus_CF_S36627Aedes albopictusAedes albopictusMK995332.1100.00%99.84%
Ae_albopictus_CF_S37621Aedes albopictusAedes albopictusMK995332.1100.00%100.00%
Ae_albopictus_CF_S38621Aedes albopictusAedes albopictusMK995332.1100.00%100.00%
An_coustani_CF_S39621Anopheles coustaniAnopheles coustaniMK585968.1100.00%99.84%
An_coustani_CF_S40621Anopheles coustaniAnopheles coustaniMK585959.1100.00%99.03%
An_coustani_CF_S41699Anopheles coustaniAnopheles coustaniMK585968.194.00%99.70%
An_funestus_CF_S42696Anopheles funestusAnopheles funestusMK300231.1100.00%99.71%
An_funestus_CF_S43660Anopheles funestusAnopheles funestusMT375215.1100.00%99.85%
An_funestus_CF_S44658Anopheles funestusAnopheles funestusMT375215.1100.00%99.70%
An_gambiae_CF_S45660Anopheles gambiaeAnopheles gambiaeMG930895.186.00%99.79%
An_gambiae_CF_S46659Anopheles gambiaeAnopheles gambiaeMT375223.189.00%100.00%
An_gambiae_CF_S47Anopheles gambiaeNo COI obtained
Cx_sp.2_CF_S48653Culex quinquefasciatusCulex cornigerKM593015.1100.00%94.95%94% similarity to all other Culex species
Cx_sp.2_CF_S49660Culex quinquefasciatusCulex nigripalpusKM593058.199.00%94.65%94% similarity to all other Culex species
Cx_sp.2_CF_S50658Culex quinquefasciatusCulex bidensMH931446.1100.00%94.68%94% similarity to all other Culex species
Ma_uniformis_CF_S51Mansonia uniformisNo COI obtained
Cx_duttoni_CF_S52621Mansonia uniformisCulex duttoniLC473629.1100.00%99.68%
Er_intermedius_CF_S54620Eretmapodites sp.Eretmapodites intermediusMN552305.1100.00%99.52%
Er_intermedius_CF_S55621Eretmapodites sp.Eretmapodites intermediusMN552305.1100.00%99.68%
Er_intermedius_CF_S56621Eretmapodites sp.Eretmapodites intermediusMN552305.1100.00%99.68%
Cx_antennatus_MG_S57621Culex antennatusCulex antennatusLC473659.1100.00%100.00%
Cx_antennatus_MG_S58621Culex antennatusCulex antennatusLC473659.1100.00%100.00%
Cx_antennatus_MG_S59621Culex. antennatusCulex antennatusLC473659.1100.00%100.00%
Cx_perexiguus_MG_S60621Culex decensCulex perexiguusLC473634.1100.00%99.84%
Cx_sp.3_MG_S61685Culex decensUnknown Culex speciesKU380436.196.00%96.05%
Cx_sp.4_MG_S62687Culex decensUnknown Culex speciesMT993494.199.00%95.63%
Cx_sp.3_MG_S63687Culex univittatusUnknown Culex speciesKU380436.195.00%96.50%
Mi_sp.1_MG_S64694Culex univittatusMimomyia mimomyiaformisLC473719.194.00%92.55%Unknown Mimomyia species
Cx_sp.3_MG_S65691Culex univittatusUnknown Culex speciesKU380436.195.00%96.66%
An_coustani_MG_S66669Anopheles coustaniAnopheles coustaniNC_050693.199.00%99.40%
An_coustani_MG_S67659Anopheles coustaniAnopheles coustaniNC_050693.199.00%99.08%
An_squamosus_MG_S68653Anopheles coustaniAnopheles squamosusMK776741.1100.00%100.00%
An_funestus_MG_S69654Anopheles funestusAnopheles funestusMT375215.1100.00%99.85%
An_funestus_MG_S70654Anopheles funestusAnopheles funestusMG742199.1100.00%99.69%
An_gambiae_MG_S71654Anopheles gambiaeAnopheles gambiaeMT375222.1100.00%99.85%
An_gambiae_MG_S72654Anopheles gambiaeAnopheles gambiaeMT375222.1100.00%99.85%
An_gambiae_MG_S73622Anopheles gambiaeAnopheles gambiaeMT375222.1100.00%100.00%
Cx_quinquefasciatus_MG_S74654Culex quinquefasciatusCulex pipiensMT199095.1100.00%100.00%99.85% similarity to Culex quinquefasciatus
Cx_quinquefasciatus_MG_S75647Culex quinquefasciatusCulex quinquefasciatusMH423504.1100.00%98.15%Also 98% similarity to Culex pipiens
Cx_perexiguus_MG_S76621Culex quinquefasciatusCulex perexiguusLC473634.1100.00%99.52%Same SNPs to Culex pipiens MH374861.1
Ma_uniformis_MG_S77621Mansonia uniformisMansonia uniformisKU187165.1100.00%100.00%
Ma_uniformis_MG_S78621Mansonia uniformisMansonia uniformisKU187165.1100.00%100.00%
Ma_uniformis_MG_S79626Mansonia uniformisMansonia uniformisKU187157.1100.00%99.68%
Cx_poicilipes_MG_S82689Culex bitaeniorhynchusCulex poicilipesLC473618.195.00%99.70%
Mi_mediolineata_MG_S83694Culex tritaeniorhynchusMimomyia mediolineataLC473723.194.00%99.39%
Cx_neavei_MG_S84671Culex tritaeniorhynchusCulex neaveiLC473635.198.00%99.85%
Ae_scapularis_GF_S85659Aedes scapularisAedes scapularisMN997484.197.00%98.76%
Ae_scapularis_GF_S86658Aedes scapularisAedes scapularisMF172265.197.00%99.38%
Ae_scapularis_GF_S87654Aedes scapularisAedes scapularisMF172265.198.00%99.22%
Ae_serratus_GF_S88660Aedes serratusAedes serratusMF172269.197.00%98.91%
Ae_serratus_GF_S89660Aedes serratusAedes serratusMF172268.197.00%99.22%
Ae_serratus_GF_S90654Aedes serratusAedes serratusMF172268.198.00%99.07%
Cq_venezuelensis_GF_S91658Coquillettidia venezuelensisCoquillettidia venezuelensisMN997703.197.00%97.98%
Cq_venezuelensis_GF_S92621Coquillettidia venezuelensisCoquillettidia. venezuelensisMN997703.1100.00%98.07%
Cq_venezuelensis_GF_S93621Coquillettidia venezuelensisCoquillettidia venezuelensisMN997703.1100.00%97.75%
Cx_portesi_GF_S94653Culex portesiCulex portesiin-house reference library98.5–100%Reference sequence provided by Amandine Guidez, IP Guyane
Cx_portesi_GF_S95693Culex portesiCulex portesiin-house reference library98.5–100%Reference sequence provided by Amandine Guidez, IP Guyane
Cx_portesi_GF_S96687Culex portesiCulex portesiin-house reference library98.5–100%Reference sequence provided by Amandine Guidez, IP Guyane
Cx_spissipes_GF_S97672Culex spissipesCulex spissipesin-house reference library98.5–100%Reference sequence provided by Amandine Guidez, IP Guyane
Cx_spissipes_GF_S98663Culex spissipesCulex spissipesin-house reference library98.5–100%Reference sequence provided by Amandine Guidez, IP Guyane
Cx_spissipes_GF_S99660Culex spissipesCulex spissipesin-house reference library98.5–100%Reference sequence provided by Amandine Guidez, IP Guyane
Li_durhamii_GF_S100653Lmatus durhamiiLimatus durhamiiMF172330.198.00%99.84%
Li_durhamii_GF_S101621Limatus durhamiiLimatus durhamiiMF172330.1100.00%100.00%
Li_durhamii_GF_S102699Limatus durhamiiLimatus durhamiiMF172330.194.00%100.00%
Ma_sp.4_GF_S103621Mansonia titillansMansonia sp.MT329066.1100.00%99.84%87.12% similarity to Mansonia titillans MN968244.1
Ma_sp.4_GF_S104695Mansonia titillansMansonia sp.MT329066.195.00%99.85%87.39% to Mansonia titillans MN968244.1
Ma_titillans_GF_S105669Mansonia titillansMansonia titillansMN968244.198.00%99.70%
Cx_pedroi_GF_S106653Culex pedroiCulex pedroiKX779887.198.00%98.60%
Cx_pedroi_GF_S107661Culex pedroiCulex pedroiKX779887.197.00%98.76%
Cx_pedroi_GF_S108621Culex pedroiCulex pedroiKX779887.199.00%98.87%
Ps_ferox_GF_S109633Psorophora feroxPsorophora feroxMF172349.1100.00%99.68%
Ps_ferox_GF_S110621Psorophora feroxPsorophora feroxMF172349.1100.00%99.68%
Ps_ferox_GF_S111621Psorophora feroxPsorophora feroxMF172347.199.00%99.51%
Ur_geometrica_GF_S112621Uranotaenia geometricaUranotaenia geometricaNC_044662.1100.00%100.00%
Ur_geometrica_GF_S113Uranotaenia geometricaNo COI obtained
Ur_geometrica_GF_S114621Uranotaenia geometricaUranotaenia geometricaNC_044662.1100.00%100.00%
Cx_tritaeniorhynchus_MG_S115653Culex tritaeniorhynchusCulex tritaeniorhynchusMK861440.1100.00%98.77%

Data availability

Multiple sequence alignment files are included as source data files. All sequences generated in this study have been deposited in GenBank under the accession numbers OM350214–OM350327 for 18S rRNA sequences, OM542339–OM542460 for 28S rRNA sequences, and OM630610–OM630715 for COI sequences.

References

    1. Aspen S
    2. Savage HM
    (2003)
    Polymerase chain reaction assay identifies North American members of the Culex pipiens complex based on nucleotide sequence differences in the acetylcholinesterase gene Ace.2
    Journal of the American Mosquito Control Association 19:323–328.
    1. Bhattacharya S
    2. Basu P
    (2016)
    The Southern House Mosquito, Culex quinquefasciatus: profile of a smart vector
    Journal of Entomology and Zoology Studies JEZS 4:73–81.
  1. Book
    1. Edwards FW
    (1941)
    Mosquitoes of the Ethiopian Region: III
    Culicine Adults and Pupae. Order of the Trustees.
    1. Folmer O
    2. Black M
    3. Hoeh W
    4. Lutz R
    5. Vrijenhoek R
    (1994)
    DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates
    Molecular Marine Biology and Biotechnology 3:294–299.
  2. Book
    1. Grjebine A
    (1966)
    Insectes Diptères Culicidae Anophelinae
    ORSTOM / CNRS.
  3. Book
    1. Héraud JM
    2. Andriamandimby SF
    3. Olive MM
    4. Guis H
    5. Miatrana Rasamoelina V
    6. Tantely L
    (2022)
    Arthropod-borne viruses of Madagascar
    In: Goodman SM, editors. The New Natural History of Madagascar. Princeton University Press. pp. 285–291.
  4. Book
    1. Huang Y
    2. Ward RA
    (1981)
    A Pictorial Key for the Identification of the Mosquitoes Associated with Yellow Fever in Africa
    Mosquito Systematics.
    1. Jacobi JC
    2. Serie C
    (1972)
    Prevalence of group B arbovirus infections in French Guiana in 1967-69
    Medecine d’Afrique Noire 19:225–226.
    1. Rattanarithikul R
    2. Harbach RE
    3. Harrison BA
    4. Panthusiri P
    5. Jones JW
    6. Coleman RE
    (2005a)
    Illustrated keys to the mosquitoes of Thailand
    II Genera Culex and Lutzia: The Southeast Asian Journal of Tropical Medicine and Public Health.
    1. Rattanarithikul R
    2. Harrison BA
    3. Panthusiri P
    4. Coleman RE
    (2005b)
    Illustrated keys to the mosquitoes of Thailand I. Background; geographic distribution; lists of genera, subgenera, and species; and a key to the genera
    The Southeast Asian Journal of Tropical Medicine and Public Health 36 Suppl 1:1–80.
    1. Rattanarithikul R
    2. Harrison BA
    3. Harbach RE
    4. Panthusiri P
    5. Coleman RE
    6. Panthusiri P
    (2006a)
    Illustrated keys to the mosquitoes of Thailand. IV. anopheles
    The Southeast Asian Journal of Tropical Medicine and Public Health 37 Suppl 2:1–128.
    1. Rattanarithikul R
    2. Harrison BA
    3. Panthusiri P
    4. Peyton EL
    (2006b)
    Illustrated keys to the mosquitoes of Thailand: III. Genera Aedeomyia, Ficalbia, Mimomyia, Hodgesia, Coquillettidia, Mansonia, and Uranotaenia
    Southeast Asian Journal of Tropical Medicine and Public Health 37:1–10.
    1. Rattanarithikul R
    2. Harbach RE
    3. Harrison BA
    4. Panthusiri P
    5. Coleman RE
    (2007)
    Illustrated keys to the mosquitoes of Thailand V. Genera Orthopodomyia, Kimia, Malaya, Topomyia, Tripteroides, and Toxorhynchites
    Suppl 38:1–65.
    1. Rattanarithikul R
    2. Harbach RE
    3. Harrison BA
    4. Panthusiri P
    5. Coleman RE
    6. Richardson JH
    (2010)
    Illustrated keys to the mosquitoes of Thailand. VI. Tribe Aedini
    The Southeast Asian Journal of Tropical Medicine and Public Health 41 Suppl 1:1–225.
    1. Sallum MAM
    2. Forattini OP
    (1996)
    Revision of the spissipes section of Culex (Melanoconion) (Diptera:culicidae)
    Journal of the American Mosquito Control Association 12:517–600.
    1. Saluzzo JF
    2. Evidera TV
    3. Veas F
    4. Gonzalez JPJ
    (2017)
    Arbovirus discovery in Central African Republic (1973-1993): Zika, Bozo, Bouboui, and more
    Annals of Infectious Disease and Epidemiology 2:.
    1. Sirivanakarn S
    (1982)
    A review of the systematics and a proposed scheme of internal classification of the New World subgenus Melanoconion of Culex (Diptera, Culicidae)
    Mosquito Systematics 14:265–333.
  5. Report
    1. WHO
    (2017)
    Global vector control response 2017–2030
    In World Health Organization.

Article and author information

Author details

  1. Cassandra Koh

    Institut Pasteur, Université Paris Cité, CNRS UMR3569, Viruses and RNA Interference Unit, F-75015, Paris, France
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    cassandra.koh@pasteur.fr
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2466-6731
  2. Lionel Frangeul

    Institut Pasteur, Université Paris Cité, CNRS UMR3569, Viruses and RNA Interference Unit, F-75015, Paris, France
    Contribution
    Conceptualization, Resources, Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  3. Hervé Blanc

    Institut Pasteur, Université Paris Cité, CNRS UMR3569, Viruses and RNA Interference Unit, F-75015, Paris, France
    Contribution
    Conceptualization, Data curation, Investigation, Methodology
    Competing interests
    No competing interests declared
  4. Carine Ngoagouni

    Institut Pasteur de Bangui, Medical Entomology Laboratory, Bangui, Central African Republic
    Contribution
    Resources, Writing – review and editing
    Competing interests
    No competing interests declared
  5. Sébastien Boyer

    Institut Pasteur du Cambodge, Medical and Veterinary Entomology Unit, Phnom Penh, Cambodia
    Contribution
    Resources, Formal analysis, Writing – review and editing
    Competing interests
    No competing interests declared
  6. Philippe Dussart

    Institut Pasteur du Cambodge, Virology Unit, Phnom Penh, Cambodia
    Contribution
    Resources, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1931-3037
  7. Nina Grau

    Institut Pasteur de Madagascar, Medical Entomology Unit, Antananarivo, Madagascar
    Contribution
    Resources, Writing – review and editing
    Competing interests
    No competing interests declared
  8. Romain Girod

    Institut Pasteur de Madagascar, Medical Entomology Unit, Antananarivo, Madagascar
    Contribution
    Resources, Writing – review and editing
    Competing interests
    No competing interests declared
  9. Jean-Bernard Duchemin

    Institut Pasteur de la Guyane, Vectopôle Amazonien Emile Abonnenc, Cayenne, French Guiana
    Contribution
    Resources, Formal analysis, Writing – review and editing
    Competing interests
    No competing interests declared
  10. Maria-Carla Saleh

    Institut Pasteur, Université Paris Cité, CNRS UMR3569, Viruses and RNA Interference Unit, F-75015, Paris, France
    Contribution
    Conceptualization, Supervision, Funding acquisition, Investigation, Project administration, Writing – review and editing
    For correspondence
    carla.saleh@pasteur.fr
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8593-4117

Funding

Defense Advanced Research Projects Agency (Cooperative Agreement HR001118S0017)

  • Maria-Carla Saleh

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank members of the Saleh lab for valuable discussions and Dr Louis Lambrechts for critical reading of the manuscript. We especially thank all medical entomology staff of IP Bangui, IP Cambodge (Sony Yean, Kimly Heng, Kalyan Chhuoy, Sreynik Nhek, Moeun Chhum, Kimhuor Sour, and Pierre-Olivier Maquart), IP Madagascar, and IP Guyane for assistance in field missions, laboratory work, and logistics, and Inès Partouche from IP Paris for laboratory assistance. We are also grateful to Dr Catherine Dauga for advice on phylogenetic analyses, and to Amandine Guidez for providing a French Guiana-specific COI reference library. Finally, we thank our Reviewers, including Dr Leslie Vosshall and Dr Katherine Young, and Editor Dr Sara Sawyer for constructive reviews and comments. This work was supported by the Defence Advanced Research Projects Agency PREEMPT program managed by Dr Rohit Chitale and Dr Kerri Dugan (Cooperative Agreement HR001118S0017) (the content of the information does not necessarily reflect the position or the policy of the US government, and no official endorsement should be inferred).

Copyright

© 2023, Koh et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,343
    views
  • 198
    downloads
  • 1
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Cassandra Koh
  2. Lionel Frangeul
  3. Hervé Blanc
  4. Carine Ngoagouni
  5. Sébastien Boyer
  6. Philippe Dussart
  7. Nina Grau
  8. Romain Girod
  9. Jean-Bernard Duchemin
  10. Maria-Carla Saleh
(2023)
Ribosomal RNA (rRNA) sequences from 33 globally distributed mosquito species for improved metagenomics and species identification
eLife 12:e82762.
https://doi.org/10.7554/eLife.82762

Share this article

https://doi.org/10.7554/eLife.82762

Further reading

    1. Computational and Systems Biology
    2. Microbiology and Infectious Disease
    Gaetan De Waele, Willem Waegeman, Gerben Menschaert
    Research Article

    Timely and effective use of antimicrobial drugs can improve patient outcomes, as well as help safeguard against resistance development. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is currently routinely used in clinical diagnostics for rapid species identification. Mining additional data from said spectra in the form of antimicrobial resistance (AMR) profiles is, therefore, highly promising. Such AMR profiles could serve as a drop-in solution for drastically improving treatment efficiency, effectiveness, and costs. This study endeavors to develop the first machine learning models capable of predicting AMR profiles for the whole repertoire of species and drugs encountered in clinical microbiology. The resulting models can be interpreted as drug recommender systems for infectious diseases. We find that our dual-branch method delivers considerably higher performance compared to previous approaches. In addition, experiments show that the models can be efficiently fine-tuned to data from other clinical laboratories. MALDI-TOF-based AMR recommender systems can, hence, greatly extend the value of MALDI-TOF MS for clinical diagnostics. All code supporting this study is distributed on PyPI and is packaged at https://github.com/gdewael/maldi-nn.

    1. Immunology and Inflammation
    2. Microbiology and Infectious Disease
    Megan E Amason, Cole J Beatty ... Edward A Miao
    Research Article

    Granulomas are defined by the presence of organized layers of immune cells that include macrophages. Granulomas are often characterized as a way for the immune system to contain an infection and prevent its dissemination. We recently established a mouse infection model where Chromobacterium violaceum induces the innate immune system to form granulomas in the liver. This response successfully eradicates the bacteria and returns the liver to homeostasis. Here, we sought to characterize the chemokines involved in directing immune cells to form the distinct layers of a granuloma. We use spatial transcriptomics to investigate the spatial and temporal expression of all CC and CXC chemokines and their receptors within this granuloma response. The expression profiles change dynamically over space and time as the granuloma matures and then resolves. To investigate the importance of monocyte-derived macrophages in this immune response, we studied the role of CCR2 during C. violaceum infection. Ccr2–/– mice had negligible numbers of macrophages, but large numbers of neutrophils, in the C. violaceum-infected lesions. In addition, lesions had abnormal architecture resulting in loss of bacterial containment. Without CCR2, bacteria disseminated and the mice succumbed to the infection. This indicates that macrophages are critical to form a successful innate granuloma in response to C. violaceum.