Main text

Kim BY, Wang JR, Miller DE, Barmina O, Delaney E, Thompson A, Comeault AA, Peede D, D'Agostino ERR, Pelaez J, Aguilar JM, Haji D, Matsunaga T, Armstrong EE, Zych M, Ogawa Y, Stamenković-Radak M, Jelić M, Veselinović MS, Tanasković M, Erić P, Gao J-J, Katoh TK, Toda MJ, Watabe H, Watada M, Davis JS, Moyle LC, Manoli G, Bertolini E, Košťál V, Hawley RS, Takahashi A, Jones CD, Price DK, Whiteman N, Kopp A, Matute DR, Petrov DA. 2021. Highly contiguous assemblies of 101 drosophilid genomes. eLife 10:e66405. doi: 10.7554/eLife.66405.

Published 19 July 2021

This review was prepared by Bernard Kim, Diler Haji, Noah Whiteman, Artyom Kopp, Daniel Matute, and Dmitri Petrov.

This correction is issued to correct the species identification of the Drosophila nebulosa 14030–0761.01 line from this study. Here, we show that strain 14030–0761.01 is not D. nebulosa, but instead D. sucinea, and that the line is likely misidentified at the stock center. It is unknown whether line 14030–0761.01 was originally D. nebulosa. Although we did not assemble the genome of D. sucinea 14030–0791.00, our analyses of genomic data from this strain also show that it is likely misidentified D. paulistorum.

Several genomes of drosophilid species belonging to the willistoni species group were assembled for this study. Among these were two closely related species purchased from the National Drosophila Species Stock Center (NDSSC) in December 2019: D. sucinea 14030–0791.01 and “D. nebulosa” 14030–0761.01. For clarity, we will refer to misidentified strains henceforth with the species name in quotes. After the publication of this manuscript, we were notified that our assembly of “D. nebulosa” 14030–0761.01 resembled D. sucinea more than other willistoni group species (pers. comm. Christopher Sottolano, Anthony Geneva, and Nir Yakoby). Indeed our “D. nebulosa” genome appears as a sister taxon to D. sucinea in our phylogeny, rather than the other willistoni group species we sequenced (Figure 5 of the original manuscript), inconsistent with other phylogenies inferred from molecular data (e.g., Finet et al., 2021).

We first wished to eliminate the possibility that we unknowingly sequenced D. sucinea multiple times due to sample mishandling. If so, the variation present in the long and short read datasets should not be consistent with two genetically distinct lines. To test for this, we built new genome assemblies to obtain a consensus sequence of the variation represented by each set of reads, then inferred the phylogenetic relationships of the new assemblies.

In addition to the willistoni group assemblies already generated through the hybrid approach in our previous work (including D. sucinea and “D. nebulosa”), we newly assembled our Nanopore and Illumina reads for D. sucinea 14030–0791.01 and “D. nebulosa” 14030–0761.01, and Illumina reads for “D. sucinea” 14030–0791.00 and D. nebulosa 14030–0761.00 (Khallaf et al., 2021; available from NCBI BioProject PRJNA669609). However, “D. sucinea” 14030–0791.00 was ignored for reasons we will cover shortly. We also obtained an unpublished draft assembly of D. nebulosa 14030–0761.06 courtesy of Christopher Sottolano and Nir Yakoby. Finally, our D. saltans assembly was used as an outgroup. Short read datasets were assembled with SPAdes v3.15.3 (Prjibelski et al., 2020). Nanopore reads were assembled with Flye 2.9 (Kolmogorov et al., 2019) then polished once with Oxford Nanopore’s Medaka software (v1.4.4). The 125 BUSCO genes (Manni et al., 2021) that were the most complete across all assemblies were used to build an ASTRAL tree (Zhang et al., 2018), using the methods from our study (Figure 1).

A phylogenetic tree constructed with genomic data from various willistoni group species suggests that line 14030–0761.01 is misidentified.

The ASTRAL tree is constructed from 125 randomly selected complete single-copy BUSCOs from each tip genome. Node confidence values are the local posterior probabilities of each node.

The phylogenetic relationships between the various samples showed us two important things (Figure 1). First, “D. nebulosa” 14030–0761.01 is indeed more closely related to D. sucinea than to the other D. nebulosa assemblies, confirming our suspicions that this line is misidentified. Second, the sequences from “D. nebulosa” 14030–0761.01 and D. sucinea 14030–0791.01 form clusters distinct from each other, meaning the samples were properly handled for sequencing.

While we originally downloaded “D. sucinea” 14030–0791.00 data from NCBI for these analyses, we found those data to be also inconsistent with the phylogeny and ignored them for this analysis. COI sequences extracted from these reads and queried at the Barcode of Life Database (Ratnasingham and Hebert, 2007) suggested these reads were instead from D. paulistorum. Mapping these sequences against our willistoni group assemblies was consistent with this species prediction: only 27.9% of reads mapped to our D. sucinea assembly while the best-mapping assembly was D. paulistorum 14030–0771.06, with 97.5% of reads mapped.

Although strain misidentification seems to explain the anomaly in our data, we sought to further clarify the species identity of the strain and the origin of the misidentification or contamination. To eliminate the possibility of contamination in strains maintained in our labs, we ordered nine fresh lines from the NDSSC: (four lines) D. nebulosa 14030–0761.00,01,03,06; (three lines) D. sucinea 14030–0791.00,01,02; D. capricorni 14030–0721.01; and D. willistoni 14030–0811.17. Sanger sequencing of the COI marker locus was performed for each strain and wings were examined for an anterior dark spot (Figure 2), a distinguishing characteristic of D. nebulosa (pers comm. A Kopp).

Wing pigmentation and inferred COI sequence relationships indicate species misidentifications.

The wing coloration of “D. nebulosa” 14030–0761.01 is not consistent with the darker pigmentation observed in other D. nebulosa strains. Similarly, a maximum likelihood phylogeny of 18 COI sequences shows that “D. nebulosa” 14030–0761.01 and “D. sucinea” 14030–0791.00 are likely to be misidentified.

As expected, “D. nebulosa” 14030–0761.01 lacks the anterior pigmentation found in true D. nebulosa lines (Figure 2). A maximum likelihood phylogeny constructed with COI sequences (Figure 2) further supports our suspicions that the “D. nebulosa” misidentified line is D. sucinea and that “D. sucinea” 14030–0791.00 is not a D. sucinea line. The consistency between our data, data sequenced by others (Khallaf et al., 2021) and uploaded to NCBI, and the freshly obtained lines indicates “D. nebulosa” 14030–0761.01 and “D. sucinea” 14030–0791.00 strains are misidentified at the NDSSC. We have notified the NDSSC and recommend these strains be used with caution.

Other than revised table and figure text to correct the species misidentification, this issue does not affect any of the results presented by this work.

References to D. nebulosa are now revised to D. sucinea** in Figures 1, 2, 3, and 5. Figures legends and the underlying data have not changed.

The corrected Figure 1 is shown here:

For reference, the originally published Figure 1 is shown:

The corrected Figure 2 is shown here:

For reference, the originally published Figure 2 is shown:

The corrected Figure 3 is shown here:

For reference, the originally published Figure 3 is shown:

The corrected Figure 5 is shown here:

For reference, the originally published Figure 5 is shown:

Lastly, any references to D. nebulosa in Supplementary Files 1, 2, 3, 4, and 6, and Table 1, are now revised to D. sucinea**. No other entries in these tables are changed.

The article has been corrected accordingly.

Data accessibility: Wing photographs are available on Dryad ( NCBI Accession numbers for new COI sequences are listed in Table 1.

Table 1
GenBank accession numbers for new COI sequences.
SpeciesNDSSC Stock #GenBank accession
Drosophila capricorni14030–0721.01OK393688
Drosophila nebulosa14030–0761.00OK393689
“Drosophila nebulosa”14030–0761.01OK393690
Drosophila nebulosa14030–0761.03OK393691
Drosophila nebulosa14030–0761.06OK393692
“Drosophila sucinea”14030–0791.00OK393693
Drosophila sucinea14030–0791.01OK393694
Drosophila sucinea14030–0791.02OK393695
Drosophila willistoni14030–0811.17OK393696


Article and author information

Author details

  1. Danny E Miller

  2. Olga Barmina

  3. Ammon Thompson

  4. Emmanuel RR D'Agostino

  5. Julianne Pelaez

  6. Jessica M Aguilar

  7. Diler Haji

  8. Molly Zych

  9. Yoshitaka Ogawa

  10. Marina Stamenković-Radak

  11. Marija Savić Veselinović

  12. Marija Tanasković

  13. Jian-Jun Gao

  14. Takehiro K Katoh

  15. Hideaki Watabe

  16. Masayoshi Watada

  17. Jeremy S Davis

  18. Giulia Manoli

  19. Enrico Bertolini

  20. Vladimír Košťál

  21. R Scott Hawley

  22. Corbin D Jones

  23. Donald K Price

  24. Daniel R Matute

    For correspondence

Version history

  1. Received: March 11, 2022
  2. Accepted: March 11, 2022
  3. Version of Record published: March 18, 2022 (version 1)


© 2022, Kim et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Bernard Y Kim
  2. Jeremy R Wang
  3. Danny E Miller
  4. Olga Barmina
  5. Emily Delaney
  6. Ammon Thompson
  7. Aaron A Comeault
  8. David Peede
  9. Emmanuel RR D'Agostino
  10. Julianne Pelaez
  11. Jessica M Aguilar
  12. Diler Haji
  13. Teruyuki Matsunaga
  14. Ellie Armstrong
  15. Molly Zych
  16. Yoshitaka Ogawa
  17. Marina Stamenković-Radak
  18. Mihailo Jelić
  19. Marija Savić Veselinović
  20. Marija Tanasković
  21. Pavle Erić
  22. Jian-Jun Gao
  23. Takehiro K Katoh
  24. Masanori J Toda
  25. Hideaki Watabe
  26. Masayoshi Watada
  27. Jeremy S Davis
  28. Leonie C Moyle
  29. Giulia Manoli
  30. Enrico Bertolini
  31. Vladimír Košťál
  32. R Scott Hawley
  33. Aya Takahashi
  34. Corbin D Jones
  35. Donald K Price
  36. Noah Whiteman
  37. Artyom Kopp
  38. Daniel R Matute
  39. Dmitri A Petrov
Correction: Highly contiguous assemblies of 101 drosophilid genomes
eLife 11:e78579.

Share this article