High-quality carnivoran genomes from roadkill samples enable comparative species delineation in aardwolf and bat-eared fox

Abstract
Introduction
Results
Discussion
Materials and methods
Appendix 1
Appendix 2
Appendix 3
Appendix 4
Data availability
References
Article and author information
Metrics

Abstract

In a context of ongoing biodiversity erosion, obtaining genomic resources from wildlife is essential for conservation. The thousands of yearly mammalian roadkill provide a useful source material for genomic surveys. To illustrate the potential of this underexploited resource, we used roadkill samples to study the genomic diversity of the bat-eared fox (Otocyon megalotis) and the aardwolf (Proteles cristatus), both having subspecies with similar disjunct distributions in Eastern and Southern Africa. First, we obtained reference genomes with high contiguity and gene completeness by combining Nanopore long reads and Illumina short reads. Then, we showed that the two subspecies of aardwolf might warrant species status (P. cristatus and P. septentrionalis) by comparing their genome-wide genetic differentiation to pairs of well-defined species across Carnivora with a new Genetic Differentiation index (GDI) based on only a few resequenced individuals. Finally, we obtained a genome-scale Carnivora phylogeny including the new aardwolf species.

Introduction

In the context of worldwide erosion of biodiversity, obtaining large-scale genomic resources from wildlife is essential for biodiversity assessment and species conservation. An underexploited, but potentially useful, source of material for genomics is the many thousands of annual wildlife fatalities due to collisions with cars. In particular, mammalian roadkill is unfortunately so frequent that several citizen-science surveys have been implemented on this subject in recent decades (Périquet et al., 2018; Shilling and Perkins, 2015). For example, in South Africa alone, over 12,000 wildlife road mortality incidents were recorded by The Endangered Wildlife Trust’s "Wildlife and Roads Project" from 1949 to 2017 (Endangered Wildlife Trust 2017). Initially developed to measure the impact of roads on wildlife, these web-based systems highlight the numbers of car-wildlife collisions. The possibility of retrieving DNA from roadkill tissue samples (Etherington et al., 2020; Maigret, 2019) could provide new opportunities in genomics by giving access not only to a large number of specimens of commonly encountered species, but also to more elusive and endangered species that might be difficult to sample otherwise.

Recent advances in the development of high-throughput sequencing technologies have made the sequencing of hundreds or thousands of genetic loci cost-efﬁcient and have offered the possibility of using ethanol-preserved tissues, old DNA extracts, and museum specimens (Blaimer et al., 2016; Guschanski et al., 2013). In the meantime, third-generation long-read sequencing technologies, such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) sequencing, have increased the sizes of the sequenced molecules from several kilobases to several megabases. The relatively high level of sequencing errors (10–15%) associated with these technologies can be compensated by sequencing at a high depth-of-coverage to avoid sequencing errors in de novo genome assembly and thus obtain reference genomes with high base accuracy, contiguity, and completeness (Koren et al., 2017; Shafin et al., 2020; Vaser et al., 2017). Originally designed to allow direct sequencing of DNA molecules with simplified library preparation procedures, ONT instruments, such as the MinION (Jain et al., 2016), have been co-opted as a portable sequencing method in the field that proved useful in a diversity of environmental conditions (Blanco et al., 2019; Parker et al., 2017; Pomerantz et al., 2018; Srivathsan et al., 2018). This approach is particularly suitable for sequencing roadkill specimens, for which it is notoriously difficult to obtain a large amount of high-quality DNA because of postmortem DNA degradation processes in high ambient environmental temperatures. Furthermore, it is possible to correct errors in ONT long reads by combining them with Illumina short reads, either to polish de novo long-read-based genome assemblies (Batra et al., 2019; Jain et al., 2018; Nicholls et al., 2019; Walker et al., 2014) or to construct hybrid assemblies (Di Genova et al., 2018; Gan et al., 2019; Tan et al., 2018; Zimin et al., 2013). In hybrid assembly approaches, the accuracy of short reads with high depth-of-coverage (50–100x) allows the use of long reads at lower depths of coverage (10–30x) essentially for scaffolding (Armstrong et al., 2020; Kwan et al., 2019). A promising hybrid assembly approach, combining short- and long-read sequencing data has been implemented in MaSuRCA software (Zimin et al., 2017; Zimin et al., 2013). This approach consists of transforming large numbers of short reads into a much smaller number of longer highly accurate ‘super reads’, allowing the use of a mixture of read lengths. Furthermore, this method is designed to tolerate a significant level of sequencing error. Initially developed to address short reads from Sanger sequencing and longer reads from 454 Life Sciences instruments, this method has already shown promising results for combining Illumina and ONT/PacBio sequencing data in several taxonomic groups, such as plants (Scott et al., 2020; Wang et al., 2019; Zimin et al., 2017), birds (Gan et al., 2019), and fishes (Jiang et al., 2019; Kadobianskyi et al., 2019; Tan et al., 2018), but not yet in mammals.

Here, we studied two of the most frequently encountered mammalian roadkill species in South Africa (Périquet et al., 2018): the bat-eared fox (Otocyon megalotis, Canidae) and the aardwolf (Proteles cristatus, Hyaenidae). These two species are among several African vertebrate taxa presenting disjunct distributions between Southern and Eastern Africa that are separated by more than a thousand kilometres (e.g. ostrich, Miller et al., 2011; ungulates, Lorenzen et al., 2012). Diverse biogeographical scenarios, involving the survival and divergence of populations in isolated savanna refugia during the climatic oscillations of the Pleistocene, have been proposed to explain these disjunct distributions in ungulates (Lorenzen et al., 2012). Among the Carnivora, subspecies have been defined based on this peculiar allopatric distribution not only for the black-backed jackal (Lupulella mesomelas; Walton and Joly, 2003) but also for both the bat-eared fox (Clark, 2005) and the aardwolf (Koehler and Richardson, 1990; Figure 1). The bat-eared fox is divided into the Southern bat-eared fox (O. megalotis megalotis) and the Eastern bat-eared fox (O. megalotis virgatus) (Clark, 2005), and the aardwolf is divided into the Southern aardwolf (P. cristatus cristatus) and the Eastern aardwolf (P. cristatus septentrionalis) (Koehler and Richardson, 1990). However, despite known differences in behaviour between the subspecies of both species groups (Wilson et al., 2009), no genetic or genomic assessment of population differentiation has been conducted to date. In other taxa, similar allopatric distributions have led to genetic differences between populations and several studies reported substantial intraspecific genetic structuration between Eastern and Southern populations (Atickem et al., 2018; Barnett et al., 2006; Dehghani et al., 2008; Lorenzen et al., 2012; Miller et al., 2011; Rohland et al., 2005). Here, with a novel approach based on a few individuals, we investigate whether significant genetic structuration and population differentiation have occurred between subspecies of bat-eared fox and aardwolf using whole genome data.

Figure 1

Download asset Open asset

Disjunct distributions of the aardwolf (*Proteles cristatus*) and the bat-eared fox (*Otocyon megalotis*) in Eastern and Southern Africa.

Within each species, two subspecies have been recognized based on their distributions and morphological differences (Clark, 2005; Koehler and Richardson, 1990). Picture credits: Southern aardwolf (*P. cristatus cristatus*) copyright Dominik Käuferle; Southern bat-eared fox (*O. megalotis megalotis*) copyright Derek Keats.

To evaluate the taxonomic status of the proposed subspecies within both O. megalotis and P. cristatus, we first sequenced and assembled two reference genomes from roadkill samples by combining ONT long reads and Illumina short reads using the MaSuRCA hybrid assembler. The quality of our genome assemblies was assessed by comparison to available mammalian genome assemblies. Then, to estimate the genetic diversity of these species and to perform comparative genome-scale species delineation analyses, two additional individuals from the disjunct South African and Tanzanian populations of both species were resequenced at high depth-of-coverage using Illumina short reads. Using these additional individuals, we estimated the genetic diversity and differentiation of each subspecies pair via an FST-like measure, which we called the genetic differentiation index (GDI), and compared the results with the genetic differentiation among pairs of well-established carnivoran sister species. Based on measures of genetic differentiation, we found that the two subspecies of P. cristatus warrant separate species status, whereas the subspecies of O. megalotis do not show such differentiation. Our results show that high-quality reference mammalian genomes could be obtained through a combination of short- and long-read sequencing methods providing opportunities for large-scale population genomic studies of mammalian wildlife using (re)sequencing of samples collected from roadkill.

Results

Mitochondrial diversity within the Carnivora

The first dataset, composed of complete carnivoran mitogenomes available in GenBank combined with the newly generated sequences of the two subspecies of P. cristatus, the two subspecies of O. megalotis, Parahyaena brunnea, Speothos venaticus and Vulpes vulpes, plus the sequences extracted from Ultra Conserved Elements (UCE) libraries for Bdeogale nigripes, Fossa fossana, and Viverra tangalunga (see Materials and methods for more details), comprised 142 species or subspecies representing all families of Carnivora. Maximum likelihood (ML) analyses reconstructed a robust mitogenomic phylogeny, with 91.4% of the nodes (128 out of 140) recovered with bootstrap support higher than 95% (Figure 2a). The patristic distances, extracted from the phylogenetic tree inferred with complete mitogenomes between the allopatric subspecies of aardwolf and bat-eared fox, were 0.045 and 0.020 substitutions per site, respectively (Supplementary file 1). These genetic distances are comparable to those observed between different well-defined species of Carnivora, such as the red fox (Vulpes vulpes) and the fennec (V. zerda) (0.029) or the steppe polecat (Mustela eversmanii) and the Siberian weasel (M. sibirica) (0.034) (see Supplementary file 1).

Figure 2

Download asset Open asset

Representation of the mitochondrial genetic diversity within the Carnivora with (a) the mitogenomic phylogeny inferred from 142 complete Carnivora mitogenomes, including those of the two populations of aardwolf (*Proteles cristatus*) and bat-eared fox (*Otocyon megalotis*) and (b) intraspecific (orange) and the interspecific (red) genetic diversities observed for the two mitochondrial markers COX1 and CYTB.

Silhouettes from http://phylopic.org/.

To further assess the genetic distances between the two pairs of subspecies and compare them to both polymorphism and divergence values observed across Carnivora, two supplemental datasets, including at least two individuals per species, were assembled by retrieving all COX1 and CYTB sequences, which are the two widely sequenced mitochondrial markers for carnivorans, available on GenBank. These datasets include 3,657 COX1 sequences for 150 species and 6,159 CYTB sequences for 203 species of Carnivora, including 5 O. megalotis and 10 P. cristatus individuals, respectively. After adding the corresponding sequences from the newly assembled mitogenomes, ML phylogenetic inference was conducted on each dataset. The patristic distances between all tips of the resulting phylogenetic trees were measured and classified into two categories: (i) intraspecific variation (polymorphism) for distances inferred among individuals of the same species and (ii) interspecific divergence for distances inferred among individuals of different species. Despite an overlap between polymorphism and divergence in both mitochondrial genes, this analysis revealed a threshold between polymorphism and divergence of approximately 0.02 substitutions per site for Carnivora (Figure 2b). With a patristic distance of 0.054 for both COX1 and CYTB, the genetic distance observed between the two subspecies of aardwolf (Proteles ssp.) was higher than the majority of the intraspecific distances observed across Carnivora. However, with patristic distances of 0.020 for COX1 and 0.032 for CYTB, the genetic distances observed between the two subspecies of bat-eared fox (Otocyon ssp.) were clearly in the ambiguous zone and did not provide a clear indication of the specific taxonomic status of these populations.

Finally, to test whether the two pairs of allopatric subspecies diverged synchronously or in two different time periods, Bayesian molecular dating inferences were performed on the 142-taxon ML mitogenomic tree. The resulting divergence times were slightly different depending on the clock model used (strict clock [CL], autocorrelated [LN or TK02] and uncorrelated [UGAM or UCLM]) despite the convergence of the MCMC chains for all models. Cross-validation analyses resulted in the selection of the LN and UGAM models as the models with the best fit based on a higher cross-likelihood score than that of CL (LN and UGAM versus CL mean scores = 35 8). Unfortunately, these two statistically indistinguishable models provided different divergence times for the two pairs of subspecies, with LN favoring a synchronous divergence (approximately 1 Mya [95% credibility interval (CI): 6.72–0.43]; Supplementary file 2), while UGAM favored an asynchronous divergence (~0.6 [CI: 0.83–0.39] Mya for O. megalotis ssp. and ~1.3 [CI: 1.88–0.93] Mya for P. cristatus ssp.; Supplementary file 2). However, the three chains performed with the UGAM model recovered highly similar ages for the two nodes of interest with low CI 95% values, whereas the three chains performed with the LN model recovered less similar ages between chains and high CI 95% values (Supplementary file 2).

Assembling reference genomes from roadkill

Considering the DNA quality and purity required to perform single-molecule sequencing with ONT, a specific protocol to extract DNA from roadkill was developed (Tilak et al., 2020). This protocol was designed to specifically select the longest DNA fragments present in the extract, which also contained short degraded fragments resulting from postmortem DNA degradation processes. This protocol increased the median size of the sequenced raw DNA fragments threefold in the case of aardwolf (Tilak et al., 2020). In total, after high-accuracy basecalling, adapter trimming, and quality filtering, 27.3 Gb of raw Nanopore long reads were sequenced using 16 MinION flow cells for the Southern aardwolf (P. c. cristatus) and 33.0 Gb using 13 flow cells for the Southern bat-eared fox (O. m. megalotis) (Table 1). Owing to quality differences among the extracted tissues for both species, the N50 of the DNA fragment size for P. cristatus (9,175 bp) was about two times higher than the N50 of the DNA fragment size obtained for O. megalotis (4,393 bp). The quality of the reads base-called with the high accuracy option of Guppy was significantly higher than the quality of those translated with the fast option, which led to better assemblies (see Appendix 1—figure 1). Complementary Illumina sequencing returned 522.8 and 584.4 million quality-filtered reads per species corresponding to 129.5 Gb (expected coverage = 51.8 x) and 154.8 Gb (expected coverage = 61.6 x) for P. c. cristatus and O. m. megalotis, respectively. Regarding the resequenced individuals of each species, on average 153.5 Gb were obtained with Illumina resequencing (Table 1).

Table 1

Summary of sequencing and assembly statistics of the genomes generated in this study.

Individuals			Illumina				Oxford Nanopore Sequencing					Assembly statistics
Species	Subspecies	Voucher	Raw reads (M)	Cleaned reads (M)	Nbr of gigabases	Estimated coverage	Nbr of flowcells	Nbr of bases (Gb)	N50	Average size	Estimated coverage	Genome size (Gb)	Nbr of scaff.	N50 (kb)	Busco score	OMM genes	Missing data (%)
Proteles cristatus	cristatus	NMB 12641	716.7	522.8	129.50	51.8	16	27.3	9,175	5,555	10.9	2.39	5,669	1.309	92.8	12,062	22.43
Proteles cristatus	cristatus	NMB 12667	663.8	526.1	140.73	56.3	NA					NA				NA
Proteles cristatus	septentrionalis	NMS.Z.2018.54	750.9	516.2	132.44	53.0	NA					NA				12,050	22.96
Otocyon megalotis	megalotis	NMB 12639	710.2	584.4	154.81	61.6	13	33	4,393	3,092	13.2	2.75	11,081	728	92.9	11,981	22.02
Otocyon megalotis	megalotis	NMB 12640	861.2	820	240.71	96.3	NA					NA
Otocyon megalotis	virgatus	FMNH 158128	661.7	554.1	100.30	40.1	NA					NA

The two reference genomes were assembled using MinION long reads and Illumina short reads in combination with MaSuRCA v3.2.9 (Zimin et al., 2013). Hybrid assemblies for both species were obtained with a high degree of contiguity with only 5,669 scaffolds and an N50 of 1.3 Mb for the aardwolf (P. cristatus) and 11,081 scaffolds and an N50 of 728 kb for the bat-eared fox (O. megalotis) (Table 1). Our two new genomes compared favorably with the available carnivoran genome assemblies in terms of (i) contiguity showing slightly less than the median N50 and a lower number of scaffolds than the majority of the other assemblies (Appendix 1—figure 2, Supplementary file 3) and (ii) completeness showing high BUSCO scores (see Appendix 1—figure 3 and Supplementary file 4 for BUSCO score comparisons among carnivoran genomes). Comparison of two hybrid assemblies with Illumina-only assemblies obtained with SOAPdenovo illustrated the positive effect of introducing Nanopore long reads even at moderate coverage by reducing the number of scaffolds from 409,724 to 5,669 (aardwolf) and from 433,209 to 11,081 (bat-eared fox), while increasing the N50 from 17.3 kb to 1.3 Mb (aardwolf) and from 22.3 kb to 728 kb (bat-eared fox).

Genome-wide analyses of population structure and differentiation

To evaluate the population structure between the subspecies of P. cristatus and O. megalotis, the number of shared heterozygous sites, unique heterozygous sites, and homozygous sites between individuals was computed to estimate an FST-like statistic (hereafter called the genetic differentiation index or GDI, see Materials and methods and Appendix for more details). Since we were in possession of two individuals for the Southern subspecies and only one for the Eastern subspecies of both species, the genetic differentiation between the two individuals within the Southern subspecies and between the Southern and Eastern subspecies was computed. To account for the variation across the genome, 10 replicates of 100 regions with a length of 100 kb were randomly chosen to estimate genetic differentiation. Interestingly, in both species, the mean heterozygosity was higher in the Southern subspecies than in the Eastern subspecies. For the aardwolf, the mean heterozygosity was 0.189 per kb (sd = 0.010) in the Southern population and 0.121 per kb (sd = 0.008) in the Eastern population. For the bat-eared fox, the mean heterozygosity was 0.209 per kb (sd = 0.013) in the Southern population and 0.127 per kb (sd = 0.003) in the Eastern population. This heterozygosity level is low compared to that of other large mammals (Díez-Del-Molino et al., 2018) and is comparable to that of the Iberian lynx, the cheetah or the brown hyaena, which have notoriously low genetic diversity (Abascal et al., 2016; Casas-Marce et al., 2013; Westbury et al., 2018).

Since we had very limited power to fit the evolution of the genetic differentiation statistics with a hypothetical demographic scenario because of our limited sample size (n = 3), we chose a comparative approach and applied the same analyses to four well-defined species pairs of carnivorans, for which similar individual sampling was available. The genetic differentiation estimates between the two individuals belonging to the same subspecies (Southern populations in both cases) were on average equal to 0.005 and 0.014 for P. c. cristatus and O. m. megalotis, respectively. This indicated that the polymorphism observed in the two individuals within the Southern subspecies of each species was comparable (genetic differentiation index close to 0) and thus that these two subpopulations are likely panmictic (Figure 3). In contrast, the genetic differentiation estimates for the two pairs of individuals belonging to the different subspecies were respectively equal on average to 0.533 and 0.294 for P. cristatus ssp. and O. megalotis ssp., indicating that the two disjunct populations are genetically structured. To contextualize these results, the same genetic differentiation measures were estimated using three individuals for four other well-defined species pairs (Figure 3). First, the comparison of the polymorphism of two individuals of the same species led to intraspecific GDIs ranging from 0.029 on average for polar bear (Ursus maritimus) to 0.137 for lion (Panthera leo). As expected, comparing the polymorphisms of two individuals between closely related species led to a higher interspecific GDI ranging from 0.437 on average for the wolf/golden jackal (Canis lupus/Canis aureus) pair to 0.760 for the lion/leopard (Panthera leo/P. pardus) pair (Figure 3). The genetic differentiation indexes between the gray wolf (C. lupus) and the golden jackal (C. aureus) averaged 0.44, indicating that the two subspecies of aardwolf (GDI = 0.533) are genetically more differentiated than these two well-defined species, and only slightly less differentiated than the brown bear (Ursus arctos) and the polar bear (Ursus maritimus). Conversely, the genetic differentiation obtained between the bat-eared fox subspecies (GDI = 0.294) was lower than the genetic differentiation estimates obtained for any of the four reference species pairs evaluated here (Figure 3). We verified that differences in depth-of-coverage among individuals did not bias our genetic differentiation estimates by subsampling reads at 15x (Figure 3—figure supplement 1). We also checked that randomly sampling only three individuals was enough to accurately estimate genetic differentiation in the case of the brown vs. polar bear comparison (Figure 3—figure supplement 2).

Figure 3 with 2 supplements see all

Download asset Open asset

Genetic differentiation indexes obtained from a comparison of intraspecific (orange values) and interspecific (red values) polymorphisms in four pairs of well-defined.

Carnivora species and for the subspecies of aardwolf (*Proteles cristatus*) and bat-eared fox (*Otocyon megalotis*) (gray values). Silhouettes from http://phylopic.org/.

Effective population size reconstructions

We used the pairwise sequential Markovian coalescent (PSMC) model to estimate the ancestral effective population size (Ne) trajectory over time for each sequenced individual. For both the aardwolf and the bat-eared fox, the individual from Eastern African populations showed a continuous decrease in Ne over time, leading to the recent Ne being lower than that in Southern African populations (Figure 4). This is in agreement with the lower heterozygosity observed in the Eastern individuals of both species. For the bat-eared fox, the trajectories of the three sampled individuals were synchronized approximately 200 kya ago (Figure 4a), which could correspond to the time of divergence between the Southern and Eastern populations. In contrast, Ne trajectories for the aardwolf populations did not synchronize over the whole period (~2 Myrs). Interestingly, the Southern populations of both species showed a marked increase in population size between ~10 and 30 kya before sharply decreasing in more recent times (Figure 4).

Figure 4

Download asset Open asset

PSMC estimates of changes in effective population size over time for the Eastern (orange) and Southern (blue and purple) populations of (a) bat-eared fox and (b) aardwolf.

mu = mutation rate of 10⁻⁸ mutations per site per generation and g = generation time of 2 years. Vertical red lines indicate 20 kyrs and 40 kyrs. Silhouettes from http://phylopic.org/.

Phylogenomics of the Carnivora

Phylogenetic relationships within the Carnivora were inferred from a phylogenomic dataset comprising 52 carnivoran species (including the likely new Proteles septentrionalis species), representing all but two families of the Carnivora (Nandiniidae and Prionodontidae). The non-annotated genome assemblies of these different species were annotated with a median of 18,131 functional protein-coding genes recovered for each species. Then, single-copy orthologous gene identification resulted in a median of 12,062 out of the 14,509 single-copy orthologues extracted from the OrthoMaM database for each species, ranging from a minimum of 6305 genes for the California sea lion (Zalophus californianus) and a maximum of 13,808 for the dog (Canis familiaris) (Supplementary file 5). Our new hybrid assemblies allowed the recovery of 12,062 genes for the Southern aardwolf (P. c. cristatus), 12,050 for the Eastern aardwolf (P. c. septentrionalis), and 11,981 for the Southern bat-eared fox (O. m. megalotis) (Table 1). These gene sets were used to create a supermatrix consisting of 14,307 genes representing a total of 24,041,987 nucleotide sites with 6,495,611 distinct patterns (27.0%) and 22.8% gaps or undetermined nucleotides.

Phylogenomic inference was first performed on the whole supermatrix using ML. The resulting phylogenetic tree was highly supported, with all but one node being supported by maximum bootstrap (UFBS) values (Figure 5). To further dissect the phylogenetic signal underlying this ML concatenated topology, we measured gene concordance (gCF) and site concordance (sCF) factors to complement traditional bootstrap node-support values. For each node, the proportion of genes (gCF) or sites (sCF) that supported the node inferred with the whole supermatrix was compared to the proportion of the genes (gDF) or sites (sDF) that supported an alternative resolution of the node (Figure 5). Finally, a coalescent-based approximate species tree inference was performed using ASTRAL-III based on individual gene trees. Overall, the three different analyses provided well-supported and almost identical results (Figure 5). The order Carnivora was divided into two distinct suborders: a cat-related clade (Feliformia) and a dog-related clade (Caniformia). Within the Feliformia, the first split separated the Felidae (felids) from the Viverroidea, a clade composed of the four families Viverridae (civets and genets), Eupleridae (fossa), Herpestidae (mongooses), and Hyaenidae (hyaenas). In hyaenids, the two species of termite-eating aardwolves (P. cristatus and P. septentrionalis) were the sister-group of a clade composed of the carnivorous spotted (Crocuta crocuta) and striped (Hyaena hyaena) hyaenas. Congruent phylogenetic relationships among Feliformia families and within hyaenids were also retrieved with the mitogenomic data set (Figure 2a). The short internal nodes of the Felidae were the principal source of incongruence among the three different analyses with concordance factor analyses pointing to three nodes for which many sites and genes support alternative topologies (Figure 5), including one node for which the coalescent-based approximate species tree inference supported an alternative topology to the one obtained with ML on the concatenated supermatrix. In the Viverroidea, the Viverridae split early from the Herpestoidea, regrouping the Hyaenidae, Herpestidae, and Eupleridae, within which the Herpestidae and Eupleridae formed a sister clade to the Hyaenidae. Within the Caniformia the Canidae (canids) was recovered as a sister group to the Arctoidea. Within the Canidae, in accordance with the mitogenomic phylogeny, the Vulpini tribe, represented by O. megalotis and V. vulpes, was recovered as the sister clade of the Canini tribe, represented here by Lycaon pictus and C. familiaris. The Arctoidea was recovered as a major clade composed of eight families grouped into three subclades: Ursoidea (Ursidae), Pinnipedia (Otariidae, Odobedinae, and Phocidae), and Musteloidea, composed of Ailuridae (red pandas), Mephitidae (skunks), Procyonidae (raccoons), and Mustelidae (badgers, martens, weasels, and otters). Within the Arctoidea, the ML phylogenetic inference on the concatenation provided support for grouping the Pinnipedia and the Musteloidea to the exclusion of the Ursidae (bears) with maximum bootstrap support (Figure 5), as in the mitogenomic tree (Figure 2a). However, the concordance factor analyses revealed that many sites and many genes actually supported alternative topological conformations for this node characterized by a very short branch length (sCF = 34.1, SDF1 = 29.2, sDF2 = 36.7, gCF = 46.9, gDF1 = 18.6, gDF2 = 18.2, gDFP = 16.3) (Figure 5). In the Pinnipedia, the clade Odobenidae (walruses) plus Otariidae (eared seals) was recovered to the exclusion of the Phocidae (true seals), which was also in agreement with the mitogenomic scenario (Figure 2a). Finally, within the Musteloidea, the Mephitidae represented the first offshoot, followed by the Ailuridae, and a clade grouping the Procyonidae and the Mustelidae. Phylogenetic relationships within Musteloidea were incongruent with the mitogenomic tree, which alternatively supported the grouping of the Ailuridae and the Mephitidae (Figure 2a).

Figure 5

Download asset Open asset

Phylogenomic tree reconstructed from the nucleotide supermatrix composed of 14,307 single-copy orthologous genes for 52 species of Carnivora plus one outgroup (*Manis javanica*).

The family names in the legend are ordered as in the phylogeny. Silhouettes from http://phylopic.org/.

Discussion

High-quality mammalian genomes from roadkill using MaSuRCA hybrid assembly

With an increasing number of species being threatened worldwide, obtaining genomic resources from mammalian wildlife can be difficult. We decided to test the potential of using roadkill samples, an abundant and valuable resource in ecological studies (Schwartz et al., 2020) but a currently underexploited source material for genomics (Etherington et al., 2020; Maigret, 2019). Roadkill are indeed relatively easy to survey and the potential coordination with ongoing monitoring and citizen science projects (e.g. Périquet et al., 2018; Waetjen and Shilling, 2017) could potentially give access to large numbers of tissue samples for frequently encountered species. Even though roadkill may represent a biased sample of species populations (Brown and Bomberger Brown, 2013; Loughry and McDonough, 1996), they can also be relevant to generate reference genomes for elusive species that could hardly be sampled otherwise. Despite limited knowledge and difficulties associated with de novo assembly of non-model species (Etherington et al., 2020), we designed a protocol to produce DNA extracts of suitable quality for Nanopore long-read sequencing from roadkill (Tilak et al., 2020). Additionally, we tested the impact of the accuracy of the MinION basecalling step on the quality of the resulting MaSuRCA hybrid assemblies. In line with previous studies (Wenger et al., 2019; Wick et al., 2019), we found that using the high accuracy option rather than the fast option of Guppy 3.1.5 leads to more contiguous assemblies by increasing the N50 value. By relying on this protocol, we were able to generate two hybrid assemblies by combining Illumina reads at relatively high coverage (50-60x) and MinION long reads at relatively moderate coverage (11-13x), which provided genomes with high contiguity and completeness. These represent the first two mammalian genomes obtained with such a hybrid Illumina/Nanopore approach using the MaSuRCA assembler for non-model carnivoran species: the aardwolf (P. cristatus) and the bat-eared fox (O. megalotis). Despite the use of roadkill samples, our assemblies compare favorably, in terms of both contiguity and completeness, with the best carnivoran genomes obtained so far from classical genome sequencing approaches that do not rely on complementary optical mapping or chromatin conformation approaches. Overall, our carnivoran hybrid assemblies are fairly comparable to those obtained using the classic Illumina-based genome sequencing protocol involving the sequencing of both paired-end and mate-paired libraries (Li et al., 2010). The benefit of adding Nanopore long reads is demonstrated by the fact that our hybrid assemblies are of better quality than all the draft genome assemblies generated using the DISCOVAR de novo protocol based on a PCR-free single Illumina 250 bp paired-end library (Weisenfeld et al., 2014) used in the 200 Mammals Project of the Broad Genome Institute (Zoonomia Consortium, 2020). These results confirm the capacity of the MaSuRCA hybrid assembler to produce high quality assemblies for large and complex genomes by leveraging the power of long Nanopore reads (Wang et al., 2020). Moreover, these two hybrid assemblies could form the basis for future chromosome-length assemblies by adding complementary HiC data (van Berkum et al., 2010) as proposed in initiatives such as the Vertebrate Genome Project (Koepfli et al., 2015) and the DNA Zoo (Dudchenko et al., 2017). Our results demonstrate the feasibility of producing high-quality mammalian genome assemblies at moderate cost (5,000–10,000 USD for each of our Carnivora genomes) using roadkill and should encourage genome sequencing of non-model mammalian species in ecology and evolution laboratories.

Genomic evidence for two distinct species of aardwolves

The mitogenomic distances inferred between the subspecies of O. megalotis and P. cristatus were comparable to those observed for other well-defined species within the Carnivora. Furthermore, by comparing the genetic diversity between several well-defined species (divergence) and several individuals of the same species (polymorphism) based on the COX1 and CYTB genes across Carnivora, we were able to pinpoint a threshold of approximately 0.02 substitutions per base separating divergence from polymorphism, which is in accordance with a recent study of naturally occurring hybrids in Carnivora (Allen et al., 2020). This method, also known as the barcoding-gap method (Meyer and Paulay, 2005), allowed us to show that the two subspecies of P. cristatus present a genetic divergence greater than the threshold, whereas the divergence is slightly lower for the two subspecies of O. megalotis. These results seem to indicate that the subspecies P. c. septentrionalis should be elevated to species level (P. septentrionalis). Conversely, for O. megalotis, this first genetic indicator seems to confirm the distinction at the subspecies level. However, mitochondrial markers have some well-identified limitations (Galtier et al., 2009), and it is difficult to properly determine a threshold between polymorphism and divergence across the Carnivora. The measure of mtDNA sequence distances can thus be seen only as a first useful indicator for species delineation. The examination of variation at multiple genomic loci in a phylogenetic context, combined with morphological, behavioral and ecological data, is required to establish accurate species boundaries.

The newly generated reference genomes allowed us to perform genome-wide evaluation of the genetic differentiation between subspecies using short-read resequencing data of a few additional individuals of both species. Traditionally, the reduction in polymorphism in two subdivided populations (p within) compared to the population at large (p between) is measured with several individuals per population (FST; Hudson et al., 1992). However, given that the two alleles of one individual are the results of the combination of two a priori non-related individuals of the population (i.e. the parents), with a large number of SNPs, the measurement of heterozygosity can be extended to estimation of the (sub)population polymorphism. Furthermore, in a panmictic population with recombination along the genome, different chromosomal regions can be considered to be independent and can be used as replicates for heterozygosity estimation. In this way, genome-wide analyses of heterozygosity provide a way to assess the level of polymorphism in a population and a way to compare genetic differentiation between two populations. If we hypothesize that the two compared populations are panmictic, picking one individual or another of the population has no effect (i.e. there is no individual with excess homozygous alleles due to mating preference across the population), and the population structure can be assessed by comparing the heterozygosity of the individuals of each population compared to the heterozygosity observed for two individuals of the same population (see Methods). Such an index of genetic differentiation, by measuring the level of population structure, could provide support to establish accurate species boundaries. In fact, delineating species has been and still is a complex task in evolutionary biology (Galtier, 2019; Ravinet et al., 2016; Roux et al., 2016). Given that accurately defining the species taxonomic level is essential for a number of research fields, such as macroevolution (Faurby et al., 2016) or conservation (Frankham et al., 2012), defining thresholds to discriminate between populations or subspecies in different species is an important challenge in biology. However, due to the disagreement on the definition of species, the different routes of speciation observed in natura and the different amounts of data available among taxa, adapting a standardized procedure for species delineation seems complicated (Galtier, 2019).

As proposed by Galtier, 2019, we decided to test the taxonomic level of the P. cristatus and O. megalotis subspecies by comparing the genetic differentiation observed between Eastern and Southern populations within these species to the genetic differentiation measured for well-defined Carnivora species. Indeed, estimation of the genetic differentiation either within well-defined species (polymorphism) or between two closely related species (divergence) allowed us to define a threshold between genetic polymorphism and genetic divergence across the Carnivora (Figure 5, see Materials and methods). Given these estimates, and in accordance with mitochondrial data, the two subspecies of P. cristatus (1) present more genetic differentiation between each other than the two well-defined species of golden jackal (Canis aureus) and wolf (C. lupus), and (2) present more genetic differentiation than the more polymorphic species of the dataset, the lion (P. leo). Despite known cases of natural hybridization reported between C. aureus and C. lupus (Galov et al., 2015; Gopalakrishnan et al., 2018), the taxonomic rank of these two species is well accepted. In that sense, given the species used as a reference, both subspecies of P. cristatus seem to deserve to be elevated to species level. The situation is less clear regarding the subspecies of O. megalotis. Indeed, while the genetic differentiation observed between the two subspecies is significantly higher than the polymorphic distances observed for all the well-defined species of the dataset, there is no species in our dataset that exhibits equivalent or lower genetic divergence than a closely related species. This illustrates the limits of delineating closely related species due to the continuous nature of the divergence process (De Queiroz, 2007). The subspecies of O. megalotis fall into the ‘gray zone’ of the speciation continuum (De Queiroz, 2007; Roux et al., 2016) and are likely undergoing incipient speciation due to their vicariant distributions. To be congruent with the genetic divergence observed across closely related species of the Carnivora (according to our dataset), we thus propose that (1) the taxonomic level of the P. cristatus subspecies be reconsidered by elevating the two subspecies P. c. cristatus and P. c. septentrionalis to species level, and (2) the taxonomic level for the two subspecies of O. megalotis be maintained.

Although there is a distinct genetic difference between Eastern and Southern aardwolves, the evidence for a clear morphological difference is less obvious (Figure 6, Appendix 2—figures 1, 2, Supplementary file 6, Supplementary file 7). The earliest available name for the East African aardwolf subspecies is P. c. septentrionalis (Rothschild, 1902). This subspecies was first distinguished based on pelage characteristics of a specimen from Somaliland, which has a creamy white pelage without any gray tinge, but washed slightly with buff in the neck and side of the rump (Rothschild, 1902). Also, the striping pattern is less well defined and breaks up into spots on the neck. In contrast, the Southern aardwolf subspecies P. c. cristatus was described as ashy gray, front and sides of neck grayish white, black stripes broad and well defined (Rothschild, 1902). Drake-Brockman, 1910 also described Somali aardwolves as pale buff with a dark grayish-buff head, but Cabrera, 1910 was the first to ascribe diagnostic characters to distinguish between the Eastern and Southern populations. He described a new subspecies P. c. pallidior from Suakin (Sudan) as a very pale yellowish cream, almost white ventrally and on the forehead. This contrasts with the grizzled gray of the forehead of P. c. cristatus (Figure 6). Cabrera, 1910 also described how the fur of P. c. pallidior is unicolored and lacks the brown base of P. c. cristatus. This latter character appears to be consistent in an Ethiopian specimen compared with three skins of Namibian and South African origin in the collections of National Museums Scotland, although it would appear to be a difference in the coloration of the underfur. However, a further specimen from Zimbabwe also has pale underfur. In reviewing georeferenced photographs of aardwolves from throughout the range, the striping pattern appeared to be variable, but overall East African specimens tended to be paler, with more contrasting stripes with a pale forehead compared with the longer, grayer or ochre-gray fur in Southern African specimens, which have less distinctive stripes (A.C.K. pers. obs.). However, fur length and hence stripe distinctiveness may just be a phenotypic response to lower temperatures at higher latitudes compared with equatorial East African specimens. Cabrera, 1910 also proposed differences in a skull measurement between Eastern and Southern African aardwolves. Three specimens from Eastern Africa had a wider inter-orbital breadth than two from Southern Africa. However, his measurements also showed that Eastern African aardwolves have larger postorbital breadths, brain case widths, and maxillary widths at the canines. Adding in measurements of skulls from the literature (Allen et al., 1909; Heller, 1913; Hollister, 1924; Roberts, 1951; Roberts, 1932) confirmed that postorbital breadth is significantly greater in P. c. septentrionalis than P. c. cristatus but revealed no significant differences between other skull measurements including condylobasal length of skull (Appendix 2—figure 2, Supplementary file 7). However, as noted above from skins, sample sizes are very limited and thus these morphological differences remain tentative subject to examination of a much larger sample with more powerful geometric morphometrics methods. These preliminary observations should nevertheless prompt a deeper investigation of morphological and behavioural differences that have been reported between the two proposed subspecies of aardwolf to formally validate our newly proposed taxonomic arrangement. Our results might also have conservation implications, as the status of the two distinct aardwolf species will have to be re-evaluated separately in the International Union for Conservation of Nature (IUCN) Red List of Threatened Species (IUCN, 2020).

Figure 6

Download asset Open asset

Phenotypic comparisons, highlighting the differences in fur coloration and stripe pattern, between captive individuals of Eastern (*P. septentrionalis*) and Southern (*P. cristatus*) aardwolves held at Hamerton Zoo Park (UK).

All pictures copyright and used with permission from Robb Cadd.

Population size variation and environmental change

The Pairwise Sequentially Markovian Coalescent (PSMC) analyses revealed that the Southern and Eastern African populations have different effective population-size estimates over time, confirming that they have been genetically isolated for several thousand years, which is more so for the aardwolf than for the bat-eared fox. This supports the hypothesis of two separate events leading to the same disjunct distributions for the two taxa, in accordance with mitochondrial dating. Nevertheless, the population trends are rather similar and are characterized by continuous declines between 1 Mya and 100–200 kya that are followed by an increase that is much more pronounced in the Southern populations of both species between 30 and 10 kya. The similar trajectories exhibited by both species suggest that they were under the influence of similar environmental factors, such as climate and vegetation variations.

Aardwolves and bat-eared foxes live in open environments including short-grass plains, shrubland, and open and tree savannas, and both are highly dependent on herbivorous termites for their diet. Therefore, the fluctuation of their populations could reflect the evolution of these semi-arid ecosystems determining prey abundance during the last million years. However, the global long-term Plio-Pleistocene African climate is still debated. For Eastern Africa, some studies have suggested an evolution toward increased aridity (deMenocal, 2004; deMenocal, 1995), whereas others have proposed the opposite (Grant et al., 2017; Maslin et al., 2014; Trauth et al., 2009). Therefore, our data support the latter hypothesis, as a global long-term tendency toward a wetter climate in East Africa could have been less favorable for species living in open environments.

Southern populations exhibit a similar decreasing trend between 1 Mya and 100 kya. Once again, the relevant records appear contradictory. This could be the result of regional variation across South Africa, with aridification in the Southwestern part and wetter conditions in the Southeast (Caley et al., 2018; Johnson et al., 2016). Finally, the 30–10 kya period appears to have been more humid (Chase et al., 2019; Chevalier and Chase, 2015; Lim et al., 2016). This seems inconsistent with the large population increase detected in Southern populations of both species; however, the large regions of the Namib Desert that are currently unsuitable could have been more favorable in wetter conditions.

The global decrease in population size detected in the Southern and Eastern populations could also reflect the fragmentation of a continuous ancestral range. The global trend toward a wetter climate may have favored the development of the tropical rainforest in central Africa, creating a belt of unsuitable habitat. This is in line with previous studies describing diverse biogeographical scenarios involving the survival and divergence of ungulate populations in isolated savanna refuges during Pleistocene climatic oscillations (Lorenzen et al., 2012). In this respect, it could be interesting to study population trends in other species living in semi-arid environments and having a similar range as disconnected populations. Interestingly, several bird species also have similar distributions including the Orange River francolin (Scleroptila gutturalis), the greater kestrel (Falco rupicoloides), the double-banded courser (Smutsornis africanus), the red-fronted tinkerbird (Pogoniulus pusillus), the Cape crow (Corvus capensis), and the black-faced waxbill (Estrilda erythronotos), supporting the role of the environment in the appearance of these disjunct distributions. Finally, these new demographic results, showing recent population size declines in both regions in both species, might be taken into account when assessing the conservation status of the two distinct aardwolf species and bat-eared fox subspecies.

Genome-scale phylogeny of the Carnivora

In this study, we provide a new phylogeny of Carnivora including the newly recognized species of aardwolf (P. septentrionalis). The resulting phylogeny is fully resolved with all nodes supported with UFBS values greater than 95% and is congruent with previous studies (Doronina et al., 2015; Eizirik et al., 2010; Figure 5). Across the Carnivora the monophyly of all superfamilies are strongly supported (Flynn et al., 2010) and are divided into two distinct suborders: a cat-related clade (Feliformia) and a dog-related clade (Caniformia). On the one hand, within the Feliformia, the different families and their relative relationships are well supported and are in accordance with previous studies (Eizirik et al., 2010). There is one interesting point regarding the Felidae. While almost all the nodes of the phylogeny were recovered as strongly supported from the three phylogenetic inference analyses (ML inferences, concordance factor analyses and coalescent-based inferences), one third of the nodes (three out of nine) within the Felidae show controversial node supports. This result is not surprising and is consistent with previous studies arguing for ancient hybridization among the Felidae (Li et al., 2019; Li et al., 2016). Another interesting point regarding the Feliformia and particularly the Hyaenidae is the relationship of the two aardwolves. The two species, P. cristasta and P. septentrionalis form a sister clade to the clade composed of the striped hyaena (H. hyaena) and the spotted hyaena (C. crocuta), in accordance with previous studies (Koepfli et al., 2006; Westbury et al., 2018) and the two subfamilies Protelinae and Hyaeninae that have been proposed for these two clades, respectively. However, although the phylogenetic inferences based on the supermatrix of 14,307 single-copy orthologues led to a robust resolution of this node according to the bootstrap supports, both concordance factors and coalescent-based analyses revealed conflicting signals with support for alternative topologies. In this sense, the description and acceptance of the Hyaeninae and Protelinae subfamilies still require further analyses, including genomic data for the brown hyaena (Parahyena brunnea) (Westbury et al., 2018).

On the other hand, within the Caniformia, the first split separates the Canidae from the Arctoidea. Within the Canidae the bat-eared fox (O. megalotis) is grouped with the red fox (Vulpes vulpes) and the other representative of the Vulpini, but with a very short branch, and concordance analyses indicate conflicting signals on this node. Regarding the Arctoidea, historically the relationships between the three superfamilies of arctoids have been contradictory and debated. The least supported scenario from the literature is that in which the clade Ursoidea/Musteloidea is a sister group of the Pinnipedia (Flynn and Nedbal, 1998). Based on different types of phylogenetic characters, previous studies found support for both the clade Ursoidea/Pinnipedia (Agnarsson et al., 2010; Meredith et al., 2011; Rybczynski et al., 2009) and the clade Pinnipedia/Musteloidea (Arnason et al., 2007; Eizirik et al., 2010; Flynn et al., 2005; Sato et al., 2009; Sato et al., 2006; Schröder et al., 2009). However, investigations of the insertion patterns of retroposed elements revealed the occurrence of incomplete lineage sorting (ILS) at this node (Doronina et al., 2015). With a phylogeny inferred from 14,307 single-copy orthologous genes, our study, based on both gene trees and supermatrix approaches, gives support to Pinnipedia/Musteloidea excluding the Ursoidea as the best supported conformation for the Arctoidea tree (Doronina et al., 2015; Eizirik et al., 2010; Sato et al., 2006). Interestingly, in agreement with Doronina et al., 2015, our concordance factor analysis supports the idea that the different conformations of the Arctoidea tree are probably due to incomplete lineage sorting by finding almost the same number of sites supporting each of the three conformations (34.11%, 29.61%, and 36.73%). However, although trifurcation of this node is supported by these proportions of sites, a majority of genes taken independently (gene concordance factors: 6,624 out of 14,307 genes) and the coalescent-based species tree approach (quartet posterior probabilities q1 = 0.53, q2 = 0.24, q3 = 0.24) support the clade Pinnipedia/Musteloidea, excluding the Ursoidea. Considering these results, the difficulty of resolving this trifurcation among the Carnivora (Delisle and Strobeck, 2005) has likely been contradictory due to the ILS observed among these three subfamilies (Doronina et al., 2015), which led to different phylogenetic scenarios depending on methods (Peng et al., 2007) or markers (Yu and Zhang, 2006) used. Another controversial point, likely due to ILS (Doronina et al., 2015) within the Carnivora, is the question regarding which of the Ailuridae and Mephitidae is the most basal family of the Musteloidea (Doronina et al., 2015; Eizirik et al., 2010; Flynn et al., 2005; Sato et al., 2009). Interestingly, our phylogenetic reconstruction based on mitogenomic data recovered the clade Ailuridae/Mephitidae as a sister clade to all other Musteloidea families. The phylogenomic inferences based on the genome-scale supermatrix recovered the Mephitidae as the most basal family of the Musteloidea. This result is supported by both coalescent-based inferences and concordance factors. In that sense, despite incomplete lineage sorting (Doronina et al., 2015), at the genomic level, it seems that the Mephitidae is the sister-group to all other Musteloidea families.

Overall, the phylogenomic inference based on 14,307 single-copy orthologous genes provides a new vision of the evolution of Carnivora. The addition of information from both concordance factor analyses (Minh et al., 2020) and coalescent-based inference (Zhang et al., 2018) supports previous analyses showing controversial nodes in the Carnivora phylogeny. Indeed, this additional information seems essential in phylogenomic analyses based on thousands of markers, which can lead to highly resolved and well-supported phylogenies despite support for alternative topological conformations for controversial nodes (Allio et al., 2020b; Jeffroy et al., 2006; Kumar et al., 2012).

Conclusions

The protocol developed here to extract the best part of the DNA from roadkill samples provides a good way to obtain genomic data from wildlife. Combining Illumina short-reads and Oxford Nanopore long-reads using the MaSuRCA hybrid assembler allowed us to generate high-quality reference genomes for the Southern aardwolf (P. c. cristatus) and the Southern bat-eared fox (O. m. megalotis). This cost-effective strategy provides opportunities for large-scale population genomic studies of mammalian wildlife using resequencing of samples collected from roadkill and opportunistic field collection. Indeed, by defining a genetic differentiation index (called GDI) based on only three individuals, we illustrated the potential of the approach for comparative genome-scale species delineation in both species for which subspecies have been defined based on disjunct distributions and morphological differences. Our results, based on both mitochondrial and nuclear genome analyses, indicate that the two subspecies of aardwolf warrant elevation to species level (P. cristatus and P. septentrionalis), but the O. megalotis subspecies do not warrant this status. Hence, by generating reference genomes with high contiguity and completeness, this study shows a practical application for genomics of roadkill samples.

Materials and methods

Biological samples

Request a detailed protocol

We conducted fieldwork in the Free State province of South Africa in October 2016 and October 2018. While driving along the roads, we opportunistically collected tissue samples from four roadkill specimens, from which we sampled ear tissues preserved in 95% ethanol: two Southern bat-eared foxes (O. megalotis megalotis NMB 12639, GPS: 29°1’52”S, 25°9’38”E and NMB 12640, GPS: 29°2’33”S, 25°10’26”E), and two Southern aardwolves (P. cristatus cristatus NMB 12641, GPS: 29°48’45”S, 26°15’0”E and NMB 12667, GPS: 29°8’42”S, 25°39’4”E). As aardwolf specimen NMB 12641 was still very fresh, we also sampled muscle and salivary gland and preserved them in RNAlater stabilization solution (Thermo Fisher Scientific). These roadkill specimens were sampled under standing permit number S03016 issued by the Department of National Affairs in Pretoria (South Africa) granted to the National Museum, Bloemfontein. These samples have been sent to France under export permits (JM 3007/2017 and JM 5042/2018) issued by the Free State Department of Economic, Small Business Development, Tourism and Environmental Affairs (DESTEA) in Bloemfontein (Free State, South Africa). All tissue samples collected in this study have been deposited in the mammalian tissue collection of the National Museum, Bloemfontein (Free State, South Africa). Additional tissue samples for an Eastern aardwolf (P. c. septentrionalis) male neonate (NMS.Z.2018.54) stillborn from Tanzanian parents in 2015 at Hamerton Zoo Park (UK) have been provided by the National Museums Scotland (Edinburgh, UK), and for an Eastern bat-eared fox (O. m. virgatus) from Tanzania (FMNH 158128) by the Field Museum of Natural History (Chicago, USA). As these two species are classified as Least Concern by the IUCN, and thus do not require CITES permits for international transport, the samples were transferred to France under import permits issued by the Direction régionale de l'environnement, de l'aménagement et du logement (DREAL) Occitanie in Toulouse (France).

Mitochondrial barcoding and phylogenetics

Mitogenomic dataset construction

Request a detailed protocol

In order to assemble a mitogenomic data set for assessing mitochondrial diversity among P. cristatus and O. megalotis subspecies, we generated seven new Carnivora mitogenomes using Illumina shotgun sequencing (Supplementary file 8). Briefly, we extracted total genomic DNA using the DNeasy Blood and Tissue Kit (Qiagen) for P. c. cristatus (NMB 12641), P. c. septentrionalis (NMS Z.2018.54), O. m. megalotis (NMB 12639), O. m. virgatus (FMNH 158128), Speothos venaticus (ISEM T1624), Vulpes vulpes (ISEM T3611), and Parahyaena brunnea (ISEM FD126), prepared Illumina libraries following the protocol of Tilak et al., 2015, and sent libraries to the Montpellier GenomiX platform for single-end 100 bp sequencing on a Illumina HiSeq 2500 instrument to obtain about 5–10 million reads per sample. We then assembled and annotated mitogenomes from these single-read shotgun sequencing data with MitoFinder v1.0.2 (Allio et al., 2020a) using default parameters. We also used MitoFinder to extract three additional mitogenomes from paired-end Illumina capture libraries of ultra-conserved elements (UCEs) and available from the Short Read Archive (SRA) of NCBI for Viverra tangalunga, Bdeogale nigripes, and Fossa fossana. Additional read mappings were done with Geneious (Kearse et al., 2012) to close gaps when the mitochondrial genome was fragmented. Finally, we downloaded all RefSeq carnivoran mitogenomes available in Genbank (135 species as of July 1st, 2019) and the mitogenome of the Malayan pangolin (Manis javanica) to use as an outgroup.

Mitogenomic phylogenetics and dating

Request a detailed protocol

Mitochondrial protein-coding genes were individually aligned using MACSE v2 (Ranwez et al., 2018) with default parameters, and ribosomal RNA genes using MAFFT (Katoh and Standley, 2013) algorithm FFT-NS-2 with option --adjustdirection. A nucleotide supermatrix was created by concatenating protein-coding and ribosomal RNA genes for the 142 taxa (140 species and two subspecies). Phylogenetic inferences were performed with Maximum likelihood (ML) as implemented in IQ-TREE 1.6.8 (Nguyen et al., 2015) with the GTR+G4+F model. Using the resulting topology, divergence time estimation was performed using Phylobayes v4.1c (Lartillot et al., 2013) with strict clock (CL), autocorrelated (LN or TK02), and uncorrelated (UGAM or UCLM) models combined with 18 fossil calibrations (Supplementary file 9). Three independent Markov chains Monte Carlo (MCMC) analyses starting from a random tree were run until 10,000 generated cycles with trees and associated model parameters sampled every cycle. A burn-in of 25% was applied before constructing the majority-rule Bayesian consensus tree with the readdiv subprogram. Finally, to determine the best-fitting clock model, cross-validation analyses were performed with Phylobayes by splitting the dataset randomly into two parts. Then, parameters of one model were estimated on the first part of the dataset (here representing 90%) and the parameter values were used to compute the likelihood of the second part of the dataset (10%). This procedure was repeated 10 times for each model. Finally, the likelihood of each repeated test was computed and summed for each model with the readcv and sumcv subprograms, respectively. The molecular clock model with the highest cross-likelihood scores was considered as the best fitting.

Mitochondrial diversity and barcoding gap analyses

Request a detailed protocol

To check if a threshold between intraspecific variation and interspecific divergence could be determined across the Carnivora (Meyer and Paulay, 2005), two mitochondrial barcoding datasets were assembled from all COX1 and CYTB sequences available for Carnivora plus the corresponding sequences for each of the two subspecies of O. megalotis and P. cristatus, respectively. After aligning each barcoding dataset with MACSE v2, ML phylogenetic inferences were performed with IQ-TREE 1.6.6 using the optimal substitution model as determined by ModelFinder (Kalyaanamoorthy et al., 2017). Then, pairwise patristic distances between all individuals were calculated from the resulting ML phylogram. Finally, based on the actual taxonomic assignment, patristic distances were considered as intraspecific variation between two individuals belonging to the same species and as interspecific divergence between individuals of different species.

Short reads and long reads hybrid assembly of reference genomes

Sampling

Request a detailed protocol

To construct reference assemblies with high contiguity for the two focal species, we selected the best-preserved roadkill samples: NMB 12639 for O. megalotis and NMB 12641 for P. cristatus (Table 1, Supplementary file 8). Total genomic DNA extractions were performed separately for Illumina short-read sequencing and MinION long-read sequencing.

Illumina short-read sequencing

Request a detailed protocol

Total genomic DNA extractions were performed from ear tissue samples from two individuals using the DNeasy Blood and Tissue Kit (Qiagen) following manufacturer’s instructions. A total amount of 1.0 μg DNA per sample was sent as input material for Illumina library preparation and sequencing to Novogene Europe (Cambridge, UK). Sequencing libraries were generated using NEBNext DNA Library Prep Kit following manufacturer’s recommendations and indices were added to each sample. Genomic DNA was randomly fragmented to a size of 350 bp by shearing, then DNA fragments were end-polished, A-tailed, and ligated with the NEBNext adapter for Illumina sequencing, and further PCR enriched by P5 and indexed P7 oligos. The PCR products were purified (AMPure XP system) and the resulting libraries were analysed for size distribution by Agilent 2100 Bioanalyzer and quantified using real-time PCR. Since the genome sizes for these two species was estimated to be about 2.5 Gb, Illumina paired-end 250 bp sequencing was run on HiSeqX10 and NovaSeq instruments to obtain about 200 Gb per sample corresponding to a genome depth-of-coverage of about 80x.

MinION long-read sequencing

Request a detailed protocol

Considering the DNA quality required to perform sequencing with Oxford Nanopore Technologies (ONT), a specific protocol to extract DNA from roadkill was designed (Tilak et al., 2020). First, genomic DNA was extracted by using the classical phenol-chloroform method. Then, we evaluated the cleanliness of the extractions by using (1) a binocular magnifying glass to check the absence of suspended particles (e.g. hairpieces) and (2) both Nanodrop and Qubit/Nanodrop ratio. To select the longest DNA fragments, we applied a specific ratio of 0.4x of AMPure beads applied (Tilak et al., 2020). Extracted-DNA size was then homogenized using covaris G-tubes to optimize sequencing yield. Finally, long-read ONT sequencing was performed through MinION flowcells (FLO-MIN-106) using libraries prepared with the ONT Ligation Sequencing kit SQK-LSK109. For both species, we run MinION sequencing until about 30 Gb per sample were obtained to reach a genome depth-of-coverage of about 12x.

Hybrid assembly of short and long reads

Request a detailed protocol

Short reads were cleaned using Trimmomatic 0.33 (Bolger et al., 2014) by removing low-quality bases from their beginning (LEADING:3) and end (TRAILING:3), and by removing reads shorter than 50 bp (MINLEN:50). Quality was measured for sliding windows of four base pairs and had to be greater than 15 on average (SLIDINGWINDOW:4:15). For MinION sequencing, basecalling of fast5 files was performed using Guppy v3.1.5 (developed by ONT) with the high accuracy option, which takes longer but is more accurate than the standard fast model (Appendix 1—figure 1). Long-read adapters were removed using Porechop v0.2.3 (https://github.com/rrwick/Porechop). To take advantage of both the high accuracy of Illumina short reads sequencing and the size of MinION long reads, assemblies were performed using the MaSuRCA hybrid genome assembler (Zimin et al., 2013). This method transforms large numbers of paired-end reads into a much smaller number of longer ‘super-reads’ and permits assembling Illumina reads of differing lengths together with longer ONT reads. To illustrate the advantage of using short reads and long reads conjointly, assemblies were also performed with short reads only using SOAP-denovo (Luo et al., 2012) (kmer size = 31, default parameters) and gaps between contigs were closed using the abundant paired relationships of short reads with GapCloser 1.12 (Luo et al., 2012). To evaluate genome quality, traditional measures, like the number of scaffolds and contig N50, the mean and maximum lengths were evaluated for 503 mammalian genome assemblies retrieved from NCBI (https://www.ncbi.nlm.nih.gov/assembly) on August 13th, 2019 with filters: ‘Exclude derived from surveillance project’, ‘Exclude anomalous’, ‘Exclude partial’, and using only the RefSeq assembly for Homo sapiens. Finally, we assessed the gene completeness of our assemblies by comparison with the 63 carnivoran assemblies available at NCBI on August 13th, 2019 using Benchmarking Universal Single-Copy Orthologs (BUSCO) v3 (Waterhouse et al., 2018) with the Mammalia OrthoDB 9 BUSCO gene set (Zdobnov et al., 2017) through the gVolante web server (Nishimura et al., 2017).

Comparative species delineation based on genomic data

Sampling and resequencing

Request a detailed protocol

To assess the genetic diversity in P. cristatus, we sampled an additional roadkill individual of the South African subspecies P. c. cristatus (NMB 12667) and an individual of the East African subspecies P. c. septentrionalis (NMS.Z.2018.54) born in a zoo from wild Tanzanian parents (Table 1). A similar sampling was done for O. megalotis, with an additional roadkill individual of the South African subspecies O. m. megalotis (NMB 12640) and an individual of the East African subspecies O. m. virgatus (FMNH 158128) from Tanzania (Table 1). DNA extractions were performed with the DNeasy Blood and Tissue Kit (Qiagen), following manufacturer’s instructions and a total amount of 1.0 μg DNA per sample was outsourced to Novogene Europe (Cambridge, UK) for Illumina library preparation and Illumina paired-end 250 bp sequencing on HiSeqX10 and NovaSeq instruments to obtain about 200 Gb per sample (genome depth-of-coverage of about 80x). The resulting reads were cleaned using Trimmomatic 0.33 with the same parameters as described above.

Heterozygosity and genetic differentiation estimation

Request a detailed protocol

In a panmictic population, alleles observed in one individual are shared randomly with other individuals of the same population and the frequencies of homozygous and heterozygous alleles should follow Hardy-Weinberg expectations. However, any structure in subpopulations leads to a deficiency of heterozygotes (relative to Hardy-Weinberg expectations) in these subpopulations due to inbreeding (Holsinger and Weir, 2009; Wahlund, 2010) and thus decreases the polymorphism within the inbred subpopulations with respect to the polymorphism of the global population. Given that, Hudson et al., 1992 defined the FST as a measure of polymorphism reduction in two subdivided populations (p within) compared to the population at large (p between).

To assess the p within and p between of the two subspecies of each species (P. cristatus and O. megalotis), we compared the heterozygous alleles (SNPs) of two individuals of the same subspecies and the SNPs of two individuals of different subspecies by computing a FST-like statistic (hereafter called Genetic Differentiation Index: GDI) (Appendix 3—figure 1). In fact, polymorphic sites can be discriminated in four categories: (1) fixed in one individual (e.g. AA/TT); (2) shared with both individuals (e.g. AT/AT); (3) specific to individual 1 (e.g. AT/AA); and (4) specific to individual 2 (e.g. AA/AT). Using these four categories, it is possible to estimate the polymorphism of each individual 1 and 2 and thus estimate a GDI between two individuals of the same population A and the GDI between two individuals of different populations A and B as follows:

G D I_{i n t r a A} = 1 - \frac{(π_{A 1} + π_{A 2}) / 2}{π_{t o t A}}

G D I_{i n t r a B} = 1 - \frac{(π_{B 1} + π_{B 2}) / 2}{π_{t o t B}}

For each species, cleaned short reads of all individuals (the one used to construct the reference genome and the two resequenced from each population) were aligned with their reference genome using BWA-MEM (Li, 2013). BAM files were created and merged using SAMtools (Li et al., 2009). Likely contaminant contigs identified using BlobTools (Laetsch and Blaxter, 2017; Appendix 4—figure 1, Supplementary file 10, Supplementary file 11) and contigs likely belonging to the X chromosome following LASTZ (Rahmani et al., 2011) alignments were removed (contigs that align with cat or dog autosomes and not to X chromosome have been selected). Then, 100 regions of 100,000 bp were randomly sampled among contigs longer than 100,000 bp and 10 replicates of this sampling were performed (i.e. 10 × 100×100,000 bp=100 Mb) to assess statistical variance in the estimates. Genotyping of these regions was performed with freebayes v1.3.1–16 (git commit id: g85d7bfc) (Garrison and Marth, 2012) using the parallel mode (Tange, 2011). Only SNPs with freebayes-estimated quality higher than 10 were considered for further analyses. A first GDI estimation comparing the average of the private polymorphisms of the two southern individuals (p within A) and the total polymorphism of the two individuals (p between A) was estimated to control that no genetic structure was observed in the Southern subspecies. Then, a global GDI comparing the private polymorphisms of individuals from the two populations (p within AB) and the total polymorphism of the species (the two populations, p between AB) was estimated with one individual from each population (Appendix 3—figure 1). Finally, the two GDI were compared to check if the Southern populations were more structured than the entire populations.

To contextualize these results, the same GDI measures were estimated for well-defined species of Carnivora. The species pairs used to make the comparison and thus help gauge the taxonomic status of the bat-eared fox and aardwolf subspecies were selected according to the following criteria: (1) the two species had to be as closely related as possible, (2) they had both reference genomes and short reads available, (3) their estimated coverage for the two species had to be greater than 15x, and (4) short-read sequencing data had to be available for two individuals for one species of the pair. Given that, four species pairs were selected: (1) Canis lupus / Canis aureus (Canis lupus: SRR8926747, SRR8926748; Canis aureus: SRR7976426; vonHoldt et al., 2016; reference genome: GCF_000002285.3 ; Lindblad-Toh et al., 2005); (2) Ursus maritimus / Ursus arctos (Ursus maritimus PB43: SRR942203, SRR942290, SRR942298; Ursus maritimus PB28: SRR942211, SRR942287, SRR942295; Ursus arctos: SRR935591, SRR935625, SRR935627; Liu et al., 2014); (3) Lynx pardinus / Lynx lynx (Lynx pardinus LYNX11: ERR1255591-ERR1255594; Lynx lynx LYNX8: ERR1255579-ERR1255582; Lynx lynx LYNX23: ERR1255540-ERR1255549; Abascal et al., 2016); and (4) Panthera leo / Panthera pardus (Panthera leo: SRR10009886, SRR836361; Panthera pardus: SRR3041424; Kim et al., 2016). Raw reads for the three individuals of each species pair were downloaded, cleaned, and mapped as described above. Then, the same GDI estimation protocol was applied to each species pair by estimating the GDI within species, using two individuals of the same species, and the GDI between species, using one individual of each species of the pair.

To check the robustness of the genetic differentiation index estimation, two additional analyses were conducted. First, given that the estimation could be biased by the depth-of-coverage used for the genotype calling, the reads used for all individuals were randomly subsampled to obtain a homogenized depth-of-coverage of about 15x. Based on these new datasets, genetic differentiation indexes were re-estimated for each group. Second, to show the consistency of the results, when few individuals are used for the estimates, a permuted subsampling approach, drawing from a larger dataset, was performed. Using the species pairs Ursus maritimus/Ursus arctos, for which sequencing data were available for 10 individuals of each species, genetic differentiation indexes were estimated using all possible combinations, using either two individuals for Ursus arctos or one individual for each species (i.e. 45 Ursus arctos/Ursus arctos and 100 Ursus arctos/Ursus maritimus). Given the number of possible combinations, estimates were performed on only five replicates (instead of 10) of 100 regions of 100,000 bp for each combination (Figure 3—figure supplement 2).

Demographic analyses

Request a detailed protocol

Historical demographic variations in effective population size were estimated using the Pairwise Sequentially Markovian Coalescent (PSMC) model implemented in the software PSMC (https://github.com/lh3/psmc) (Li and Durbin, 2011). As described above, cleaned short reads were mapped against the corresponding reference genome using BWA-MEM (Li, 2013) and genotyping was performed using Freebayes v1.3.1–16 (git commit id: g85d7bfc) (Garrison and Marth, 2012) for the three individuals of each species. VCF files were converted to fasta format using a custom python script, excluding positions with quality below 20 and a depth-of-coverage below 10x or higher than 200x. Diploid sequences in fasta format were converted into PSMC fasta format using a C++ program written using the BIO++ library (Guéguen et al., 2013) with a block length of 100 bp and excluding blocks containing more than 20% missing data as implemented in ‘fq2psmcfa’ (https://github.com/lh3/psmc).

PSMC analyses were run for all other populations, testing several -t and -p parameters including -p ‘4+30*2+4+6+10’ (Nadachowska-Brzyska et al., 2013) and -p ‘4+25*2+4+6’ (Kim et al., 2016) but also -p ‘4+10*3+4’, -p ‘4+20*2+4’ and -p ‘4+20*3+4’. Overall, the tendencies were similar, but some parameters led to unrealistic differences between the two individuals from the South African population of Otocyon megalotis. We chose to present the results obtained using the parameters -t15 -r4 -p ‘4+10*3+4’. For this parameter setting, the variance in ancestral effective population size was estimated by bootstrapping the scaffolds 100 times. To scale PSMC results, based on several previous studies on large mammals, a mutation rate of 10⁻⁸ mutation/site/generation (Ekblom et al., 2018; Gopalakrishnan et al., 2017) and a generation time of 2 years (Clark, 2005; Koehler and Richardson, 1990; van Jaarsveld, 1993) were selected. Results were plotted in Rv3.63 (R core R Development Core Team, 2020) using the function ‘psmc.results’ (https://doi.org/10.5061/dryad.0618v/4) (Liu and Hansen, 2017) modifed using ggplot2 (Wickham, 2016) and cowplot (Wilke, 2016).

Phylogenomic inferences

Request a detailed protocol

To infer the Carnivora phylogenetic relationships, all carnivoran genomes available on Genbank, the DNAZoo website (https://www.dnazoo.org), and the OrthoMaM database (Scornavacca et al., 2019) as of February 11th, 2020 were downloaded (Supplementary file 12). In cases where more than one genome was available per species, the assembly with the best BUSCO scores was selected. Then, we annotated our two reference genome assemblies and the other unannotated assemblies using MAKER2 (Holt and Yandell, 2011) following the recommendations of the DNA Zoo protocol (https://www.dnazoo.org/post/the-first-million-genes-are-the-hardest-to-make-r). In the absence of available transcriptomic data, this method leveraged the power of homology combined with the thorough knowledge accumulated on the gene content of mammalian genomes. As advised, a mammal-specific subset of UniProtKB/Swiss-Prot, a manually annotated, non-redundant protein sequence database, was used as a reference for this annotation step (Boutet et al., 2016). Finally, the annotated coding sequences (CDSs) recovered for the Southern aardwolf (P. c. cristatus) were used to assemble those of the Eastern aardwolf (P. c. septentrionalis) by mapping the resequenced Illumina reads using BWA-MEM (Li, 2013).

Orthologous genes were extracted following the orthology delineation process of the OrthoMaM database (OMM) (Scornavacca et al., 2019). First, for each orthologous-gene alignment of OMM, a HMM profile was created via hmmbuild, using default parameters of the HMMER toolkit (Eddy, 2011), and all HMM profiles were concatenated and summarized using hmmpress to construct a HMM database. Then, for each CDS newly annotated by MAKER, hmmscan was used on the HMM database to retrieve the best hits among the orthologous gene alignments. For each orthologous gene alignment, the most similar sequences for each species were detected via hmmsearch. Outputs from hmmsearch and hmmscan were discarded, if the first-hit score was not substantially better than the second (hit₂ <0.9 hit₁). This ensures our orthology predictions for the newly annotated CDSs to be robust. Then, the cleaning procedure of the OrthoMaM database was applied to the set of orthologous genes obtained. This process, implemented in a singularity image (Kurtzer et al., 2017) named OMM_MACSE.sif (Ranwez et al., 2021), is composed of several steps including nucleotide sequence alignment at the amino acid level with MAFFT (Katoh and Standley, 2013), refining alignments to handle frameshifts with MACSE v2 (Ranwez et al., 2018), cleaning of non-homologous sequences, and masking of erroneous/dubious parts of gene sequences with HMMcleaner (Di Franco et al., 2019). Finally, the last step of the cleaning process was to remove sequences that generated abnormally long branches during gene tree inferences. This was done by reconstructing gene trees using IQ-TREEv1.6.8 (Nguyen et al., 2015) with the MFP option to select the best-fitting model for each gene. Then, the sequences generating abnormally long branches were identified and removed by PhylteR (https://github.com/damiendevienne/phylter). This software allows detection and removal of outliers in phylogenomic datasets by iteratively removing taxa in genes and optimising a concordance score between individual distance matrices.

Phylogenomic analyses were performed using maximum likelihood (ML) using IQ-TREE 1.6.8 (Nguyen et al., 2015) on the supermatrix resulting from the concatenation of all orthologous genes previously recovered with the TESTNEW option to select the best-fitting model for each partition. Two partitions per gene were defined to separate the first two codon positions from the third codon positions. Node supports were estimated with 100 non-parametric bootstrap replicates. Furthermore, gene concordant (gCF) and site concordant (sCF) factors were measured to complement traditional bootstrap node-support measures as recommended in Minh et al., 2020. For each orthologous gene alignment a gene tree was inferred using IQ-TREE with model selection and gCF and sCF were calculated using the specific option -scf and -gcf in IQ-TREE (Minh et al., 2020). The gene trees obtained with this analysis were also used to perform a coalescent-based species tree inference using ASTRAL-III (Zhang et al., 2018).

Data access

Request a detailed protocol

Genome assemblies, associated Illumina and Nanopore sequence reads, and mitogenomes have been submitted to the National Center for Biotechnology Information (NCBI) and will be available after publication under BioProject number PRJNA681015. The full analytical pipeline, phylogenetic datasets (mitogenomic and genomic), corresponding trees, and other supplementary materials are available from zenodo.org (DOI: 10.5281/zenodo.4479226).

Appendix 1

Difference between Fast and High accuracy modes of Guppy basecaller

For MinION sequencing, basecalling of fast5 files was performed using Guppy v3.1.5 (developed by ONT) with the high accuracy option, which takes longer but is more accurate than the standard fast model (Appendix 1—figure 1).

Appendix 1—figure 1

Download asset Open asset

Plot of the quality of Nanopore long reads base-called with either the *fast* or the *high accuracy* option of Guppy v3.1.5.

The quality of the base-calling step has a large impact on the final quality of the assemblies by reducing the number of contigs and increasing the N50 value.

Genome quality assessments

Exhaustive comparisons with 503 available mammalian assemblies revealed a large heterogeneity among taxonomic groups and a wide variance within groups in terms of both number of scaffolds and N50 values (Appendix 1—figure 2, Supplementary file 3). Xenarthra was the group with the lowest quality genome assemblies, with a median number of scaffolds of more than one million and a median N50 of only 15 kb. Conversely, Carnivora contained genome assemblies of much better quality, with a median number of scaffolds of 15,872 and a median N50 of 4.6 Mb, although a large variance was observed among assemblies for both metrics (Appendix 1—figure 2, Supplementary file 3). Our two new genomes compared favorably with the available carnivoran genome assemblies in terms of contiguity showing slightly less than the median N50 and a lower number of scaffolds than the majority of the other assemblies (Appendix 1—figure 2, Supplementary file 3). Comparison of two hybrid assemblies with Illumina-only assemblies obtained with SOAPdenovo illustrated the positive effect of introducing Nanopore long reads even at moderate coverage by reducing the number of scaffolds from 409,724 to 5669 (aardwolf) and from 433,209 to 11,081 (bat-eared fox) while increasing the N50 from 17.3 kb to 1.3 Mb (aardwolf) and from 22.3 kb to 728 kb (bat-eared fox). With regard to completeness based on 4104 single-copy mammalian BUSCO orthologues, our two hybrid assemblies are among the best assemblies with more than 90% complete BUSCO genes and less than 4% missing genes (Appendix 1—figure 3, Supplementary file 4). As expected, the two corresponding Illumina-only assemblies were much more fragmented and had globally much lower BUSCO scores (Appendix 1—figure 3, Supplementary file 4).

Appendix 1—figure 2

Download asset Open asset

Comparison of 503 mammalian genome assemblies from 12 taxonomic groups using bean plots of the (a) number of scaffolds, and (b) scaffold N50 values ranked by median values.

Thick black lines show the medians, dashed black lines represent individual data points, and polygons represent the estimated density of the data. Note the log scale on the Y axes. The bat-eared fox (*Otocyon megalotis megalotis*) and aardwolf (*Proteles cristatus cristatus*) assemblies produced in this study using SOAPdenovo and MaSuRCA are indicated by asterisks. Bean plots were computed using BoxPlotR (Spitzer et al., 2014).

Appendix 1—figure 3

Download asset Open asset

BUSCO completeness assessment of 67 Carnivora genome assemblies visualized as bar charts representing percentages of complete single-copy (light blue), complete duplicated (dark blue), fragmented (yellow), and missing (red) genes ordered by increasing percentage of total complete genes.

The bat-eared fox (*Otocyon megalotis megalotis*) and aardwolf (*Proteles cristatus cristatus*) assemblies produced in this study using MaSuRCA and SOAPdenovo are indicated by asterisks.

Appendix 2

Morphological differences between Proteles taxa

Differences in fur coloration and markings

Cabrera, 1910 described how the fur of pallidior is unicolored and lacks the brown base of cristatus. This latter character appears to be consistent in an Ethiopian specimen in National Museums Scotland (NMS.Z.1877.15.5) compared with three skins of cristatus of Namibian and South African origin (NMS.Z.2020.44, NMS.Z.2020.46.1 and NMS.Z.2020.46.6) also in the collections of National Museums Scotland (Appendix 2—figure 1), although it would appear to be a difference in the coloration of the underfur. However, a Zimbabwean specimen (NMS.Z.1950.68) also had only pale underfur, which appears to contradict Cabrera, 1910, so the usefulness of this character is in doubt.

In reviewing georeferenced photographs of aardwolves from throughout the range, the striping pattern appeared to be variable, but overall East African specimens tended to be paler, with more contrasting stripes with a pale forehead compared with the longer, grayer or ochre-gray fur in Southern African specimens, which have broader less distinctive stripes (A.C.K. pers. obs.). However, fur length and hence stripe distinctiveness may just be a phenotypic response to lower temperatures at higher latitudes compared with equatorial East African specimens.

Additional preliminary observations were made on pelage coloration and markings based on the skins above and live specimens of both taxa kept at Hamerton Zoo Park, Cambridgeshire, UK. The live specimens offer a unique opportunity to examine these characters at the same latitude and environmental conditions, so that phenotypes should reflect genetic differences between taxa. Two pelage characters appear to be different between the two taxa. Firstly the stripes in cristatus tend to broader and less well defined, whereas in septentrionalis they are thinner, more contrasting and break up into spots on the neck. Secondly the forehead coloration is dark grizzled gray in cristatus, but lighter yellowish-gray or creamy-gray in septentrionalis. Further investigation is required to examine pelage variation from throughout the ranges of both taxa to see if these characters are diagnostic and to determine additional diagnostic characters.

Appendix 2—figure 1

Download asset Open asset

Unicolored fur of an Eastern aardwolf from Ethiopia (NMS.Z.1877.15.5) (A) and bicolored fur of a Southern aardwolf of South African origin (NMS.Z.2020.44) (B).

Skull morphometric analyses

In addition to skull measurements taken from specimens in the Naturla History Museum, London (NHMUK), Museum of Vertebrate Zoology (MVZ) and National Museums Scotland (NMS), measurements of skulls were taken from the literature (Allen et al., 1909; Heller, 1913; Hollister, 1924; Roberts, 1932; Roberts, 1951; Supplementary file 6). Comparison of means confirmed that mean post-orbital breadth is significantly greater in septentrionalis than in cristatus (t_8,16=4.10, p<0.001) (Appendix 2—figure 1). However, there are no differences between the means of other skull measurements, including condylobasal length of skull (Appendix 2—figure 2), zygomatic width, inter-orbital breadth, brain-case width and mandible length (all p>0.05). As noted above with skins, sample sizes are small and thus the significant difference in mean post-orbital breadth between the two taxa remains tentative subject to examination of a larger sample.

Appendix 2—figure 2

Download asset Open asset

Box and jitter plot of (A) post-orbital breadths of *Proteles* taxa: *cristatus* (left) and *septentrionalis* (right) and (B) condylobasal lengths of skull of *Proteles* taxa: *cristatus* (left) and *septentrionalis* (right).

Graph generated with BoxPlotR (http://shiny.chemgrid.org/boxplotr/).

Appendix 3

Genetic differentiation index

To estimate the level of genetic differentiation between two populations, we developed a new index based on the heterozygosity of at least one individual of each population (Appendix 3—figure 1).

Appendix 3—figure 1

Download asset Open asset

Definition of the genetic differentiation index (GDI) based on the F-statistic (FST).

The main difference between these two indexes is the use of heterozygous allele states for GDI rather than real polymorphism for the FST. Green = π_within, Orange = π_between, Blue = Population A, Red = Population A+B.

Appendix 4

Contigs selection for genetic differentiation analyses

Using Blobtools (Laetsch and Blaxter, 2017), we were able to specifically select the Carnivora contigs for further analyses (Appendix 4—figure 1, Supplementary file 10, Supplementary file 11). Additionally, contigs likely belonging to X chromosome were identified and removed based on LASTZ (Rahmani et al., 2011) alignments (contigs that align with cat or dog autosomes and not to X chromosome have been selected).

Appendix 4—figure 1

Download asset Open asset

Graphical representation (BlobPlot) of the results of contamination analyses performed with BlobTools for (a) the aardwolf (*Proteles cristatus cristatus*) and (b) the bat-eared fox (*Otocyon megalotis megalotis*) genome assemblies.

Data availability

The following data sets were generated

1. Allio R
2. Tilak MK
3. Scornavacca C
4. Avenant N
5. Corre E
6. Nabholz B
7. Delsuc F
(2020) Zenodo
Zenodo.

https://doi.org/10.5281/zenodo.4479226
1. Allio Rm
2. Tilak MK
3. Scornavacca C
4. Avenant NL
5. Kitchener AC
6. Corre E
7. Nabholz B
8. Delsuc Fdr
(2021) NCBI BioProject
ID PRJNA681015. Molecular data associated with the publication.

https://www.ncbi.nlm.nih.gov/bioproject/PRJNA681015

The following previously published data sets were used

(2016) Dryad
Admixture mapping identifies introgressed genomic regions in North American canids.

https://doi.org/10.5061/dryad.0mg54
1. Liu S
2. Lorenzen ED
3. Fumagalli M
4. Li B
5. Harris K
6. Xiong Z
7. Zhou L
8. Korneliussen TS
9. Somel M
10. Babbitt C
(2014) journal
Population genomics reveal recent speciation and rapid evolutionary adaptation in polar bears.

https://doi.org/10.1016/j.cell.2014.03.054
1. Abascal F
2. Corvelo A
3. Cruz F
4. Villanueva-Cañas JL
5. Vlasova A
6. Marcet-Houben M
7. Martínez-Cruz B
8. Cheng JY
9. Prieto P
10. Quesada V
11. Quilez J
12. Li G
13. García F
14. Rubio-Camarillo M
15. Frias L
16. Ribeca P
17. Capella-Gutiérrez S
18. Rodríguez JM
19. Câmara F
20. Lowy E
21. Cozzuto L
22. Erb I
23. Tress ML
24. Rodriguez-Ales JL
25. Ruiz-Orera J
26. Reverter F
27. Casas-Marce M
28. Sorano L
29. Arango JR
30. Derdak S
31. Galán B
32. Blanc J
33. Gut M
34. Lorente-Galdos B
35. Andrés-Nieto M
36. López-Otín C
37. Valencia A
38. Gut I
39. García JL
40. Guigó R
41. Murphy WJ
42. Ruiz-Herrera A
43. Marques-Bonet T
44. Roma G
45. Notredame C
46. Mailund T
47. Albà MM
48. Gabaldón T
49. Alioto T
50. Godoy JA
(2016) European Nucleotide Archive
ID PRJEB12609. Extreme genomic erosion after recurrent demographic bottlenecks in the highly endangered Iberian lynx.

http://www.ebi.ac.uk/ena
1. Kim S
2. Cho YS
3. Kim H
4. Chung O
5. Jho S
6. Seomun H
7. Kim J
8. Bang WY
9. Kim C
10. An J
11. Bae CH
12. Bhak Y
13. Jeon S
14. Yoon H
15. Kim Y
16. Jun J
17. Lee H
18. Cho S
19. Uphyrkina O
20. Kostyria A
21. Goodrich J
22. Miquelle D
23. Roelke M
24. Lewis J
25. Yurchenko A
26. Bankevich A
27. Cho J
28. Lee S
29. Edwards JS
30. Weber JA
31. Cook J
32. Kim S
33. Manica A
34. Lee I
35. O’Brien SJ
36. Bhak J
37. Yeo J
(2016) NCBI Sequence Read Archive
ID SRA321193. Comparison of carnivore, omnivore, and herbivore mammalian genomes with a new leopard assembly.

https://www.ncbi.nlm.nih.gov/sra/?term=SRA321193

References

1. Abascal F
2. Corvelo A
3. Cruz F
4. Villanueva-Cañas JL
5. Vlasova A
6. Marcet-Houben M
7. Martínez-Cruz B
8. Cheng JY
9. Prieto P
10. Quesada V
11. Quilez J
12. Li G
13. García F
14. Rubio-Camarillo M
15. Frias L
16. Ribeca P
17. Capella-Gutiérrez S
18. Rodríguez JM
19. Câmara F
20. Lowy E
21. Cozzuto L
22. Erb I
23. Tress ML
24. Rodriguez-Ales JL
25. Ruiz-Orera J
26. Reverter F
27. Casas-Marce M
28. Soriano L
29. Arango JR
30. Derdak S
31. Galán B
32. Juie B
33. Gut M
34. Lorente-Galdos B
35. Andrés-Nieto M
36. López-Otín C
37. Valencia A
38. Gut I
39. García JL
40. Guigó R
41. Murphy WJ
42. Ruiz-Herrera A
43. Marques-Bonet T
44. Roma G
45. Notredame C
46. Mailund T
47. Albà MM
48. Gabaldón T
49. Alioto T
50. Godoy JA
(2016) Extreme genomic erosion after recurrent demographic bottlenecks in the highly endangered iberian Lynx
Genome Biology 17:251.

https://doi.org/10.1186/s13059-016-1090-1
- PubMed
- Google Scholar
(2010) Dogs, cats, and kin: a molecular species-level phylogeny of carnivora
Molecular Phylogenetics and Evolution 54:726–745.

https://doi.org/10.1016/j.ympev.2009.10.033
- PubMed
- Google Scholar
1. Allen JA
2. Tjader R
3. Lang H
(1909)
Mammals from british East Africa, collected by the tjäder expedition of 1906

Bulletin of the AMNH 26:12.
- Google Scholar
1. Allen R
2. Ryan H
3. Davis BW
4. King C
5. Frantz L
6. Irving-Pease E
7. Barnett R
8. Linderholm A
9. Loog L
10. Haile J
11. Lebrasseur O
12. White M
13. Kitchener AC
14. Murphy WJ
15. Larson G
(2020) A mitochondrial genetic divergence proxy predicts the reproductive compatibility of mammalian hybrids
Proceedings of the Royal Society B: Biological Sciences 287:20200690.

https://doi.org/10.1098/rspb.2020.0690
- Google Scholar
(2020a) MitoFinder: efficient automated large-scale extraction of mitogenomic data in target enrichment phylogenomics
Molecular Ecology Resources 20:892–905.

https://doi.org/10.1111/1755-0998.13160
- PubMed
- Google Scholar
(2020b) Whole genome shotgun phylogenomics resolves the pattern and timing of swallowtail butterfly evolution
Systematic Biology 69:38–60.

https://doi.org/10.1093/sysbio/syz030
- PubMed
- Google Scholar
1. Armstrong EE
2. Taylor RW
3. Miller DE
4. Kaelin CB
5. Barsh GS
6. Hadly EA
7. Petrov D
(2020) Long live the king: chromosome-level assembly of the lion (Panthera leo) using linked-read, Hi-C, and long-read data
BMC Biology 18:3.

https://doi.org/10.1186/s12915-019-0734-5
- PubMed
- Google Scholar
(2007) Mitogenomic analyses of caniform relationships
Molecular Phylogenetics and Evolution 45:863–874.

https://doi.org/10.1016/j.ympev.2007.06.019
- PubMed
- Google Scholar
1. Atickem A
2. Stenseth NC
3. Drouilly M
4. Bock S
5. Roos C
6. Zinner D
(2018) Deep divergence among mitochondrial lineages in african jackals
Zoologica Scripta 47:1–8.

https://doi.org/10.1111/zsc.12257
- Google Scholar
(2006) The origin, current diversity and future conservation of the modern lion ( Panthera leo )
Proceedings of the Royal Society B: Biological Sciences 273:2119–2125.

https://doi.org/10.1098/rspb.2006.3555
- Google Scholar
Preprint
1. Batra SS
2. Levy-Sakin M
3. Robinson J
4. Guillory J
5. Durinck S
6. Kwok P-Y
7. Cox LA
8. Seshagiri S
9. Song YS
10. Wall JD
(2019) Accurate assembly of the olive baboon (Papio anubis) genome using long-read and Hi-C data
bioRxiv.

https://doi.org/10.1101/678771
- Google Scholar
(2016) Phylogenomics, biogeography and diversification of obligate mealybug-tending ants in the genus Acropyga
Molecular Phylogenetics and Evolution 102:20–29.

https://doi.org/10.1016/j.ympev.2016.05.030
- PubMed
- Google Scholar
Preprint
(2019) Next-generation in situ conservation and educational outreach in Madagascar using a mobile genetics lab
bioRxiv.

https://doi.org/10.1101/650614
- Google Scholar
(2014) Trimmomatic: a flexible trimmer for illumina sequence data
Bioinformatics 30:2114–2120.

https://doi.org/10.1093/bioinformatics/btu170
- PubMed
- Google Scholar
1. Boutet E
2. Lieberherr D
3. Tognolli M
4. Schneider M
5. Bansal P
6. Bridge AJ
7. Poux S
8. Bougueleret L
9. Xenarios I
(2016) UniProtKB/Swiss-Prot, the manually annotated section of the UniProt KnowledgeBase: how to use the entry view
Methods in Molecular Biology 1374:23–25.

https://doi.org/10.1007/978-1-4939-3167-5_2
- PubMed
- Google Scholar
1. Brown CR
2. Bomberger Brown M
(2013) Where has all the road kill gone?
Current Biology 23:R233–R234.

https://doi.org/10.1016/j.cub.2013.02.023
- PubMed
- Google Scholar
1. Cabrera A
(1910) LI.—On two new Carnivora from North-east Africa
Annals and Magazine of Natural History 6:461–465.

https://doi.org/10.1080/00222931008692876
- Google Scholar
1. Caley T
2. Extier T
3. Collins JA
4. Schefuß E
5. Dupont L
6. Malaizé B
7. Rossignol L
8. Souron A
9. McClymont EL
10. Jimenez-Espejo FJ
11. García-Comas C
12. Eynaud F
13. Martinez P
14. Roche DM
15. Jorry SJ
16. Charlier K
17. Wary M
18. Gourves PY
19. Billy I
20. Giraudeau J
(2018) A two-million-year-long hydroclimatic context for hominin evolution in southeastern africa
Nature 560:76–79.

https://doi.org/10.1038/s41586-018-0309-6
- PubMed
- Google Scholar
(2013) Genetics at the verge of extinction: insights from the iberian Lynx
Molecular Ecology 22:5503–5515.

https://doi.org/10.1111/mec.12498
- PubMed
- Google Scholar
1. Chase BM
2. Niedermeyer EM
3. Boom A
4. Carr AS
5. Chevalier M
6. He F
7. Meadows ME
8. Ogle N
9. Reimer PJ
(2019) Orbital controls on namib desert hydroclimate over the past 50,000 years
Geology 47:867–871.

https://doi.org/10.1130/G46334.1
- Google Scholar
1. Chevalier M
2. Chase BM
(2015) Southeast african records reveal a coherent shift from high- to low-latitude forcing mechanisms along the east african margin across last glacial–interglacial transition
Quaternary Science Reviews 125:117–130.

https://doi.org/10.1016/j.quascirev.2015.07.009
- Google Scholar
1. Clark HO
(2005) Otocyon megalotis
Mammalian Species 766:1–5.

https://doi.org/10.1644/1545-1410(2005)766[0001:OM]2.0.CO;2
- Google Scholar
1. De Queiroz K
(2007) Species concepts and species delimitation
Systematic Biology 56:879–886.

https://doi.org/10.1080/10635150701701083
- PubMed
- Google Scholar
(2008) Phylogeography of the white-tailed mongoose (Herpestidae, carnivora, mammalia) based on partial sequences of the mtDNA control region
Journal of Zoology 276:385–393.

https://doi.org/10.1111/j.1469-7998.2008.00502.x
- Google Scholar
1. Delisle I
2. Strobeck C
(2005) A phylogeny of the caniformia (order carnivora) based on 12 complete protein-coding mitochondrial genes
Molecular Phylogenetics and Evolution 37:192–201.

https://doi.org/10.1016/j.ympev.2005.04.025
- PubMed
- Google Scholar
1. deMenocal PB
(1995) Plio-Pleistocene african climate
Science 270:53–59.

https://doi.org/10.1126/science.270.5233.53
- PubMed
- Google Scholar
1. deMenocal PB
(2004) African climate change and faunal evolution during the Pliocene–Pleistocene
Earth and Planetary Science Letters 220:3–24.

https://doi.org/10.1016/S0012-821X(04)00003-2
- Google Scholar
(2019) Evaluating the usefulness of alignment filtering methods to reduce the impact of errors on evolutionary inferences
BMC Evolutionary Biology 19:21.

https://doi.org/10.1186/s12862-019-1350-2
- PubMed
- Google Scholar
1. Di Genova A
2. Ruz GA
3. Sagot MF
4. Maass A
(2018) Fast-SG: an alignment-free algorithm for hybrid assembly
GigaScience 7:giy048.

https://doi.org/10.1093/gigascience/giy048
- Google Scholar
(2018) Quantifying temporal genomic erosion in endangered species
Trends in Ecology & Evolution 33:176–185.

https://doi.org/10.1016/j.tree.2017.12.002
- PubMed
- Google Scholar
1. Doronina L
2. Churakov G
3. Shi J
4. Brosius J
5. Baertsch R
6. Clawson H
7. Schmitz J
(2015) Exploring massive incomplete lineage sorting in arctoids (Laurasiatheria, carnivora)
Molecular Biology and Evolution 29:msv188.

https://doi.org/10.1093/molbev/msv188
- Google Scholar
Book
1. Drake-Brockman RE
(1910)
The Mammals of Somaliland

Hurst and Blackett.
- Google Scholar
1. Dudchenko O
2. Batra SS
3. Omer AD
4. Nyquist SK
5. Hoeger M
6. Durand NC
7. Shamim MS
8. Machol I
9. Lander ES
10. Aiden AP
11. Aiden EL
(2017) De novo assembly of the aedes aegypti genome using Hi-C yields chromosome-length scaffolds
Science 356:92–95.

https://doi.org/10.1126/science.aal3327
- PubMed
- Google Scholar
1. Eddy SR
(2011) Accelerated profile HMM searches
PLOS Computational Biology 7:e1002195.

https://doi.org/10.1371/journal.pcbi.1002195
- PubMed
- Google Scholar
1. Eizirik E
2. Murphy WJ
3. Koepfli K-P
4. Johnson WE
5. Dragoo JW
6. Wayne RK
7. O’Brien SJ
(2010) Pattern and timing of diversification of the mammalian order carnivora inferred from multiple nuclear gene sequences
Molecular Phylogenetics and Evolution 56:49–63.

https://doi.org/10.1016/j.ympev.2010.01.033
- Google Scholar
1. Ekblom R
2. Brechlin B
3. Persson J
4. Smeds L
5. Johansson M
6. Magnusson J
7. Flagstad Ø
8. Ellegren H
(2018) Genome sequencing and conservation genomics in the scandinavian wolverine population
Conservation Biology 32:1301–1312.

https://doi.org/10.1111/cobi.13157
- PubMed
- Google Scholar
1. Etherington GJ
2. Heavens D
3. Baker D
4. Lister A
5. McNelly R
6. Garcia G
7. Clavijo B
8. Macaulay I
9. Haerty W
10. Di Palma F
(2020) Sequencing smart: de novo sequencing and assembly approaches for a non-model mammal
GigaScience 9:giaa045.

https://doi.org/10.1093/gigascience/giaa045
- PubMed
- Google Scholar
(2016) Strong effects of variation in taxonomic opinion on diversification analyses
Methods in Ecology and Evolution 7:4–13.

https://doi.org/10.1111/2041-210X.12449
- Google Scholar
1. Flynn JJ
2. Finarelli JA
3. Zehr S
4. Hsu J
5. Nedbal MA
(2005) Molecular phylogeny of the carnivora (mammalia): assessing the impact of increased sampling on resolving enigmatic relationships
Systematic Biology 54:317–337.

https://doi.org/10.1080/10635150590923326
- PubMed
- Google Scholar
Book
(2010)
Phylogeny of the Carnivora and Carnivoramorpha, and the use of the fossil record to enhance understanding of evolutionary transformations

In: Goswami A, Friscia A, editors. Carnivoran Evolution. Cambridge: Cambridge University Press. pp. 25–63.
- Google Scholar
1. Flynn JJ
2. Nedbal MA
(1998) Phylogeny of the carnivora (Mammalia): congruence vs incompatibility among multiple data sets
Molecular Phylogenetics and Evolution 9:414–426.

https://doi.org/10.1006/mpev.1998.0504
- PubMed
- Google Scholar
1. Frankham R
2. Ballou JD
3. Dudash MR
4. Eldridge MDB
5. Fenster CB
6. Lacy RC
7. Mendelson JR
8. Porton IJ
9. Ralls K
10. Ryder OA
(2012) Implications of different species concepts for conserving biodiversity
Biological Conservation 153:25–31.

https://doi.org/10.1016/j.biocon.2012.04.034
- Google Scholar
(2015) First evidence of hybridization between golden jackal (Canis aureus) and domestic dog (Canis familiaris) as revealed by genetic markers
Royal Society Open Science 2:150450.

https://doi.org/10.1098/rsos.150450
- PubMed
- Google Scholar
(2009) Mitochondrial DNA as a marker of molecular diversity: a reappraisal
Molecular Ecology 18:4541–4550.

https://doi.org/10.1111/j.1365-294X.2009.04380.x
- PubMed
- Google Scholar
1. Galtier N
(2019) Delineating species in the speciation continuum: a proposal
Evolutionary Applications 12:657–663.

https://doi.org/10.1111/eva.12748
- PubMed
- Google Scholar
1. Gan HM
2. Falk S
3. Moraleś HE
4. Austin CM
5. Sunnucks P
6. Pavlova A
(2019) Genomic evidence of neo-sex chromosomes in the eastern yellow robin
GigaScience 8:giz131.

https://doi.org/10.1093/gigascience/giz131
- PubMed
- Google Scholar
Preprint
1. Garrison E
2. Marth G
(2012) Haplotype-based variant detection from short-read sequencing
arXiv.

https://arxiv.org/abs/1207.3907
- Google Scholar
(2017) The wolf reference genome sequence (Canis lupus lupus) and its implications for Canis spp population genomics
BMC Genomics 18:495.

https://doi.org/10.1186/s12864-017-3883-3
- PubMed
- Google Scholar
(2018) Interspecific gene flow shaped the evolution of the genus Canis
Current Biology 28:3441–3449.

https://doi.org/10.1016/j.cub.2018.08.041
- PubMed
- Google Scholar
1. Grant KM
2. Rohling EJ
3. Westerhold T
4. Zabel M
5. Heslop D
6. Konijnendijk T
7. Lourens L
(2017) A 3 million year index for north african humidity/aridity and the implication of potential pan-African humid periods
Quaternary Science Reviews 171:100–118.

https://doi.org/10.1016/j.quascirev.2017.07.005
- Google Scholar
1. Guéguen L
2. Gaillard S
3. Boussau B
4. Gouy M
5. Groussin M
6. Rochette NC
7. Bigot T
8. Fournier D
9. Pouyet F
10. Cahais V
11. Bernard A
12. Scornavacca C
13. Nabholz B
14. Haudry A
15. Dachary L
16. Galtier N
17. Belkhir K
18. Dutheil JY
(2013) Bio++: efficient extensible libraries and tools for computational molecular evolution
Molecular Biology and Evolution 30:1745–1750.

https://doi.org/10.1093/molbev/mst097
- PubMed
- Google Scholar
1. Guschanski K
2. Krause J
3. Sawyer S
4. Valente LM
5. Bailey S
6. Finstermeier K
7. Sabin R
8. Gilissen E
9. Sonet G
10. Nagy ZT
11. Lenglet G
12. Mayer F
13. Savolainen V
(2013) Next-generation museomics disentangles one of the largest primate radiations
Systematic Biology 62:539–554.

https://doi.org/10.1093/sysbio/syt018
- PubMed
- Google Scholar
Book
1. Heller E
(1913)
New Antelopes and Carnivores From British East Africa

Smithson Misc Collect.
- Google Scholar
1. Hollister N
(1924) East african mammals in the united states national museum part 1 Insectivora, chiroptera, and carnivora
Bulletin - United States National Museum. 99:1–184.

https://doi.org/10.5962/bhl.part.21107
- Google Scholar
1. Holsinger KE
2. Weir BS
(2009) Genetics in geographically structured populations: defining, estimating and interpreting F(ST)
Nature Reviews Genetics 10:639–650.

https://doi.org/10.1038/nrg2611
- PubMed
- Google Scholar
1. Holt C
2. Yandell M
(2011) MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects
BMC Bioinformatics 12:491.

https://doi.org/10.1186/1471-2105-12-491
- PubMed
- Google Scholar
(1992) Estimation of levels of gene flow from DNA sequence data
Genetics 132:583–589.

https://doi.org/10.1093/genetics/132.2.583
- PubMed
- Google Scholar
Software
1. IUCN
(2020) The IUCN Red List of Threatened Species, version 2020-1
The IUCN Red List of Threatened Species.

https://www.iucnredlist.org
1. Jain M
2. Olsen HE
3. Paten B
4. Akeson M
(2016) The oxford nanopore MinION: delivery of nanopore sequencing to the genomics community
Genome Biology 17:1–11.

https://doi.org/10.1186/s13059-016-1103-0
- PubMed
- Google Scholar
1. Jain M
2. Koren S
3. Miga KH
4. Quick J
5. Rand AC
6. Sasani TA
7. Tyson JR
8. Beggs AD
9. Dilthey AT
10. Fiddes IT
11. Malla S
12. Marriott H
13. Nieto T
14. O'Grady J
15. Olsen HE
16. Pedersen BS
17. Rhie A
18. Richardson H
19. Quinlan AR
20. Snutch TP
21. Tee L
22. Paten B
23. Phillippy AM
24. Simpson JT
25. Loman NJ
26. Loose M
(2018) Nanopore sequencing and assembly of a human genome with ultra-long reads
Nature Biotechnology 36:338–345.

https://doi.org/10.1038/nbt.4060
- PubMed
- Google Scholar
(2006) Phylogenomics: the beginning of incongruence?
Trends in Genetics 22:225–231.

https://doi.org/10.1016/j.tig.2006.02.003
- PubMed
- Google Scholar
(2019) A hybridde novoassembly of the sea pansy (Renilla muelleri) genome
GigaScience 8:giz026 .

https://doi.org/10.1093/gigascience/giz026
- Google Scholar
1. Johnson TC
2. Werne JP
3. Brown ET
4. Abbott A
5. Berke M
6. Steinman BA
7. Halbur J
8. Contreras S
9. Grosshuesch S
10. Deino A
11. Scholz CA
12. Lyons RP
13. Schouten S
14. Damsté JS
(2016) A progressively wetter climate in southern east africa over the past 1.3 million years
Nature 537:220–224.

https://doi.org/10.1038/nature19065
- PubMed
- Google Scholar
(2019) Hybrid genome assembly and annotation of Danionella translucida
Scientific Data 6:1–7.

https://doi.org/10.1038/s41597-019-0161-z
- PubMed
- Google Scholar
(2017) ModelFinder: fast model selection for accurate phylogenetic estimates
Nature Methods 14:587–589.

https://doi.org/10.1038/nmeth.4285
- PubMed
- Google Scholar
1. Katoh K
2. Standley DM
(2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability
Molecular Biology and Evolution 30:772–780.

https://doi.org/10.1093/molbev/mst010
- PubMed
- Google Scholar
1. Kearse M
2. Moir R
3. Wilson A
4. Stones-Havas S
5. Cheung M
6. Sturrock S
7. Buxton S
8. Cooper A
9. Markowitz S
10. Duran C
11. Thierer T
12. Ashton B
13. Meintjes P
14. Drummond A
(2012) Geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data
Bioinformatics 28:1647–1649.

https://doi.org/10.1093/bioinformatics/bts199
- PubMed
- Google Scholar
1. Kim S
2. Cho YS
3. Kim HM
4. Chung O
5. Kim H
6. Jho S
7. Seomun H
8. Kim J
9. Young Bang W
10. Kim C
11. An J
12. Bae CH
13. Bhak Y
14. Jeon S
15. Yoon H
16. Kim Y
17. Jun J
18. Lee H
19. Cho S
20. Uphyrkina O
21. Kostyria A
22. Goodrich J
23. Miquelle D
24. Roelke M
25. Lewis J
26. Yurchenko A
27. Bankevich A
28. Cho J
29. Lee S
30. Edwards JS
31. Weber JA
32. Cook J
33. Kim S
34. Lee H
35. Manica A
36. Lee I
37. O'Brien SJ
38. Bhak J
39. Yeo JH
(2016) Comparison of carnivore, Omnivore, and herbivore mammalian genomes with a new leopard assembly
Genome Biology 17:211.

https://doi.org/10.1186/s13059-016-1071-4
- PubMed
- Google Scholar
1. Koehler CE
2. Richardson PRK
(1990) Proteles cristatus
Mammalian Species 23:1–6.

https://doi.org/10.2307/3504197
- Google Scholar
(2006) Molecular systematics of the Hyaenidae: relationships of a relictual lineage resolved by a molecular supermatrix
Molecular Phylogenetics and Evolution 38:603–620.

https://doi.org/10.1016/j.ympev.2005.10.017
- PubMed
- Google Scholar
(2015) The genome 10K project: a way forward
Annual Review of Animal Biosciences 3:57–111.

https://doi.org/10.1146/annurev-animal-090414-014900
- PubMed
- Google Scholar
1. Koren S
2. Walenz BP
3. Berlin K
4. Miller JR
5. Bergman NH
6. Phillippy AM
(2017) Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation
Genome Research 27:722–736.

https://doi.org/10.1101/gr.215087.116
- PubMed
- Google Scholar
(2012) Statistics and truth in phylogenomics
Molecular Biology and Evolution 29:457–472.

https://doi.org/10.1093/molbev/msr202
- PubMed
- Google Scholar
(2017) Singularity: scientific containers for mobility of compute
PLOS ONE 12:e0177459.

https://doi.org/10.1371/journal.pone.0177459
- PubMed
- Google Scholar
1. Kwan HH
2. Culibrk L
3. Taylor GA
4. Leelakumari S
5. Tan R
6. Jackman SD
7. Tse K
8. MacLeod T
9. Cheng D
10. Chuah E
11. Kirk H
12. Pandoh P
13. Carlsen R
14. Zhao Y
15. Mungall AJ
16. Moore R
17. Birol I
18. Marra MA
19. Rosen DAS
20. Haulena M
21. Jones SJM
(2019) The genome of the steller sea lion (Eumetopias jubatus)
Genes 10:486.

https://doi.org/10.3390/genes10070486
- Google Scholar
1. Laetsch DR
2. Blaxter ML
(2017) BlobTools: interrogation of genome assemblies
F1000Research 6:1287.

https://doi.org/10.12688/f1000research.12232.1
- Google Scholar
(2013) PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment
Systematic Biology 62:611–615.

https://doi.org/10.1093/sysbio/syt022
- PubMed
- Google Scholar
(2009) The sequence alignment/Map format and SAMtools
Bioinformatics 25:2078–2079.

https://doi.org/10.1093/bioinformatics/btp352
- PubMed
- Google Scholar
1. Li R
2. Fan W
3. Tian G
4. Zhu H
5. He L
6. Cai J
7. Huang Q
8. Cai Q
9. Li B
10. Bai Y
11. Zhang Z
12. Zhang Y
13. Wang W
14. Li J
15. Wei F
16. Li H
17. Jian M
18. Li J
19. Zhang Z
20. Nielsen R
21. Li D
22. Gu W
23. Yang Z
24. Xuan Z
25. Ryder OA
26. Leung FC
27. Zhou Y
28. Cao J
29. Sun X
30. Fu Y
31. Fang X
32. Guo X
33. Wang B
34. Hou R
35. Shen F
36. Mu B
37. Ni P
38. Lin R
39. Qian W
40. Wang G
41. Yu C
42. Nie W
43. Wang J
44. Wu Z
45. Liang H
46. Min J
47. Wu Q
48. Cheng S
49. Ruan J
50. Wang M
51. Shi Z
52. Wen M
53. Liu B
54. Ren X
55. Zheng H
56. Dong D
57. Cook K
58. Shan G
59. Zhang H
60. Kosiol C
61. Xie X
62. Lu Z
63. Zheng H
64. Li Y
65. Steiner CC
66. Lam TT
67. Lin S
68. Zhang Q
69. Li G
70. Tian J
71. Gong T
72. Liu H
73. Zhang D
74. Fang L
75. Ye C
76. Zhang J
77. Hu W
78. Xu A
79. Ren Y
80. Zhang G
81. Bruford MW
82. Li Q
83. Ma L
84. Guo Y
85. An N
86. Hu Y
87. Zheng Y
88. Shi Y
89. Li Z
90. Liu Q
91. Chen Y
92. Zhao J
93. Qu N
94. Zhao S
95. Tian F
96. Wang X
97. Wang H
98. Xu L
99. Liu X
100. Vinar T
101. Wang Y
102. Lam TW
103. Yiu SM
104. Liu S
105. Zhang H
106. Li D
107. Huang Y
108. Wang X
109. Yang G
110. Jiang Z
111. Wang J
112. Qin N
113. Li L
114. Li J
115. Bolund L
116. Kristiansen K
117. Wong GK
118. Olson M
119. Zhang X
120. Li S
121. Yang H
122. Wang J
123. Wang J
(2010) The sequence and de novo assembly of the giant Panda genome
Nature 463:311–317.

https://doi.org/10.1038/nature08696
- PubMed
- Google Scholar
Preprint
1. Li H
(2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM
arXiv.

https://arxiv.org/abs/1303.3997
- Google Scholar
1. Li G
2. Davis BW
3. Eizirik E
4. Murphy WJ
(2016) Phylogenomic evidence for ancient hybridization in the genomes of living cats (Felidae)
Genome Research 26:1–11.

https://doi.org/10.1101/gr.186668.114
- PubMed
- Google Scholar
1. Li G
2. Figueiró HV
3. Eizirik E
4. Murphy WJ
(2019) Recombination-Aware phylogenomics reveals the structured genomic landscape of hybridizing cat species
Molecular Biology and Evolution 36:2111–2126.

https://doi.org/10.1093/molbev/msz139
- PubMed
- Google Scholar
1. Li H
2. Durbin R
(2011) Inference of human population history from individual whole-genome sequences
Nature 475:493–496.

https://doi.org/10.1038/nature10231
- PubMed
- Google Scholar
1. Lim S
2. Chase BM
3. Chevalier M
4. Reimer PJ
(2016) 50,000 years of vegetation and climate change in the southern namib desert, Pella, south Africa
Palaeogeography, Palaeoclimatology, Palaeoecology 451:197–209.

https://doi.org/10.1016/j.palaeo.2016.03.001
- Google Scholar
1. Lindblad-Toh K
2. Wade CM
3. Mikkelsen TS
4. Karlsson EK
5. Jaffe DB
6. Kamal M
7. Clamp M
8. Chang JL
9. Kulbokas EJ
10. Zody MC
11. Mauceli E
12. Xie X
13. Breen M
14. Wayne RK
15. Ostrander EA
16. Ponting CP
17. Galibert F
18. Smith DR
19. DeJong PJ
20. Kirkness E
21. Alvarez P
22. Biagi T
23. Brockman W
24. Butler J
25. Chin CW
26. Cook A
27. Cuff J
28. Daly MJ
29. DeCaprio D
30. Gnerre S
31. Grabherr M
32. Kellis M
33. Kleber M
34. Bardeleben C
35. Goodstadt L
36. Heger A
37. Hitte C
38. Kim L
39. Koepfli KP
40. Parker HG
41. Pollinger JP
42. Searle SM
43. Sutter NB
44. Thomas R
45. Webber C
46. Baldwin J
47. Abebe A
48. Abouelleil A
49. Aftuck L
50. Ait-Zahra M
51. Aldredge T
52. Allen N
53. An P
54. Anderson S
55. Antoine C
56. Arachchi H
57. Aslam A
58. Ayotte L
59. Bachantsang P
60. Barry A
61. Bayul T
62. Benamara M
63. Berlin A
64. Bessette D
65. Blitshteyn B
66. Bloom T
67. Blye J
68. Boguslavskiy L
69. Bonnet C
70. Boukhgalter B
71. Brown A
72. Cahill P
73. Calixte N
74. Camarata J
75. Cheshatsang Y
76. Chu J
77. Citroen M
78. Collymore A
79. Cooke P
80. Dawoe T
81. Daza R
82. Decktor K
83. DeGray S
84. Dhargay N
85. Dooley K
86. Dooley K
87. Dorje P
88. Dorjee K
89. Dorris L
90. Duffey N
91. Dupes A
92. Egbiremolen O
93. Elong R
94. Falk J
95. Farina A
96. Faro S
97. Ferguson D
98. Ferreira P
99. Fisher S
100. FitzGerald M
101. Foley K
102. Foley C
103. Franke A
104. Friedrich D
105. Gage D
106. Garber M
107. Gearin G
108. Giannoukos G
109. Goode T
110. Goyette A
111. Graham J
112. Grandbois E
113. Gyaltsen K
114. Hafez N
115. Hagopian D
116. Hagos B
117. Hall J
118. Healy C
119. Hegarty R
120. Honan T
121. Horn A
122. Houde N
123. Hughes L
124. Hunnicutt L
125. Husby M
126. Jester B
127. Jones C
128. Kamat A
129. Kanga B
130. Kells C
131. Khazanovich D
132. Kieu AC
133. Kisner P
134. Kumar M
135. Lance K
136. Landers T
137. Lara M
138. Lee W
139. Leger JP
140. Lennon N
141. Leuper L
142. LeVine S
143. Liu J
144. Liu X
145. Lokyitsang Y
146. Lokyitsang T
147. Lui A
148. Macdonald J
149. Major J
150. Marabella R
151. Maru K
152. Matthews C
153. McDonough S
154. Mehta T
155. Meldrim J
156. Melnikov A
157. Meneus L
158. Mihalev A
159. Mihova T
160. Miller K
161. Mittelman R
162. Mlenga V
163. Mulrain L
164. Munson G
165. Navidi A
166. Naylor J
167. Nguyen T
168. Nguyen N
169. Nguyen C
170. Nguyen T
171. Nicol R
172. Norbu N
173. Norbu C
174. Novod N
175. Nyima T
176. Olandt P
177. O'Neill B
178. O'Neill K
179. Osman S
180. Oyono L
181. Patti C
182. Perrin D
183. Phunkhang P
184. Pierre F
185. Priest M
186. Rachupka A
187. Raghuraman S
188. Rameau R
189. Ray V
190. Raymond C
191. Rege F
192. Rise C
193. Rogers J
194. Rogov P
195. Sahalie J
196. Settipalli S
197. Sharpe T
198. Shea T
199. Sheehan M
200. Sherpa N
201. Shi J
202. Shih D
203. Sloan J
204. Smith C
205. Sparrow T
206. Stalker J
207. Stange-Thomann N
208. Stavropoulos S
209. Stone C
210. Stone S
211. Sykes S
212. Tchuinga P
213. Tenzing P
214. Tesfaye S
215. Thoulutsang D
216. Thoulutsang Y
217. Topham K
218. Topping I
219. Tsamla T
220. Vassiliev H
221. Venkataraman V
222. Vo A
223. Wangchuk T
224. Wangdi T
225. Weiand M
226. Wilkinson J
227. Wilson A
228. Yadav S
229. Yang S
230. Yang X
231. Young G
232. Yu Q
233. Zainoun J
234. Zembek L
235. Zimmer A
236. Lander ES
(2005) Genome sequence, comparative analysis and haplotype structure of the domestic dog
Nature 438:803–819.

https://doi.org/10.1038/nature04338
- PubMed
- Google Scholar
1. Liu S
2. Lorenzen ED
3. Fumagalli M
4. Li B
5. Harris K
6. Xiong Z
7. Zhou L
8. Korneliussen TS
9. Somel M
10. Babbitt C
11. Wray G
12. Li J
13. He W
14. Wang Z
15. Fu W
16. Xiang X
17. Morgan CC
18. Doherty A
19. O'Connell MJ
20. McInerney JO
21. Born EW
22. Dalén L
23. Dietz R
24. Orlando L
25. Sonne C
26. Zhang G
27. Nielsen R
28. Willerslev E
29. Wang J
(2014) Population genomics reveal recent speciation and rapid evolutionary adaptation in polar bears
Cell 157:785–794.

https://doi.org/10.1016/j.cell.2014.03.054
- PubMed
- Google Scholar
1. Liu S
2. Hansen MM
(2017) PSMC (pairwise sequentially markovian coalescent) analysis of RAD (restriction site associated DNA) sequencing data
Molecular Ecology Resources 17:631–641.

https://doi.org/10.1111/1755-0998.12606
- PubMed
- Google Scholar
(2012) Comparative phylogeography of african savannah ungulates
Molecular Ecology 21:3656–3670.

https://doi.org/10.1111/j.1365-294X.2012.05650.x
- PubMed
- Google Scholar
1. Loughry WJ
2. McDonough CM
(1996) Are road kills valid indicators of Armadillo population structure?
American Midland Naturalist 135:53–59.

https://doi.org/10.2307/2426871
- Google Scholar
1. Luo R
2. Liu B
3. Xie Y
4. Li Z
5. Huang W
6. Yuan J
7. He G
8. Chen Y
9. Pan Q
10. Liu Y
11. Tang J
12. Wu G
13. Zhang H
14. Shi Y
15. Liu Y
16. Yu C
17. Wang B
18. Lu Y
19. Han C
20. Cheung DW
21. Yiu SM
22. Peng S
23. Xiaoqian Z
24. Liu G
25. Liao X
26. Li Y
27. Yang H
28. Wang J
29. Lam TW
30. Wang J
(2012) SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler
GigaScience 1:18.

https://doi.org/10.1186/2047-217X-1-18
- PubMed
- Google Scholar
1. Maigret TA
(2019) Snake scale clips as a source of high quality DNA suitable for RAD sequencing
Conservation Genetics Resources 11:373–375.

https://doi.org/10.1007/s12686-018-1019-y
- Google Scholar
1. Maslin MA
2. Brierley CM
3. Milner AM
4. Shultz S
5. Trauth MH
6. Wilson KE
(2014) East african climate pulses and early human evolution
Quaternary Science Reviews 101:1–17.

https://doi.org/10.1016/j.quascirev.2014.06.012
- Google Scholar
1. Meredith RW
2. Janečka JE
3. Gatesy J
4. Ryder OA
5. Fisher CA
6. Teeling EC
7. Goodbla A
8. Eizirik E
9. Simão TL
10. Stadler T
11. Rabosky DL
12. Honeycutt RL
13. Flynn JJ
14. Ingram CM
15. Steiner C
16. Williams TL
17. Robinson TJ
18. Burk-Herrick A
19. Westerman M
20. Ayoub NA
21. Springer MS
22. Murphy WJ
(2011) Impacts of the cretaceous terrestrial revolution and KPg extinction on mammal diversification
Science 334:521–524.

https://doi.org/10.1126/science.1211028
- PubMed
- Google Scholar
1. Meyer CP
2. Paulay G
(2005) DNA barcoding: error rates based on comprehensive sampling
PLOS Biology 3:e422.

https://doi.org/10.1371/journal.pbio.0030422
- PubMed
- Google Scholar
1. Miller JM
2. Hallager S
3. Monfort SL
4. Newby J
5. Bishop K
6. Tidmus SA
7. Black P
8. Houston B
9. Matthee CA
10. Fleischer RC
(2011) Phylogeographic analysis of nuclear and mtDNA supports subspecies designations in the ostrich (Struthio camelus)
Conservation Genetics 12:423–431.

https://doi.org/10.1007/s10592-010-0149-x
- Google Scholar
(2020) New methods to calculate concordance factors for phylogenomic datasets
Molecular Biology and Evolution 37:2727–2733.

https://doi.org/10.1093/molbev/msaa106
- PubMed
- Google Scholar
(2013) Demographic divergence history of pied flycatcher and collared flycatcher inferred from whole-genome re-sequencing data
PLOS Genetics 9:e1003942.

https://doi.org/10.1371/journal.pgen.1003942
- PubMed
- Google Scholar
(2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies
Molecular Biology and Evolution 32:268–274.

https://doi.org/10.1093/molbev/msu300
- PubMed
- Google Scholar
1. Nicholls SM
2. Quick JC
3. Tang S
4. Loman NJ
(2019) Ultra-deep, long-read nanopore sequencing of mock microbial community standards
GigaScience 8:giz043.

https://doi.org/10.1093/gigascience/giz043
- PubMed
- Google Scholar
(2017) gVolante for standardizing completeness assessment of genome and transcriptome assemblies
Bioinformatics 33:3635–3637.

https://doi.org/10.1093/bioinformatics/btx445
- PubMed
- Google Scholar
(2017) Field-based species identification of closely-related plants using real-time nanopore sequencing
Scientific Reports 7:8345.

https://doi.org/10.1038/s41598-017-08461-5
- PubMed
- Google Scholar
1. Peng R
2. Zeng B
3. Meng X
4. Yue B
5. Zhang Z
6. Zou F
(2007) The complete mitochondrial genome and phylogenetic analysis of the giant Panda (Ailuropoda Melanoleuca)
Gene 397:76–83.

https://doi.org/10.1016/j.gene.2007.04.009
- PubMed
- Google Scholar
(2018) Testing the value of citizen science for roadkill studies: a case study from south africa
Frontiers in Ecology and Evolution 6:15.

https://doi.org/10.3389/fevo.2018.00015
- Google Scholar
(2018) Real-time DNA barcoding in a rainforest using nanopore sequencing: opportunities for rapid biodiversity assessments and local capacity building
GigaScience 7:giy033.

https://doi.org/10.1093/gigascience/giy033
- Google Scholar
Software
1. R Development Core Team
(2020) R: A Language and Environment for Statistical Computing
R Foundation for Statistical Computing, Vienna, Austria.

http://www.r-project.org
Conference
(2011)
LastZ: an ultra optimized 3D networks-on-chip architecture

2011 14th Euromicro Conference on Digital System Design.
- Google Scholar
1. Ranwez V
2. Douzery EJP
3. Cambon C
4. Chantret N
5. Delsuc F
(2018) MACSE v2: toolkit for the alignment of coding sequences accounting for frameshifts and stop codons
Molecular Biology and Evolution 35:2582–2584.

https://doi.org/10.1093/molbev/msy159
- PubMed
- Google Scholar
(2021) Aligning Protein-Coding nucleotide sequences with MACSE
Methods in Molecular Biology 2231:51–70.

https://doi.org/10.1007/978-1-0716-1036-7_4
- PubMed
- Google Scholar
1. Ravinet M
2. Westram A
3. Johannesson K
4. Butlin R
5. André C
6. Panova M
(2016) Shared and nonshared genomic divergence in parallel ecotypes of Littorina saxatilis at a local scale
Molecular Ecology 25:287–305.

https://doi.org/10.1111/mec.13332
- PubMed
- Google Scholar
1. Roberts A
(1932)
Preliminary description of fifty-seven new forms of south african mammals

Ann Transvaal Museum 15:1–19.
- Google Scholar
Book
1. Roberts A
(1951)
The Mammals of South Africa the Mammals of South Africa

Mamm South Africa Mamm South Africa.
- Google Scholar
1. Rohland N
2. Pollack JL
3. Nagel D
4. Beauval C
5. Airvaux J
6. Pääbo S
7. Hofreiter M
(2005) The population history of extant and extinct hyenas
Molecular Biology and Evolution 22:2435–2443.

https://doi.org/10.1093/molbev/msi244
- PubMed
- Google Scholar
1. Rothschild LW
(1902)
Two new subspecies of Proteles

Novit Zool 9:443.
- Google Scholar
1. Roux C
2. Fraïsse C
3. Romiguier J
4. Anciaux Y
5. Galtier N
6. Bierne N
(2016) Shedding light on the grey zone of speciation along a continuum of genomic divergence
PLOS Biology 14:e2000234.

https://doi.org/10.1371/journal.pbio.2000234
- PubMed
- Google Scholar
(2009) A semi-aquatic arctic mammalian carnivore from the miocene epoch and origin of pinnipedia
Nature 458:1021–1024.

https://doi.org/10.1038/nature07985
- PubMed
- Google Scholar
1. Sato JJ
2. Wolsan M
3. Suzuki H
4. Hosoda T
5. Yamaguchi Y
6. Hiyama K
7. Kobayashi M
8. Minami S
(2006) Evidence from nuclear DNA sequences sheds light on the phylogenetic relationships of pinnipedia: single origin with affinity to musteloidea
Zoological Science 23:125–146.

https://doi.org/10.2108/zsj.23.125
- PubMed
- Google Scholar
1. Sato JJ
2. Wolsan M
3. Minami S
4. Hosoda T
5. Sinaga MH
6. Hiyama K
7. Yamaguchi Y
8. Suzuki H
(2009) Deciphering and dating the red Panda's ancestry and early adaptive radiation of Musteloidea
Molecular Phylogenetics and Evolution 53:907–922.

https://doi.org/10.1016/j.ympev.2009.08.019
- PubMed
- Google Scholar
(2009) Occurrence of Can-SINEs and intron sequence evolution supports robust phylogeny of pinniped carnivores and their terrestrial relatives
Gene 448:221–226.

https://doi.org/10.1016/j.gene.2009.06.012
- PubMed
- Google Scholar
(2020) The value of monitoring wildlife roadkill
European Journal of Wildlife Research 66:1–12.

https://doi.org/10.1007/s10344-019-1357-4
- Google Scholar
1. Scornavacca C
2. Belkhir K
3. Lopez J
4. Dernat R
5. Delsuc F
6. Douzery EJP
7. Ranwez V
(2019) OrthoMaM v10: scaling-up orthologous coding sequence and exon alignments with more than one hundred mammalian genomes
Molecular Biology and Evolution 36:861–862.

https://doi.org/10.1093/molbev/msz015
- PubMed
- Google Scholar
Preprint
1. Scott AD
2. Zimin A
3. Puiu D
4. Workman R
5. Britton M
6. Zaman S
7. Caballero M
8. Read AC
9. Bogdanove AJ
10. Burns E
11. Wegrzyn J
12. Timp W
13. Salzberg SL
14. Neale DB
(2020) The giant Sequoia genome and proliferation of disease resistance genes
bioRxiv.

https://doi.org/10.1101/2020.03.17.995944
- Google Scholar
1. Shafin K
2. Pesout T
3. Lorig-Roach R
4. Haukness M
5. Olsen HE
6. Bosworth C
7. Armstrong J
8. Tigyi K
9. Maurer N
10. Koren S
11. Sedlazeck FJ
12. Marschall T
13. Mayes S
14. Costa V
15. Zook JM
16. Liu KJ
17. Kilburn D
18. Sorensen M
19. Munson KM
20. Vollger MR
21. Monlong J
22. Garrison E
23. Eichler EE
24. Salama S
25. Haussler D
26. Green RE
27. Akeson M
28. Phillippy A
29. Miga KH
30. Carnevali P
31. Jain M
32. Paten B
(2020) Nanopore sequencing and the shasta toolkit enable efficient de novo assembly of eleven human genomes
Nature Biotechnology 38:1044–1053.

https://doi.org/10.1038/s41587-020-0503-6
- PubMed
- Google Scholar
Book
1. Shilling F
2. Perkins SE
(2015) Wildlife/Roadkill Observation and Reporting Systems
In: Collinson W, editors. Handbook of Road Ecology. John Wiley & Sons, Ltd. pp. 492–501.

https://doi.org/10.1002/9781118568170.ch62
- Google Scholar
(2014) BoxPlotR: a web tool for generation of box plots
Nature Methods 11:121–122.

https://doi.org/10.1038/nmeth.2811
- PubMed
- Google Scholar
1. Srivathsan A
2. Baloğlu B
3. Wang W
4. Tan WX
5. Bertrand D
6. Ng AHQ
7. Boey EJH
8. Koh JJY
9. Nagarajan N
10. Meier R
(2018) A MinION-based pipeline for fast and cost-effective DNA barcoding
Molecular Ecology Resources 18:1035–1049.

https://doi.org/10.1111/1755-0998.12890
- Google Scholar
1. Tan MH
2. Austin CM
3. Hammer MP
4. Lee YP
5. Croft LJ
6. Gan HM
(2018) Finding nemo: hybrid assembly with oxford nanopore and illumina reads greatly improves the clownfish (Amphiprion ocellaris) genome assembly
GigaScience 7:gix137.

https://doi.org/10.1093/gigascience/gix137
- Google Scholar
1. Tange O
(2011)
Gnu parallel-the command-line power tool

USENIX Mag 36:42–47.
- Google Scholar
(2015) A cost-effective straightforward protocol for shotgun illumina libraries designed to assemble complete mitogenomes from non-model species
Conservation Genetics Resources 7:37–40.

https://doi.org/10.1007/s12686-014-0338-x
- Google Scholar
Software
(2020)
An Optimized Protocol for Sequencing Mammalian Roadkill Tissues with Oxford Nanopore Technology (ONT) V1 (Protocols.io.6bthann)

An Optimized Protocol for Sequencing Mammalian Roadkill Tissues with Oxford Nanopore Technology (ONT) V1 (Protocols.io.6bthann).
(2009) Trends, rhythms and events in Plio-Pleistocene african climate
Quaternary Science Reviews 28:399–411.

https://doi.org/10.1016/j.quascirev.2008.11.003
- Google Scholar
(2010) Hi-C: a method to study the Three-dimensional architecture of genomes
Journal of Visualized Experiments 6:1869.

https://doi.org/10.3791/1869
- Google Scholar
1. van Jaarsveld AS
(1993) A comparative investigation of hyaenid and aardwolf life-histories, with notes on spotted Hyaena mortality patterns
Transactions of the Royal Society of South Africa 48:219–232.

https://doi.org/10.1080/00359199309520272
- Google Scholar
(2017) Fast and accurate de novo genome assembly from long uncorrected reads
Genome Research 27:737–746.

https://doi.org/10.1101/gr.214270.116
- PubMed
- Google Scholar
(2016) Admixture mapping identifies introgressed genomic regions in north american canids
Molecular Ecology 25:2443–2453.

https://doi.org/10.1111/mec.13667
- PubMed
- Google Scholar
1. Waetjen DP
2. Shilling FM
(2017) Large extent volunteer roadkill and wildlife observation systems as sources of reliable data
Frontiers in Ecology and Evolution 5:89.

https://doi.org/10.3389/fevo.2017.00089
- Google Scholar
1. Wahlund S
(2010) Zusammensetzung von populationen und korrelationserscheinungen vom standpunkt der vererbungslehre aus betrachtet
Hereditas 11:65–106.

https://doi.org/10.1111/j.1601-5223.1928.tb02483.x
- Google Scholar
1. Walker BJ
2. Abeel T
3. Shea T
4. Priest M
5. Abouelliel A
6. Sakthikumar S
7. Cuomo CA
8. Zeng Q
9. Wortman J
10. Young SK
11. Earl AM
(2014) Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement
PLOS ONE 9:e112963.

https://doi.org/10.1371/journal.pone.0112963
- PubMed
- Google Scholar
1. Walton LR
2. Joly DO
(2003) Canis mesomelas
Mammalian Species 715:1–9.

https://doi.org/10.1644/715
- Google Scholar
Preprint
(2019) The draft nuclear genome assembly of Eucalyptus pauciflora: new approaches to comparing de novo assemblies
bioRxiv.

https://doi.org/10.1101/678730
- Google Scholar
(2020) The draft nuclear genome assembly of Eucalyptus pauciflora: a pipeline for comparing de novo assemblies
GigaScience 9:giz160.

https://doi.org/10.1093/gigascience/giz160
- PubMed
- Google Scholar
(2018) BUSCO applications from quality assessments to gene prediction and phylogenomics
Molecular Biology and Evolution 35:543–548.

https://doi.org/10.1093/molbev/msx319
- PubMed
- Google Scholar
1. Weisenfeld NI
2. Yin S
3. Sharpe T
4. Lau B
5. Hegarty R
6. Holmes L
7. Sogoloff B
8. Tabbaa D
9. Williams L
10. Russ C
11. Nusbaum C
12. Lander ES
13. MacCallum I
14. Jaffe DB
(2014) Comprehensive variation discovery in single human genomes
Nature Genetics 46:1350–1355.

https://doi.org/10.1038/ng.3121
- PubMed
- Google Scholar
1. Wenger AM
2. Peluso P
3. Rowell WJ
4. Chang PC
5. Hall RJ
6. Concepcion GT
7. Ebler J
8. Fungtammasan A
9. Kolesnikov A
10. Olson ND
11. Töpfer A
12. Alonge M
13. Mahmoud M
14. Qian Y
15. Chin CS
16. Phillippy AM
17. Schatz MC
18. Myers G
19. DePristo MA
20. Ruan J
21. Marschall T
22. Sedlazeck FJ
23. Zook JM
24. Li H
25. Koren S
26. Carroll A
27. Rank DR
28. Hunkapiller MW
(2019) Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
Nature Biotechnology 37:1155–1162.

https://doi.org/10.1038/s41587-019-0217-9
- PubMed
- Google Scholar
1. Westbury MV
2. Hartmann S
3. Barlow A
4. Wiesel I
5. Leo V
6. Welch R
7. Parker DM
8. Sicks F
9. Ludwig A
10. Dalén L
11. Hofreiter M
(2018) Extended and continuous decline in effective population size results in low genomic diversity in the world's Rarest Hyena Species, the Brown Hyena
Molecular Biology and Evolution 35:1225–1237.

https://doi.org/10.1093/molbev/msy037
- PubMed
- Google Scholar
1. Wick RR
2. Judd LM
3. Holt KE
(2019) Performance of neural network basecalling tools for Oxford Nanopore sequencing
Genome Biology 20:129.

https://doi.org/10.1186/s13059-019-1727-y
- Google Scholar
Book
1. Wickham H
(2016)
Ggplot2 : Elegant Graphics for Data Analysis

Berlin, Germany: Springer.
- Google Scholar
Software
1. Wilke CO
(2016) Cowplot: Streamlined Plot Theme and Plot Annotations for ‘Ggplot2
Cowplot: Streamlined Plot Theme and Plot Annotations for ‘Ggplot2.

https://rdrr.io/cran/cowplot/
Book
(2009)
Handbook of the Mammals of the World

Lynx Edicions.
- Google Scholar
1. Yu L
2. Zhang YP
(2006) Phylogeny of the caniform carnivora: evidence from multiple genes
Genetica 127:65–79.

https://doi.org/10.1007/s10709-005-2482-4
- PubMed
- Google Scholar
(2017) OrthoDB v9.1: cataloging evolutionary and functional annotations for animal, fungal, plant, archaeal, bacterial and viral orthologs
Nucleic Acids Research 45:D744–D749.

https://doi.org/10.1093/nar/gkw1119
- PubMed
- Google Scholar
1. Zhang C
2. Rabiee M
3. Sayyari E
4. Mirarab S
(2018) ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees
BMC Bioinformatics 19:15–30.

https://doi.org/10.1186/s12859-018-2129-y
- Google Scholar
1. Zimin AV
2. Marçais G
3. Puiu D
4. Roberts M
5. Salzberg SL
6. Yorke JA
(2013) The MaSuRCA genome assembler
Bioinformatics 29:2669–2677.

https://doi.org/10.1093/bioinformatics/btt476
- PubMed
- Google Scholar
1. Zimin AV
2. Puiu D
3. Luo MC
4. Zhu T
5. Koren S
6. Marçais G
7. Yorke JA
8. Dvořák J
9. Salzberg SL
(2017) Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm
Genome Research 27:787–792.

https://doi.org/10.1101/gr.213405.116
- PubMed
- Google Scholar
1. Zoonomia Consortium
(2020) A comparative genomics multitool for scientific discovery and conservation
Nature 587:240–245.

https://doi.org/10.1038/s41586-020-2876-6
- PubMed
- Google Scholar

Article and author information

Author details

Rémi Allio

Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France

Contribution
Conceptualization, Resources, Data curation, Software, Formal analysis, Investigation, Methodology, Writing - original draft, Writing - review and editing

For correspondence
rem.allio@yahoo.fr

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-3885-5410
Marie-Ka Tilak

Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France

Contribution
Conceptualization, Resources, Data curation, Investigation, Methodology, Writing - review and editing, Oxford Nanopore Sequencing

Competing interests
No competing interests declared
Celine Scornavacca

Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France

Contribution
Data curation, Software, Formal analysis, Methodology, Writing - review and editing

Competing interests
No competing interests declared
Nico L Avenant

National Museum and Centre for Environmental Management, University of the Free State, Bloemfontein, South Africa

Contribution
Resources, Funding acquisition, Writing - review and editing

Competing interests
No competing interests declared
Andrew C Kitchener

Department of Natural Sciences, National Museums Scotland, Edinburgh, United Kingdom

Contribution
Resources, Formal analysis, Writing - review and editing

Competing interests
No competing interests declared
Erwan Corre

CNRS, Sorbonne Université, CNRS, ABiMS, Station Biologique de Roscoff, Roscoff, France

Contribution
Resources, Funding acquisition, Writing - review and editing

Competing interests
No competing interests declared
Benoit Nabholz
1. Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France
2. Institut Universitaire de France (IUF), Paris, France
Contribution
Conceptualization, Data curation, Software, Formal analysis, Supervision, Methodology, Writing - review and editing

Competing interests
No competing interests declared
Frédéric Delsuc

Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France

Contribution
Conceptualization, Resources, Formal analysis, Supervision, Funding acquisition, Validation, Methodology, Writing - original draft, Project administration, Writing - review and editing

For correspondence
frederic.delsuc@umontpellier.fr

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-6501-6287

Funding

H2020 European Research Council (ERC‐2015‐CoG‐683257)

Frédéric Delsuc

Agence Nationale de la Recherche (ANR‐10‐LABX‐25‐01)

Rémi Allio
Marie-Ka Tilak
Celine Scornavacca
Benoit Nabholz
Frédéric Delsuc

Agence Nationale de la Recherche (ANR‐10‐LABX‐0004)

Rémi Allio
Marie-Ka Tilak
Celine Scornavacca
Benoit Nabholz
Frédéric Delsuc

Agence Nationale de la Recherche (ANR-11-INBS-0013)

Erwan Corre

National Research Foundation (86321)

Nico L Avenant

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We are indebted to the Broad Institute (http://www.broadinstitute.org), the DNA Zoo (http://www.dnazoo.org), and numerous other sequencing centres and institutions for making their mammalian genomic data publically available. We would like to thank Rachid Koual and Amandine Magdeleine for technical help with DNA extractions and library preparations, Aude Caizergues and Nathalie Delsuc for fieldwork assistance, Christian Fontaine, Jean-Christophe Vié (Faune Sauvage, French Guiana), Corine Esser (Fauverie du Mont Faron, Toulon, France), François Catzeflis (ISEM Mammalian Tissue Collection), Adam Ferguson and Bruce Patterson (Field Museum of Natural History, Chicago, USA), and Lily Crowley and Andrew Swales (Hamerton Zoo Park, UK) for access to tissue samples. The National Museum (Bloemfontein, Free State, South Africa) is thanked for their collaboration and for making tissues from the Mammal Collection available for the study. ACK thanks the Negaunee Foundation for their generous support of a curatorial preparator who sampled the East African aardwolf used in this study. We also acknowledge Pierre-Alexandre Gagnaire for helpful discussion on the genetic differentiation index, Brian Chase for providing references on African paleoclimate, and Sérgio Ferreira-Cardoso for taking measurements of aardwolf skulls. We also thank George Perry for handling this manuscript as senior and reviewing Editor and the two additional anonymous reviewers for providing helpful comments on a previous version of the manuscript. Robb Cadd kindly made available his aardwolf photographs taken at Hamerton Zoo Park. We thank the Montpellier GenomiX Plateform (MGX) part of the France Génomique National Infrastructure for sequencing data generation. Computational analyses benefited from the Montpellier Bioinformatics Biodiversity (MBB) computing platform. We are also grateful to the Institut Français de Bioinformatique and the Roscoff Bioinformatics platform ABiMS (http://abims.sb-roscoff.fr) for providing help for computing and storage resources. This is contribution ISEM 2021-033 of the Institut des Sciences de l’Evolution de Montpellier.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

2,522

views
276

downloads
25

citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Citations by DOI

25

citations for umbrella DOI https://doi.org/10.7554/eLife.63167

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Mendeley

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Rémi Allio
Marie-Ka Tilak
Celine Scornavacca
Nico L Avenant
Andrew C Kitchener
Erwan Corre
Benoit Nabholz
Frédéric Delsuc

(2021)

High-quality carnivoran genomes from roadkill samples enable comparative species delineation in aardwolf and bat-eared fox

eLife 10:e63167.

https://doi.org/10.7554/eLife.63167

Share this article

Cite this article

Disjunct distributions of the aardwolf (Proteles cristatus) and the bat-eared fox (Otocyon megalotis) in Eastern and Southern Africa.

Summary of sequencing and assembly statistics of the genomes generated in this study.

Genetic differentiation indexes obtained from a comparison of intraspecific (orange values) and interspecific (red values) polymorphisms in four pairs of well-defined.

PSMC estimates of changes in effective population size over time for the Eastern (orange) and Southern (blue and purple) populations of (a) bat-eared fox and (b) aardwolf.

Phylogenomic tree reconstructed from the nucleotide supermatrix composed of 14,307 single-copy orthologous genes for 52 species of Carnivora plus one outgroup (Manis javanica).

Phenotypic comparisons, highlighting the differences in fur coloration and stripe pattern, between captive individuals of Eastern (P. septentrionalis) and Southern (P. cristatus) aardwolves held at Hamerton Zoo Park (UK).

Plot of the quality of Nanopore long reads base-called with either the fast or the high accuracy option of Guppy v3.1.5.

Comparison of 503 mammalian genome assemblies from 12 taxonomic groups using bean plots of the (a) number of scaffolds, and (b) scaffold N50 values ranked by median values.

BUSCO completeness assessment of 67 Carnivora genome assemblies visualized as bar charts representing percentages of complete single-copy (light blue), complete duplicated (dark blue), fragmented (yellow), and missing (red) genes ordered by increasing percentage of total complete genes.

Unicolored fur of an Eastern aardwolf from Ethiopia (NMS.Z.1877.15.5) (A) and bicolored fur of a Southern aardwolf of South African origin (NMS.Z.2020.44) (B).

Box and jitter plot of (A) post-orbital breadths of Proteles taxa: cristatus (left) and septentrionalis (right) and (B) condylobasal lengths of skull of Proteles taxa: cristatus (left) and septentrionalis (right).

Definition of the genetic differentiation index (GDI) based on the F-statistic (FST).

Graphical representation (BlobPlot) of the results of contamination analyses performed with BlobTools for (a) the aardwolf (Proteles cristatus cristatus) and (b) the bat-eared fox (Otocyon megalotis megalotis) genome assemblies.

Author details

Rémi Allio

Contribution

For correspondence

Competing interests

Marie-Ka Tilak

Contribution

Competing interests

Celine Scornavacca

Contribution

Competing interests

Nico L Avenant

Contribution

Competing interests

Andrew C Kitchener

Contribution

Competing interests

Erwan Corre

Contribution

Competing interests

Benoit Nabholz

Contribution

Competing interests

Frédéric Delsuc

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags