Research Article

Genetics and Genomics

A new genus of horse from Pleistocene North America

University of California, Santa Cruz, United States
Tromsø University Museum, UiT - The Arctic University of Norway, Norway
Government of Yukon, Canada
American Museum of Natural History, United States
Cogstone Resource Management, Incorporated, United States
California State University San Bernardino, United States
Harvard University, United States
German Consortium for Translational Cancer Research, Germany
University of Alaska Fairbanks, United States
Natural History Museum of Denmark, Denmark
Université Paul Sabatier, Université de Toulouse, France
University of California, Irvine, United States
University of Alberta, Canada

Nov 28, 2017

https://doi.org/10.7554/eLife.29944

Open access
Copyright information

Version of Record: November 28, 2017

Download
Cite
Share
CommentOpen annotations (there are currently 0 annotations on this page).

Altmetric provides a collated score for online attention across various platforms and media.
See more details

1. Part of Collection
Paleontology: A Collection of Articles

Edited by Ian Baldwin et al.
Further reading

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Appendix 1
Appendix 2
Data availability
References
Article and author information
Metrics

Abstract

The extinct ‘New World stilt-legged’, or NWSL, equids constitute a perplexing group of Pleistocene horses endemic to North America. Their slender distal limb bones resemble those of Asiatic asses, such as the Persian onager. Previous palaeogenetic studies, however, have suggested a closer relationship to caballine horses than to Asiatic asses. Here, we report complete mitochondrial and partial nuclear genomes from NWSL equids from across their geographic range. Although multiple NWSL equid species have been named, our palaeogenomic and morphometric analyses support the idea that there was only a single species of middle to late Pleistocene NWSL equid, and demonstrate that it falls outside of crown group Equus. We therefore propose a new genus, Haringtonhippus, for the sole species H. francisci. Our combined genomic and phenomic approach to resolving the systematics of extinct megafauna will allow for an improved understanding of the full extent of the terminal Pleistocene extinction event.

https://doi.org/10.7554/eLife.29944.001

eLife digest

The horse family – which also includes zebras, donkeys and asses – is often featured on the pages of textbooks about evolution. All living horses belong to a group, or genus, called Equus. The fossil record shows how the ancestors of these animals evolved from dog-sized, three-toed browsers to larger, one-toed grazers. This process took around 55 million years, and many members of the horse family tree went extinct along the way.

Nevertheless, the details of the horse family tree over the past 2.5 million years remain poorly understood. In North America, horses from this period – which is referred to as the Pleistocene – have been classed into two major groups: stout-legged horses and stilt-legged horses. Both groups became extinct near the end of the Pleistocene in North America, and it was not clear how they relate to one another. Based on their anatomy, many scientists suggested that stilt-legged horses were most closely related to modern-day asses living in Asia. Yet, other studies using ancient DNA placed the stilt-legged horses closer to the stout-legged horses.

Heintzman et al. set out to resolve where the stilt-legged horses sit within the horse family tree by examining more ancient DNA than the previous studies. The analyses showed that the stilt-legged horses were much more distinct than previously thought. In fact, contrary to all previous findings, these animals actually belonged outside of the genus Equus. Heintzman et al. named the new genus for the stilt-legged horses Haringtonhippus, and showed that all stilt-legged horses belonged to a single species within this genus, Haringtonhippus francisci.

Together these new findings provide a benchmark for reclassifying problematic fossil groups across the tree of life. A similar approach could be used to resolve the relationships in other problematic groups of Pleistocene animals, such as mammoths and bison. This would give scientists a more nuanced understanding of evolution and extinction during this period.

https://doi.org/10.7554/eLife.29944.002

Introduction

The family that includes modern horses, asses, and zebras, the Equidae, is a classic model of macroevolution. The excellent fossil record of this family clearly documents its ~55 million year evolution from dog-sized hyracotheres through many intermediate forms and extinct offshoots to present-day Equus, which comprises all living equid species (MacFadden, 1992). The downside of this excellent fossil record is that many dubious fossil equid taxa have been erected, a problem especially acute within Pleistocene Equus of North America (Macdonald et al., 1992). While numerous species are described from the fossil record, molecular data suggest that most belonged to, or were closely related to, a single, highly variable stout-legged caballine species that includes the domestic horse, E. caballus (Weinstock et al., 2005). The enigmatic and extinct ‘New World stilt-legged’ (NWSL) forms, however, exhibit a perplexing mix of morphological characters, including slender, stilt-like distal limb bones with narrow hooves reminiscent of extant Eurasian hemionines, the Asiatic wild asses (E. hemionus, E. kiang) (Eisenmann, 1992; Eisenmann et al., 2008; Harington and Clulow, 1973; Lundelius and Stevens, 1970; Scott, 2004), and dentitions that have been interpreted as more consistent with either caballine horses (Lundelius and Stevens, 1970) or hemionines (MacFadden, 1992).

On the basis of their slender distal limb bones, the NWSL equids have traditionally been considered as allied to hemionines (e.g. Eisenmann et al., 2008; Guthrie, 2003; Scott, 2004; Skinner and Hibbard, 1972). Palaeogenetic analyses based on mitochondrial DNA (mtDNA) have, however, consistently placed NWSL equids closer to caballine horses (Barrón-Ortiz et al., 2017; Der Sarkissian et al., 2015; Orlando et al., 2008, 2009; Vilstrup et al., 2013; Weinstock et al., 2005). The current mtDNA-based phylogenetic model therefore suggests that the stilt-legged morphology arose independently in the New and Old Worlds (Weinstock et al., 2005) and may represent convergent adaptations to arid climates and habitats (Eisenmann, 1985). However, these models have been based on two questionable sources. The first is based on 15 short control region sequences (<1000 base pairs, bp; Barrón-Ortiz et al., 2017; Weinstock et al., 2005), a data type that can be unreliable for resolving the placement of major equid groups (Der Sarkissian et al., 2015; Orlando et al., 2009). The second consist of two mitochondrial genome sequences (Vilstrup et al., 2013) that are either incomplete or otherwise problematic (see Results). Given continuing uncertainty regarding the phylogenetic placement of NWSL equids—which impedes our understanding of Pleistocene equid evolution in general—we therefore sought to resolve their position using multiple mitochondrial and partial nuclear genomes from specimens representing as many parts of late Pleistocene North America as possible.

The earliest recognized NWSL equid fossils date to the late Pliocene/early Pleistocene (~2–3 million years ago, Ma) of New Mexico (Azzaroli and Voorhies, 1993; Eisenmann, 2003; Eisenmann et al., 2008). Middle and late Pleistocene forms tended to be smaller in stature than their early Pleistocene kin, and ranged across southern and extreme northwestern North America (i.e. eastern Beringia, which includes Alaska, USA and Yukon Territory, Canada). NWSL equids have been assigned to several named species, such as E. conversidens Owen 1869, E. tau Owen 1869, E. francisci Hay (1915), E. calobatus Troxell 1915, and E. (Asinus) cf. kiang, but there is considerable confusion and disagreement regarding their taxonomy. Consequently, some researchers have chosen to refer to them collectively as Equus (Hemionus) spp. (Guthrie, 2003; Scott, 2004), or avoid a formal taxonomic designation altogether (Der Sarkissian et al., 2015; Vilstrup et al., 2013; Weinstock et al., 2005). Using our phylogenetic framework and comparisons between specimens identified by palaeogenomics and/or morphology, we attempted to determine the taxonomy of middle-late Pleistocene NWSL equids.

Radiocarbon (¹⁴C) dates from Gypsum Cave, Nevada, confirm that NWSL equids persisted in areas south of the continental ice sheets during the last glacial maximum (LGM; ~26–19 thousand years before present (ka BP); Clark et al., 2009) until near the terminal Pleistocene, ~13 thousand radiocarbon years before present (¹⁴C ka BP) (Weinstock et al., 2005), soon after which they became extinct, along with their caballine counterparts and most other coeval species of megafauna (Koch and Barnosky, 2006). This contrasts with dates from unglaciated eastern Beringia, where NWSL equids were seemingly extirpated locally during a relatively mild interstadial interval centered on ~31 ¹⁴C ka BP (Guthrie, 2003), thus prior to the LGM (Clark et al., 2009), final loss of caballine horses (Guthrie, 2003; 2006), and arrival of humans in the region (Guthrie, 2006). The apparently discrepant extirpation chronology between NWSL equids south and north of the continental ice sheets implies that their populations responded variably to demographic pressures in different parts of their range, which is consistent with results from some other megafauna (Guthrie, 2006; Zazula et al., 2014; Zazula et al., 2017). To further test this extinction chronology, we generated new radiocarbon dates from eastern Beringian NWSL equids.

We analyzed 26 full mitochondrial genomes and 17 partial nuclear genomes from late Pleistocene NWSL equids, which revealed that individuals from both eastern Beringia and southern North America form a single well-supported clade that falls outside the diversity of Equus and diverged from the lineage leading to Equus during the latest Miocene or early Pliocene. This novel and robust phylogenetic placement warrants the recognition of NWSL equids as a distinct genus, which we here name Haringtonhippus. After reviewing potential species names and conducting morphometric and anatomical comparisons, we determined that, based on the earliest-described specimen bearing diagnosable features, francisci Hay is the most well-supported species name. We therefore refer the analyzed NWSL equid specimens to H. francisci. New radiocarbon dates revealed that H. francisci was extirpated in eastern Beringia ~14 ¹⁴C ka BP. In light of our analyses, we review the Plio-Pleistocene evolutionary history of equids, and the implications for the systematics of equids and other Pleistocene megafauna.

Results

Phylogeny of North American late Pleistocene and extant equids

We reconstructed whole mitochondrial genomes from 26 NWSL equids and four New World caballine Equus (two E. lambei, two E. cf. scotti). Using these and mitochondrial genomes of representatives from all extant and several late Pleistocene equids, we estimated a mitochondrial phylogeny, using a variety of outgroups (Appendix 1, Appendix 2—tables 1–2, and Supplementary file 1). The resulting phylogeny is mostly consistent with previous studies (Der Sarkissian et al., 2015; Vilstrup et al., 2013), including confirmation of NWSL equid monophyly (Weinstock et al., 2005). However, we recover a strongly supported placement of the NWSL equid clade outside of crown group diversity (Equus), but closer to Equus than to Hippidion (Figure 1, Figure 1—figure supplement 1a, Figure 1—source data 1, and Appendix 2—tables 1–2). In contrast, previous palaeogenetic studies placed the NWSL equids within crown group Equus, closer to caballine horses than to non-caballine asses and zebras (Barrón-Ortiz et al., 2017; Der Sarkissian et al., 2015; Orlando et al., 2008, 2009; Vilstrup et al., 2013; Weinstock et al., 2005). To explore possible causes for this discrepancy, we reconstructed mitochondrial genomes from previously sequenced NWSL equid specimens and used a maximum likelihood evolutionary placement algorithm (Berger et al., 2011) to place these published sequences in our phylogeny a posteriori. These analyses suggested that previous results were likely due to a combination of outgroup choice and the use of short, incomplete, or problematic mtDNA sequences (Appendix 2 and Appendix 2—table 3).

Figure 1 with 3 supplements see all

Download asset Open asset

Phylogeny of extant and middle-late Pleistocene equids, as inferred from the Bayesian analysis of full mitochondrial genomes.

Purple node-bars illustrate the 95% highest posterior density of node heights and are shown for nodes with >0.99 posterior probability support. The range of divergence estimates derived from our nuclear genomic analyses is shown by the thicker, lime green node-bars ([Orlando et al., 2013]; this study). Nodes highlighted in the main text are labeled with boxed numbers. All analyses were calibrated using as prior information a caballine/non-caballine *Equus* divergence estimate of 4.0–4.5 Ma (Orlando et al., 2013) at node 3, and, in the mitochondrial analyses, the known ages of included ancient specimens. The thicknesses of nodes 2 and 3 represent the range between the median nuclear and mitochondrial genomic divergence estimates. Branches are coloured based on species provenance and the most parsimonious biogeographic scenario given the data, with gray indicating ambiguity. Fossil record occurrences for major represented groups (including South American *Hippidion*, New World stilt-legged equids, and Old World Sussemiones) are represented by the geographically coloured bars, with fade indicating uncertainty in the first appearance datum (after (Eisenmann et al., 2008; Forsten, 1992; O'Dea et al., 2016; Orlando et al., 2013) and references therein). The Asiatic ass species (*E. kiang*, *E. hemionus*) are not reciprocally monophyletic based on the analyzed mitochondrial genomes, and so the Asiatic ass clade is shown as ‘*E. kiang + hemionus*’. Daggers denote extinct taxa. NW: New World.

https://doi.org/10.7554/eLife.29944.003

Figure 1—source data 1 Bayesian time tree analysis results, with support and estimated divergence times for major nodes, and the tMRCAs for Haringtonhippus, E. asinus, and E. quagga summarized. All analyses supported topology one in Appendix 2—figure 3. HPD: highest posterior density.: https://doi.org/10.7554/eLife.29944.007
Download elife-29944-fig1-data1-v1.xlsx
Figure 1—source data 2 Statistics from the phylogenetic inference analyses of nuclear genomes using all four approaches. (A) Read mapping statistics. (B) Relative transversion frequencies for approaches 1–3. (C) Relative private transversion frequencies for approach 4. DNA extraction 1: (Rohland et al., 2010); DNA extraction 2: (Dabney et al., 2013b); library preparation 1: (Meyer and Kircher, 2010; Heintzman et al., 2015); library preparation 2: (Meyer and Kircher, 2010; Vilstrup et al., 2013). In (C), data in length bins with fewer than 200,000 called sites are italicized.: https://doi.org/10.7554/eLife.29944.008
Download elife-29944-fig1-data2-v1.xlsx
Figure 1—source data 3 Summary of nuclear genome data from all 17 NWSL equids pooled together and analyzed using approach four. Minimum and maximum NWSL:Equus ratios between relative frequencies are in bold, and are used for the divergence estimates in Figure 1—figure supplement 3. Total and mean values are for the four longest bins only (90–99 to 120–129 bp). Mean values equally weight each length bin. bp: base pairs.: https://doi.org/10.7554/eLife.29944.009
Download elife-29944-fig1-data3-v1.xlsx

To confirm the mtDNA result that NWSL equids fall outside of crown group equid diversity, we sequenced and compared partial nuclear genomes from 17 NWSL equids to a caballine (horse) and a non-caballine (donkey) reference genome. After controlling for reference genome and ancient DNA fragment length artifacts (Appendices 1–2), we examined differences in relative private transversion frequency between these genomes (Appendix 1—figure 1). We found that the relative private transversion frequency for NWSL equids was ~1.4–1.5 times greater than that for horse or donkey (Appendix 2, Figure 1—source data 3, Figure 1—figure supplement 2, and Figure 1—source data 2). This result supports the placement of NWSL equids as sister to the horse-donkey clade (Figure 1—figure supplement 3), the latter of which is representative of living Equus diversity (e.g. [Der Sarkissian et al., 2015; Jónsson et al., 2014]) and is therefore congruent with the mitochondrial genomic analyses.

Divergence times of Hippidion, NWSL equids, and Equus

We estimated the divergence times between the lineages leading to Hippidion, the NWSL equids, and Equus. We first applied a Bayesian time-tree approach to the whole mitochondrial genome data. This gave divergence estimates for the Hippidion-NWSL/Equus split (node 1) at 5.15–7.66 Ma, consistent with (Der Sarkissian et al., 2015), the NWSL-Equus split (node 2) at 4.09–5.13 Ma, and the caballine/non-caballine Equus split (node 3) at 3.77–4.40 Ma (Figure 1 and Figure 1—source data 1). These estimates suggest that the NWSL-Equus mitochondrial split occurred only ~500 thousand years (ka) prior to the caballine/non-caballine Equus split. We then estimated the NWSL-Equus divergence time using relative private transversion frequency ratios between the nuclear genomes, assuming a caballine/non-caballine Equus divergence estimate of 4–4.5 Ma (Orlando et al., 2013) and a genome-wide strict molecular clock (following [Heintzman et al., 2015]). This analysis yielded a divergence estimate of 4.87–5.69 Ma (Figure 1—figure supplement 3), which overlaps with that obtained from the relaxed clock analysis of whole mitochondrial genome data (Figure 1). These analyses suggest that the NWSL equid and Equus clades diverged during the latest Miocene or early Pliocene (4.1–5.7 Ma; late Hemphillian or earliest Blancan).

Systematic palaeontology

The genus Equus (Linnaeus, 1758) was named to include three living equid groups – horses (E. caballus), donkeys (E. asinus), and zebras (E. zebra) – whose diversity comprises all extant, or crown group, equids. Previous palaeontological and palaeogenetic studies have uniformly placed NWSL equids within the diversity of extant equids and therefore this genus (Barrón-Ortiz et al., 2017; Bennett, 1980; Der Sarkissian et al., 2015; Harington and Clulow, 1973; Orlando et al., 2008; 2009; Scott, 2004; Vilstrup et al., 2013; Weinstock et al., 2005). This, however, conflicts with the phylogenetic signal provided by palaeogenomic data, which strongly suggest that NWSL equids fall outside the confines of the equid crown group (Equus). Nor is there any morphological or genetic evidence warranting the assignment of NWSL equids to an existing extinct taxon such as Hippidion. We therefore erect a new genus for NWSL equids, Haringtonhippus, as defined and delimited below:

Order: Perissodactyla, Owen 1848

Family: Equidae, Linnaeus 1758

Subfamily: Equinae, Steinmann & Döderlein 1890

Tribe: Equini, Gray 1821

Genus: Haringtonhippus, gen. nov. urn:lsid:zoobank.org:act:35D901A7-65F8-4615-9E13-52A263412F67

Type species. Haringtonhippus francisci Hay 1915.

Etymology

The new genus is named in honor of C. Richard Harington, who first described NWSL equids from eastern Beringia (Harington and Clulow, 1973). ‘Hippus’ is from the Greek word for horse, and so Haringtonhippus is implied to mean ‘Harington’s horse’.

Holotype

A partial skeleton consisting of a complete cranium, mandible, and a stilt-legged third metatarsal (MTIII) (Figure 2a and Figure 2—figure supplement 1b), which is curated at the Texas Vertebrate Paleontology Collections at The University of Texas, Austin (TMM 34–2518). This specimen is the holotype of ‘E’. francisci, originally described by Hay (1915), and is from the middle Pleistocene Lissie Formation of Wharton County, Texas (Hay, 1915; Lundelius and Stevens, 1970).

Figure 2 with 4 supplements see all

Download asset Open asset

Morphological analysis of extant and middle-late Pleistocene equids.

(A) Crania of *Haringtonhippus francisci*, upper: LACM(CIT) 109/156450 from Nevada, lower: TMM 34–2518 from Texas. (B) From upper to lower, third metatarsals of: *H. francisci* (YG 401.268), *E. lambei* (YG 421.84), and E. cf. *scotti* (YG 198.1) from Yukon. Scale bar is 5 cm. (C) Principal component analysis of selected third metatarsals from extant and middle-late Pleistocene equids, showing clear clustering of stilt-legged (hemionine *Equus* (orange) and *H. francisci* (green)) from stout-legged (caballine *Equus*; blue) specimens (see also Figure 2—source data 1). Symbol shape denotes the specimen identification method (DNA: square, triangle: DNA/morphology, circle: morphology). The first and second principal components explain 95% of the variance.

https://doi.org/10.7554/eLife.29944.010

Figure 2—source data 1 Measurement data for (A) equid third metatarsals, which were used in the morphometrics analysis, and (B) other NWSL equid elements.: https://doi.org/10.7554/eLife.29944.015
Download elife-29944-fig2-data1-v1.xlsx

Referred material

On the basis of mitochondrial and nuclear genomic data, we assign the following material confidently to Haringtonhippus: a cranium, femur, and MTIII (LACM(CIT): Nevada); three MTIIIs, three third metacarpals (MCIII), three premolar teeth, and a molar tooth (KU: Wyoming); two radii, 12 MTIIIs, three MCIIIs, a metapodial, and a first phalanx (YG: Yukon Territory); and a premolar tooth (University of Texas El Paso, UTEP: New Mexico); (Figure 2—figure supplements 1–4 and Supplementary file 1; (Barrón-Ortiz et al., 2017; Weinstock et al., 2005). This material includes at least four males and at least six females (Appendix 2, Appendix 2—Table 4 and Appendix 2—Table 4—source data 1). We further assign MTIII specimens from Yukon Territory (n = 13), Wyoming (n = 57), and Nevada (n = 4) to Haringtonhippus on the basis of morphometric analysis (Figure 2c and Figure 2—source data 1). On the basis of short mitochondrial DNA sequences, we tentatively assign to Haringtonhippus a premolar tooth (LACM(CIT): Nuevo Leon); a premolar and a molar (UTEP: New Mexico); and a premolar (Royal Alberta Museum, RAM/PMA: Alberta) (Barrón-Ortiz et al., 2017). We also tentatively assign 19 NWSL equid metapodial specimens from the Fairbanks area, Alaska (Guthrie, 2003) to Haringtonhippus, but note that morphometric and/or palaeogenomic analysis would be required to confirm this designation.

Geographic and temporal distribution

Haringtonhippus is known only from the Pleistocene of North America (Figure 3). In addition to the middle Pleistocene holotype from Texas, Haringtonhippus is confidently known from the late Pleistocene of Yukon Territory (Klondike region), Wyoming (Natural Trap Cave), Nevada (Gypsum Cave, Mineral Hill Cave), and New Mexico (Dry Cave), and is tentatively registered as present in Nuevo Leon (San Josecito Cave), Alberta (Edmonton area), and Alaska (Fairbanks area) (Appendix 2, Supplementary file 1, and Appendix 2—table 3; [Barrón-Ortiz et al., 2017; Vilstrup et al., 2013; Weinstock et al., 2005]).

Figure 3

Download asset Open asset

The geographic distribution of *Haringtonhippus*.

Blue circles are east Beringian localities (KL: Klondike region, Yukon Territory, Canada). Red circles are contiguous USA localities (NTC: Natural Trap Cave, Wyoming, USA; GC: Gypsum Cave, Nevada, USA; MHC: Mineral Hill Cave, Nevada, USA; DC: Dry Cave, New Mexico, USA [Barrón-Ortiz et al., 2017; Weinstock et al., 2005]). Orange circles are localities with tentatively assigned *Haringtonhippus* specimens only (FB: Fairbanks, Alaska, USA; ED: Edmonton, Alberta, Canada, USA; SJC: San Josecito Cave, Nuevo Leon, Mexico; (Barrón-Ortiz et al., 2017; Guthrie, 2003). The green-star-labeled HT is the locality of the *francisci* holotype, Wharton County, Texas, USA. This figure was drawn using Simplemappr (Shorthouse, 2010).

https://doi.org/10.7554/eLife.29944.016

To investigate the last appearance date (LAD) of Haringtonhippus in eastern Beringia, we obtained new radiocarbon dates from 17 Yukon Territory fossils (Appendix 1 and Supplementary file 1). This resulted in three statistically-indistinguishable radiocarbon dates of ~14.4 ¹⁴C ka BP (derived from two independent laboratories) from a metacarpal bone (YG 401.235) of Haringtonhippus, which represents this taxon’s LAD in eastern Beringia (Supplementary file 1). The LAD for North America as a whole is based on two dates of ~13.1 ¹⁴C ka BP from Gypsum Cave, Nevada (Supplementary file 1; [Weinstock et al., 2005]).

Mitogenomic diagnosis

Haringtonhippus is the sister genus to Equus (equid crown group), with Hippidion being sister to the Haringtonhippus-Equus clade (Figure 1). Haringtonhippus can be differentiated from Equus and Hippidion by 178 synapomorphic positions in the mitochondrial genome, including four insertions and 174 substitutions (Appendix 1—Table 2 and Appendix 1—table 2—source data 1). We caution that these synapomorphies are tentative and will likely be reduced in number as a greater diversity of mitochondrial genomes for extinct equids become available.

Morphological comparisons of third metatarsals

We used morphometric analysis of caballine/stout-legged Equus and stilt-legged equids (hemionine/stilt-legged Equus, Haringtonhippus) MTIIIs to determine how confidently these groups can be distinguished (Figure 2c). Using logistic regression on principal components, we find a strong separation that can be correctly distinguished with 98.2% accuracy (Appendix 2; Heintzman et al., 2017). Hemionine/stilt-legged Equus MTIIIs occupy the same morphospace as H. francisci in our analysis, although given a larger sample size, it may be possible to discriminate E. hemionus from the remaining stilt-legged equids. We note that Haringtonhippus seems to exhibit a negative correlation between latitude and MTIII length, and that specimens from the same latitude occupy similar morphospace regardless of whether DNA- or morphological-based identification was used (Figure 2c and Figure 2—source data 1).

Comments

On the basis of morphology, we assign all confidently referred material of Haringtonhippus to the single species H. francisci Hay (1915) (Appendix 2). Comparison between the cranial anatomical features of LACM(CIT) 109/156450 and TMM 34–2518 reveal some minor differences, which can likely be ascribed to intraspecific variation (Figure 2a and Appendix 2 and Figure 2—figure supplement 1). Further, the MTIII of TMM 34–2518 is comparable to the MTIIIs ascribed to Haringtonhippus by palaeogenomic data, and is consistent with the observed latitudinally correlated variation in MTIII length across Haringtonhippus (Figure 2c and Appendix 2).

This action is supported indirectly by molecular evidence, namely the lack of mitochondrial phylogeographic structure and the estimated time to most recent common ancestor (tMRCA) for sampled Haringtonhippus. The mitochondrial tree topology within Haringtonhippus does not exhibit phylogeographic structure (Figure 1—figure supplement 1b), which is consistent with sampled Haringtonhippus mitochondrial genomes belonging to the same species. Using Bayesian time-tree analysis, we estimated a tMRCA for the sampled Haringtonhippus mitochondrial genomes of ~200–470 ka BP (Figure 1 and Figure 1—source data 1; Heintzman et al., 2017). The MRCA of Haringtonhippus is therefore more recent than that of other extant equid species (such as E. asinus and E. quagga, which have a combined 95% HPD range: 410–1030 ka BP; Figure 1 and Figure 1—source data 1; Heintzman et al., 2017). Although the middle Pleistocene holotype TMM 34–2518 (~125–780 ka BP) may predate our Haringtonhippus mitochondrial tMRCA, this sample has no direct date and the range of possible ages falls within the tMRCA range of other extant equid species. We therefore cannot reject the hypothesis of its conspecificity with Haringtonhippus, as defined palaeogenomically. We attempted, but were unable, to recover either collagen or genomic data from TMM 34–2518 (Appendix 2), consistent with the taphonomic, stratigraphic, and geographic context of this fossil (Hay, 1915; Lundelius and Stevens, 1970). Altogether, the molecular evidence is consistent with the assignment of H. francisci as the type and only species of Haringtonhippus.

Discussion

Reconciling the genomic and fossil records of Plio-Pleistocene equid evolution

The suggested placement of NWSL equids within a taxon (Haringtonhippus) sister to Equus is a departure from previous interpretations, which variably place the former within Equus, as sister to hemionines or caballine horses (Figure 1). According to broadly accepted palaeontological interpretations, the earliest equids exhibiting morphologies consistent with NWSL and caballine attribution appear in the fossil record only ~2–3 and ~1.9–0.7 Ma ago (Eisenmann et al., 2008; Forsten, 1992), respectively, whereas our divergence estimates suggest that these lineages to have diverged between 4.1–5.8 and 3.8–4.5 Ma, most likely in North America. Dating incongruence might be attributed to an incomplete fossil record, but this seems unlikely given the density of the record for late Neogene and Pleistocene horses. Conversely, incongruence might be attributed to problems with estimating divergence using genomic evidence. However, we emphasize that the NWSL-Equus split is robustly calibrated to the caballine/non-caballine Equus divergence at 4.0–4.5 Ma, which is in turn derived from a direct molecular clock calibration using a middle Pleistocene horse genome (Orlando et al., 2013).

Other possibilities to explain the incongruence include discordance between the timing of species divergence and the evolution of diagnostic anatomical characteristics, or failure to detect or account for homoplasy (Forsten, 1992). For example, Pliocene Equus generally exhibits a primitive (‘plesippine’ in North America, ‘stenonid’ in the Old World) morphology that presages living zebras and asses (Forsten, 1988, 1992), with more derived caballine (stout-legged) and hemionine (stilt-legged) forms evolving in the early Pleistocene. The stilt-legged morphology appears to have evolved independently at least once in each of the Old and New Worlds, yielding the Asiatic wild asses and Haringtonhippus, respectively. We include the middle-late Pleistocene Eurasian E. hydruntinus within the Asiatic wild asses (following [Bennett et al., 2017; Burke et al., 2003; Orlando et al., 2006]), and note that the Old World sussemione E. ovodovi may represent another instance of independent stilt-legged origin, but its relation to Asiatic wild asses and other non-caballine Equus is currently unresolved (as depicted in Der Sarkissian et al., 2015; Orlando et al., 2009; Vilstrup et al., 2013; and Figure 1). It is plausible that features at the plesiomorphous end of the spectrum, such as those associated with Hippidion, survived after the early to middle Pleistocene at lower latitudes (South America, Africa; Figure 1). By contrast, the more derived hemionine and caballine morphologies evolved from, and replaced, their antecedents in higher latitude North America and Eurasia, perhaps as adaptations to the extreme ecological pressures perpetuated by the advance and retreat of continental ice sheets and correlated climate oscillations during the Pleistocene (Forsten, 1992, Forsten, 1996Forsten, 1996). We note that this high-latitude replacement model is consistent with the turnover observed in regional fossil records for Pleistocene equids in North America (Azzaroli, 1992; Azzaroli and Voorhies, 1993) and Eurasia (Forsten, 1988, 1992, Forsten, 1996). By contrast, in South America Hippidion co-existed with caballine horses until they both succumbed to extinction, together with much of the New World megafauna near the end of the Pleistocene (Forsten, 1996; Koch and Barnosky, 2006; O'Dea et al., 2016). This model helps to explain the discordance between the timings of the appearance of the caballine and hemionine morphologies in the fossil record and the divergence of lineages leading to these forms as estimated from palaeogenomic data.

Although we can offer no solution to the general problem of mismatches between molecular and morphological divergence estimators–an issue scarcely unique to equid systematics–this model predicts that some previously described North American Pliocene and early Pleistocene Equus species (e.g. E. simplicidens, E. idahoensis; [Azzaroli and Voorhies, 1993]), or specimens thereof, may be ancestral to extant Equus and/or late Pleistocene Haringtonhippus.

Temporal and geographic range overlap of Pleistocene equids in North America

Three new radiocarbon dates of ~14.4 ¹⁴C ka BP from a Yukon Haringtonhippus fossil greatly extends the known temporal range of this genus in eastern Beringia. This result demonstrates, contrary to its previous LAD of 31,400 ± 1200 ¹⁴C years ago (AA 26780; [Guthrie, 2003]), that Haringtonhippus survived throughout the last glacial maximum in eastern Beringia (Clark et al., 2009) and may have come into contact with humans near the end of the Pleistocene (Goebel et al., 2008; Guthrie, 2006). These data suggest that populations of stilt-legged Haringtonhippus and stout-legged caballine Equus were sympatric, both north and south of the continental ice sheets, through the late Pleistocene and became extinct at roughly the same time. The near synchronous extinction of both horse groups across their entire range in North America suggests that similar causal mechanisms may have led each to their demise.

The sympatric nature of these equids raises questions of whether they managed to live within the same community without hybridizing or competing for resources. Extant members of the genus Equus vary considerably in the sequence of Prdm9, a gene involved in the speciation process, and chromosome number (karyotype) (Ryder et al., 1978; Steiner and Ryder, 2013), and extant caballine and non-caballine Equus rarely produce fertile offspring (Allen and Short, 1997; Steiner and Ryder, 2013). It is unlikely, therefore, that the more deeply diverged Haringtonhippus and caballine Equus would have been able to hybridize. Future analysis of high coverage nuclear genomes, ideally including an outgroup such as Hippidion, will make it possible to test for admixture that may have occurred soon after the lineages leading to Haringtonhippus and Equus diverged, as occurred between the early caballine and non-caballine Equus lineages (Jónsson et al., 2014). It may also be possible to use isotopic and/or tooth mesowear analyses to assess the potential of resource partitioning between Haringtonhippus and caballine Equus in the New World.

Fossil systematics in the palaeogenomics and proteomics era: concluding remarks

Fossils of NWSL equids have been known for more than a century, but until the present study their systematic position within Plio-Pleistocene Equidae was poorly characterized. This was not because of a lack of interest on the part of earlier workers, whose detailed anatomical studies strongly indicated that what we now call Haringtonhippus was related to Asiatic wild asses, such as Tibetan khulan and Persian onagers, rather than to caballine horses (Eisenmann et al., 2008; Guthrie, 2003; Scott, 2004; Skinner and Hibbard, 1972). That the cues of morphology have turned out to be misleading in this case underlines a recurrent problem in systematic biology, which is how best to discriminate authentic relationships within groups, such as Neogene equids, that were prone to rampant convergence. The solution we adopted here was to utilize both palaeogenomic and morphometric information in reframing the position of Haringtonhippus, which now clearly emerges as the closest known outgroup to all living Equus.

Our success in this regard demonstrates that an approach which incorporates phenomics with molecular methods (palaeogenomic as well as palaeoproteomic, e.g. [Welker et al., 2015]) is likely to offer a means for securely detecting relationships within speciose groups that are highly diverse ecomorphologically. All methods have their limits, with taphonomic degradation being the critical one for molecular approaches. However, proteins may persist significantly longer than ancient DNA (e.g. [Rybczynski et al., 2013]), and collagen proteomics may come to play a key role in characterizing affinities, as well as the reality, of several proposed Neogene equine taxa (e.g. Dinohippus, Pliohippus, Protohippus, Calippus, and Astrohippus; [MacFadden, 1998]) whose distinctiveness and relationships are far from settled (Azzaroli and Voorhies, 1993; Forsten, 1992). A reciprocally informative approach like the one taken here holds much promise for lessening the amount of systematic noise, due to oversplitting, that hampers our understanding of the evolutionary biology of other major late Pleistocene megafaunal groups such as bison and mammoths (Enk et al., 2016; Froese et al., 2017). This approach is clearly capable of providing new insights into just how extensive megafaunal losses were at the end of the Pleistocene, in what might be justifiably called the opening act of the Sixth Mass Extinction in North America.

Materials and methods

We provide an overview of methods here; full details can be found in Appendix 1.

Sample collection and radiocarbon dating

Request a detailed protocol

We recovered Yukon fossil material (17 Haringtonhippus francisci, two Equus cf. scotti, and two E. lambei; Supplementary file 1) from active placer mines in the Klondike goldfields near Dawson City. We further sampled seven H. francisci fossils from the contiguous USA that are housed in collections at the University of Kansas Biodiversity Institute (KU; n = 4), Los Angeles County Museum of Natural History (LACM(CIT); n = 2), and the Texas Vertebrate Paleontology Collections at The University of Texas (TMM; n = 1). We radiocarbon dated the Klondike fossils and the H. francisci cranium from the LACM(CIT) (Supplementary file 1).

Morphometric analysis of third metatarsals

Request a detailed protocol

For morphometric analysis, we took measurements of third metatarsals (MTIII) and other elements. We used a reduced data set of four MTIII variables for principal components analysis and performed logistic regression on the first three principal components, computed in R (R Development Core Team, 2008) (Source code 1).

DNA extraction, library preparation, target enrichment, and sequencing

Request a detailed protocol

We conducted all molecular biology methods prior to indexing PCR in the dedicated palaeogenomics laboratory facilities at either the UC Santa Cruz or Pennsylvania State University. We extracted DNA from between 100 and 250 mg of bone powder following either Rohland et al. (2010) or Dabney et al. (2013a). We then converted DNA extracts to libraries following the Meyer and Kircher protocol (Meyer and Kircher, 2010), as modified by (Heintzman et al., 2015) or the PSU method of (Vilstrup et al., 2013). We enriched libraries for equid mitochondrial DNA. We then sequenced all enriched libraries and unenriched libraries from 17 samples using Illumina platforms.

Mitochondrial genome reconstruction and analysis

Request a detailed protocol

We prepared raw sequence data for alignment and mapped the filtered reads to the horse reference mitochondrial genome (Genbank: NC_001640.1) and a H. francisci reference mtDNA genome (Genbank: KT168321), resulting in mitogenomic coverage ranging from 5.8× to 110.7× (Supplementary file 1). We were unable to recover equid mtDNA from TMM 34–2518 (the francisci holotype) using this approach (Appendix 2). We supplemented our mtDNA genome sequences with 38 previously published complete equid mtDNA genomes. We constructed six alignment data sets and selected models of molecular evolution for the analyses described below (Appendix 1—table 1, and Supplementary file 1; Heintzman et al., 2017).

We tested the phylogenetic position of the NWSL equids (=H. francisci) using mtDNA data sets 1–3 and applying Bayesian (Ronquist et al., 2012) and maximum likelihood (ML; [Stamatakis, 2014]) analyses. We varied the outgroup, the inclusion or exclusion of the fast-evolving partitions, and the inclusion or exclusion of Hippidion sequences. Due to the lack of a globally supported topology across the Bayesian and ML phylogenetic analyses, we used an Evolutionary Placement Algorithm (EPA; [Berger et al., 2011]) to determine the a posteriori likelihood of phylogenetic placements for candidate equid outgroups using mtDNA data set four. We also used the same approach to assess the placement of previously published equid sequences (Appendix 2). To infer divergence times between the four major equid groups (Hippidion, NWSL equids, caballine Equus, and non-caballine Equus), we ran Bayesian timetree analyses (Drummond et al., 2012) using mtDNA data set five. We varied these analyses by including or excluding fast-evolving partitions, constrained the root height or not, and including or excluding the E. ovodovi sequence.

To facilitate future identification of equid mtDNA sequences, we constructed, using data set six, a list of putative synapomorphic base states, including indels and substitutions, that define the genera Hippidion, Haringtonhippus, and Equus at sites across the mtDNA genome.

Phylogenetic inference, divergence date estimation, and sex determination from nuclear genomes

Request a detailed protocol

To test whether our mtDNA genome-based phylogenetic hypothesis truly reflects the species tree, we compared the nuclear genomes of a horse (EquCab2), donkey (Orlando et al., 2013), and the shotgun sequence data from 17 of our NWSL equid samples (Figure 1—source data 2, Appendix 1, Appendix 1—figure 1, and Supplementary file 1). We applied four successive approaches, which controlled for reference genome and DNA fragment length biases (Appendix 1).

We estimated the divergence between the NWSL equids and Equus (horse and donkey) by fitting the branch length, or relative private transversion frequency, ratio between horse/donkey and NWSL equids into a simple phylogenetic scenario (Figure 1—figure supplement 3). We then multiplied the NWSL equid branch length by a previous horse-donkey divergence estimate (4.0–4.5 Ma; [Orlando et al., 2013]) to give the estimated NWSL equid-Equus divergence date, following (Heintzman et al., 2015) and assuming a strict genome-wide molecular clock (Figure 1—figure supplement 3).

We determined the sex of the 17 NWSL equid samples by comparing the relative mapping frequency of the autosomes to the X chromosome.

DNA damage analysis

Request a detailed protocol

We assessed the prevalence of mitochondrial and nuclear DNA damage in a subset of the equid samples using mapDamage (Jónsson et al., 2013).

Data availability

Request a detailed protocol

Repository details and associated metadata for curated samples can be found in Supplementary file 1. MTIII and other element measurement data are in Figure 2—source data 1, and the Rscript used for morphometric analysis is in the DRYAD database (Heintzman et al., 2017). MtDNA genome sequences have been deposited in Genbank under accessions KT168317-KT168336, MF134655-MF134663, and an updated version of JX312727. All mtDNA genome alignments (in NEXUS format) and associated XML and TREE files are in the DRYAD database (Heintzman et al., 2017). Raw shotgun sequence data used for the nuclear genomic analyses and raw shotgun and target enrichment sequence data for TMM 34–2518 (francisci holotype) have been deposited in the Short Read Archive (BioProject: PRJNA384940).

Nomenclatural act

Request a detailed protocol

The electronic edition of this article conforms to the requirements of the amended International Code of Zoological Nomenclature, and hence the new name contained herein is available under that Code from the electronic edition of this article. This published work and the nomenclatural act it contains have been registered in ZooBank, the online registration system for the ICZN. The ZooBank LSIDs (Life Science Identifiers) can be resolved and the associated information viewed through any standard web browser by appending the LSID to the prefix ‘http://zoobank.org/'. The LSID for this publication is: urn:lsid:zoobank.org:pub:8D270E0A-9148-4089-920C-724F07D8DC0B. The electronic edition of this work was published in a journal with an ISSN, and has been archived and is available from the following digital repositories: PubMed Central and LOCKSS.

Appendix 1

Supplementary methods

Yukon sample context and identification

Pleistocene vertebrate fossils are commonly recovered at placer mining localities, in the absence of stratigraphic context, as miners are removing frozen sediments to access underlying gold bearing gravel (Froese et al., 2009; Harington, 2011). We recovered H. francisci fossils along with other typical late Pleistocene (Rancholabrean) taxa, including caballine horses (Equus sp.), woolly mammoth (Mammuthus primigenius), steppe bison (Bison priscus), and caribou (Rangifer tarandus), which are consistent with our age estimates based on radiocarbon dating (Supplementary file 1). All Yukon fossil material consisted of limb bones that were taxonomically assigned based on their slenderness and are housed in the collections of the Yukon Government (YG).

Radiocarbon dating

We subsampled fossil specimens using handheld, rotating cutting tools and submitted them to either the KECK Accelerator Mass Spectrometry (AMS) Laboratory at the University of California (UC), Irvine (UCIAMS) and/or the Center for AMS (CAMS) at the Lawrence Livermore National Laboratory. We extracted collagen from the fossil subsamples using ultrafiltration (Beaumont et al., 2010), which was used for AMS radiocarbon dating. We were unable to recover collagen from TMM 34–2518 (francisci holotype), consistent with the probable middle Pleistocene age of this specimen (Lundelius and Stevens, 1970). We recovered finite radiocarbon dates from all other fossils, with the exception of the two Equus cf. scotti specimens. We calibrated AMS radiocarbon dates using the IntCal13 curve (Reimer et al., 2013) in OxCal v4.2 (https://c14.arch.ox.ac.uk/oxcal/OxCal.html) and report median calibrated dates in Supplementary file 1.

Morphometric analysis of third metatarsals

Third metatarsal (MTIII) and other elemental measurements were either taken by GDZ or ES or from the literature (Figure 2—source data 1). For morphometric analysis, we focused exclusively on MTIIIs, which exhibit notable differences in slenderness among equid groups (Figure 2—figure supplement 2a; [Weinstock et al., 2005]). Starting with a data set of 10 variables (following [Eisenmann et al., 1988]), we compared the loadings of all variables in principal components space in order to remove redundant measurements. This reduced the data set to four variables (GL: greatest length, Pb: proximal breadth, Mb: midshaft breadth, and DABm: distal articular breadth at midline). We visualized the reduced variables using principal components analysis, computed in R (Appendix 1—table 2—source data 1; [R Development Core Team, 2008]), and performed logistic regression on the first three principal components to test whether MTIII morphology can distinguish stilt-legged (hemionine Equus and H. francisci, n = 105) from stout-legged (caballine Equus, n = 187) equid specimens.

Target enrichment and sequencing

We enriched libraries for equid mitochondrial DNA following the MyBaits v2 protocol (Arbor Biosciences, Ann Arbor, MI), with RNA bait molecules constructed from the horse reference mitochondrial genome sequence (NC_001640.1). We then sequenced the enriched libraries for 2 × 150 cycles on the Illumina HiSeq-2000 platform at UC Berkeley or 2 × 75 cycles on the MiSeq platform at UC Santa Cruz, following the manufacturer’s instructions. We produced data for the nuclear genomic analyses by shotgun sequencing 17 of the unenriched libraries for 2 × 75 cycles on the MiSeq to produce ~1.1–6.4 million reads per library (Figure 1—source data 2).

Mitochondrial genome reconstruction

We initially reconstructed the mitochondrial genome for H. francisci specimen YG 404.663 (PH047). For sequence data enriched for the mitochondrial genome, we trimmed adapter sequences, merged paired-end reads (with a minimum overlap of 15 base pairs (bp) required), and removed merged reads shorter than 25 bp, using SeqPrep (St. John, 2013; https://github.com/jstjohn/SeqPrep). We then mapped the merged and remaining unmerged reads to the horse reference mitochondrial genome sequence using the Burrows-Wheeler Aligner aln (BWA-aln v0.7.5; [Li and Durbin, 2010]), with ancient parameters (-l 1024; [Schubert et al., 2012]). We removed reads with a mapping quality less than 20 and collapsed duplicated reads to a single sequence using SAMtools v0.1.19 rmdup (Li et al., 2009). We called consensus sequences using Geneious v8.1.7 (Biomatters, http://www.geneious.com; [Kearse et al., 2012]). We then re-mapped the reads to the same reference mitochondrial genome using the iterative assembler, MIA (Briggs et al., 2009). Consensus sequences from both alignment methods required each base position to be covered a minimum of three times, with a minimum base agreement of 67%. The two consensus sequences were then combined to produce a final consensus sequence for YG 404.663 (Genbank: KT168321), which we used as the H. francisci reference mitochondrial genome sequence.

For the remaining newly analyzed 21 H. francisci, two E. cf. scotti, and two E. lambei samples, we merged and removed reads as described above. We then separately mapped the retained reads to the horse and H. francisci mitochondrial reference genome sequences using MIA. Consensus sequences from MIA analyses were called as described above. The two consensus sequences were then combined to produce a final consensus sequence for each sample, with coverage ranging from 5.8× to 110.7× (Supplementary file 1). We also reconstructed the mitochondrial genomes for four previously published samples: YG 401.268, LACM(CIT) 109/150807, KU 62158, and KU 62055 (Supplementary file 1; [Vilstrup et al., 2013; Weinstock et al., 2005]).

Mitochondrial genome alignments

We supplemented our 30 new mitochondrial genome sequences with 38 previously published complete equid mitochondrial genomes, which included all extant Equus species, and extinct Hippidion, E. ovodovi, and E. cf. scotti (‘equids’). We constructed six alignment data sets for the mitochondrial genome analyses: (1) equids and White rhinoceros (Ceratotherium simum; NC_001808) (n = 69); (2) equids and Malayan tapir (Tapirus indicus; NC_023838) (n = 69); (3) equids, six rhinos, two tapirs, and dog (Canis lupus familiaris; NC_002008) (n = 77); (4) equids, six rhinos, two tapirs, 19 published equid short fragments, and two published NWSL equid mitochondrial genome sequences (n = 88); (5) a reduced equid data set (n = 32); and (6) a full equid data set (n = 68) (Heintzman et al., 2017). For data sets three and four, we selected one representative from all rhino and tapir species for which full mitochondrial genome data are publicly available (Supplementary file 1).

For all six data sets, we first created an alignment using muscle (v3.8.31; [Edgar, 2004]). We then manually scrutinized alignments for errors and removed a 253 bp variable number of tandem repeats (VNTR) part of the control region, corresponding to positions 16121–16373 of the horse reference mitochondrial genome. We partitioned the alignments into six partitions (three codon positions, ribosomal-RNAs, transfer-RNAs, and control region), using the annotated horse reference mitochondrial genome in Geneious, following (Heintzman et al., 2015). We excluded the fast-evolving control region alignment for data set three, which included the highly-diverged dog sequence. For each partition, we selected models of molecular evolution using the Bayesian information criterion in jModelTest (v2.1.6; [Darriba et al., 2012]) (Appendix 1—table 1).

Appendix 1—table 1

Selected models of molecular evolution for partitions of the first five mtDNA genome alignment data sets.

All lengths are in base pairs. Reduced length excludes the Coding3 and CR partitions. For all RAxML analyses the GTR model was implemented. *The TrN model was selected, but this cannot be implemented in MrBayes and so the HKY model was used. EPA: evolutionary placement algorithm; CR: control region.

https://doi.org/10.7554/eLife.29944.020

Data set		Partition						Total length
Data set		Coding1	Coding2	Coding3	rRNAs	tRNAs	CR	All	Reduced
1. White rhino outgroup	Length	3803	3803	3803	2579	1529	1066	16583	11714
1. White rhino outgroup	Model	GTR + I + G	HKY + I + G	GTR + I + G	GTR + I + G	HKY + I + G	HKY*+I + G
2. Malayan tapir outgroup	Length	3803	3803	3803	2585	1530	1065	16589	11721
2. Malayan tapir outgroup	Model	GTR + I + G	HKY + I + G	GTR + I + G	GTR + I + G	HKY + I + G	HKY*+G
3. Dog + ceratomorphs outgroups	Length	3803	3803	3803	2615	1540	N/A	15564	11761
3. Dog + ceratomorphs outgroups	Model	GTR + I + G	HKY + I + G	GTR + I + G	GTR + I + G	HKY + I + G	N/A
4. EPA	Length	3803	3803	3803	2601	1534	1118	16662	11741
4. EPA	Model	GTR + I + G	TrN + I + G	GTR + I + G	GTR + I + G	HKY + I + G	HKY + I + G
5. Equids only	Length	3802	3802	3802	2571	1528	971	16476	11703
5. Equids only	Model	TrN + I + G	TrN + I + G	GTR + G	TrN + I + G	HKY + I	HKY + G

Phylogenetic analysis of mitochondrial genomes

To test the phylogenetic position of the NWSL equids, we conducted Bayesian and maximum likelihood (ML) phylogenetic analyses of data sets one, two, and three, under the partitioning scheme and selected models of molecular evolution described above. For outgroup, we selected: White rhinoceros (data set one), Malayan tapir (data set two), or dog (data set three). For each of the data sets, we varied the analyses based on (a) inclusion or exclusion of the fast-evolving partitions (third codon positions and control region, where appropriate) and (b) inclusion or exclusion of the Hippidion sequences. We ran Bayesian analyses in MrBayes (v3.2.6, [Ronquist et al., 2012]) for two parallel runs of 10 million generations, sampling every 1,000, with the first 25% discarded as burn-in. We conducted ML analyses in RAxML (v8.2.4, [Stamatakis, 2014]), using the GTRGAMMAI model across all partitions, and selected the best of three trees. We evaluated branch support with both Bayesian posterior probability scores from MrBayes and 500 ML bootstrap replicates in RAxML.

Placement of outgroups and published sequences a posteriori

We used the evolutionary placement algorithm (EPA) in RAxML to determine the a posteriori likelihood of phylogenetic placements for eight candidate equid outgroups (two tapirs, six rhinos) relative to the four well supported major equid groups (Hippidion, NWSL equids, caballine Equus, non-caballine Equus). We first constructed an unrooted reference tree consisting only of the equids from data set four in RAxML. We then analyzed the placements of the eight outgroups and retaining all placements up to a cumulative likelihood threshold of 0.99. We used the same approach to assess the placement of 21 previously published equid sequences derived from 13 NWSL equids (Barrón-Ortiz et al., 2017; Vilstrup et al., 2013; Weinstock et al., 2005), five Hippidion devillei (Orlando et al., 2009), and three E. ovodovi (Orlando et al., 2009) (Appendix 2—table 3).

Divergence date estimation from mitochondrial genomes

To further investigate the topology of the four major equid groups, and to infer divergence times between them, we ran Bayesian timetree analyses in BEAST (v1.8.4; [Drummond et al., 2012]). Unlike the previous analyses, BEAST can resolve branching order in the absence of an outgroup, by using branch length and molecular clock methods. For BEAST analyses, we used data set five. We did not enforce monophyly. Where available, we used radiocarbon dates to tip date ancient samples. For two samples without available radiocarbon dates, we sampled the ages of tips. For the E. ovodovi sample (mtDNA genome: NC_018783), which was found in a cave that has been stratigraphically dated as late Pleistocene and includes other E. ovodovi remains have been dated to ~45–50 ka BP (Eisenmann and Sergej, 2011; Orlando et al., 2009), we used the following lognormal prior (mean: 4.5 × 10⁴, log(stdev): 0.766, offset: 1.17 × 10⁴) to ensure that 95% of the prior fell within the late Pleistocene (11.7–130 ka BP). For the E. cf. scotti mitochondrial genome (KT757763), we used a normal prior (mean: 6.7 × 10⁵, stdev: 5.64 × 10⁴) to ensure that 95% of the prior fell within the proposed age range of this specimen (560–780 ka BP; [Orlando et al., 2013]). We further calibrated the tree using an age of 4–4.5 Ma for the root of crown group Equus (normal prior, mean 4.25 × 10⁶, stdev: 1.5 × 10⁵) (Orlando et al., 2013). To assess the impact of variables on the topology and divergence times, we either (a) included or excluded the fast-evolving partitions, (b) constrained the root height (lognormal prior: mean 1 × 10⁷, stdev: 1.0) or not, and (c) included or excluded the E. ovodovi sequence, which was not directly dated. We used the models of molecular evolution estimated by jModeltest (Appendix 1—table 1). We estimated the substitution and clock parameters for each partition, and estimated a single tree using all partitions. We implemented the birth-death serially sampled (BDSS) tree prior. We ran two analyses for each variable combination. In each analysis, we ran the MCMC chain for 100 million generations, sampling trees and parameters every 10,000, and discarding the first 10% as burn-in. We checked log files for convergence in Tracer (v1.6; http://tree.bio.ed.ac.uk/software/tracer/). We combined trees from the two runs for each variable combination in LogCombiner (v1.8.4) and then calculated the maximum clade credibility (MCC) tree in TreeAnnotator (v1.8.4). We report divergence dates as 95% highest posterior probability credibility intervals of node heights.

Mitochondrial synapomorphy analysis

We first divided data set six, which consists of all available and complete equid mitogenomic sequences, into three data sets based on the genera Hippidion, Haringtonhippus, and Equus. For each of the three genus-specific alignments, we created a strict consensus sequence, whereby sites were only called if there was 100% sequence agreement, whilst including gaps and excluding ambiguous sites. We then compared the three genus-specific consensus sequences to determine sites where one genus exhibited a base state that is different to the other two genera, or, at five sites, where each genus has its own base state (Appendix 1—table 2—source data 1). In this analysis, we did not make any inference regarding the ancestral state for the identified synapomorphic base states. We identified 391 putative mtDNA genome synapomorphies for Hippidion, 178 for Haringtonhippus, and 75 for Equus (Appendix 1—table 2; Appendix 1—table 2—source data 1).

Appendix 1—table 2

Summary of the number and type of synapomorphic bases for each of the three examined equid genera.

A full list of these substitutions, and their position relative to the E. caballus reference mitochondrial genome (NC_001640), can be found in Appendix 1—table 2-Source data 1. *total includes a further five synapomorphic sites that have unique states in each genus.

https://doi.org/10.7554/eLife.29944.021

Substitution	Hippidion	Haringtonhippus	Equus
Transition	338	147	66
Transversion	43	22	4
Insertion	2	4	0
Deletion	3	0	0
Total*	391	178	75

Appendix 1—table 2—source data 1 A compilation of all 634 putative synapomorphic sites in the mitochondrial genome for Hippidion, Haringtonhippus, and Equus (A), with a comparison to the published MS272 mitochondrial genome sequence at the 140 sites with a base state that matches one of the three genera (B). The horse reference mtDNA has Genbank accession NC_001640.1.: https://doi.org/10.7554/eLife.29944.022
Download elife-29944-app1-table2-data1-v1.xlsx

Phylogenetic inference from nuclear genomes

We compared the genomes of a horse (E. caballus; EquCab2; GCA_000002305.1) and donkey (E. asinus; Willy, 12.4×; http://geogenetics.ku.dk/publications/middle-pleistocene-omics; [Orlando et al., 2013]) with shotgun sequence data from 17 of our NWSL equid samples (Figure 1—source data 2, and Supplementary file 1). We merged paired-end reads using SeqPrep as described above, except that we removed merged reads shorter than 30 bp. We further removed merged and remaining unmerged reads that had low sequence complexity, defined as a DUST score >7, using PRINSEQ-lite v0.20.4 (Schmieder and Edwards, 2011). We used four successive approaches to minimize the impact of mapping bias introduced from ancient DNA fragment length variation and reference genome choice.

We first followed a modified version of the approach outlined in (Heintzman et al., 2015). We mapped the donkey genome to the horse genome by computationally dividing the donkey genome into 150 bp ‘pseudo-reads’ tiled every 75 bp, and aligned these pseudo-reads using Bowtie2-local v2.1.0 (Langmead and Salzberg, 2012) while allowing one seed mismatch and a maximum mismatch penalty of four to better account for ancient DNA specific damage (Appendix 1—figure 1, steps 1–3). We then mapped the filtered shotgun data from each of the NWSL equid samples to the horse genome using Bowtie2-local with the settings described above, and removed PCR duplicated reads and those with a mapping quality score of <30 in SAMtools. We called a pseudo-haploidized sequence for the donkey and NWSL equid alignments, by randomly picking a base with a base quality score ≥60 at each position, using SAMtools mpileup. We masked positions that had a coverage not equal to 2× (donkey) or 1× (NWSL equid), and those located on scaffolds shorter than 100 kb (Appendix 1—figure 1, step 4). As the horse, donkey, and NWSL equid genome sequences were all based on the horse genome coordinates, we compared the relative transversion frequency between the donkey or NWSL equids and the horse using custom scripts. We restricted our analyses to transversions to avoid the impacts of ancient DNA damage, which can manifest as erroneous transitions from the deamination of cytosine (e.g. Appendix 2—figure 1,2) (Dabney et al., 2013b). We repeated this analysis, but with the horse and NWSL equids mapped to the donkey genome (the donkey genome coordinate framework).

Appendix 1—figure 1

Download asset Open asset

An overview of the nuclear genome analysis pipeline.

A first reference genome sequence (red; step 1) is divided into 150 bp pseudo-reads, tiled every 75 bp for exactly 2 × genomic coverage (step 2). These pseudo-reads are then mapped to a second reference genome (blue; step 3), and a consensus sequence of the mapped pseudo-reads is called (step 4). Regions of the second reference genome that are not covered by the pseudo-reads are masked (step 5). For each NWSL equid sample, reads (orange) are mapped independently to the first reference consensus sequence (step 6a) and masked second reference genome (step 6b). Alignments from steps 6a and 6b are then merged (step 7). For alignment coordinates that have base calls for the first reference, second reference, and NWSL equid sample genomes, the relative frequencies of private transversion substitutions (yellow stars) for each genome are calculated (step 8). The co-ordinates from the second reference genome (blue) are used for each analysis.

https://doi.org/10.7554/eLife.29944.023

For the second approach and using the horse genome coordinate framework, we next masked sites in the horse reference genome that were not covered by donkey reads at a depth of 2×. This resulted in the horse genome and donkey consensus sequence being masked at the same positions (Appendix 1—figure 1, step 5). We then separately mapped the filtered NWSL equid shotgun data to scaffolds longer than 100 kb for the masked horse genome and donkey consensus sequence (Appendix 1—figure 1, step 6), called NWSL equid consensus sequences, and calculated relative transversion frequencies as described above. This analysis was repeated using the donkey genome coordinate framework.

Next, for each genome coordinate framework, we combined the two alignments for each NWSL equid sample from approach two to create a union of reads mappable to both the masked coordinate genome and alternate genome consensus sequence (Appendix 1—figure 1, step 7). If a NWSL equid read mapped to different coordinates between the two references, we selected the alignment with the higher map quality score and randomly selected between mappings of equal quality. We then called NWSL equid consensus sequences as above. As this third approach allowed for simultaneous comparison of the horse, donkey, and NWSL equid sequences, we calculated relative private transversion frequencies for each sequence, at sites where all three sequences had a base call, using tri-aln-report (Green et al., 2015); https://github.com/Paleogenomics/Chrom-Compare) (Appendix 1—figure 1, step 8).

Finally, as a fourth approach and for both genome coordinate frameworks, we repeated approach three with the exception that we divided the NWSL alignments by mapped read length. We split the alignments into 10 bp read bins ranging from 30–39 to 120–129 bp, and discarded longer reads and paired-end reads that were unmerged by SeqPrep. We called consensus sequences and calculated relative private transversion frequencies for each sequence as described above. We only used relative private transversion frequencies from the 90–99 to 120–129 bp bins for divergence date estimates (Appendix 2).

Sex determination from nuclear genomes

We used the alignments of the 17 NWSL equids to the horse genome, from approach one described above, to infer the probable sex of these individuals. For this, we determined the number of reads mapped to each chromosome using SAMtools idxstats. For each chromosome, we then calculated the relative mapping frequency by dividing the number of mapped reads by the length of the chromosome. We then compared the relative mapping frequency between the autosomes and X-chromosome. As males and females are expected to have one and two copies of the X chromosome, respectively, and two copies of every autosome, we inferred a male if the ratio between the autosomes and X-chromosome was 0.45–0.55 and a female if the ratio were 0.9–1.1.

DNA damage analysis

For a subset of nine samples, we realigned the filtered sequence data from the libraries enriched for equid mitochondrial DNA to either the H. francisci (for H. francisci samples) or horse (for E. lambei and E. cf. scotti samples) reference mitochondrial genome sequences using BWA-aln as described above. We also realigned the filtered unenriched sequence data to the horse reference genome (EquCab2) for a subset of six samples using the same approach. We then analyzed patterns of DNA damage in mapDamage v2.0.5 (Jónsson et al., 2013).

Appendix 2

Supplementary Results

Ancient DNA characterization

We selected a subset of samples for the analysis of DNA damage patterns. In all of these samples, we observe expected patterns of damage in both mitochondrial and nuclear DNA, including evidence of the deamination of cytosine residues at the ends of reads, depurination-induced strand breaks, and a short mean DNA fragment length (Dabney et al., 2013b) (Appendix 2—figure 1–2). We note that the sample with the greatest proportion of deaminated cytosines is E. cf. scotti (YG 198.1; Appendix 2—figure 1v-x), which is the oldest sample in the subset (Supplementary file 1).

Appendix 2—figure 1

Download asset Open asset

Characterization of ancient mitochondrial DNA damage patterns from nine equid samples.

*H. francisci*: (**A–C**) JK166 (LACM(CIT) 109/150807; Nevada), (**D–F**) JK207 (LACM(CIT) 109/156450; Nevada), (**G–I**) JK260 (KU 47800; Wyoming), (**J–L**) PH013 (YG 130.6; Yukon), (**M–O**) PH047 (YG 404.663; Yukon), (**P–R**) MS272 (YG 401.268; Yukon), (**S–U**) MS349 (YG 130.55; Yukon); E. cf. *scotti*: (**V–X**) PH055 (YG 198.1; Yukon); *E. lambei*: (**Y–AA**) MS316 (YG 328.54; Yukon). Every third panel: (A) to (Y) DNA fragment length distributions; (B) to (Z) proportion of cytosines that are deaminated at fragment ends (red: cytosine → thymine; blue: guanine → adenine); and (C) to (AA) mean base frequencies immediately upstream and downstream of the 5’ and 3’ ends of mapped reads.

https://doi.org/10.7554/eLife.29944.025

Appendix 2—figure 2

Download asset Open asset

Characterization of ancient nuclear DNA damage patterns from six *H. francisci* samples.

(**A–C**) JK166 (LACM(CIT) 109/150807; Nevada), (**D–F**) JK260 (KU 47800; Wyoming), (**G–I**) PH013 (YG 130.6; Yukon), (**J–L**) PH036 (YG 76.2; Yukon), (**M–O**) MS349 (YG 130.55; Yukon), (**P–R**) MS439 (YG 401.387; Yukon). Every third panel: (A) to (P) DNA fragment length distributions; (B) to (Q) proportion of cytosines that are deaminated at fragment ends (red: cytosine → thymine; blue: guanine → adenine); and (C) to (R) mean base frequencies immediately upstream and downstream of the 5’ and 3’ ends of mapped reads.

https://doi.org/10.7554/eLife.29944.026

Resolving the phylogenetic placement of NWSL equids using mitochondrial genomes

We ran Bayesian and ML phylogenetic analyses on mtDNA genome alignment data sets 1–3, whilst varying the outgroup, including (all) or excluding (reduced) the fast-evolving partitions (see Appendix 1), and including or excluding the Hippidion sequences. In all analyses, we recover four major equid groups (Hippidion, NWSL equids(=H. francisci), caballine Equus, and non-caballine Equus) with strong statistical support (Bayesian posterior probability (BPP): 1.000; ML bootstrap: 96–100%; Appendix 2—table 1), consistent with previous studies (e.g. [Der Sarkissian et al., 2015; Orlando et al., 2009]). We recover conflicting phylogenetic topologies between these four groups, however, which is dependent on the variables described above and the choice of phylogenetic algorithm (Appendix 2—figure 3; Appendix 2—table 1). Across all analyses, strong statistical support (BPP:≥0.99; ML bootstrap:≥95%) is only associated with topology 1 (Appendix 2—figure 3; Appendix 2—table 1), in which NWSL equids are placed outside of Equus, and Hippidion is placed outside of the NWSL equid-Equus clade. We note that the analyses with the strongest support consist of multiple outgroups (mtDNA data set three).

Appendix 2—figure 3

Download asset Open asset

Seven phylogenetic hypotheses for the four major groups of equids with sequenced mitochondrial genomes.

These major groups are *Hippidion*, the New World stilt-legged equids (=*Haringtonhippus*), non-caballine *Equus* (asses, zebras, and *E. ovodovi*) and caballine *Equus* (horses). (A) imbalanced and (B) balanced hypotheses. The hypotheses presented in (C) and (D) are identical to (A) and (B), except that *Hippidion* is excluded. Node letters are referenced in Appendix 2—tables 1–2. We only list combinations that were recovered by our palaeogenomic, or previous palaeogenetic, analyses.

https://doi.org/10.7554/eLife.29944.027

Appendix 2—table 1

Topological shape and support values for the best supported trees.

These results are from the Bayesian and maximum likelihood (ML) analyses of mtDNA data sets 1–3, including either the all or reduced partition sets, and with Hippidion sequences either included or excluded. Topology numbers and node letters refer to those outlined in Appendix 2—figure 3. Bayesian posterior probability support of >0.99 and ML bootstrap support of >95% are in bold for nodes A and B. *support for nodes that are consistent with topology one in Appendix 2—figure 3. NCs: non-caballines.

https://doi.org/10.7554/eLife.29944.028

Outgroup	Partitions	Hippidion?	Tips	Analysis method	Topology	Support
Outgroup	Partitions	Hippidion?	Tips	Analysis method	Topology	Node A	Node B	Hippidion	NWSL	NCs	Caballines
White rhino (Data set 1)	All	Excluded	63	Bayesian	1/2/3	0.996*	N/A	N/A	1.000	1.000	1.000
		Excluded	63	ML	1/2/3	71*	N/A	N/A	100	99	100
		Included	69	Bayesian	2	0.751	1.000*	1.000	1.000	1.000	1.000
		Included	69	ML	1	64*	96*	100	100	100	100
	Reduced	Excluded	63	Bayesian	1/2/3	1.000*	N/A	N/A	1.000	1.000	1.000
		Excluded	63	ML	1/2/3	100*	N/A	N/A	99	100	100
		Included	69	Bayesian	2	0.948	1.000*	1.000	1.000	1.000	1.000
		Included	69	ML	2	73	98*	100	99	100	100
Malayan tapir (Data set 2)	All	Excluded	63	Bayesian	5/7	0.971	N/A	N/A	1.000	1.000	1.000
		Excluded	63	ML	5/7	87	N/A	N/A	100	99	99
		Included	69	Bayesian	6	0.808	0.867	1.000	1.000	1.000	1.000
		Included	69	ML	6	55	63	100	100	100	100
	Reduced	Excluded	63	Bayesian	1/2/3	0.675*	N/A	N/A	1.000	1.000	1.000
		Excluded	63	ML	4/6	28	N/A	N/A	100	96	98
		Included	69	Bayesian	3	0.685	0.864*	1.000	1.000	1.000	1.000
		Included	69	ML	3	70	69	100	100	100	100
Dog + ceratomorphs (Data set 3)	All	Excluded	71	Bayesian	1/2/3	0.598*	N/A	N/A	1.000	1.000	1.000
		Excluded	71	ML	4/6	59	N/A	N/A	100	100	100
		Included	77	Bayesian	1	1.000*	1.000*	1.000	1.000	1.000	1.000
		Included	77	ML	1	94*	96*	100	100	100	100
	Reduced	Excluded	71	Bayesian	1/2/3	0.999*	N/A	N/A	1.000	1.000	1.000
		Excluded	71	ML	1/2/3	97*	N/A	N/A	100	100	100
		Included	77	Bayesian	1	1.000*	1.000*	1.000	1.000	1.000	1.000
		Included	77	ML	1	99*	100*	100	100	100	100

Appendix 2—table 2

The a posteriori phylogenetic placement likelihood for eight ceratomorph (rhino and tapir) outgroups.

These analyses used a ML evolutionary placement algorithm, whilst varying the partition set used (all or reduced), and either including or excluding Hippidion sequences. Likelihoods >0.95 are in bold. Topology numbers refer to those outlined in Appendix 2—figure 3. Genbank accession numbers are given in parentheses after outgroup names.

https://doi.org/10.7554/eLife.29944.029

Partitions	Outgroup	Hippidion?	Included				Excluded
Partitions	Outgroup	Topology	1	2	3	6	1/2/3	4/6	5/7
All	Tapirus terrestris (AJ428947)		0.456	0.317	0.205	0.018	0.549	0.313	0.139
	Tapirus indicus (NC023838)		0.275	0.105	0.225	0.389	0.050	0.908	0.042
	Coelodonta antiquitatis (NC012681)		0.998				0.248	0.451	0.301
	Dicerorhinus sumatrensis (NC012684)		0.981		0.009		0.155	0.553	0.292
	Rhinoceros unicornis (NC001779)		0.998				0.529	0.334	0.137
	Rhinoceros sondaicus (NC012683)		0.989	0.006			0.732	0.196	0.072
	Ceratotherium simum (NC001808)		0.448	0.499	0.053		0.949	0.018	0.033
	Diceros bicornis (NC012682)		0.917	0.065	0.018		0.851	0.073	0.076
Reduced	Tapirus terrestris (AJ428947)		0.410	0.391	0.199		0.987		0.012
	Tapirus indicus (NC023838)		0.536	0.298	0.166		0.995
	Coelodonta antiquitatis (NC012681)		0.411	0.554	0.035		1.000
	Dicerorhinus sumatrensis (NC012684)		0.983	0.015			1.000
	Rhinoceros unicornis (NC001779)		0.998				1.000
	Rhinoceros sondaicus (NC012683)		0.895	0.102			1.000
	Ceratotherium simum (NC001808)		0.296	0.704			1.000
	Diceros bicornis (NC012682)		0.996				1.000

We further investigated the effect of outgroup choice by using an evolutionary placement algorithm (EPA; [Berger et al., 2011]) to place the outgroup sequences into an unrooted ML phylogeny a posteriori using the same set of variables described above. We find that the outgroup placement likelihood is increased with the inclusion of Hippidion sequences, and that the only placements with a likelihood of ≥0.95 are consistent with topology one (Appendix 2—figure 3; Appendix 2—table 2), in agreement with the Bayesian and ML phylogenetic analyses. The phylogenetic and EPA analyses demonstrate that outgroup choice can greatly impact equid phylogenetic inference and that multiple outgroups should be used for resolving relationships between major equid groups.

We lastly ran Bayesian timetree analyses in BEAST in the absence of an outgroup, whilst including or excluding the fast-evolving partitions, including or excluding the E. ovodovi sequence, and constraining the root prior or not. All BEAST analyses yielded a maximum clade credibility tree that is consistent with topology one (Figure 1 and Appendix 2—figure 3) with Bayesian posterior probability support for the NWSL equid-Equus and Equus clades of 0.996–1.000 (Figure 1—source data 1). Altogether, the phylogenetic, EPA, and timetree analyses support topology one (Appendix 2—figure 3), with NWSL equids falling outside of Equus, and therefore the NWSL equids as a separate genus, Haringtonhippus.

Placement of previously published NWSL equid sequences

To confirm that all 15 previously published NWSL equid samples with available mtDNA sequence data (Barrón-Ortiz et al., 2017; Vilstrup et al., 2013; Weinstock et al., 2005) belong to H. francisci, we either reconstructed mitochondrial genomes for these samples (JW277, JW161; [Weinstock et al., 2005]), placed the sequences into a ML phylogeny a posteriori using the EPA whilst varying the partitioning scheme and inclusion or exclusion of Hippidion (Appendix 2—table 3), or both. For JW277 and JW161, the mitochondrial genomes were consistent with those derived from the newly analyzed samples (Figure 1—figure supplement 1). For eight other NWSL equid mitochondrial sequences (JW125, JW126, JW328, EQ3, EQ9, EQ13, EQ22, EQ41; [Barrón-Ortiz et al., 2017; Vilstrup et al., 2013; Weinstock et al., 2005]), including samples from Mineral Hill Cave and Dry Cave (Supplementary file 1), the EPA strongly supported a ML placement within the NWSL equid clade (cumulative likelihood of 0.974–1.000). The EPA placed four sequences from Dry Cave, San Josecito Cave, and the Edmonton area (EQ1, EQ4, EQ16, EQ30; [Barrón-Ortiz et al., 2017]) within the NWSL equid clade albeit with lower support (cumulative likelihood of 0.703–0.854). We note that in the case of EQ4 from Edmonton, this may be due to very limited available sequence data (117 bp). For EQ1, EQ16, and EQ30, the placement with the second greatest support is the branch leading to NWSL equids (cumulative likelihood of 0.138–0.259), which, assuming high fidelity of the sequence data, may indicate that these samples fall outside of, but close to, sampled NWSL equid mitochondrial diversity. However, the EPA placed the remaining sample (MS272; [Vilstrup et al., 2013]) on the branch leading to NWSL equids with strong support (likelihood: 1.000). We therefore explored whether this is real or if the published sequence for MS272 was problematic.

Appendix 2—table 3

The a posteriori phylogenetic placement likelihood for 21 published equid mitochondrial sequences.

These analyses used the ML evolutionary placement algorithm, whilst varying the partition set used (all or reduced), and either including or excluding Hippidion sequences. Sample names are given in parentheses after the species or group name. Localities are given for NWSL equids only. Likelihoods >0.95 are in bold. *Equus includes only caballines and non-caballine equids (NCE). **For EQ04 from Alberta, other placement likelihood values for the Hippidion included/excluded partitions were: Within caballines: 0.003/0.002, Sister to caballines: 0.002/0.002, Within NCE: 0.246/0.245, Sister to NCE: 0.004/0.003. No placements were returned for ‘within Hippidion’. bp: base pairs.

https://doi.org/10.7554/eLife.29944.030

Hippidion?	Partition	Published sample	Sequence length (bp)	Locality	Placement
Hippidion?	Partition	Published sample	Sequence length (bp)	Locality	Sister to E. ovodovi	Sister to Hippidion	Within NWSL	Sister to NWSL	Sister to Equus*	Other**
Included	All	E. ovodovi (ACAD2305)	688		1.000
		E. ovodovi (ACAD2302)	688		1.000
		E. ovodovi (ACAD2303)	688		1.000
		H. devillei (ACAD3615)	476		N/A	1.000
		H. devillei (ACAD3625)	543		N/A	1.000
		H. devillei (ACAD3627)	543		N/A	1.000
		H. devillei (ACAD3628)	543		N/A	0.999
		H. devillei (ACAD3629)	476		N/A	0.999
		NWSL equid (JW125)	720	Klondike, YT	N/A		0.996
		NWSL equid (JW126)	720	Klondike, YT	N/A		0.999
Included	All	NWSL equid (EQ01)	620	Dry Cave, NM	N/A		0.735	0.256
		NWSL equid (EQ03)	117	Dry Cave, NM	N/A	0.002	0.974	0.011	0.003
		NWSL equid (EQ04)	117	Edmonton, AB	N/A	0.004	0.703	0.014	0.007	0.255
		NWSL equid (EQ09)	620	Natural Trap Cave, WY	N/A		0.981	0.014
		NWSL equid (EQ13)	620	Natural Trap Cave, WY	N/A		0.992
		NWSL equid (EQ16)	464	Dry Cave, NM	N/A		0.854	0.138
		NWSL equid (EQ22)	620	Natural Trap Cave, WY	N/A		0.999
		NWSL equid (EQ30)	393	San Josecito Cave, MX-NL	N/A		0.792	0.198
		NWSL equid (EQ41)	398	Natural Trap Cave, WY	N/A		0.997
		NWSL equid (JW328)	mitogenome	Mineral Hill Cave, NV	N/A		1.000
		NWSL equid (MS272)	mitogenome	Klondike, YT	N/A			1.000
	Reduced	NWSL equid (JW328)	mitogenome	Mineral Hill Cave, NV	N/A		0.996
	Reduced	NWSL equid (MS272)	mitogenome	Klondike, YT	N/A			1.000
Excluded	All	E. ovodovi (ACAD2305)	688		1.000	N/A			N/A
		E. ovodovi (ACAD2302)	688		1.000	N/A			N/A
		E. ovodovi (ACAD2303)	688		1.000	N/A			N/A
		NWSL equid (JW125)	720	Klondike, YT	N/A	N/A	0.996		N/A
		NWSL equid (JW126)	720	Klondike, YT	N/A	N/A	0.999		N/A
		NWSL equid (EQ01)	620	Dry Cave, NM	N/A	N/A	0.731	0.259	N/A
		NWSL equid (EQ03)	117	Dry Cave, NM	N/A	N/A	0.980	0.010	N/A
		NWSL equid (EQ04)	117	Edmonton, AB	N/A	N/A	0.721	0.013	N/A	0.252
		NWSL equid (EQ09)	620	Natural Trap Cave, WY	N/A	N/A	0.987	0.008	N/A
		NWSL equid (EQ13)	620	Natural Trap Cave, WY	N/A	N/A	0.993		N/A
		NWSL equid (EQ16)	464	Dry Cave, NM	N/A	N/A	0.844	0.148	N/A
		NWSL equid (EQ22)	620	Natural Trap Cave, WY	N/A	N/A	0.999		N/A
		NWSL equid (EQ30)	393	San Josecito Cave, MX-NL	N/A	N/A	0.788	0.203	N/A
		NWSL equid (EQ41)	398	Natural Trap Cave, WY	N/A	N/A	0.995		N/A
		NWSL equid (JW328)	mitogenome	Mineral Hill Cave, NV	N/A	N/A	1.000		N/A
		NWSL equid (MS272)	mitogenome	Klondike, YT	N/A	N/A		1.000	N/A
	Reduced	NWSL equid (JW328)	mitogenome	Mineral Hill Cave, NV	N/A	N/A	0.995		N/A
	Reduced	NWSL equid (MS272)	mitogenome	Klondike, YT	N/A	N/A		1.000	N/A

We first tested the EPA on eight other equid mitochondrial sequences (E. ovodovi, n = 3; Hippidion devillei, n = 5), which grouped as expected from previous analyses (likelihood: 0.999–1.000; Appendix 2—table 3; [Orlando et al., 2009]). We then used our mitochondrial genome assembly pipeline to reconstruct a consensus for MS272 from the raw data used by Vilstrup et al. (2013), which resulted in a different sequence that was consistent with other NWSL equids. To confirm this new sequence, we used the original MS272 DNA extract for library preparation, target enrichment, and sequencing. The consensus from this analysis was identical to our new sequence.

We sought to understand the origins of the problems associated with the published MS272 sequence. We first applied our synapomorphy analysis. For the called bases, we found that the published MS272 sequence contained 0/384 diagnostic bases for Hippidion, 124/164 for Haringtonhippus, and 16/70 for Equus (Appendix 1—table 2—source data 1). We infer from this analysis that the published MS272 sequence is therefore ~76% Haringtonhippus and that ~23% originates from Equus. The presence of Equus synapomorphies could be explained by the fact that the enriched library for MS272 was sequenced on the same run as ancient caballine horses (Equus), thereby potentially introducing contaminating reads from barcode bleeding (Kircher et al., 2012), which may have been exacerbated by alignment to the modern horse reference mitochondrial genome with BWA-aln and consensus calling using SAMtools (Vilstrup et al., 2013). The presence of caballine horse sequence in the published MS272 mtDNA genome explains why previous phylogenetic analyses of mitochondrial genomes have recovered NWSL equids as sister to caballine Equus with strong statistical support (Der Sarkissian et al., 2015; Vilstrup et al., 2013).

Resolving the phylogenetic placement of NWSL equids using nuclear genomes

The horse and donkey genomes are representative of total Equus genomic diversity (Jónsson et al., 2014), and so, if NWSL equids are Equus, we should expect their genomes to be more similar to either horse or donkey than to the alternative.

Initial analyses based on approach one (see Appendix 1) were inconclusive, with some NWSL equid samples appearing to fall outside of Equus (higher relative transversion frequency between the NWSL equid and the horse or donkey than between the horse and donkey) and others inconsistently placed in the phylogeny, appearing most closely related to horse when aligned to the horse genome and most closely related donkey when aligned to the donkey genome (Figure 1—source data 2). We then used approaches two and three in an attempt to standardize between the horse and donkey reference genomes, and therefore reduce potential bias introduced from the reference genome. In the latter union-based approach, mapping should not be disproportionately sensitive to regions of the genome where NWSL equids are more horse- or donkey-like. These approaches, however, were not successful, but we noted that relative private transversion frequency for the coordinate genome and NWSL equid sequences correlated with mean DNA fragment length (Appendix 2—figure 4 and Figure 1—source data 2). We therefore used approach four to control for the large variation in mean DNA fragment length between NWSL equid sequences (Appendix 2—figure 2 and Figure 1—source data 2), which is likely due to a combination of DNA preservation and differences in the DNA extraction and library preparation techniques used (Figure 1—source data 2). This allowed for direct comparison between the NWSL equid samples, which showed a consistent pattern across read length bins (Figure 1—figure supplement 2, Figure 1—source data 1). The relative private transversion frequency for both the coordinate genome and NWSL equid sequences increase with read length until the 90–99 bp bin, at which point the coordinate genome and alternate sequence relative private transversion frequencies converge (defined as a ratio between 0.95–1.05) and the NWSL equid relative private transversion frequencies reach plateau at between 1.40–1.56× greater than that of the horse or donkey (Figure 1—figure supplements 2–3, Figure 1—source data 1).

Appendix 2—figure 4

Download asset Open asset

A comparison of relative private transversion frequencies between the nuclear genomes of a caballine *Equus* (horse, E.

*caballus*; green), a non-caballine *Equus* (donkey, *E. asinus*; red), and the 17 New World-stilt legged (NWSL) equid samples (=*Haringtonhippus francisci*; blue), using approach three (Appendix 1), with samples ordered by increasing mean mapped read length. Analyses are based on alignment to the horse (A) or donkey (B) genome coordinates.

https://doi.org/10.7554/eLife.29944.031

A greater relative private transversion frequency in NWSL equids, as compared to horse and donkey, is consistent with their being more diverged than the horse-donkey split (Equus) and therefore supports the hypothesis of NWSL equids as a separate genus (Haringtonhippus).

Sex determination from nuclear genomes

We inferred the sex of our 17 NWSL equid samples by calculating the ratio of relative mapping frequencies between the autosomes and X-chromosome (Appendix 2—table 4—source data 1). We find that at least four of our samples are male and at least eight are female (Appendix 2—table 4).

Appendix 2—table 4

Sex determination analysis of 17 NWSL equids.

Chromosome ratio is the relative mapping frequency ratio between all autosomes and the X-chromosome. Males are inferred if the ratio is 0.45–0.55 and females if the ratio is 0.9–1.1.

https://doi.org/10.7554/eLife.29944.032

Sample	Museum accession	Chromosome ratio	Inferred sex
AF037	YG 402.235	0.48	male
JK166	LACM(CIT) 109/150807	0.93	female
JK167	LACM(CIT) 109/149291	0.91	female
JK207	LACM(CIT) 109/156450	0.92	female
JK260	KU 47800	0.95	female
JK276	KU 53678	0.91	female
MS341	YG 303.1085	0.50	male
MS349	YG 130.55	0.48	male
MS439	YG 401.387	0.98	female
PH008	YG 404.205	0.90	female
PH013	YG 130.6	0.87	probable female
PH014	YG 303.371	0.46	male
PH015	YG 404.662	0.44	probable male
PH021	YG 29.169	0.83	probable female
PH023	YG 160.8	0.91	female
PH036	YG 76.2	0.81	probable female
PH047	YG 404.663	0.88	probable female

Appendix 2—table 4—source data 1 Data from the sex determination analyses of 17 NWSL equids, based on alignment to the horse genome (EquCab2).: https://doi.org/10.7554/eLife.29944.033
Download elife-29944-app2-table4-data1-v1.xlsx

We note that all three Gypsum Cave samples are inferred to be female, have statistically indistinguishable radiocarbon dates, and identical mtDNA genome sequences (Figure 1—figure supplement 1b, Supplementary file 1). However, the skull was found in room four of the cave, whereas the femur and metatarsal were found in room three. The available evidence therefore suggests that these samples represent at least two individuals.

Intriguingly, we further note that, across all 17 NWSL equid samples, the relative mapping frequency for chromosomes 8 and 13 is appreciably greater than the remaining autosomes (Appendix 2—table 4—source data 1). This may suggest that duplicated regions of these chromosomes are present in NWSL equids, as compared to the horse (E. caballus).

Designation of a type species for Haringtonhippus

We sought to designate a type species for the NWSL equid genus, Haringtonhippus, using an existing name, in order to avoid adding to the unnecessarily extensive list of Pleistocene North American equid species names (Winans, 1985). For this, we scrutinized nine names that have previously been assigned to NWSL equids in order of priority (date the name was first described in the literature). We rejected names that were solely based on dentitions, as these anatomical features are insufficient for delineating between equid groups (Groves and Willoughby, 1981). The earliest named species with a valid, diagnostic holotype is francisci Hay (1915). On the basis of taxonomic priority, stratigraphic age, and cranial and metatarsal comparisons (see main results and below), we conclude that francisci Hay (1915) is the most appropriate name for Haringtonhippus. We note that this middle Pleistocene species is also small, like our late Pleistocene specimens.

The nine examined names were:

conversidens Owen, 1869: a small species based upon a partial palate from Tepeyac Mountain, northeast of Mexico City, Mexico. The type fossil has no reliably diagnostic features other than small size, and no more diagnostic topotypal remains are available. For this reason, the validity of the name has previously been challenged by some authors (e.g., Winans, 1985; MacFadden, 1992). However, Scott, 2004 argued for retaining the name because of its long history of use and utility in promoting taxonomic stability; that study explicitly considered the species to be a small, stout-limbed equid, following the conventions of numerous previous investigations. Following this interpretation, the name conversidens would not be available for NWSL equids assigned herein to Haringtonhippus. We note in this context that Barrón-Ortiz et al. (2017) obtained mtDNA from an equid tooth (EQ30) from San Josecito Cave, Mexico, whose fossil equid assemblage has been assigned by earlier authors (e.g., Azzaroli, 1992; Scott, 2004) to Equus conversidens. Although this fossil assemblage consists of non-NWSL equids, the mtDNA obtained from the tooth indicated placement within the NWSL equid clade (see also Appendix 2—table 3). This finding led Barrón-Ortiz et al. (2017) to infer some degree of plasticity in the metapodial proportions of the NWSL equids, and to select conversidens as their preferred species name for them. We do not follow this interpretation for two reasons: (1) the holotype of the species conversidens is nondiagnostic; and (2) selecting a stout-limbed equid species for NWSL equids is problematic.

tau Owen, 1869: a small species erected based upon an upper cheek tooth series lacking the P² from the Valley of Mexico. Other than small size, the species has no reliably diagnostic features. The holotype specimen has been lost, and no topotypal material is available, and so determining whether or not the species represents a NWSL equid is impossible. Eisenmann et al., 2008 proposed a neotype specimen for the species, consisting of a cranium (FC 673), but this is rejected here on technical grounds: (1) the proposed neotype fossil was listed as being part of a private collection, which negates its use as a neotype; (2) ICZN rules require that a neotype be ‘consistent with what is known of the former name-bearing type from the original description and from other sources’ and derive from ‘as nearly as practicable from the original type locality … and, where relevant, from the same geological horizon or host species as the original name-bearing type’.

semiplicatus Cope, 1893: based upon an isolated upper molar tooth from Rock Creek, Texas. The specimen has been interpreted to be derived from the same species as the holotype metatarsal of ‘E’. calobatus Troxell (see below) (Azzaroli, 1995; Quinn, 1957; Sandom et al., 2014).

littoralis Hay, 1913: based upon an upper cheek tooth from Peace Creek, Florida. The tooth is small, but offers no diagnostic features.

francisci Hay, 1915: Named in April of 1915 based upon a partial skeleton, including the skull, mandible, and a broken MTIII (TMM 34–2518). Confidently determined to be a NWSL equid based upon reconstruction of the right MTIII by Lundelius and Stevens, 1970.

calobatus Troxell, 1915: Named in June of 1915 based upon limb bones. No holotype designated, but lectotype erected by Hibbard, 1953 (YPM 13470, right MTIII).

altidens Quinn, 1957: based upon a partial skeleton from Blanco Creek, Texas that exhibits elongate metapodials. Synonymized with francisci Hay by Winans, 1985.

zoyatalis Mooser, 1958: based upon a partial mandible including the symphyseal region and the right dentary with p2-m3. Synonymized with francisci Hay by Winans, 1985.

quinni Slaughter et al. 1962: based upon a MTIII (SMP 60578) and other referred elements from Texas. Synonymized with francisci Hay by Lundelius and Stevens, 1970 and Winans, 1985.

Anatomical comparison of the francisci holotype and Gypsum Cave crania

We compared the holotype of francisci Hay (TMM 34–2518) from Texas to the Gypsum Cave cranium (LACM(CIT) 109/156450) from Nevada, the latter of which was assigned to Haringtonhippus using palaeogenomic data (Figure 2—figure supplement 1). Although there are minor anatomical differences between the two crania, which are outlined below, we consider these to fall within the range of intraspecific variation.

The skull from Gypsum Cave (GCS) can be distinguished from that of the francisci holotype (fHS) by its slightly larger size, and markedly longer and more slender rostrum, both absolutely and as a percentage of the skull length. The rostrum of the GCS is also absolutely narrower; the fHS, despite being the smaller skull, is transversely broader at the i/3. The palatine foramina are positioned medial to the middle of the M² in the GCS, whereas they are medial to the M²-M³ junction in the fHS. Viewed laterally, the orbits of the GCS have more pronounced supraorbital ridges than those of the fHS. The latter skull also exhibits somewhat stronger basicranial flexion than the GCS. Dentally, the GCS exhibits arcuate protocones, with strong anterior heels and marked lingual troughs in P³-M³; the fHS has smaller, triangular protocones with less pronounced anterior heels and no lingual trough or groove. These characters are not thought to result from different ontogenetic stages, since both specimens appear to be of young adults (all teeth in wear and tall in the jaw). Both the GCS and the fHS have relatively simple enamel patterns on the cheek teeth, with few evident plications. Not only are the observed differences between these two specimens unlikely to result from ontogeny, they also don't result from sex, since both skulls appear to be females given the absence of canine teeth. The inference of the GCS being female is further supported by palaeogenomic data (Appendix 2—table 4).

Attempt to recover DNA from the francisci holotype

We attempted to retrieve endogenous mitochondrial and nuclear DNA from the holotype of francisci Hay (TMM 34–2518), to directly link this anatomically-derived species name with our palaeogenomically-derived genus name Haringtonhippus, but were unsuccessful.

After sequencing a library enriched for equid mitochondrial DNA (see Appendix 1), we could only align 11 reads to the horse reference mitochondrial genome sequence with BWA. Using the basic local alignment search tool (BLASTn), we show that these reads are 100% match to human and therefore likely originate from contamination. We repeated this approach using MIA and aligned 166 reads, which were concentrated in 20 regions of the mitochondrial genome. We identified these sequences as human (n = 18, 96–100% identity), cow (n = 1, 100%), or Aves (n = 1, 100%), consistent with the absence of endogenous mitochondrial DNA in this sample.

We further generated ~800,000 reads from the unenriched library for TMM 34–2518, and followed a modified metagenomic approach, outlined in (Graham et al., 2016), to assess if any endogenous DNA was present. We mapped the reads to the horse reference genome (EquCab2), using the BWA-aln settings of (Graham et al., 2016), of which 538 reads aligned. We then compared these aligned reads to the BLASTn database. None of the reads uniquely hit Equidae or had a higher score to Equidae than non-Equidae, whereas 492 of the reads either uniquely hit non-Equidae or had a higher score to non-Equidae than Equidae. These results are consistent with either a complete lack, or an ultra-low occurrence, of endogenous DNA in TMM 34–2518.

Morphometric analysis of third metatarsals

Stilt- and stout-legged equids can be distinguished with high accuracy (98.2%; logistic regression) on the basis of third metatarsal (MTIII) morphology (Figure 2c, Appendix 1—table 2—source data 1, and Appendix 2—table 4—source data 1), which has the potential to easily and confidently distinguish candidates from either group prior to more costly genetic testing. We note that future genetic analysis of ambiguous specimens, that cross the ‘middle ground’ between stilt- and stout-legged regions of morphospace, could open the possibility of a simple length-vs-width definition for these two morphotypes. Furthermore, we can highlight potential misidentifications, such as the two putative E. lambei specimens that fall within stilt-legged morphospace (Figure 2c), which could then be tested by genetic analysis. Intriguingly, an Old World E. ovodovi (stilt-legged; MT no. 6; [Eisenmann and Sergej, 2011]) and New World E. cf. scotti (stout-legged; CMN 29867) specimen directly overlap in a stout-legged region of morphospace (Figure 2c), which could indicate that either this E. ovodovi specimen was misidentified or that this species straddles the delineation between stilt- and stout-legged morphologies.

H. francisci occupies a region of morphospace distinct from caballine/stout-legged Equus, but overlaps considerably with hemionine/stilt-legged Equus (Figure 2c). The holotype of H. francisci (TMM 34–2518) is very pronounced in its slenderness; it has a greater MTIII length than most other H. francisci but slightly smaller width/breadth measurements. This holotype is surpassed in these dimensions only by the quinni Slaughter et al. holotype, which has itself previously been synonymized with francisci Hay (Lundelius and Stevens, 1970; Winans, 1985). This suggests a potentially larger range of MTIII morphology for H. francisci than exhibited by the presently assigned specimens. We observe that this diversity may be influenced by geography, with H. francisci specimens from high-latitude Beringia having shorter MTIIIs relative to those from the lower-latitude contiguous USA.

We note that two New World caballine Equus from Yukon, E. cf. scotti and E. lambei, appear to separate in morphospace (Figure 2c), primarily by MTIII length, supporting the potential delineation of these two taxa using MTIII morphology alone.

Data availability

The following data sets were generated

(2017) Nuclear DNA sequences from 17 Haringtonhippus francisci fossils
Publicly available at NCBI Short Read Archive (accession no: PRJNA384940).

https://www.ncbi.nlm.nih.gov/bioproject/PRJNA384940
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 303.371
Publicly available at NCBI GenBank (accession no:KT168317).

https://www.ncbi.nlm.nih.gov/nuccore/KT168317
(2017) Mitochondrial genome sequence from YG 133.16
Publicly available at NCBI GenBank (accession no:KT168318).

https://www.ncbi.nlm.nih.gov/nuccore/KT168318
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 29.169
Publicly available at NCBI GenBank (accession no:KT168319).

https://www.ncbi.nlm.nih.gov/nuccore/KT168319
(2017) Mitochondrial genome sequence from YG 401.387
Publicly available at NCBI GenBank (accession no:KT168320).

https://www.ncbi.nlm.nih.gov/nuccore/KT168320
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 404.663
Publicly available at NCBI GenBank (accession no:KT168321).

https://www.ncbi.nlm.nih.gov/nuccore/KT168321
(2017) Mitochondrial genome sequence from YG 328.54
Publicly available at NCBI GenBank (accession no:KT168322).

https://www.ncbi.nlm.nih.gov/nuccore/KT168322
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 378.5
Publicly available at NCBI GenBank (accession no:KT168323).

https://www.ncbi.nlm.nih.gov/nuccore/KT168323
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 404.478
Publicly available at NCBI GenBank (accession no:KT168324).

https://www.ncbi.nlm.nih.gov/nuccore/KT168324
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 402.235
Publicly available at NCBI GenBank (accession no:KT168325).

https://www.ncbi.nlm.nih.gov/nuccore/KT168325
(2017) Mitochondrial genome sequence from YG 130.55
Publicly available at NCBI GenBank (accession no:KT168326).

https://www.ncbi.nlm.nih.gov/nuccore/KT168326
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 198.1
Publicly available at NCBI GenBank (accession no:KT168327).

https://www.ncbi.nlm.nih.gov/nuccore/KT168327
(2017) Mitochondrial genome sequence from YG 303.1085
Publicly available at NCBI GenBank (accession no:KT168328).

https://www.ncbi.nlm.nih.gov/nuccore/KT168328
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 130.6
Publicly available at NCBI GenBank (accession no:KT168329).

https://www.ncbi.nlm.nih.gov/nuccore/KT168329
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 417.13
Publicly available at NCBI GenBank (accession no:KT168330).

https://www.ncbi.nlm.nih.gov/nuccore/KT168330
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 76.2
Publicly available at NCBI GenBank (accession no:KT168331).

https://www.ncbi.nlm.nih.gov/nuccore/KT168331
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 160.8
Publicly available at NCBI GenBank (accession no:KT168332).

https://www.ncbi.nlm.nih.gov/nuccore/KT168332
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 404.662
Publicly available at NCBI GenBank (accession no:KT168333).

https://www.ncbi.nlm.nih.gov/nuccore/KT168333
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 404.480
Publicly available at NCBI GenBank (accession no:KT168334).

https://www.ncbi.nlm.nih.gov/nuccore/KT168334
(2017) Mitochondrial genome sequence from YG 401.235
Publicly available at NCBI GenBank (accession no:KT168335).

https://www.ncbi.nlm.nih.gov/nuccore/KT168335
1. Heintzman PD
2. Shapiro B
(2017) Mitochondrial genome sequence from YG 404.205
Publicly available at NCBI GenBank (accession no:KT168336).

https://www.ncbi.nlm.nih.gov/nuccore/KT168336
(2017) Mitochondrial genome sequence from LACM(CIT) 109 / 150807
Publicly available at NCBI GenBank (accession no:MF134655).

https://www.ncbi.nlm.nih.gov/nuccore/MF134655
(2017) Mitochondrial genome sequence from LACM(CIT) 109 / 149291
Publicly available at NCBI GenBank (accession no:MF134656).

https://www.ncbi.nlm.nih.gov/nuccore/MF134656
(2017) Mitochondrial genome sequence from LACM(CIT) 109 / 156450
Publicly available at NCBI GenBank (accession no:MF134657).

https://www.ncbi.nlm.nih.gov/nuccore/MF134657
(2017) Mitochondrial genome sequence from KU 47800
Publicly available at NCBI GenBank (accession no:MF134658).

https://www.ncbi.nlm.nih.gov/nuccore/MF134658
(2017) Mitochondrial genome sequence from KU 62055
Publicly available at NCBI GenBank (accession no:MF134659).

https://www.ncbi.nlm.nih.gov/nuccore/MF134659
(2017) Mitochondrial genome sequence from KU 33418
Publicly available at NCBI GenBank (accession no:MF134660).

https://www.ncbi.nlm.nih.gov/nuccore/MF134660
(2017) Mitochondrial genome sequence from KU 53678
Publicly available at NCBI GenBank (accession no:MF134661).

https://www.ncbi.nlm.nih.gov/nuccore/MF134661
(2017) Mitochondrial genome sequence from KU 50817
Publicly available at NCBI GenBank (accession no:MF134662).

https://www.ncbi.nlm.nih.gov/nuccore/MF134662
(2017) Mitochondrial genome sequence from KU 62158
Publicly available at NCBI GenBank (accession no:MF134663).

https://www.ncbi.nlm.nih.gov/nuccore/MF134663
(2017) Data from: A new genus of horse from Pleistocene North America
Available at Dryad Digital Repository under a CC0 Public Domain Dedication.

http://dx.doi.org/10.5061/dryad.8153g

References

1. Allen WR
2. Short RV
(1997) Interspecific and extraspecific pregnancies in equids: anything goes
Journal of Heredity 88:384–392.

https://doi.org/10.1093/oxfordjournals.jhered.a023123
- PubMed
- Google Scholar
1. Arnason U
2. Adegoke JA
3. Gullberg A
4. Harley EH
5. Janke A
6. Kullberg M
(2008) Mitogenomic relationships of placental mammals and molecular estimates of their divergences
Gene 421:37–51.

https://doi.org/10.1016/j.gene.2008.05.024
- PubMed
- Google Scholar
1. Azzaroli A
2. Voorhies MR
(1993)
The Genus Equus in North America. The blancan species

Palaeontographia Italica 80:175–198.
- Google Scholar
1. Azzaroli A
(1992)
Ascent and decline of monodactyl equids: a case for prehistoric overkill

Annales Zoologici Fennici 28:151–163.
- Google Scholar
1. Azzaroli A
(1995)
A synopsis of the Quaternary species of Equus in North America

Bolletino Della Societa Palaeontologica Italiana 34:205–221.
- Google Scholar
(2017) Cheek tooth morphology and ancient mitochondrial DNA of late Pleistocene horses from the western interior of North America: Implications for the taxonomy of North American Late Pleistocene Equus
PLoS One 12:e0183045.

https://doi.org/10.1371/journal.pone.0183045
- PubMed
- Google Scholar
(2010) Bone preparation at the KCCAMS laboratory
Nuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms 268:906–909.

https://doi.org/10.1016/j.nimb.2009.10.061
- Google Scholar
1. Bennett DK
(1980) Stripes Do Not a Zebra Make, Part I: A Cladistic Analysis of Equus
Systematic Zoology 29:272–287.

https://doi.org/10.2307/2412662
- Google Scholar
1. Bennett EA
2. Champlot S
3. Peters J
4. Arbuckle BS
5. Guimaraes S
6. Pruvost M
7. Bar-David S
8. Davis SJM
9. Gautier M
10. Kaczensky P
11. Kuehn R
12. Mashkour M
13. Morales-Muñiz A
14. Pucher E
15. Tournepiche JF
16. Uerpmann HP
17. Bălăşescu A
18. Germonpré M
19. Gündem CY
20. Hemami MR
21. Moullé PE
22. Ötzan A
23. Uerpmann M
24. Walzer C
25. Grange T
26. Geigl EM
(2017) Taming the late Quaternary phylogeography of the Eurasiatic wild ass through ancient and modern DNA
PLoS One 12:e0174216.

https://doi.org/10.1371/journal.pone.0174216
- PubMed
- Google Scholar
(2011) Performance, accuracy, and Web server for evolutionary placement of short sequence reads under maximum likelihood
Systematic Biology 60:291–302.

https://doi.org/10.1093/sysbio/syr010
- PubMed
- Google Scholar
1. Briggs AW
2. Good JM
3. Green RE
4. Krause J
5. Maricic T
6. Stenzel U
7. Lalueza-Fox C
8. Rudan P
9. Brajkovic D
10. Kucan Z
11. Gusic I
12. Schmitz R
13. Doronichev VB
14. Golovanova LV
15. de la Rasilla M
16. Fortea J
17. Rosas A
18. Pääbo S
(2009) Targeted retrieval and analysis of five neandertal mtDNA genomes
Science 325:318–321.

https://doi.org/10.1126/science.1174462
- PubMed
- Google Scholar
(2003) The systematic position of Equus hydruntinus, an extinct species of Pleistocene equid☆
Quaternary Research 59:459–469.

https://doi.org/10.1016/S0033-5894(03)00059-0
- Google Scholar
1. Clark PU
2. Dyke AS
3. Shakun JD
4. Carlson AE
5. Clark J
6. Wohlfarth B
7. Mitrovica JX
8. Hostetler SW
9. McCabe AM
(2009) The last glacial maximum
Science 325:710–714.

https://doi.org/10.1126/science.1172873
- PubMed
- Google Scholar
1. Dabney J
2. Knapp M
3. Glocke I
4. Gansauge MT
5. Weihmann A
6. Nickel B
7. Valdiosera C
8. García N
9. Pääbo S
10. Arsuaga JL
11. Meyer M
(2013a) Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments
PNAS 110:15758–15763.

https://doi.org/10.1073/pnas.1314445110
- PubMed
- Google Scholar
(2013b) Ancient DNA damage
Cold Spring Harbor Perspectives in Biology 5:a012567.

https://doi.org/10.1101/cshperspect.a012567
- PubMed
- Google Scholar
(2012) jModelTest 2: more models, new heuristics and parallel computing
Nature Methods 9:772.

https://doi.org/10.1038/nmeth.2109
- PubMed
- Google Scholar
1. Der Sarkissian C
2. Vilstrup JT
3. Schubert M
4. Seguin-Orlando A
5. Eme D
6. Weinstock J
7. Alberdi MT
8. Martin F
9. Lopez PM
10. Prado JL
11. Prieto A
12. Douady CJ
13. Stafford TW
14. Willerslev E
15. Orlando L
(2015) Mitochondrial genomes reveal the extinct Hippidion as an outgroup to all living equids
Biology Letters 11:20141058.

https://doi.org/10.1098/rsbl.2014.1058
- PubMed
- Google Scholar
(2012) Bayesian phylogenetics with BEAUti and the BEAST 1.7
Molecular Biology and Evolution 29:1969–1973.

https://doi.org/10.1093/molbev/mss075
- PubMed
- Google Scholar
1. Edgar RC
(2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput
Nucleic Acids Research 32:1792–1797.

https://doi.org/10.1093/nar/gkh340
- PubMed
- Google Scholar
Book
(1988)
Methodology

In: Woodburne M, Sondaar P, editors. Studying Fossil Horses, 1. Leiden: E. J. Brill. pp. 1–71.
- Google Scholar
(2008) Old World hemiones and New World slender species (Mammalia, Equidae)
Palaeovertebrata 36:159–233.

https://doi.org/10.18563/pv.36.1-4.159-233
- Google Scholar
1. Eisenmann V
2. Sergej V
(2011) Unexpected finding of a new Equus species (Mammalia, Perissodactyla) belonging to a supposedly extinct subgenus in late Pleistocene deposits of Khakassia (Southwestern Siberia)
Geodiversitas 33:519–530.

https://doi.org/10.5252/g2011n3a5
- Google Scholar
1. Eisenmann V
(1985)
L’Environnement Des Hominidés Au Plio- Pléistocène

57–79, Indications paléoécologiques fournies par les Equus (Mammalia, Perissodactyla) Plio-Pléistocènes d’Afrique, L’Environnement Des Hominidés Au Plio- Pléistocène.
- Google Scholar
1. Eisenmann V
(1992)
Origins, dispersals, and migrations of Equus (Mammalia, Perissodactyla)

CFS Courier Forschungsinstitut Senckenberg 153:161–170.
- Google Scholar
1. Eisenmann V
(2003)
Advances in Vertebrate Paleontology: Hen to Panta ; a Tribute to Constantin Rădulescu and Petre Mihai Samson

31–40, Advances in Vertebrate Paleontology: Hen to Panta ; a Tribute to Constantin Rădulescu and Petre Mihai Samson, Bucharest.
- Google Scholar
1. Enk J
2. Devault A
3. Widga C
4. Saunders J
5. Szpak P
6. Southon J
7. Rouillard J-M
8. Shapiro B
9. Golding GB
10. Zazula G
11. Froese D
12. Fisher DC
13. MacPhee RDE
14. Poinar H
(2016) Mammuthus population dynamics in late Pleistocene North America: divergence, phylogeography, and introgression
Frontiers in Ecology and Evolution 4:.

https://doi.org/10.3389/fevo.2016.00042
- Google Scholar
1. Forsten A
(1988) Middle Pleistocene replacement of stenonid horses by caballoid horses — ecological implications
Palaeogeography, Palaeoclimatology, Palaeoecology 65:23–33.

https://doi.org/10.1016/0031-0182(88)90109-5
- Google Scholar
1. Forsten A
(1992)
Mitochondrial-DNA time-table and the evolution of Equus: comparison of molecular and palaeontological evidence

Annales Zoologici Fennici 28:301–309.
- Google Scholar
1. Forsten A
(1996)
Climate and the evolution of Equus (Perissodactyla, Equidae) in the Plio-Pleistocene of Eurasia

Acta Zoologica Cracoviensia 39:161–166.
- Google Scholar
1. Froese D
2. Stiller M
3. Heintzman PD
4. Reyes AV
5. Zazula GD
6. Soares AE
7. Meyer M
8. Hall E
9. Jensen BJ
10. Arnold LJ
11. MacPhee RD
12. Shapiro B
(2017) Fossil and genomic evidence constrains the timing of bison arrival in North America
PNAS 114:3457–3462.

https://doi.org/10.1073/pnas.1620754114
- PubMed
- Google Scholar
1. Froese DG
2. Zazula GD
3. Westgate JA
4. Preece SJ
5. Sanborn PT
6. Reyes AV
7. Pearce NJG
(2009) The Klondike goldfields and Pleistocene environments of Beringia
GSA Today 19:4.

https://doi.org/10.1130/GSATG54A.1
- Google Scholar
(2008) The late Pleistocene dispersal of modern humans in the Americas
Science 319:1497–1502.

https://doi.org/10.1126/science.1153569
- PubMed
- Google Scholar
1. Graham RW
2. Belmecheri S
3. Choy K
4. Culleton BJ
5. Davies LJ
6. Froese D
7. Heintzman PD
8. Hritz C
9. Kapp JD
10. Newsom LA
11. Rawcliffe R
12. Saulnier-Talbot É
13. Shapiro B
14. Wang Y
15. Williams JW
16. Wooller MJ
(2016) Timing and causes of mid-Holocene mammoth extinction on St. Paul Island, Alaska
PNAS 113:9310–9314.

https://doi.org/10.1073/pnas.1604903113
- PubMed
- Google Scholar
Software
1. Green RE
2. Vohr SH
3. Rice ES
(2015) tri-aln-report, version 4404df2
Github.

https://github.com/Paleogenomics/Chrom-Compare
1. Groves CP
2. Willoughby DP
(1981) Studies on the taxonomy and phylogeny of the genus Equus. 1. Subgeneric classification of the recent species
Mammalia 45:.

https://doi.org/10.1515/mamm.1981.45.3.321
- Google Scholar
1. Guthrie RD
(2003) Rapid body size decline in Alaskan Pleistocene horses before extinction
Nature 426:169–171.

https://doi.org/10.1038/nature02098
- PubMed
- Google Scholar
1. Guthrie RD
(2006) New carbon dates link climatic change with human colonization and Pleistocene extinctions
Nature 441:207–209.

https://doi.org/10.1038/nature04604
- PubMed
- Google Scholar
1. Harington CR
2. Clulow FV
(1973) Pleistocene mammals from Gold Run Creek, Yukon Territory
Canadian Journal of Earth Sciences 10:697–759.

https://doi.org/10.1139/e73-069
- Google Scholar
Thesis
1. Harington CR
(1977)
PhD Thesis: Pleistocene mammals of the Yukon Territory

University of Alberta, Edmonton.
- Google Scholar
1. Harington CR
(2011) Pleistocene vertebrates of the Yukon Territory
Quaternary Science Reviews 30:2341–2354.

https://doi.org/10.1016/j.quascirev.2011.05.020
- Google Scholar
1. Hay OP
(1915) Contributions to the knowledge of the mammals of the Pleistocene of North America
Proceedings of the United States National Museum 48:515–575.

https://doi.org/10.5479/si.00963801.48-2086.515
- Google Scholar
(2015) Genomic data from extinct North American Camelops revise camel evolutionary history
Molecular Biology and Evolution 32:2433–2440.

https://doi.org/10.1093/molbev/msv128
- PubMed
- Google Scholar
1. Heintzman PD
2. Zazula GD
3. MacPhee RDE
4. Scott E
5. Cahill JA
6. McHorse BK,
7. Kapp JD,
8. Stiller M,
9. Wooller MJ,
10. Orlando L,
11. Southon JR,
12. Froese DG
13. Shapiro B
(2017)
Data from: a new genus of horse from pleistocene North America

Dryad Digital Repository.
- Google Scholar
1. Hibbard CW
(1953) Equus (Asinus) calobatus Troxell and associated vertebrates from the Pleistocene of Kansas
Transactions of the Kansas Academy of Science 56:111–126.

https://doi.org/10.2307/3626201
- Google Scholar
(2013) mapDamage2.0: fast approximate bayesian estimates of ancient DNA damage parameters
Bioinformatics 29:1682–1684.

https://doi.org/10.1093/bioinformatics/btt193
- PubMed
- Google Scholar
(2014) Speciation with gene flow in equids despite extensive chromosomal plasticity
PNAS 111:18655–18660.

https://doi.org/10.1073/pnas.1412627111
- PubMed
- Google Scholar
1. Kearse M
2. Moir R
3. Wilson A
4. Stones-Havas S
5. Cheung M
6. Sturrock S
7. Buxton S
8. Cooper A
9. Markowitz S
10. Duran C
11. Thierer T
12. Ashton B
13. Meintjes P
14. Drummond A
(2012) Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data
Bioinformatics 28:1647–1649.

https://doi.org/10.1093/bioinformatics/bts199
- PubMed
- Google Scholar
1. Kim KS
2. Lee SE
3. Jeong HW
4. Ha JH
(1998) The complete nucleotide sequence of the domestic dog (Canis familiaris) mitochondrial genome
Molecular Phylogenetics and Evolution 10:210–220.

https://doi.org/10.1006/mpev.1998.0513
- PubMed
- Google Scholar
(2012) Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform
Nucleic Acids Research 40:e3.

https://doi.org/10.1093/nar/gkr771
- PubMed
- Google Scholar
1. Koch PL
2. Barnosky AD
(2006) Late quaternary extinctions: state of the debate
Annual Review of Ecology, Evolution, and Systematics 37:215–250.

https://doi.org/10.1146/annurev.ecolsys.34.011802.132415
- Google Scholar
1. Langmead B
2. Salzberg SL
(2012) Fast gapped-read alignment with Bowtie 2
Nature Methods 9:357–359.

https://doi.org/10.1038/nmeth.1923
- PubMed
- Google Scholar
1. Li H
2. Durbin R
(2010) Fast and accurate long-read alignment with Burrows-Wheeler transform
Bioinformatics 26:589–595.

https://doi.org/10.1093/bioinformatics/btp698
- PubMed
- Google Scholar
(2009) The sequence alignment/Map format and SAMtools
Bioinformatics 25:2078–2079.

https://doi.org/10.1093/bioinformatics/btp352
- PubMed
- Google Scholar
Book
1. Linnaeus C
(1758)
Systema Naturae Per Regna Tria Naturae, Secundum Classes, Ordines, Genera, Species, Cum Characteribus, Differentiis, Synonymis, Locis

Salvius: Holmiæ.
- Google Scholar
(2011) Whole mitochondrial genome sequencing of domestic horses reveals incorporation of extensive wild horse diversity during domestication
BMC Evolutionary Biology 11:328.

https://doi.org/10.1186/1471-2148-11-328
- PubMed
- Google Scholar
1. Lundelius EL
2. Stevens MS
(1970)
Equus francisci Hay, a small stilt-legged horse, middle Pleistocene of Texas

Journal of Palaeontology 44:148–153.
- Google Scholar
1. Luo Y
2. Chen Y
3. Liu F
4. Jiang C
5. Gao Y
(2011) Mitochondrial genome sequence of the Tibetan wild ass (Equus kiang)
Mitochondrial DNA 22:6–8.

https://doi.org/10.3109/19401736.2011.588221
- PubMed
- Google Scholar
(1992)
The Species, Genera, and Tribes of the Living and Extinct Horses of the World 1758-1966

The Species, Genera, and Tribes of the Living and Extinct Horses of the World 1758-1966.
- Google Scholar
Book
1. MacFadden BJ
(1992)
Fossil Horses: Systematics, Palaeobiology, and Evolution of the Family Equidae

Cambridge University Press.
- Google Scholar
1. MacFadden BJ
(1998)
Evolution of Tertiary Mammals of North America

537–559, Equidae, Evolution of Tertiary Mammals of North America.
- Google Scholar
1. Martin FM
2. Borrero LA
(2017) Climate change, availability of territory, and Late Pleistocene human exploration of Ultima Esperanza, South Chile
Quaternary International 428:86–95.

https://doi.org/10.1016/j.quaint.2015.06.023
- Google Scholar
1. Martin FM
2. Todisco D
3. Rodet J
4. San Román M
5. Morello F
6. Prevosti F
7. Stern C
8. Borrero LA
(2015) Nuevas excavaciones en cueva del medio: procesos de formación de la cueva y avances en los estudios de interacción entre cazadores-recolectores y fauna extinta (Pleistoceno final, Patagonia Meridional)
Magallania 43:165–189.

https://doi.org/10.4067/S0718-22442015000100010
- Google Scholar
1. Meyer M
2. Kircher M
(2010) Illumina sequencing library preparation for highly multiplexed target capture and sequencing
Cold Spring Harbor Protocols 2010:pdb.prot5448.

https://doi.org/10.1101/pdb.prot5448
- PubMed
- Google Scholar
(2016) The complete mitochondrial genome of the Asian tapirs (Tapirus indicus): the only extant Tapiridae species in the Old World
Mitochondrial DNA 27:413–415.

https://doi.org/10.3109/19401736.2014.898283
- PubMed
- Google Scholar
1. O'Dea A
2. Lessios HA
3. Coates AG
4. Eytan RI
5. Restrepo-Moreno SA
6. Cione AL
7. Collins LS
8. de Queiroz A
9. Farris DW
10. Norris RD
11. Stallard RF
12. Woodburne MO
13. Aguilera O
14. Aubry MP
15. Berggren WA
16. Budd AF
17. Cozzuol MA
18. Coppard SE
19. Duque-Caro H
20. Finnegan S
21. Gasparini GM
22. Grossman EL
23. Johnson KG
24. Keigwin LD
25. Knowlton N
26. Leigh EG
27. Leonard-Pingel JS
28. Marko PB
29. Pyenson ND
30. Rachello-Dolmen PG
31. Soibelzon E
32. Soibelzon L
33. Todd JA
34. Vermeij GJ
35. Jackson JB
(2016) Formation of the Isthmus of Panama
Science Advances 2:e1600883.

https://doi.org/10.1126/sciadv.1600883
- PubMed
- Google Scholar
1. Orlando L
2. Ginolhac A
3. Zhang G
4. Froese D
5. Albrechtsen A
6. Stiller M
7. Schubert M
8. Cappellini E
9. Petersen B
10. Moltke I
11. Johnson PL
12. Fumagalli M
13. Vilstrup JT
14. Raghavan M
15. Korneliussen T
16. Malaspinas AS
17. Vogt J
18. Szklarczyk D
19. Kelstrup CD
20. Vinther J
21. Dolocan A
22. Stenderup J
23. Velazquez AM
24. Cahill J
25. Rasmussen M
26. Wang X
27. Min J
28. Zazula GD
29. Seguin-Orlando A
30. Mortensen C
31. Magnussen K
32. Thompson JF
33. Weinstock J
34. Gregersen K
35. Røed KH
36. Eisenmann V
37. Rubin CJ
38. Miller DC
39. Antczak DF
40. Bertelsen MF
41. Brunak S
42. Al-Rasheid KA
43. Ryder O
44. Andersson L
45. Mundy J
46. Krogh A
47. Gilbert MT
48. Kjær K
49. Sicheritz-Ponten T
50. Jensen LJ
51. Olsen JV
52. Hofreiter M
53. Nielsen R
54. Shapiro B
55. Wang J
56. Willerslev E
(2013) Recalibrating Equus evolution using the genome sequence of an early middle Pleistocene horse
Nature 499:74–78.

https://doi.org/10.1038/nature12323
- PubMed
- Google Scholar
1. Orlando L
2. Male D
3. Alberdi MT
4. Prado JL
5. Prieto A
6. Cooper A
7. Hänni C
(2008) Ancient DNA clarifies the evolutionary history of American late Pleistocene equids
Journal of Molecular Evolution 66:533–538.

https://doi.org/10.1007/s00239-008-9100-x
- PubMed
- Google Scholar
1. Orlando L
2. Mashkour M
3. Burke A
4. Douady CJ
5. Eisenmann V
6. Hänni C
(2006) Geographic distribution of an extinct equid (Equus hydruntinus: Mammalia, Equidae) revealed by morphological and genetical analyses of fossils
Molecular Ecology 15:2083–2093.

https://doi.org/10.1111/j.1365-294X.2006.02922.x
- PubMed
- Google Scholar
1. Orlando L
2. Metcalf JL
3. Alberdi MT
4. Telles-Antunes M
5. Bonjean D
6. Otte M
7. Martin F
8. Eisenmann V
9. Mashkour M
10. Morello F
11. Prado JL
12. Salas-Gismondi R
13. Shockey BJ
14. Wrinn PJ
15. Vasil'ev SK
16. Ovodov ND
17. Cherry MI
18. Hopwood B
19. Male D
20. Austin JJ
21. Hänni C
22. Cooper A
(2009) Revising the recent evolutionary history of equids using ancient DNA
PNAS 106:21754–21759.

https://doi.org/10.1073/pnas.0903672106
- PubMed
- Google Scholar
1. Quinn JH
(1957)
Bureau of Economic Geology

1–51, Pleistocene Equidae of Texas, Bureau of Economic Geology, 33, University of Texas, 10.23867/ri0033d.
- Google Scholar
Software
1. R Development Core Team
(2008) R: A language and environment for statistical computing
R Foundation for Statistical Computing, Vienna, Austria.

http://www.R-project.org
1. Reimer PJ
2. Bard E
3. Bayliss A
4. Beck JW
5. Blackwell PG
6. Ramsey CB
7. Buck CE
8. Cheng H
9. Edwards RL
10. Friedrich M
11. Grootes PM
12. Guilderson TP
13. Haflidason H
14. Hajdas I
15. Hatté C
16. Heaton TJ
17. Hoffmann DL
18. Hogg AG
19. Hughen KA
20. Kaiser KF
21. Kromer B
22. Manning SW
23. Niu M
24. Reimer RW
25. Richards DA
26. Scott EM
27. Southon JR
28. Staff RA
29. Turney CSM
30. van der Plicht J
(2013) IntCal13 and marine13 radiocarbon age calibration curves 0–50,000 years cal BP
Radiocarbon 55:1869–1887.

https://doi.org/10.2458/azu_js_rc.55.16947
- Google Scholar
(2010) A rapid column-based ancient DNA extraction method for increased sample throughput
Molecular Ecology Resources 10:677–683.

https://doi.org/10.1111/j.1755-0998.2009.02824.x
- PubMed
- Google Scholar
1. Ronquist F
2. Teslenko M
3. van der Mark P
4. Ayres DL
5. Darling A
6. Höhna S
7. Larget B
8. Liu L
9. Suchard MA
10. Huelsenbeck JP
(2012) MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space
Systematic Biology 61:539–542.

https://doi.org/10.1093/sysbio/sys029
- PubMed
- Google Scholar
(2013) Mid-Pliocene warm-period deposits in the High Arctic yield insight into camel evolution
Nature Communications 4:1550.

https://doi.org/10.1038/ncomms2516
- PubMed
- Google Scholar
(1978) Chromosome banding studies of the Equidae
Cytogenetic and Genome Research 20:323–350.

https://doi.org/10.1159/000130862
- Google Scholar
(2014) Global late Quaternary megafauna extinctions linked to humans, not climate change
Proceedings of the Royal Society B: Biological Sciences 281:20133254.

https://doi.org/10.1098/rspb.2013.3254
- Google Scholar
1. Schmieder R
2. Edwards R
(2011) Quality control and preprocessing of metagenomic datasets
Bioinformatics 27:863–864.

https://doi.org/10.1093/bioinformatics/btr026
- PubMed
- Google Scholar
(2012) Improving ancient DNA read mapping against modern reference genomes
BMC Genomics 13:178.

https://doi.org/10.1186/1471-2164-13-178
- PubMed
- Google Scholar
1. Scott E
(2004)
Biodiversity Response to Climate Change in the Middle Pleistocene

264–279, Pliocene and Pleistocene horses from Porcupine Cave, Biodiversity Response to Climate Change in the Middle Pleistocene.
- Google Scholar
1. Seguin-Orlando A
2. Gamba C
3. Der Sarkissian C
4. Ermini L
5. Louvel G
6. Boulygina E
7. Sokolov A
8. Nedoluzhko A
9. Lorenzen ED
10. Lopez P
11. McDonald HG
12. Scott E
13. Tikhonov A
14. Stafford TW
15. Alfarhan AH
16. Alquraishi SA
17. Al-Rasheid KA
18. Shapiro B
19. Willerslev E
20. Prokhortchouk E
21. Orlando L
(2015) Pros and cons of methylation-based enrichment methods for ancient DNA
Scientific Reports 5:11826.

https://doi.org/10.1038/srep11826
- PubMed
- Google Scholar
1. Shockey BJ
2. Salas-Gismondi R
3. Baby P
4. Guyot J-L
5. Baltazar MC
6. Huaman L
7. Flynn JJ
(2009)
New Pleistocene cave faunas of the Andes of central Peru: radiocarbon ages and the survival of low latitude, Pleistocene DNA

Palaeontologia Electronica 12:15A.
- Google Scholar
Website
1. Shorthouse DP
(2010) SimpleMappr
An Online Tool to Produce Publication-Quality Point Maps.

http://www.simplemappr.net
1. Skinner MF
2. Hibbard CW
(1972)
Early Pleistocene pre-glacial and glacial rocks and faunas of North-Central Nebraska

Bulletin of the American Museum of Natural History 148:1–148.
- Google Scholar
Software
1. St. John J
(2013) SeqPrep, version b5606dd
Github.

https://github.com/jstjohn/SeqPrep
1. Stamatakis A
(2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies
Bioinformatics 30:1312–1313.

https://doi.org/10.1093/bioinformatics/btu033
- PubMed
- Google Scholar
1. Steiner CC
2. Ryder OA
(2013) Characterization of Prdm9 in equids and sterility in mules
PLoS One 8:e61746.

https://doi.org/10.1371/journal.pone.0061746
- PubMed
- Google Scholar
1. Vilstrup JT
2. Seguin-Orlando A
3. Stiller M
4. Ginolhac A
5. Raghavan M
6. Nielsen SC
7. Weinstock J
8. Froese D
9. Vasiliev SK
10. Ovodov ND
11. Clary J
12. Helgen KM
13. Fleischer RC
14. Cooper A
15. Shapiro B
16. Orlando L
(2013) Mitochondrial phylogenomics of modern and ancient equids
PLoS One 8:e55950.

https://doi.org/10.1371/journal.pone.0055950
- PubMed
- Google Scholar
1. Weinstock J
2. Willerslev E
3. Sher A
4. Tong W
5. Ho SY
6. Rubenstein D
7. Storer J
8. Burns J
9. Martin L
10. Bravi C
11. Prieto A
12. Froese D
13. Scott E
14. Xulong L
15. Cooper A
(2005) Evolution, systematics, and phylogeography of Pleistocene horses in the New World: a molecular perspective
PLoS Biology 3:e241.

https://doi.org/10.1371/journal.pbio.0030241
- PubMed
- Google Scholar
1. Welker F
2. Collins MJ
3. Thomas JA
4. Wadsley M
5. Brace S
6. Cappellini E
7. Turvey ST
8. Reguero M
9. Gelfo JN
10. Kramarz A
11. Burger J
12. Thomas-Oates J
13. Ashford DA
14. Ashton PD
15. Rowsell K
16. Porter DM
17. Kessler B
18. Fischer R
19. Baessmann C
20. Kaspar S
21. Olsen JV
22. Kiley P
23. Elliott JA
24. Kelstrup CD
25. Mullin V
26. Hofreiter M
27. Willerslev E
28. Hublin JJ
29. Orlando L
30. Barnes I
31. MacPhee RD
(2015) Ancient proteins resolve the evolutionary history of Darwin's South American ungulates
Nature 522:81–84.

https://doi.org/10.1038/nature14249
- PubMed
- Google Scholar
1. Willerslev E
2. Gilbert MT
3. Binladen J
4. Ho SY
5. Campos PF
6. Ratan A
7. Tomsho LP
8. da Fonseca RR
9. Sher A
10. Kuznetsova TV
11. Nowak-Kemp M
12. Roth TL
13. Miller W
14. Schuster SC
(2009) Analysis of complete mitochondrial genomes from extinct and extant rhinoceroses reveals lack of phylogenetic resolution
BMC Evolutionary Biology 9:95.

https://doi.org/10.1186/1471-2148-9-95
- PubMed
- Google Scholar
Book
1. Winans MC
(1985)
Revision of North American Fossil Species of the Genus Equus (Mammalia: Perissodactyla: Equidae). Dissertation

Austin: Univ. of Texas.
- Google Scholar
1. Xu X
2. Arnason U
(1994)
The complete mitochondrial DNA sequence of the horse, Equus caballus: extensive heteroplasmy of the control region

Gene 148:357–362.
- PubMed
- Google Scholar
1. Xu X
2. Arnason U
(1997) The complete mitochondrial DNA sequence of the white rhinoceros, Ceratotherium simum, and comparison with the mtDNA sequence of the Indian rhinoceros, Rhinoceros unicornis
Molecular Phylogenetics and Evolution 7:189–194.

https://doi.org/10.1006/mpev.1996.0385
- PubMed
- Google Scholar
(1996b) The complete mitochondrial DNA (mtDNA) of the donkey and mtDNA comparisons among four closely related mammalian species-pairs
Journal of Molecular Evolution 43:438–446.

https://doi.org/10.1007/BF02337515
- PubMed
- Google Scholar
1. Xu X
2. Janke A
3. Arnason U
(1996a) The complete mitochondrial DNA sequence of the greater Indian rhinoceros, Rhinoceros unicornis, and the phylogenetic relationship among Carnivora, Perissodactyla, and Artiodactyla (+ Cetacea)
Molecular Biology and Evolution 13:1167–1173.

https://doi.org/10.1093/oxfordjournals.molbev.a025681
- PubMed
- Google Scholar
1. Zazula GD
2. MacPhee RD
3. Metcalfe JZ
4. Reyes AV
5. Brock F
6. Druckenmiller PS
7. Groves P
8. Harington CR
9. Hodgins GW
10. Kunz ML
11. Longstaffe FJ
12. Mann DH
13. McDonald HG
14. Nalawade-Chavan S
15. Southon JR
(2014) American mastodon extirpation in the arctic and subarctic predates human colonization and terminal Pleistocene climate change
PNAS 111:18460–18465.

https://doi.org/10.1073/pnas.1416072111
- PubMed
- Google Scholar
(2017) A case of early Wisconsinan “over-chill”: New radiocarbon evidence for early extirpation of western camel (Camelops hesternus) in eastern Beringia
Quaternary Science Reviews 171:48–57.

https://doi.org/10.1016/j.quascirev.2017.06.031
- Google Scholar

Article and author information

Author details

Peter D Heintzman
1. Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, United States
2. Tromsø University Museum, UiT - The Arctic University of Norway, Tromsø, Norway
Contribution
Conceptualization, Data curation, Software, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing

For correspondence
peteheintzman@gmail.com

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-6449-0219
Grant D Zazula

Yukon Palaeontology Program, Government of Yukon, Whitehorse, Canada

Contribution
Conceptualization, Resources, Investigation, Writing—original draft, Writing—review and editing

Competing interests
No competing interests declared
Ross DE MacPhee

Department of Mammalogy, Division of Vertebrate Zoology, American Museum of Natural History, New York, United States

Contribution
Resources, Validation, Investigation, Writing—original draft, Writing—review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-0688-0232
Eric Scott
1. Cogstone Resource Management, Incorporated, Riverside, United States
2. California State University San Bernardino, San Bernardino, United States
Contribution
Resources, Validation, Investigation, Writing—original draft, Writing—review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-2730-0893
James A Cahill

Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, United States

Contribution
Software, Formal analysis, Visualization, Writing—original draft

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-7145-0215
Brianna K McHorse

Department of Organismal and Evolutionary Biology, Harvard University, Cambridge, United States

Contribution
Data curation, Software, Formal analysis, Visualization, Methodology, Writing—original draft

Competing interests
No competing interests declared
Joshua D Kapp

Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, United States

Contribution
Validation, Investigation

Competing interests
No competing interests declared
Mathias Stiller
1. Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, United States
2. Department of Translational Skin Cancer Research, German Consortium for Translational Cancer Research, Essen, Germany
Contribution
Resources, Investigation, Methodology

Competing interests
No competing interests declared
Matthew J Wooller
1. College of Fisheries and Ocean Sciences, University of Alaska Fairbanks, Fairbanks, United States
2. Alaska Stable Isotope Facility, Water and Environmental Research Center, University of Alaska Fairbanks, Fairbanks, United States
Contribution
Resources, Funding acquisition, Writing—review and editing

Competing interests
No competing interests declared
Ludovic Orlando
1. Centre for GeoGenetics, Natural History Museum of Denmark, København K, Denmark
2. Université Paul Sabatier, Université de Toulouse, Toulouse, France
Contribution
Resources, Writing—review and editing

Competing interests
No competing interests declared
John Southon

Keck-CCAMS Group, Earth System Science Department, University of California, Irvine, Irvine, United States

Contribution
Resources, Investigation

Competing interests
No competing interests declared
Duane G Froese

Department of Earth and Atmospheric Sciences, University of Alberta, Edmonton, Canada

Contribution
Resources, Funding acquisition, Writing—review and editing

Competing interests
No competing interests declared
Beth Shapiro
1. Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, United States
2. UCSC Genomics Institute, University of California, Santa Cruz, Santa Cruz, United States
Contribution
Conceptualization, Resources, Supervision, Funding acquisition, Writing—original draft, Project administration, Writing—review and editing

For correspondence
bashapir@ucsc.edu

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-2733-7776

Funding

National Science Foundation (PLR-1417036)

Peter D Heintzman
James A Cahill
Joshua D Kapp
Mathias Stiller
Beth Shapiro

National Science Foundation (PLR-09090456)

Peter D Heintzman
James A Cahill
Joshua D Kapp
Mathias Stiller
Beth Shapiro

Danish Council for Independent Research Natural Sciences (4002-00152B)

Ludovic Orlando

Danmarks Grundforskningsfond

Ludovic Orlando

European Research Council (ERC-CoG-2015-681605)

Ludovic Orlando

Gordon and Betty Moore Foundation (GBMF3804)

Peter D Heintzman
James A Cahill
Joshua D Kapp
Mathias Stiller
Beth Shapiro

Norges Forskningsråd (250963)

Peter D Heintzman

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank the Klondike placer gold mining community of Yukon for their support and providing access to their mines from which many of our Haringtonhippus fossils were recovered. We thank Matt Brown and Chris Sagebiel of the Texas Vertebrate Palaeontology Collections at the University of Texas, Austin for access to a portion of TMM 34–2518, and also thank Sam McLeod, Vanessa Rhue, and Aimee Montenegro at the Los Angeles County Museum for access to the Gypsum Cave material for consumptive sampling. Thanks to Brent Breithaupt (Bureau of Land Management) for permitting the sampling of fossils from Natural Trap Cave that were originally recovered by Larry Martin, Miles Gilbert, and colleagues, and are presently curated by the University of Kansas Biodiversity Institute. We thank Chris Beard and David Burnham (University of Kansas) for facilitating access to these fossils. Thanks to Tom Guilderson, Andrew Fields, Dan Chang, and Samuel Vohr for technical assistance. Thanks to Greger Larson for providing the base map in Figure 1. We thank the reviewers whose comments improved this manuscript. This work used the Vincent J Coates Genomics Sequencing Laboratory at UC Berkeley, supported by NIH S10 Instrumentation Grants S10RR029668 and S10RR027303. PDH, JAC, JDK, MS, and BS were supported by NSF grants PLR-1417036 and 09090456, and Gordon and Betty Moore Foundation Grant GBMF3804. PDH received support from Norway’s Research Council (Grant 250963: ‘ECOGEN’). LO was supported by the Danish Council for Independent Research Natural Sciences (Grant 4002-00152B); the Danish National Research Foundation (Grant); the ‘Chaires d'Attractivit. 2014’ IDEX, University of Toulouse, France (OURASI), and the European Research Council (ERC-CoG-2015–681605).

Ethics

We received permission from three entities to destructively sample palaeontological specimens: the Texas Vertebrate Paleontology Collections at The University of Texas (granted to PDH and ES), the Los Angeles County Museum (granted to ES), and the US Department of the Interior Bureau of Land Management, Wyoming (granted to RDEM and BS; reference number: 8270(930))

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.