A novel lineage of the Capra genus discovered in the Taurus Mountains of Turkey using ancient genomics

  1. Kevin G Daly  Is a corresponding author
  2. Benjamin S Arbuckle
  3. Conor Rossi
  4. Valeria Mattiangeli
  5. Phoebe A Lawlor
  6. Marjan Mashkour
  7. Eberhard Sauer
  8. Joséphine Lesur
  9. Levent Atici
  10. Cevdet Merih Erek
  11. Daniel G Bradley  Is a corresponding author
  1. Smurfit Institute of Genetics, Trinity College Dublin, Ireland
  2. Department of Anthropology, University of North Carolina at Chapel Hill, United States
  3. Centre National de Recherche Scientifique / Muséum national d'Histoire naturelle, Archéozoologie, Archéobotanique, France
  4. University of Tehran, Bioarchaeology Laboratory, (Central Laboratory), Archaeozoology section, Islamic Republic of Iran
  5. School of History, Classics and Archaeology, University of Edinburgh, United Kingdom
  6. Department of Anthropology, University of Nevada, Las Vegas, United States
  7. Department of Archeology, Department of Prehistoric Archeology, Faculty of Letters, Ankara Hacı Bayram Veli University, Turkey

Abstract

Direkli Cave, located in the Taurus Mountains of southern Turkey, was occupied by Late Epipaleolithic hunters-gatherers for the seasonal hunting and processing of game including large numbers of wild goats. We report genomic data from new and published Capra specimens from Direkli Cave and, supplemented with historic genomes from multiple Capra species, find a novel lineage best represented by a ~14,000 year old 2.59 X genome sequenced from specimen Direkli4. This newly discovered Capra lineage is a sister clade to the Caucasian tur species (Capra cylindricornis and Capra caucasica), both now limited to the Caucasus region. We identify genomic regions introgressed in domestic goats with high affinity to Direkli4, and find that West Eurasian domestic goats in the past, but not those today, appear enriched for Direkli4-specific alleles at a genome-wide level. This forgotten ‘Taurasian tur’ likely survived Late Pleistocene climatic change in a Taurus Mountain refuge and its genomic fate is unknown.

Introduction

The genus Capra includes the domestic goat (Capra hircus) as well as a variety of wild mountain-dwelling goat/ibex species distributed across Eurasia and North Africa including several listed as endangered or vulnerable (Pidancier et al., 2006; Shackleton, 1997). Nine species are currently recognized by the IUCN; however, taxonomic relationships are still under revision (Pidancier et al., 2006; Zheng et al., 2020). Among these, the status of the two species endemic to the Caucasus Mountains has been debated (Groves and Grubb, 2011; Parrini et al., 2009). The East Caucasian tur (Capra cylindricornis) has been considered either a species distinct from the West Caucasian tur (Capra caucasica) or they comprise a single species of two potentially-hybridizing populations (Heptner et al., 1961). Moreover the bezoar (Capra aegagrus), progenitor of domestic goat, has also been reported to hybridize with both tur varieties with which it shares seasonal grazing territories in the Caucasus region (Pfitzenmayer, 1915; Sarkisov, 1953; Weinberg, 2002). Interspecies Capra gene flow is well known (Kazanskaia et al., 2007; Manceau et al., 1999; Pidancier et al., 2006), and may explain discordant phylogenies across loci (Pidancier et al., 2006; Ropiquet and Hassanin, 2006). Such admixture may have shaped the evolution of domestic goat; for example, the tur has been identified as a putative source of a MUC6 allele driven to fixation in domestic populations and likely selected for gastrointestinal parasite resistance (Grossen et al., 2020; Zheng et al., 2020). Tur additionally shows differing affinity to domestic and wild goat genomes indicating a complex evolutionary history of the genus.

Although tur are currently restricted to the Caucasus region, ancient wild goat specimens recovered from Direkli Cave, a camp site used by Late Pleistocene hunters in the Central Taurus Mountains of southern Turkey (Figure 1A, Figure 1—figure supplements 1 and 2; Arbuckle and Erek, 2012), were found to carry a tur-like mitochondrial lineage, designated T (Daly et al., 2018). Three of these four reported ancient Capra specimen fall within the bezoar autosomal diversity, but a fourth - Direkli4, dated to 12,164–11,864 cal BCE (Bronk Ramsey, 2009; Reimer et al., 2020) and sequenced here to 2.59 X mean genome coverage (Table 1) - shows an excess of ancestral alleles in D statistic tests (Green et al., 2010) when paired with domestic/bezoar goat (Figure 1—figure supplement 3, Supplementary file 1B) implying Direkli4 carries ancestry basal to that clade. To explore this signal further, we generated low coverage genomes from historic and rare Capra samples, including a 20th century CE zoo-born East Caucasian tur (Tur2), tur specimens from the Dariali-Tamara Fort archaeological site near Kazbegi, Georgia (Mashkour et al., 2020), a zoo-born Walia ibex, and supplemented with published modern and ancient Capra genomes (Supplementary file 1A and C, Figure 1—figure supplement 4; Grossen et al., 2020; Zheng et al., 2020). Surprisingly, a neighbour joining tree from nuclear genome identity-by-state (IBS) information places Direkli4 as sister to a clade of both Caucasian tur taxa, a signal obtained using either goat- or sheep-aligned data (Figure 1B and Figure 1—figure supplement 5). The Direkli4 genome thus suggests a previously-unrecognized Capra lineage sister to both Caucasian tur inhabited the Taurus Mountains ~14,000 years ago.

Table 1
Sample provenance and sequencing summary.
SampleMorphological SpeciesOriginAgeSexNuclear Cov. mtDNA Cov.
Direkli4Capra spc.E4/8 A, Direkli Cave, Turkey12,164–11,864 cal BCE (2σ)M2.59642.23
Direkli9Capra spc.B6/5, Direkli Cave, TurkeyEst. 12,100–8,900 BCE*M0.00039.5
Direkli12Capra spc.B13/4 A, Direkli Cave, TurkeyEst. 12,100–8,900 BCEM0.0744.06
Direki13Capra spc.B13/7B, Direkli Cave, TurkeyEst. 12,100–8,900 BCEM0.005576.99
Direkli14Capra spc.D3/7, Direkli Cave, TurkeyEst. 12,100–8,900 BCEM0.000513.24
Direkli15Capra spc.B6/5, Direkli Cave, TurkeyEst. 12,100–8,900 BCEF0.000224.21
Direkli16Capra spc.B8/7, Direkli Cave, TurkeyEst. 12,100–8,900 BCEF0.0115.11
Direkli17Capra spc.E5/5, Direkli Cave, TurkeyEst. 12,100–8,900 BCEF0.00010.0845
Caucasus1Capra caucasicaTamara Fort, Kazbegi, Georgia4th-21st c. CE, probably 5th-15th c. CEM0.5564.63
Caucasus2Capra caucasicaTamara Fort, Kazbegi, Georgia4th-21st c. CE, probably 5th-15th c. CEM0.0021379.71
Caucasus3Capra caucasicaTamara Fort, Kazbegi, Georgia4th-21st c. CE, probably 5th-15th c. CEM0.004464.51
Falconeri1Capra falconeri hepteneri §Unknown, via Parc de la Haute-Touche20th Century CEM0.5872.11
Falconeri2Capra falconeriBorn at MNHN Zoo, Paris20th Century CEM0.0645.78
Ibex1Capra ibexUnknown20th Century CEF3.93179.03
Ibex2Capra ibexPointe de Calabre, Savoie20th Century CEM0.0521.4
Sibirica1Capra sibiricaBorn at MNHN Zoo, Paris20th Century CEM0.04163.16
Sibirica2Capra sibiricaBorn at MNHN Zoo, Paris20th Century CEF1.48205.78
Tur2Capra cylindricornisUnknown, via Vincennes Zoo20th Century CEF0.024.96
Walie1Capra walieBorn at MNHN Zoo, Paris20th Century CEM0.75103.12
Pyrenaica2Capra pyrenaicaUnknown20th Century CEM0.1651.2
Nubiana1Capra nubianaUnknown20th Century CEM1.25211.32
  1. *

    Estimated ages for Direkli material is based on calibrated ages from the cave stratigraphy.

  2. M=Male, F=Female.

  3. Cov. = coverage.

  4. §

    Falconeri1, a likely Barbary sheep (see Materials and methods).

Figure 1 with 7 supplements see all
Direkli Cave caprids in geographic and genetic context.

(A) Elevation map of southwest Asia. Key sites are indicated, with C. caucasica and C. cylindricornis distributions from Gavashelishvili et al., 2018 displayed. MSL = metres above mean sea level. (B) Neighbour joining phylogeny of genomes >0.5 X and the lower coverage Tur2 (0.02 X) and Direkli16 (0.01 X) genomes using 625,495 transversion sites and pairwise IBS, rooted on Sheep (not shown, as well as a likely Barbary Sheep sample Falconeri1, see Methods). Node support from 100 replicates using 50 5 Mb regions sampled without replacement shown when <100. Pink = Direkli4, green = other genomes first reported here.

Results

Our additional screening of Direkli Cave Capra remains identified seven with surviving DNA (Supplementary file 1D), with two genomes showing greater affinity to Direkli4 than bezoar from the same site (Figure 2A, Supplementary file 1E). An MDS plot of IBS distances (Figure 1—figure supplement 6) places two Direkli samples with sufficient coverage (Direkli4 and Direkli16) close to East and West Caucasian tur genome clusters, with a slight bias to the former. This tur affinity is unlikely to be driven by error as Direkli specimens have low error rates (0.026–0.195%, Supplementary file 1A and C) and do not show inflated distance-to-the-outgroup relative to modern genomes (Figure 1—figure supplement 7). A total of 3 out of the 11 Direkli Cave Capra specimens therefore are assigned to the tur-related clade, implying that while less numerous than bezoar, members of this clade were not rare in the region in the Late Pleistocene. Nuclear genome types (tur-like or bezoar) do not necessarily co-associate with mitochondrial lineages (tur T and bezoar F), with all combinations except ‘tur-like genome, tur-like mitochondria’ observed (Figure 2B, Figure 2—figure supplements 1 and 2, Supplementary file 1A and E), establishing that there was gene flow between these lineages. Additionally, there is little variation among Direkli T mtDNA (average 4.07 pairwise differences, compared to 67 among Direkli F mtDNA), suggesting a limited population size for this Direkli tur-like matrilineage.

Figure 2 with 5 supplements see all
Autosomal affinity and mtDNA profile of Direkli Cave caprids.

(A) D statistic test of affinity using specimens from Direkli Cave and a historic C.caucasica individual for reference. Positive values indicate sample X has greater affinity with the Tur-like Direkli4 genome; negative values indicate greater affinity with the C. aegagrus Direkli1-2. Error bars represent 3 standard errors, underlying site counts are presented in Supplementary file 1E. (B) ML phylogeny of mtDNA, abbreviated. Bootstrap node support values (100 replicates) are displayed when <100. The complete phylogeny including likely Barbary sheep Falconeri1 is displayed in Figure 2—figure supplement 1. T haplogroup is as defined by Daly et al., 2018 Low coverage sample Direkli17 is displayed in a highly reduced phylogeny Figure 2—figure supplement 2. C=collapsed. *=Direkli sample with discordant mtDNA and nuclear genome affinity.

A tur-like population in the Taurus Mountains is consistent with the high variability in body size of the Capra material at Direkli Cave where extremely large Capra remains have been reported alongside smaller bezoar-size individuals (Figure 2—figure supplement 3). Extant tur exhibit body weights 20–50% larger than bezoar (Castelló et al., 2016; Masseti, 2009) and it is plausible that the ‘large’ Capra from Direkli represent tur-lineage animals. Although there are clear differences between bezoar and tur horn morphologies (Pidancier et al., 2006), unfortunately, diagnostic horncore remains have not been recovered from Direkli. The cave was initially inferred to be occupied primarily during summer months (Arbuckle and Erek, 2012), with subsequent discoveries of architectural remains and zooarchaeological analyses indicating more intensive use (Arbuckle, 2019). The presence of tur-related goats may reflect use of the cave in the winter months when, based on Caucasian analogs (Gavashelishvili, 2009), tur would be expected to descend from the higher elevations surrounding the cave (>2000 m above sea level).

Given the Direkli4 genome was recovered together with bezoar specimens, the two lineages of Capra likely had proximate ranges and hybridised. We use D statistics (Green et al., 2010) to measure Direkli4 derived allele sharing relative to either a likely-hunted (Supplementary file 1G) or likely-herded (Supplementary file 1H, Pearson’s r>0.99)~10,000 year old goat from the Zagros Mountains. A Late Pleistocene wild goat from the Armenian Lesser Caucasus, Hovk1, shows highest affinity with Direkli4 (Figure 2—figure supplements 4 and 5). Bezoar goats from Direkli Cave also show high Direkli4 allele sharing, mirroring affinity measures with west Caucasian tur (Zheng et al., 2020). While directionality is uncertain, these statistics imply gene flow between the tur-like lineage and wild bezoar.

Examining domestic goats we find that Neolithic genomes from Europe show greater affinity to Direkli4 (Figure 2—figure supplement 4), but Neolithic Iranian goats do not, echoing the distribution of Direkli bezoar-related ancestry in West Eurasian populations (Daly et al., 2018). We account for possible gene flow from Caucasian tur into modern European goats using the statistic D(Tur1, Direkli4; X, Sheep) to compare relative affinity with Tur1 and Direkli4 (Supplementary file 1I). With the exception of two other tur samples, all examined domestic/bezoar goats show either a bias towards Direkli4 or gave a non-significant result, consistent with Direkli4-related admixture or a more complicated genetic history.

Genetic exchange between bezoar and the ancestors of Direkli4 could confound these measures of shared variation among domestic populations. We identified variants specific to Direkli4, conditioned on ancestral allele fixation in a range of defined groups (Figure 3A, see Materials and methods). Using this we calculate a statistic analogous to the D statistic, here termed the extended D or Dex. Dex measures the relative degree of allele sharing, derived specifically in a selected genome or group of genomes, and may have some utility in genera with complex admixture histories or admixture from ghost lineages. Relative to Neolithic Zagros goats, ancient domestic genomes from western Eurasia have an excess of Direkli4-specific variants (Figure 3B, Figure 3—figure supplement 1, Supplementary file 1J). This ‘Direkli4-specific’ allele sharing signal is absent in ancient goats from Iran-eastwards, and in all tested modern goat (Figure 3C). To control for possible reference biases, we calculated Dex ascertaining on variants segregating in sheep (Supplementary file 1K) and recovered similar results (Pearson’s r=0.9935). Repeating the analysis using other ancient/historic Capra ‘specific’ alleles shows somewhat correlated results (Supplementary file 1L), but the distinct patterns of allele sharing (Figure 3—figure supplements 2 and 3, Supplementary file 1M) imply that Direkli4 ancestry in domestic goat varies temporally and geographically.

Figure 3 with 22 supplements see all
Extended D calculation and application to Direkli4 specific allele sharing.

(A) Extended D statistic. To control for gene flow with the Capra genus, we condition on variants derived in Direkli4 (H3) and a genome X (H2), but ancestral in other populations (here: Sheep, non-bezoar Capra genomes, bezoar from Direkli Cave, and other bezoar). Values are calculated relative to a set of Neolithic goat from the Zagros Mountains, and normalized similarly to the D statistic. (B) Extended D statistic values for Direkli4 using transversions, (C) plotted through time. Each symbol is a test genome, with shape denoting time period. Black borders indicates a |Z| score ≥3, using 1000 bootstrap replicates and 5 Mb blocks.

We next identify alleles derived in Direkli4 also at a low frequency (>0%,≤10%) in other Capra and bezoar, and then measure their abundance in domestic goats. The west Caucasian tur (Tur1) most frequently shares derived alleles with Direkli4 and domestic goat (Figure 3—figure supplement 4), consistent with their cladal relationship (although this measure is sensitive to genome depth, see Methods). Ancient European domestic goats share a higher proportion of alleles with both Direkli4 and the high-coverage bezoar from Direkli Cave, Direkli1-2. In comparison, Modern European and African goats carry variation present in Direkli4 plus one of the two Caucasian tur (Tur1 and Caucasus1). This discrepancy could be explained by either gene flow from domestic goats into tur during the last 8000 years, or alternatively an increase in tur-Direkli4 related ancestry in European populations over time.

Investigating gene flow events within Capra, automated tree-based model exploration (Pickrell and Pritchard, 2012) detects admixture between the Direkli4/Tur lineage and the ancestors of the Late Pleistocene bezoar Hovk-1 (Figure 3—figure supplement 5). Residuals of this graph point to unmodeled affinity between Direkli4 and both Direkli Cave bezoar and with Neolithic Serbian domestic goat (Figure 3—figure supplement 6). Modern European goat do not show unmodeled Direkli4 affinity, supporting the interpretation that Direkli4-related ancestry has declined with time in west Eurasian goats. A reduced set of populations explored using ML network orientation (Molloy et al., 2021) reiterates the Tur1/Direkli4 and Direkli bezoar lineages admixture, and also between Direkli bezoar and domestic goat (Figure 3—figure supplement 7). Investigating admixture graph space (see Materials and methods, Maier et al., 2022) we find 2 admixture events best explain how a subset of populations (Sheep, Tur1, Direkli4, Direkli bezoar, Neolithic East Iran, and Neolithic Serbia) can be modeled. A majority (6/11) of graphs model Direkli bezoar as containing ancestry related to Direkli4 (median 1.5%, mean 5.2%; best fitting graph is shown in Figure 3—figure supplement 8), with a single graph modeling the opposite (2% Direkli bezoar ancestry). While the graph space explored is limited, these results suggest a greater degree of ‘Direkli4 to Direkli bezoar’ gene flow than ‘Direkli bezoar to Direkli4’.

We finally identify 3 out of 112 regions introgressed from other Capra species to domestic goat (Zheng et al., 2020) which show high affinity with Direkli4 (Figure 3—figure supplements 911, 21–22). A further nine regions appear to have most affinity with the Direkli4-tur clade (Figure 3—figure supplements 1220), including a locus encompassing MUC6, a target of selection in domestic goats during the last 10,000 years (Zheng et al., 2020), implicating the Direkli4 lineage in the makeup of domestic goat gene pool.

Discussion

Our results indicate that a lineage related to the Caucasian tur existed in the Taurus Mountains during the Late Pleistocene, as late as the 12th millennium cal BCE. Based on the current, limited genomic data from the Capra genus, which we improve on here, this lineage appears to be a sister group to the tur C. caucasica and C. cylindricornis. Similar to other mammalian groups (Gopalakrishnan et al., 2018; Palkopoulou et al., 2018; Zheng et al., 2020), admixture likely occurred among Capra lineages; the population reported here carries bezoar-associated mtDNA and a possible small amount of bezoar nuclear genome ancestry (2% from 1/12 graphs). This Taurasian tur population is itself a possible candidate for the source of Tur-like ancestry present in domestic goats, including an introgressed MUC6 allele fixed in modern populations which increases gastrointestinal parasite resistance (Zheng et al., 2020). Given the relative paucity of Capra genomic data available compared to other mammalian groups, additional genomes from the genus will help refine the history of divergences and gene flow events which shaped the group’s evolution.

We suggest this novel ‘Taurasian tur’ lineage be designated Capra taurensis following IUCN convention (Weinberg and Lortkipanidze, 2020) or Capra caucasica taurensis under a subspecies classification (De Queiroz, 2007; Wilson and Reeder, 2005). The Taurasian tur may have diverged from the Caucasian lineages 130-200kya based on mtDNA coalescent estimates (Bouckaert et al., 2014; Daly et al., 2018). The current distribution of Capra species is mostly discontinuous and is suggestive of climate-induced fragmentation (Shackleton, 1997). The ancestors of Caucasian tur likely extended over a broader range in Eurasia during the Late Pleistocene but may have been poorly captured by the fossil record (Crégut-Bonnoure, 1991; Uerpmann, 1987; Weinberg, 2002). The large variability and high upper size range of Capra remains are consistent with both smaller-bodied bezoar and larger-bodied tur-relatives being present within the faunal assemblage at Direkli as well as other sites in the central Taurus and Lebanese mountains (Üçagızlı cave, Ksar Akil, Saaide II), but not in the western Taurus where only bezoar are evident (Figure 2—figure supplement 3; Arbuckle and Erek, 2012). C. taurensis could have survived the Last Glacial Maximum within the central Taurus Mountains, a plausible refugia for Capra species in addition to the Pontic and Anti-Taurus ranges (Gavashelishvili et al., 2018) while experiencing a severe matrilineal bottleneck (Figure 2B). C. taurensis appears to have produced fertile offspring with other members of the Capra genus; the traces of shared ancestry in ancient bezoar and likely managed goat (Figure 3B) may be the consequence of direct gene flow or secondarily via admixed bezoar. Gene flow between early managed goats and Anatolian bezoar carrying C. taurensis ancestry could partially explain the divergence between Zagros and more westerly herds.

Given the tremendous pressure on Capra species via Anthropocene over-hunting and habitat disruption (Shackleton, 1997), it is assumed that C. taurensis is extinct, with its existence only now revealed via palaeogenomics. The Caucasian tur’s preference for snowier habitats (Gavashelishvili et al., 2018) combined with the lower altitude of the Taurus Mountains relative to the Caucasus (Figure 1A) may have rendered the lineage vulnerable to climatic change via Holocene warming and interspecific competition with bezoar, which are still found in the Taurus mountains (Gavashelishvili, 2009; Naderi et al., 2008), leading to its hypothesised extinction. As the history of C. taurensis following the Late Pleistocene is still unknown, further genomic surveys of Holocene Capra remains and present-day populations, such as the VarGoats project (Denoyelle et al., 2021), from this and adjacent regions may illuminate its genetic legacy.

Materials and methods

Materials and methods summary

Request a detailed protocol

DNA from 7 postcranial bone elements from Direkli Cave and 13 historic Capra specimen was extracted via standard aDNA protocols (Yang et al., 1998), with a 0.5% sodium hypochlorite pre-wash (Korlević et al., 2015) performed for the Direkli material. Following uracil excision (Rohland et al., 2015) and dsDNA library construction (Meyer and Kircher, 2010), libraries were subject to shotgun sequencing (Illumina HiSeq 2000 and NovaSeq 6000) or RNA-bait enrichment of mtDNA reads prior to shotgun sequencing. Additional sequencing data was also generated for specimen Direkli4.

Using bwa aln (Li et al., 2008) a relaxed alignment (Meyer et al., 2012) was performed against the goat reference ARS1 (Bickhart et al., 2017) or an outgroup genome (Oar_rambouillet_v1.0). Subsequent analyses were primarily performed in the ANGSD environment (Korneliussen et al., 2014) using single read sampling.

Direkli Cave

Request a detailed protocol

Direkli Cave (37°51'28.08"N, 36°39'13.98"E) is located in the Central Taurus mountains, north-west of Kahramanmaraş Province, near the village of Döngel, Turkey (Erek, 2010; Erek, 2012). It was discovered and first excavated by Kökten, 1960. The cave sits in a south-facing limestone escarpment at an elevation of 1100 m above sea level. It is located at the base of the Deli Höbek mountain and adjacent peaks which rise up to 2200 m immediately to the northeast. The cave is located above a small alluvial fan on the eastern side of the north-south trending valley draining the Tekir river (and the D825 highway), itself a tributary to the Ceyhan, a major river system which flows into the Mediterranean. The site is located on the edge of a highly dissected upland and within a corridor providing easy access both to the Mediterranean coast as well as the interior of southern Turkey. It is therefore situated along a convenient route for seasonal movements between higher and lower elevations for both humans and other animal species.

In 2007, new excavations under the direction of C. M. Erek were initiated with the support of the Turkish Ministry of Culture and Tourism in order to further explore the late Pleistocene deposits in the cave, a period that is poorly documented for the central Taurus mountains. Work carried out since that time has identified a stratigraphic sequence rich in artifacts and features dating to the late Pleistocene with lithic parallels both with the Levantine Natufian as well as sites on the Turkish Mediterranean coast. Radiocarbon dates derived from charcoal and animal bone from layers containing microlithic tools, date the main prehistoric occupation layers of the site to between 12,100–8900 years calibrated BCE. The majority of these dates place the occupation in the late Pleistocene prior to the Younger Dryas although the cave was used as late as the late 10th millennium BCE as well.

Excavations were carried out following the natural and cultural stratigraphic layers. Sediments of different colors in each plan square were collected in different buckets and passed through a triple screening system. The sediments, which were all subjected to water screening, were sieved through fine (1 mm), medium (2 mm), and coarse (4 mm) sieves and the materials in each sieve were dried in the shade. These recovery techniques have produced a rich faunal assemblage indicative of the local fauna exploited by the cave’s occupants. This faunal community is dominated by the remains of wild goats (Capra spp–the focus of this study) but also deer, boar and black bear as well as a variety of fur bearing carnivores.

Capra specimens sequenced in this study derive from excavation areas located in the North portion (interior) of the cave (Figure 1—figure supplement 1). Stratigraphic affiliation of each specimen is reported in Table 1. Specimens were recovered from stratigraphic layers 4–8 and all derive from deposits containing material culture (especially microliths) consistent with a late Pleistocene date. In general, Capra specimens are highly fragmented and many exhibit cut and percussion marks clearly indicating an anthropogenic origin for the assemblage. There is no evidence that Capra remains accumulated through natural processes or carnivore denning behaviours. Direct AMS dates were acquired for two specimens (Direkli4 [Beta-432464: 12130+/-40] and Direkli2 [Beta-425280: 11370+/-40]) confirming their late Pleistocene provenience (12,200–11,200 cal BCE). Four specimens derive from grid square B/13 (Figure 1—figure supplement 2) including stratigraphic levels 4 A, 4 C, 7B, 7 C providing a temporal sequence from youngest to oldest including Direkli2, Direkli15, Direkli16, Direkli1.

Sample preparation

Request a detailed protocol

Material from the Muséum national d'Histoire naturelle collections in Paris were sampled on-site. Newly screened specimens from Direkli Cave and Dariali-Tamara Fort were sampled in dedicated ancient DNA facilities in TCD, Dublin, following standard protocols (Pääbo et al., 2004).

Sampled materials were cleaned with a drill bit and then subject to UV for 30 min, flipping midway to decontaminate both sides, followed by pulverization using a Mixer Mill (MM 200, Retsch).

DNA extraction and library preparation

Request a detailed protocol

For MNHN and Tamara Fort material,~150 mg of bone powder was subject to EDTA-prewash and proteinase K 3 day extraction as previously described (Daly et al., 2018; Gamba et al., 2014; MacHugh et al., 2000; Yang et al., 1998). For newly screened Direkli Cave material, samples were subjected to a 0.5% hypochlorite wash prior to EDTA wash and proteinase K-based extraction (Boessenkool et al., 2017; Daly et al., 2021). Previously described protocols for DNA cleanup (Daly et al., 2018; Daly et al., 2021), Uracil-DNA-glycosylase treatment Rohland et al., 2015 of 16.25 µl DNA followed by dsDNA library construction (Meyer and Kircher, 2010). Control tubes were included for extraction and library steps and kept for subsequent analysis, but did not show evidence of contamination.

Screening, sequencing and nuclear genome alignment

Request a detailed protocol

Libraries were amplified (Accuprime Pfx, Thermofisher) with single indexes (MNHN and Tamara Fort specimen) or double indexes (newly-screened Direkli specimen) as described (Daly et al., 2018; Daly et al., 2021). Initial screening was performed using an Illumina MiSeq platform (50 bp SE; TrinSeq, Dublin), HiSeq 2000 (100 bp SE; Macrogen, Seoul) or NovaSeq 6000 (100bp PE; TrinSeq, Dublin); sample-platform mix is presented in Supplementary file 1D.

Following de-multiplexing, single-end fastq files were filtered for adaptors which were also trimmed, and reads <30 bp in length removed using cutadapt v1.9.1 (Martin, 2011) cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -O 1 m 30.

AdaptorRemoval v2.3.1 (Schubert et al., 2016) was used for pair-end data, for the above QC steps and also to collapse overlapping read pairs: AdapterRemoval -collapse --minadapteroverlap 1 --adapter1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC --adapter2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT --minlength 30 --gzip --trimns --trimqualities.

Trimmed (SE) or collapsed (PE) reads were aligned to ARS1 (Bickhart et al., 2017) using bwa aln (Li et al., 2008) relaxing parameters (Meyer et al., 2012) and handling sam/bam files using samtools v1.11 (Li et al., 2009). Read groups were assigned using bwa samse/sampe to differentiate sequencing libraries and PCR reactions. To estimate endogenous DNA %age, duplicate removal was performed using picard v2.22.1 (The Broad Institute, 2018) and mapQ ≥30 filtering with samtools, dividing QCed aligned reads by raw reads. Substitutions associated with ancient DNA were examined using mapDamage2 (Jónsson et al., 2013), and are presented in Figure 1—figure supplement 4.

Samples selected for deeper sequencing (typically >10% endogenous DNA, and including additional sequencing of Direkli4 originally reported in Daly et al., 2018) were sequenced on either a Illumina HiSeq 2000 (100 bp SE; Macrogen, Seoul) or NovaSeq 6000 (100bp PE; TrinSeq, Dublin). Deeper sequenced data was aligned as above, with a subsequent indel realignment step (The Broad Institute, 2018) and softclipping of reads by reducing the base quality of the first and last four bases to zero using a custom python script. Genome depth was estimated using GATK and presented along with alignment statistics in Supplementary file 1D.

For alignment to the sheep reference, the above pipeline was used substituting ARS1 with the sheep reference Oar_rambouillet_v1.0 available at https://www.ncbi.nlm.nih.gov/assembly/GCF_002742125.1/. Samples aligned to sheep are indicated in Supplementary file 1C.

Mitochondrial capture

Request a detailed protocol

Samples not selected for deep genomic sequencing were enriched for mammalian mtDNA sequences using an in-solution RNA hybridization approach using custom baits as previously described (Daly et al., 2018; Gnirke et al., 2009; Maricic et al., 2010; O’Sullivan et al., 2016) Daicel Arbor Biosciences, Ann Arbor, USA. Libraries enriched for mtDNA were subsequently sequenced using a MiSeq platform (50 bp SE; TrinSeq, Dublin).

Mitochondrial genome alignment and phylogeny

Request a detailed protocol

Reads were aligned to circularized (15 bp either end) mitochondrial reference genomes as previously described (Daly et al., 2021), realigning to closer mtDNA references to maximize sequence recovery. Sample fasta files were producing using ANGSD v0.922 (-doFasta2 -setMinDepth 3 -minQ 20 -minMapQ 30 -trim 4; Korneliussen et al., 2014) and decircularized by removing 15 bp at both ends of the resulting fasta sequence. Due to the low coverage of Direkli17, this sample fasta was produced with -setMinDepth 1 and analyzed independently of others.

A multiple sequence alignment of data generated here plus published Capra mtDNA sequences (Supplementary file 1E) was generated using MUSCLE (Edgar, 2004). A ML phylogeny with 100 bootstrap replicates was then computed using phyML (Guindon et al., 2010) using a BIC-selected substitution model, and visualized using Figtree (Rambaut, 2009).

Pairwise differences among aligned mtDNA sequences was calculated using a custom Python script, ignoring sites where one of the two samples were missing data (‘N’). For F lineage comparisons, Direkli17 was excluded due to its low coverage and the higher error rate implicit in the -setMinDepth 1 option used for that sample.

Modern alignment

Request a detailed protocol

Modern genomes (Supplementary file 1C) were aligned to ARS1 or Oar_rambouillet_v1.0 using bwa mem v0.7.5a-r405 (Li, 2013; Li et al., 2009; Li et al., 2008) and following GATK Best Practices, removing duplicates and reads with mapQ ≥30. Among these were 3 modern Tur genomes, which were included in the IBS calculation and nj tree (below) with permission from the VarGoats consortia (Denoyelle et al., 2021) prior to these samples’ in-depth analyses. Samples aligned to sheep are indicated in Supplementary file 1C.

ANGSD-based analyses

Request a detailed protocol

All ANGSD (v0.922; Korneliussen et al., 2014) analyses were performed using the following parameters: -minMapQ 30 -minQ 20 C 50 -remove_bads 1 -uniqueOnly 1 -rmTrans 1 and restricting analyses only to the autosomes. When necessary (-doMajorMinor 5 or -doAbbababa 1), the ancestral sequence was defined using a fasta file from a goat-aligned sheep (Daly et al., 2021). Genotype likelihoods were computed using -GL 2 -rmTrans 1 -SNP_pval 1e-6 -skipTriallelic 1.

Error estimation

Request a detailed protocol

To estimate the error rates of the Direkli specimen and Capra genomes reported here, the ANGSD ‘perfect genome’ approach was employed, using a high coverage Old Irish Goat (‘IOG’,~42 X): -doAncError 1 -ref IOG.fa, where IOG.fa was generated using ANGSD -doFasta 1 -doCounts 1 -setMinDepth 13 -setMaxDepth 76 C 50 -minQ 20 -minMapQ 30. Sheep was used as the outgroup. Error rates were reasonable, with all falling below 0.5% and the highest reaching 0.3428%; the highest Direkli genome error was just 0.1947%. Error rates are shown in Supplementary file 1A and C (for previously published Direkli bezoar and the Tur1 specimen).

We additionally assessed error rates of ancient vs modern individuals by plotting distance from the outgroup as calculated from Identity-By-State statistics (see below). We expect high error individuals, particularly ancient ones, to show inflated distances from the outgroup (sheep). Plotting distances (Figure 1—figure supplement 7), the majority ancient Direkli specimens do not show excessive distance to the outgroup, with the highest distances being observed in historic and modern European (alpine and Iberian) ibex. A single Direkli wild goat / Capra aegagrus shows elevated distance to the outgroup relative to modern Capra aegagrus (0.240308), but all other Direkli specimens have unremarkable distance values.

D statistics

Request a detailed protocol

D statistics were computed using -doAbbaBaba 1 -rmTrans 1 -doCounts 1 and using sheep as above to define the ancestral allele. Correlation between D statistical tests was measured by calculating Pearson’s correlation coefficient r using the cor() function of R Development Core Team, 2021. D test results are presented in Figure 2—figure supplements 4 and 5, and Supplementary file 1GH and I. A subset of D statistic tests were also repeated using sheep aligned-data (see above); Pearson’s r for the test indicated in Supplementary file 1FG and H were 0.927, 0.950, and 0.494, respectively. The latter is not concerning as the D statistics in question (Direkli4, Tur1; X, Sheep) are overwhelmingly large for goat and sheep-aligned data (i.e. |D|>>0.05) and are consistent in directionality.

Identity-by-state (IBS)

Request a detailed protocol

Pairwise IBS matrices were computed for reported genomes ≥0.5 X coverage, the lower coverage Tur2 genome (C. cylindricornis), a subset of modern and ancient Capra genomes, and a sheep outgroup using ANGSD (-makeMatrix 1 -doCov 1 -doMaf 1 -minFreq 0.05 -minInd 56), allowing ~5% missingness per site. Neighbour-joining phylogenies were constructed using the nj() function of the ape R package (Paradis and Schliep, 2019). Published genomes included in IBS calculations are indicated in Supplementary file 1C; sheep was used as the outgroup.

To calculate node support, pseudo-bootstrap datasets were generated by sampling-without-replacement 50 5 Mb regions a total of 100 times, and calculating IBS values for each of the 100 pseudoreplicates. Neighbour-joining trees were calculated as above and node supports applied to the base tree using RAxML (Stamatakis, 2014), and are presented in Figure 1B.

To control for possible reference genome effects, an additional IBS neighbour-joining tree was computed using sheep-aligned data using the following ANGSD settings, allowing for 90% site missingness due to the smaller number of genomes used: -doMajorMinor 4 GL 2 -minFreq 0.05 -minInd 41. Node support values were calculated as above, using 50x5 Mb regions per replicate. This tree is presented in Figure 1—figure supplement 5, with (A) and without (B) the lower coverage Direkli16 sample which falls within the ‘taurasian tur’ clade. Site counts for both trees are 96,930 and 110,672 respectively.

We additionally constructed a MDS plot using the sheep-aligned MDS data and R functions cmdscale(as.dist()) on the IBS.ibsMat file (Figure 1—figure supplement 6A), with an insert of the corner of the plot containing the taurensis tur lineage. The MDS places the nubian, alpine, and european ibex specimens, as well as tur, in a distinct part of the MDS plot away from a Capra aegagrus/ hircus group and a Capra sibirica/falconeri (Siberian ibex and markhor). Within the west Eurasian ibex & tur genomes, the relative affinities seen in the nj tree (Figure 1B) are reiterated, with the tur group being closer to the European ibex than to Nubian ibex. The taurasian tur genomes Direkli4 and Direkli16 clearly fall within the Tur group. Both eastern and western tur genomes form distinct groups in MDS space, with Direkli tur showing somewhat higher affinity to the eastern tur group. We additionally computed IBS using sheep-aligned, ancients & historic samples only, allowing 20% missingness to account for the relatively high number of low coverage samples (-minInd 21). From the resulting IBS matrix (computed from 464,799 biallelic transversion sites), an MDS plot was produced (Figure 1—figure supplement 6B). This plot offers less discriminatory power within the ‘ibex / tur’ group, as expected given the lower number of Capra genomes and overall lower coverage among the non Capra hircus / Capra aegagrus genomes. However, both Direkli ‘taurasian tur’ genomes fall clearly within the ‘ibex / tur’ group.

Haploid calling

Request a detailed protocol

To produce a haploid (random read sampling) callset of genomes with at least 0.5 X coverage for subsequent analyses the ANGSD -doHaplocall function was employed: -doHaploCall 1 -minInd 2 -minQ 20 -minMapQ 30 -remove_bads 1 -uniqueOnly -doCounts 1 C 50 -trim 4. Output files were then screened using a custom python to retain biallelic sites only and remove CpG sites (232,685,198 remaining), and then converted to plink format (Purcell et al., 2007) using the haploToPlink tool provided with ANGSD. An additional dataset consisting of genomes ≥2 X coverage was also generated for Treemix and Orientagraph analyses (below).

Outgroup/sheep ascertained sites

Request a detailed protocol

To create an outgroup-ascertained set of variant sites for Treemix/Orientagraph, Minor Allele Frequencies were computed dataset of 11 ARS1-aligned sheep (Supplementary file 1C).

-doMaf 1 -doCounts 1 -minMaf 0.05 -minQ 20 -minMapQ 30 C 50 -doMajorMinor 3 GL 2 -remove_bads 1 -uniqueOnly 1 -SNP_pval 1e-6 -setMinDepthInd 4 -minInd 6 -sites sites.txt; where sites.txt was a file describing biallelic sites in the 2 X haploid-called dataset (above). These 223,772 outgroup-varying sites were then extracted from the 2 X dataset using plink --extract. Twelve modern domestic goat samples (Alashan2, 3, and 4; Erlangshan1, 3, and 4; Aerbasi1, 3 and 6; Liaoning2, 3, and 5; see Supplementary file 1C) were then removed due to an excess of missingness (>0.15).

Introgressed regions

Request a detailed protocol

To examine possible sources of 112 haplotypes identified as introgressed in domestic goat (Zheng et al., 2020), IBS values were calculated using ANGSD (-doIBS 1 -doMajorMinor 5 -minInd 157 -minMaf 0.02). The -minInd parameter was set to ensure only sites with ≤10% missing data across the 175 individuals were considered, and -minMaf so that alleles found at least one Capra source and two domestic genomes were included. Genomes with ≥1 X mean coverage were included, plus the Capra samples of interest. Introgressed region coordinates were obtained from Data Supplementary file 1 of Zheng et al., 2020.

Heatmaps from pairwise ibs data were constructed using the R gplots heatmap.2() function (Warnes et al., 2019), first removing Nubiana1 and Caucasus1 due to coverage effects, and sheep samples due to distortion of ibs distance value scales. Heatmaps and hierarchical clustering of each putatively-introgressed region are displayed in Figure 3—figure supplements 922. Additionally, nj trees were constructed for each IBS matrix using the nj() function and visually assessed to determine if the Direkli4 lineage is a possible source of introgressed haplogroup. All 10 nj trees are shown in Figure 3—figure supplements 21 and 22. Assessing nj trees and heatmaps, 3 regions out of the 112 total are plausible as Direkli4 being the closest-to-domestic sample (1:104,150,173–104,349,720; 2:24,324,410–24,369,675; 13:66,710,508–66,749,824); a further 7 may fit with the Direkli4-Tur1 clade being closest to domestic. These assignments should be considered preliminary in lieu of higher quality genomic data for the two Caucasian tur and the Direkli4 lineages.

Identity of MNHN sample Falconeri1

Request a detailed protocol

Falconeri1, thought to be a Capra falconeri hepteneri (Tadjik or Bukharan markhor) from the Muséum national d'Histoire naturelle (MNHN-ZM-AC-2009–243), shows the genetic profile aberrant relative to its supposed species. The mitochondrial sequence of Falconeri1 groups with the Barbary sheep or aoudad (Ammotragus lervia), a member of the Caprini tribe but not of the genus Capra (Figure 2—figure supplement 1). The mtDNA does not form a clade with the two other markhor sequences included in the phylogeny. Consistent with this is the position of Falconeri1 in IBS-based nj trees, which places the sample basal to all other non-outgroup samples, and does not group with other markhor samples (Figure 1B, Figure 1—figure supplement 5). We conclude that the sample was misidentified during storage or sampling, and that the genetic sample labeled here as Falconeri1 is more likely to be a Barbary sheep. As such the sample was excluded from subsequent analyses.

Extended D/Direkli4-specific alleles

Request a detailed protocol

We extended the general idea of the D statistic (Green et al., 2010; Patterson et al., 2012) and group D statistic (Soraggi et al., 2018) to identify variants derived in a genome/population of interest, H3, and ancestral in a set of other defined genomes/populations and an outgroup (i.e. multiple ‘outgroups’, see Figure 3A). The number of times the derived allele (i.e. the H3 ‘specific’ allele) was observed in a reference individual/genome H1 was counted (nBABAex), as was the number of times the derived allele was observed in the target individual/genome H2 (nABBAex). The difference between nABBAex and nBABAex was then divided by the sum of nABBAex and nBABAex, analogous to the D formula:

Dex=(nABBAexnBABAex)/(nABBAex+nBABAex)

A Z value can then be computed from bootstraps estimate of the standard error using 5 Mb genome blocks and 1000 bootstrap replicantes, to normalize the Dex value.

Sites must be covered at least once in each defined group (‘outgroups’ or otherwise). The ancestral allele was also conditioned on being at 100% frequency in each of the ‘outgroup’ groups, but conceptually this criteria could be slackened to investigate patterns of derived allele sharing (i.e. derived variants in H3 that are present at some frequency e.g. >0%,≤10% in one of the defined ‘outgroups’). Calculations of Dex were based on the biallelic CpG-removed haploid-called sites (232,685,198 total), computing with and without transitions, and performed using a custom python script.

Initially, we examined the sharing of Direkli4-specific variants relative to a population of ~10,000 year old Neolithic domestic-like genomes from the Zagros Mountains (Daly et al., 2021), requiring the Direkli4-derived allele to be fixed for ancestral and covered at least once each in the following groups:

  • Direkli bezoar

  • Other bezoar (ancient and modern)

  • Other Capra

  • Sheep (using the 11 genomes defined in Supplementary file 1C) ultimately defining the ancestral state

This would therefore identify variants specific to the Direkli4 lineage, and exclude those shared with the other Capra genomes analyzed here (e.g. Caucasian Tur) or the Direkli bezoar (which have likely experienced gene flow with the Direkli4 lineage, see Figure 2—figure supplement 4). Dex values were highly similar when computed using transversions only (Figure 3B) or all variants (Figure 3—figure supplement 1, for both see Supplementary file 1J). To control for reference genome effects we also calculated Dex requiring the Direkli4-derived allele to segregate in sheep, with highly correlated results (Pearson’s r=0.9935, Supplementary file 1J).

We repeated the Dex calculation but varying H3 to be a different genome of interest (Figure 3—figure supplements 2 and 3, Supplementary file 1M). Capra aegagrus from Direkli Cave (Direkli1-2, Direkli5, Direkli6) show highest levels of shared-specific ancestry, occurring in European and African goat; this ancestry is only somewhat correlated with Direkli4-specific ancestry (r=0.6562–0.5731) and likely reflects gene flow from Anatolian bezoar to the ancestors of west Eurasian domestic goat (e.g. Figure 3—figure supplements 5 and 7). A single markhor (Cfalconeri04, SAMN10736157) may have east Asian related gene flow in its ancestry, while derived alleles signals of multiple Capra genomes in sub-Saharan goat imply gene flow from a Capra source into these domestic populations.

We observed positive Dex values for Direkli4 and ancient west Eurasian goats, but not modern goats (Figure 3B). To assess whether technical biases may artificially cause Direkli4 to share more alleles with ancient samples, we examined the correlation with Dex when other Capra genomes define the “specific” allele (Figure 3—figure supplements 2 and 3, Supplementary file 1L). Correlations of specific allele sharing values with C. caucasica, the Palaeolithic Armenian bezoar Hovk1, and medieval C. cylindricornis Caucasus1 were highest (Pearson’s r=0.9258, 0.9231, 0.8824 respectively) (Supplementary file 1L). The relatively high range of correlations with historic Capra genomes (r=0.7194–0.8363) suggests some technical bias, but does not completely explain the pattern of Direkli4-specific allele sharing.

As mentioned above this ‘extended D’ approach was amenable to allowing the H3-specific variant to also segregate at a defined rate among ‘outgroups’, effectively ‘disentangling’ patterns of shared derived alleles. We investigated Direkli4-specific allele sharing, cycling through different target H2 genomes, for variants also >0%,≤10% across ‘outgroups’ (but fixed as ancestral in sheep). For each target H2, the total number of times a given ‘Outgroup’ individual shared the ‘Direkli4-specific’ allele was recorded, expressing as a proportion of the total >0%,≤10% shared variants in a heatmap format (Figure 3—figure supplement 4). While sensitive to coverage (as deeper sequenced samples will on-average make up a greater proportion of the total number of shared ‘Direkli4-specific’ variants), differences are apparent between Direkli4-specific allele sharing between ancient and modern European goat; the former share more Direkli4 alleles with Direkli bezoar, while the latter share more Direkli4 alleles with the Caucasian tur genomes Tur1 and Caucasus1. This pattern could be explained by a turnover in European goat populations to one with a greater Caucasian tur-related ancestry, a technical bias that increases affinity between the Direkli bezoar and ancient European goat or Tur genomes and modern European goat, or gene flow between a population related to modern European goat and Caucasian tur. These hypotheses could be tested with finer temporal sampling of European goats, or a time series of Caucasian tur to determine if gene flow from domestic populations occurred.

Python scripts to run the extended D calculations and to ‘disentangle’ the pattern of derived allele sharing are available on GitHub (Daly, 2022, copy archived at swh:1:rev:f803deabaa929dad5cebeec67bb0ee3b83e3c4a9) and https://osf.io/3ecqd/.

To confirm our results were unlikely due to choice of reference genome, we recalculated a subset of Dex statistics with data aligned to sheep, with and without ascertaining in the sheep outgroup population and examining tests related to Direkli4-specific variants in domestic groups (Supplementary file 1J and K). Correlation with goat-aligned data was high (r=0.897 and 0.906 respectively using transversions-only), suggesting that a reference bias towards ARS1 was unlikely driving the observed patterns in Direkli4-sharing allele sharing. However, some Z scores differed by crossing the chosen significance threshold (| Z |>3) while retaining their Dex directionality. For example, with sheep-ascertained sites the modern European goats IOG and Italian4 obtain significant negative scores (having fewer Direkli4-specific variants than the reference Neolithic Zagros group) when using sheep-aligned data but not goat aligned (negative but not |Z|>3, Supplementary file 1K). This is likely due to the lower number of genomes included in the Dex calculation when using sheep-aligned data, a necessity due to the computational limitations of aligning available data to the sheep genome. Each additional genome include in the H4 “outgroup” (e.g. diverse bezoar, Capra specimen, other Direkli bezoar) should reduce the total number of sites in the Dex calculation, by filtering out variants otherwise assigned as ‘Direkli4-specific’. Fewer sites will reduce the sensitivity of the test (by decreasing the number of sites included in each bootstrap iteration) while increasing the specificity of the variants. As such it is the consistency of directionality and correlation of the Dex values between different data sets which should lend confidence to the goat-aligned results presented in Figure 3—figure supplements 13.

Graph-based modeling

Request a detailed protocol

To generate a graph-based admixture model of how individual genomes relate, Treemix (Pickrell and Pritchard, 2012) was employed on the 2X-sheep ascertained dataset. The number of migration events m was varied from 0 to 5, with other parameters set as -root Sheep -k 1000 --noss --global. Node support values were estimated using 50 bootstrap replicates and the -boot option, applying node support values to the base tree using RAxML (Stamatakis, 2014).

To maximize the possibility of detecting gene flow between the Direkli4/Tur lineage, Orientagraph (Molloy et al., 2021) was employed on a reduced set of populations and at a group level. Sites were filtered to retain those with at least one call per group and a MAF of ≥0.05, leaving 104,550 sites. m was varied from 0 to 4 due to computational limitations, running with Treemix settings as above but with the addition of -mlno to find the Maximum Likelihood Network Orientation, and k of 500 due to the lower SNP number.

Finally ADMIXTOOLS2 (Maier et al., 2022) was employed to explore admixture graph space in a complementary manner. To investigate the admixture status of the ‘taurasian tur’ lineage in a limited graph space, six populations/genomes were included:

  1. West Caucasus Tur Capra caucasica (Tur1).

  2. Direkli4 (DIR4).

  3. Direkli bezoar / Epipaleolithic Taurus (ETa).

  4. PPN/PN East Iran (NEI).

  5. Neolithic Serbia (NSe).

  6. A group of ARS1-aligned sheep were used as the outgroup (Sheep).

SNPs used were the 223,772 sheep-ascertained variants described above. We followed the approach suggested by Maier et al., 2022, fitting 50 graphs per complexity class (m=0–5) using find_graphs(stop_gen = 100, outpop = 'Sheep') using pre-computed f2 statistics calculated over all goat autosomes and allowing missingness (extract_f2(maxmiss = 1, auto_only = F)). At m=2 (two admixture edges), the preponderance of graphs fit with a worst f4 outlier | Z |<3.

find_graph() was rerun at m=2 for 50 iterations, with the constraint that the Neolithic Serbian group had to receive at least one admixture edge in its history, reflecting the established gene flow event from a population related to the Direkli bezoar (Daly et al., 2018; Daly et al., 2021). Duplicate graphs were removed, as were graphs with 100%/0% admixture events. The best fitting graph was compared to the remaining graphs by calculating bootstrap values for graph scores (qpgraph_resample_multi(nboot = 100) and compare_fits()$p_emp), with significantly worse fitting graphs removed; several as-good-as fitting graphs retained significant f4 outlier |Z|>3.

The remaining 11 graphs were then scored for the presence of (1) admixture edges from ETa into DIR4, an indication that the taurasian tur group has bezoar admixture, and (2) admixture edges from DIR4 into ETa, indicating that Direkli bezoar have ancestry related to the taurasian tur. In a majority of graphs (6/11), ETa is modeled as having DIR4 ancestry (median 1.5% DIR4 ancestry, mean 5.2%). Only one graph models DIR4 as a mixture of an ETa related clade and Tur1 (2% for the latter). Following the methodology suggested by Maier et al., 2022, we examined the best fitting graph at an additional complexity class (m=3) and found ETa again to be modeled as containing DIR4 ancestry (1%), demonstrating this feature to be a robust one. These results suggest that while the data does not exclude bezoar to taurasian tur gene flow (which is implied by other results, including mtDNA lineages Figure 2—figure supplement 1), ‘Direkli taurasian tur’ to ‘Direkli bezoar’ admixture was of greater consequence, with more graphs indicating Direkli bezoar received taurasian tur admixture.

All 12 graphs ‘as good as’ the best fitting graph for m=2 (displayed in Figure 3—figure supplement 8) are available at https://osf.io/3ecqd/, along with the distribution of log likelihood scores for m=0–5 and the best fitting graph for m=3.

Data, script, and code availability

Request a detailed protocol

Raw sequencing reads, aligned QCed final bam files, and mitochondrial fasta files have been deposited in ENA under the project accession PRJEB51668. Admixture graphs ‘as good as’ the best fitting graph are available at https://osf.io/3ecqd/. Capra taurensis has been registered under the Zoobank LSID urn:lsid:zoobank.org:act:1261A42B-B0C0-4571-87F4-8EC3B5381A88. Scripts for extended D calculation/disentangling derived allele sharing are available on GitHub and https://osf.io/3ecqd/.

Data availability

Raw sequencing reads, aligned QCed final bam files, and mitochondrial fasta files have been deposited in ENA under the project accession PRJEB51668. Admixture graphs "as good as" the best fitting graph are available at https://osf.io/3ecqd/. Capra taurensis has been registered under the Zoobank LSID urn:lsid:zoobank.org:act:1261A42B-B0C0-4571-87F4-8EC3B5381A88. Scripts for extended D calculation/disentangling derived allele sharing are available on GitHub, (copy archived at swh:1:rev:f803deabaa929dad5cebeec67bb0ee3b83e3c4a9).

The following data sets were generated
    1. Daly KG
    (2022) European Nucleotide Archive
    ID PRJEB51668. A novel lineage of the Capra genus discovered in the Taurus Mountains using ancient genomics.
    1. Daly KG
    (2022) Open Science Framework
    ID 3ecqd. Direkli Cave Capra lineage.
    1. Daly KG
    (2022) zoobank
    ID urn:lsid:zoobank.org:act:5C272CD4-F4A8-4694-A0DE-4AE9A56B8349. Zoobank LSID of Taurensis Tur.

References

  1. Thesis
    1. Açıkkol A
    (2006)
    Üçağızlı mağarası Faunasının zooarkeolojik açıdan analizi: Capra, Capreolus, Dama ve Cervusların morfometrik açıdan analiz
    Türkiye Cumhuriyeti, Ankara Üniversitesi.
    1. Arbuckle BS
    (2019)
    Zooarchaeology at Epipaleolithic Direkli cave, Kahramanmaraş, Turkey
    Journal of Anatolian Prehistoric Research/Anadolu Prehistorya Araştırmaları Dergisi 5:1–14.
  2. Book
    1. Bökönyi S
    (1977)
    The Animal Remains from Four Sites in the Kermanshah Valley
    British Archaeological Reports.
  3. Book
    1. Castelló JR
    2. Huffman B
    3. Groves C
    (2016)
    Bovids of the World: Antelopes, Gazelles, Cattle, Goats, Sheep, and Relatives
    Princeton University Press.
    1. Churcher CS
    (1994)
    The vertebrate fauna from the natufian level at Jebel es-Saaïdé (Saaïdé II), Lebanon
    Paléorient 20:35–58.
    1. Crégut-Bonnoure E
    (1991)
    Intérêt biostratigraphique de la morphologie dentaire de Capra (Mammalia, Bovidae)
    Annales Zoologici Fennici 28:273–290.
    1. Erek CM
    (2010)
    A new Epi-Paleolithic site in the Northeast Mediterranean region: Direkli cave (Kahramanmaraş, Turkey)
    Adalya 13:1–17.
  4. Book
    1. Gavashelishvili A
    (2009)
    Gis-Based habitat modeling of mountain ungulate species in the Caucasus hotspot
    In: Zazanashvili N, Mallon D, editors. Status and Protection of Globally Threatened Species in the Caucasus. CEPF, WWF. pp. 74–82.
  5. Book
    1. Groves C
    2. Grubb P
    (2011)
    Ungulate Taxonomy
    JHU Press.
  6. Book
    1. Heptner VG
    2. Nasimovich AA
    3. Bannikov AG
    (1961) Mammals of the Soviet Union
    In: Heptner VG, Naumov NP, editors. Vysshaya Shkola. Brill. pp. 1–679.
    https://doi.org/10.2307/1381452
  7. Thesis
    1. Hesse BC
    (1978)
    Evidence for husbandry from the early Neolithic site of Ganj Dareh in western Iran
    Columbia University.
  8. Book
    1. Horwitz LK
    (2003)
    The neolithic fauna
    In: Khalaily H, Marder O, editors. The Neolithic Site of Abu Gosh. The 1995 Excavations. Israeli Antiquities Authority Reports. pp. 87–101.
    1. Kersten AMP
    (1987)
    Age and sex composition of Epipalaeolithic fallow deer and wild goat from Ksar ‘Akil
    Palaeohistoria 29:119–131.
    1. Kökten K
    (1960)
    Anadolu Maras vilayetinde tarihten dip tarihe gidis
    Türk Arkeoloji Dergisi 10:42–53.
    1. MacHugh DE
    2. Edwards CJ
    3. Bailey JF
    4. Bancroft DR
    5. Bradley DG
    (2000)
    The extraction and analysis of ancient DNA from bone and teeth: a survey of current methodologies
    Ancient Biomolecules 3:81–103.
  9. Book
    1. Mashkour M
    2. Amiri S
    3. Fathi H
    4. Khazaeli R
    5. Debue K
    6. Decruyenaere D
    7. Beizaee Doost S
    8. Clavel B
    9. Kamjan S
    10. Jajanidze R
    11. Sauer EW
    (2020)
    Herding and Hunting in the highlands from the Sasanian to Late Medieval Periods. The Archaeozoology of the Dariali Gorge
    In: Sauer EW, editors. Dariali: The “Caspian Gates” in the Caucasus from Antiquity to the Age of the Huns and the Middle Ages: The Joint Georgian-British Dariali Gorge Excavations and Surveys 2013-2016. Oxbow Books. pp. 729–788.
    1. Pfitzenmayer EV
    (1915)
    Nekotoryye interesnyye ublyudki semeistva polorogikh iz zakavkaz ’ Ya
    Izvestiya Kavkazskogo Muzeya 8:251–266.
  10. Software
    1. R Development Core Team
    (2021) R: a language and environment for statistical computing
    R Foundation for Statistical Computing, Vienna, Austria.
    1. Rohland N
    2. Harney E
    3. Mallick S
    4. Nordenfelt S
    5. Reich D
    (2015) Partial uracil-DNA-glycosylase treatment for screening of ancient DNA
    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 370:20130624.
    https://doi.org/10.1098/rstb.2013.0624
    1. Sarkisov AA
    (1953)
    0 pomesyakh polorogikh (on bovid crossbreeds)
    Priroda 42:113.
  11. Book
    1. Shackleton DM
    (1997)
    Wild Sheep and Goats and Their Relatives: Status Survey and Conservation Action Plan for Caprinae
    IUCN.
  12. Book
    1. Uerpmann HP
    (1987)
    The Ancient Distribution of Ungulate Mammals in the Middle East: Fauna and Archaeological Sites in Southwest Asia and Northeast Africa
    L. Reichert Verlag.
  13. Book
    1. Wilson DE
    2. Reeder DM
    (2005)
    Mammal Species of the World: A Taxonomic and Geographic Reference
    JHU Press.

Article and author information

Author details

  1. Kevin G Daly

    Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    For correspondence
    dalyk1@tcd.ie
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5579-6144
  2. Benjamin S Arbuckle

    Department of Anthropology, University of North Carolina at Chapel Hill, Chapel Hill, United States
    Contribution
    Conceptualization, Resources, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
  3. Conor Rossi

    Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
    Contribution
    Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4561-8878
  4. Valeria Mattiangeli

    Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
    Contribution
    Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9785-1714
  5. Phoebe A Lawlor

    Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
    Contribution
    Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  6. Marjan Mashkour

    1. Centre National de Recherche Scientifique / Muséum national d'Histoire naturelle, Archéozoologie, Archéobotanique, Paris, France
    2. University of Tehran, Bioarchaeology Laboratory, (Central Laboratory), Archaeozoology section, Tehran, Islamic Republic of Iran
    Contribution
    Resources, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3630-9459
  7. Eberhard Sauer

    School of History, Classics and Archaeology, University of Edinburgh, Edinburgh, United Kingdom
    Contribution
    Resources, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  8. Joséphine Lesur

    Centre National de Recherche Scientifique / Muséum national d'Histoire naturelle, Archéozoologie, Archéobotanique, Paris, France
    Contribution
    Resources, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  9. Levent Atici

    Department of Anthropology, University of Nevada, Las Vegas, Las Vegas, United States
    Contribution
    Resources, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4929-173X
  10. Cevdet Merih Erek

    Department of Archeology, Department of Prehistoric Archeology, Faculty of Letters, Ankara Hacı Bayram Veli University, Ankara, Turkey
    Contribution
    Resources, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0259-5111
  11. Daniel G Bradley

    Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
    Contribution
    Conceptualization, Supervision, Funding acquisition, Writing – original draft, Writing – review and editing
    For correspondence
    dbradley@tcd.ie
    Competing interests
    No competing interests declared

Funding

University of North Carolina at Chapel Hill (URC grant from the Office of the Vice Chancellor for Research)

  • Benjamin S Arbuckle

H2020 European Research Council (885729-AncestralWeave)

  • Kevin G Daly
  • Conor Rossi
  • Valeria Mattiangeli
  • Daniel G Bradley

H2020 European Research Council (295729-CodeX)

  • Kevin G Daly
  • Conor Rossi
  • Valeria Mattiangeli
  • Daniel G Bradley

H2020 European Research Council (295375-Persia)

  • Eberhard Sauer

Science Foundation Ireland (21/PATH-S/9515)

  • Kevin G Daly

Baylor University (Office of Vice Provost for Research grants)

  • Benjamin S Arbuckle

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Matthew Teasdale and Amelie Scheu for their advice on interpretation of results and helpful discussions. Excavations at Direkli Cave are sponsored by T.C. Kültür ve Turizm Bakanlığı. Permission to export samples from Direkli Cave provided by T.C. Kültür Varlıkları ve Müzeler Genel Müdürlüğü and T.C. Ankara Valılığı, Il Kültür ve Turizm Müdürlüğü, Anadolu Medeniyetleri Müzesi Müdürlüğü (#70583208–160.99(06)–899). We thank the VarGoats consortia for use of the modern tur sequencing data in IBS analyses. Preprint version 5 of this manuscript has been peer-reviewed and recommended by Peer Community In Genomics (https://doi.org/10.24072/pci.genomics.100020), and we thank all of those involved for their time and expertise. Zooarchaeological work at Direkli has been supported by grants from the Office of Vice Provost for Research, Baylor University and a URC grant from the Office of the Vice Chancellor for Research at the University of North Carolina at Chapel Hill (Benjamin S Arbuckle). This work was supported by the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant numbers 885729-AncestralWeave, 295729-CodeX, 295375-Persia and its Neighbours) (Kevin G Daly, Conor Rossi, Valeria Mattiangeli, Daniel G Bradley, Eberhard Sauer); and supported in part by a Grant from Science Foundation Ireland under grant number 21/PATH-S/9515 (Kevin G Daly).

Version history

  1. Preprint posted: April 10, 2022 (view preprint)
  2. Received: August 25, 2022
  3. Accepted: September 9, 2022
  4. Version of Record published: October 3, 2022 (version 1)

Copyright

© 2022, Daly et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 891
    Page views
  • 168
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Kevin G Daly
  2. Benjamin S Arbuckle
  3. Conor Rossi
  4. Valeria Mattiangeli
  5. Phoebe A Lawlor
  6. Marjan Mashkour
  7. Eberhard Sauer
  8. Joséphine Lesur
  9. Levent Atici
  10. Cevdet Merih Erek
  11. Daniel G Bradley
(2022)
A novel lineage of the Capra genus discovered in the Taurus Mountains of Turkey using ancient genomics
eLife 11:e82984.
https://doi.org/10.7554/eLife.82984

Share this article

https://doi.org/10.7554/eLife.82984

Further reading

    1. Cell Biology
    2. Evolutionary Biology
    Jonathan E Phillips, Duojia Pan
    Research Advance

    The genomes of close unicellular relatives of animals encode orthologs of many genes that regulate animal development. However, little is known about the function of such genes in unicellular organisms or the evolutionary process by which these genes came to function in multicellular development. The Hippo pathway, which regulates cell proliferation and tissue size in animals, is present in some of the closest unicellular relatives of animals, including the amoeboid organism Capsaspora owczarzaki. We previously showed that the Capsaspora ortholog of the Hippo pathway nuclear effector Yorkie/YAP/TAZ (coYki) regulates actin dynamics and the three-dimensional morphology of Capsaspora cell aggregates, but is dispensable for cell proliferation control (Phillips et al., 2022). However, the function of upstream Hippo pathway components, and whether and how they regulate coYki in Capsaspora, remained unknown. Here, we analyze the function of the upstream Hippo pathway kinases coHpo and coWts in Capsaspora by generating mutant lines for each gene. Loss of either kinase results in increased nuclear localization of coYki, indicating an ancient, premetazoan origin of this Hippo pathway regulatory mechanism. Strikingly, we find that loss of either kinase causes a contractile cell behavior and increased density of cell packing within Capsaspora aggregates. We further show that this increased cell density is not due to differences in proliferation, but rather actomyosin-dependent changes in the multicellular architecture of aggregates. Given its well-established role in cell density-regulated proliferation in animals, the increased density of cell packing in coHpo and coWts mutants suggests a shared and possibly ancient and conserved function of the Hippo pathway in cell density control. Together, these results implicate cytoskeletal regulation but not proliferation as an ancestral function of the Hippo pathway kinase cascade and uncover a novel role for Hippo signaling in regulating cell density in a proliferation-independent manner.

    1. Evolutionary Biology
    2. Immunology and Inflammation
    Zachary Paul Billman, Stephen Bela Kovacs ... Edward A Miao
    Research Article

    Gasdermins oligomerize to form pores in the cell membrane, causing regulated lytic cell death called pyroptosis. Mammals encode five gasdermins that can trigger pyroptosis: GSDMA, B, C, D, and E. Caspase and granzyme proteases cleave the linker regions of and activate GSDMB, C, D, and E, but no endogenous activation pathways are yet known for GSDMA. Here, we perform a comprehensive evolutionary analysis of the gasdermin family. A gene duplication of GSDMA in the common ancestor of caecilian amphibians, reptiles, and birds gave rise to GSDMA–D in mammals. Uniquely in our tree, amphibian, reptile, and bird GSDMA group in a separate clade than mammal GSDMA. Remarkably, GSDMA in numerous bird species contain caspase-1 cleavage sites like YVAD or FASD in the linker. We show that GSDMA from birds, amphibians, and reptiles are all cleaved by caspase-1. Thus, GSDMA was originally cleaved by the host-encoded protease caspase-1. In mammals the caspase-1 cleavage site in GSDMA is disrupted; instead, a new protein, GSDMD, is the target of caspase-1. Mammal caspase-1 uses exosite interactions with the GSDMD C-terminal domain to confer the specificity of this interaction, whereas we show that bird caspase-1 uses a stereotypical tetrapeptide sequence to confer specificity for bird GSDMA. Our results reveal an evolutionarily stable association between caspase-1 and the gasdermin family, albeit a shifting one. Caspase-1 repeatedly changes its target gasdermin over evolutionary time at speciation junctures, initially cleaving GSDME in fish, then GSDMA in amphibians/reptiles/birds, and finally GSDMD in mammals.