Mitochondrial genomes of Pleistocene megafauna retrieved from recent sediment layers of two Siberian lakes

  1. Peter Andreas Seeber  Is a corresponding author
  2. Laura Batke
  3. Yury Dvornikov
  4. Alexandra Schmidt
  5. Yi Wang
  6. Kathleen Stoof-Leichsenring
  7. Katie Moon
  8. Samuel H Vohr
  9. Beth Shapiro
  10. Laura S Epp  Is a corresponding author
  1. Department of Biology, University of Konstanz, Germany
  2. Agroengineering Department/Department of Landscape Design and Sustainable Ecosystems, Agrarian and Technological Institute, RUDN University, Russian Federation
  3. Laboratory of Carbon Monitoring in Terrestrial Ecosystems, Institute of Physicochemical and Biological Problems of Soil Science of the Russian Academy of Sciences, Russian Federation
  4. Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Polar Terrestrial Environmental Systems, Germany
  5. Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, United States
  6. Howard Hughes Medical Institute, University of California, Santa Cruz, United States
  7. Embark Veterinary, Inc, United States

Abstract

Ancient environmental DNA (aeDNA) from lake sediments has yielded remarkable insights for the reconstruction of past ecosystems, including suggestions of late survival of extinct species. However, translocation and lateral inflow of DNA in sediments can potentially distort the stratigraphic signal of the DNA. Using three different approaches on two short lake sediment cores of the Yamal peninsula, West Siberia, with ages spanning only the past hundreds of years, we detect DNA and identified mitochondrial genomes of multiple mammoth and woolly rhinoceros individuals—both species that have been extinct for thousands of years on the mainland. The occurrence of clearly identifiable aeDNA of extinct Pleistocene megafauna (e.g. >400 K reads in one core) throughout these two short subsurface cores, along with specificities of sedimentology and dating, confirm that processes acting on regional scales, such as extensive permafrost thawing, can influence the aeDNA record and should be accounted for in aeDNA paleoecology.

eLife assessment

This work presents convincing evidence for the presence of wooly mammoth/rhinoceros ancient environmental DNA (aeDNA) far from the time likely to host living individuals: what is effectively a genetic version of a geological inclusion. These are important findings that will have ramifications for the interpretation and conclusions extracted from aeDNA more generally.

https://doi.org/10.7554/eLife.89992.3.sa0

Introduction

Sedimentary deposits constitute highly valuable archives of past ecosystem changes as they contain dateable layers with organismic remains including ancient DNA (aDNA). Such remains are typically assumed to represent the ecosystem of the time around which the respective stratum was deposited. aDNA from sediments has yielded remarkable insights regarding paleoecology, phylogeography, and extirpation and extinction events of keystone taxa such as mammoths (Haile et al., 2009; Boessenkool et al., 2012; Graham et al., 2016; Murchie et al., 2021b). Based on such ancient environmental DNA (aeDNA), a recent study proposed that the woolly mammoth (Mammuthus primigenius) may have survived in Eurasia for much longer than previously assumed, as the authors retrieved mammoth DNA sequences in sediment layers that were approximately 4.6–7 thousand years (kyr) younger than the most recent mammoth fossils Wang et al., 2021; however, in response to this interpretation, Miller and Simpson, 2022 opined that these results may be more likely due to taphonomic processes leading to release of aeDNA from the remains of long-dead organisms from permafrost, where it is well preserved.

Conclusions derived from aeDNA isolated from sediment cores rely on the stratified nature of the remains in question and dating of the respective layer by radiometric methods. However, in theory, various physical and geochemical processes such as translocation of DNA through sediment strata (Haile et al., 2007), re-deposition of older sediment carrying DNA of extinct organisms (Arnold et al., 2011), and preservation bias (Boere et al., 2011) can distort the biological signal of aeDNA and thus bias the accuracy of allocation of taxa to specific time periods (Armbrecht et al., 2019). For lake sediments, deposited under aquatic conditions, studies have suggested that leaching is not a concern (Parducci et al., 2017), but it has been observed in soils and cave sediments (Haile et al., 2007). The question of obtaining last appearance dates of extinct taxa using aeDNA in dated sediment layers (Haile et al., 2009) is under discussion, but the surprisingly young records published so far still date to multiple thousands of years before present and thus lie within a timeframe of possible late survival.

Results and discussion

In 2019, we retrieved short subsurface sediment cores from two Arctic thermokarst lakes (LK-001 and LK-007, located approximately 5 km apart, over massive permafrost; Table 1; Dvornikov et al., 2016) on the Yamal peninsula, Siberia, to extract DNA and assess changes in mammal abundances in the Arctic over the past decades and centuries. From lake LK-001, we collected a secondary core which was sliced in the field at 1 cm steps for Pb210 radiometric dating, which indicated that the sediments at the top of this core were deposited recently, and that the core spanned the past few centuries (Appendix 1—table 7). The cores for DNA extraction were closed in the field immediately after retrieval and were then transported to the dedicated aDNA laboratories of the University of Konstanz, Germany. In this lab facility, established in 2020, no other samples from the Arctic or from any large mammals had been processed previously. From core LK-001, we isolated DNA and produced genomic double-stranded libraries from 23 samples, from 1.5 to 80 cm core depth (Supplement section 1), according to standard procedures. The core was opened and all subsequent steps until index PCR setup were carried out under customary aDNA laboratory conditions. In particular, the core opening facilities and the lab are located in buildings separated from the downstream molecular genetic analyses, the ventilation of the aDNA lab is based on a HEPA filter system and positive air pressure, and the lab is subjected to nightly UV radiation. Work in the lab is conducted under strict aDNA precautions, adhering to established aDNA protocols (Fulton and Shapiro, 2019). We enriched the libraries for mammalian DNA using a custom RNA bait panel produced from complete mitogenome sequences of 17 mammal species that currently or previously occurred in the Arctic (adapted from Murchie et al., 2021a). The enriched libraries were sequenced, and we mapped the sequences against a database of 73 mammal mitogenomes, followed by BLASTn alignment against the complete NCBI nt database. We thus retrieved mitogenomic sequences of mammals that were expected during the age range covered by the core (Appendix 1—table 9), for example, reindeer (Rangifer tarandus), Arctic lemming (Dicrostonyx torquatus), and hare (Lepus); however, throughout the entire core, there were abundant sequences of two species that have been extinct for several thousand years, that is mammoth (Mammuthus primigenius) and woolly rhinoceros (Coelodonta antiquitatis). Twenty-two of the 23 LK-001 libraries produced >1000 reads, each, assigned to Mammuthus, with read counts ranging from 2852 to 72,919 (mean 21,140±17,296). Negative controls (extraction and library blanks) did not produce any reads assigned to mammals. In the sample with the highest M. primigenius read counts (31.5 cm depth, dated to 81 years), the coverage of the reference mitogenome (NCBI accession NC_007596.2) was 95.3%, (434 (±213)-fold). Across all samples, 465,080 reads assigned to mammoth were produced, with 98.3% coverage (2762 to ±1176 fold). Read lengths ranged from 28 to 289 bp (mean 100±44 bp; Figure 1). The number of woolly mammoth reads decreased from lower samples towards the top of the core (Figure 1). Signatures of post-mortem DNA decay were comparably minor (Figure 1), with reference to an M. primigenius genome downloaded from NCBI (accession NC_007596.2), and mapping suggested that the sequences throughout the core originated from multiple individuals. Further analyses of the three libraries with the most mammoth reads using mixemt (Vohr et al., 2017) identified a number of mitochondrial haplogroups in the sequences from the core, suggesting that they originated from a multiple individuals (Figure 2). The haplogroups identified were known to occupy the region, and it seems likely the sequences reflect a history of mammoth occupation at the core site. Twelve of the 23 libraries produced >100 reads, each, assigned to woolly rhinoceros, with a total of 2737 reads and a cumulative coverage of 44% (assembled to NC_012681.1 2). Further analyses are mostly focused on the mammoth sequences as these occurred in substantially higher numbers.

aDNA of Mammuthus in recent lake sediments.

(A) Read counts assigned to Mammuthus (square-root-transformed proportion of the respective number of raw reads per library) after hybridization capture enrichment of aeDNA of core LK-001 (shown are results of 22 libraries; one library was excluded as it did not produce any reads assigned to mammals); square-root transformation of percentage. Indicated are sample depths (in cm; 1.5–80 cm) and approximate ages as per 210Pb chronology (Appendix 1—table 7; to a maximum depth of 39.5 cm). The solid line indicates the general trend. Across the 22 libraries: (B) Fragment length distribution and (C) damage patterns (red indicates C-to-T transitions, blue G-to-A transitions. the Y-axis indicates the percentage of positions with a nucleotide change, the X-axis indicates the position along the fragment).

Locations of the sediment cores of the present study (Yamal peninsula, Siberia) and previously retrieved mammoth remains and their haplo(sub)groups (Appendix 1—table 6).

The bar chart indicates a maximum-likelihood estimate of the haplogroup proportions derived from the reads from the three sediment core libraries with the most mammoth reads.

Table 1
Sediment cores retrieved from two lakes on the Yamal peninsula, Siberia.
LakeCoordinatesm above sea levelAreaWater depthCore length
LK-00170°16'45.6" N, 68°53'02.8" E2838 ha17 m80 cm
LK-00770°16'02.8" N, 68°59'35.7" E3639 ha14 m75 cm

As these results were entirely incongruent with the temporal occurrence of these two species, we employed several methods to confirm their validity, that is conventional PCR and Sanger sequencing of a mammoth cytochrome c oxidase subunit I (COI) fragment, mammal metabarcoding, and droplet digital PCR (ddPCR) of a mammoth cytochrome b fragment. We performed these analyses on core LK-001 and, to evaluate whether this is a locally isolated phenomenon, on a second short core, LK-007, from a nearby lake, which had been opened and processed under the same conditions indicated above to prevent contamination. Additionally, we screened the core LK-001 for plant macrofossils, of which three were sent for 14C AMS dating.

Conventional PCR and Sanger sequencing confirmed amplification of a COI fragment of M. primigenius. Mammal metabarcoding produced mammoth sequences (74 bp) in 13 samples of core LK-001 and 9 samples of core LK-007. In the LK-001 core, the highest M. primigenius read count occurred at 31.5 cm (2,992 reads), and in the LK-007 core at 26 cm M. primigenius (3,580 reads). ddPCR produced M. primigenius sequences in 14 samples of each core. Metabarcoding and ddPCR patterns across the cores were similar, although not completely congruent, as ddPCR appeared to be more sensitive (Appendix 1—figure 1). The dates retrieved by radiocarbon dating were not congruent with the initial age inference suggested by Pb210. While the lowest and topmost sample (with ages of 1547±228 and 339±79 uncal. yrs BP respectively) suggest relatively young ages and agree in their temporal succession, the middle sample, at 51 cm, yielded a radiocarbon age of 8677±132 years.

The mammoth was abundant throughout most of Eurasia during the Pleistocene, but populations declined at the end of the Pleistocene, with the species going extinct in the mid-Holocene (Nogués-Bravo et al., 2008). The youngest fossils from mainland Siberia have been dated to 9650 years (Stuart et al., 2002). The woolly rhinoceros was a cold-adapted megaherbivore, which was abundant from western Europe to north-east Siberia during the Middle to Late Pleistocene (Kahlke and Lacombat, 2008). This species was predominantly grazing, probably resorting to browsing only due to seasonal vegetation restrictions (Kahlke and Lacombat, 2008; Rey-Iglesia et al., 2021; Stefaniak et al., 2021). The reasons underlying the extinction of this species at ca 14 ka BP are not entirely clear, but it is largely attributed to sudden climate warming and subsequent habitat unsuitability due to vegetation changes (Stuart and Lister, 2012; Lord et al., 2020; Wang et al., 2021), likely coupled with human influence (Fordham et al., 2022), in the Bølling-Allerød interstadial warming.

The present data, which implies frequent and abundant Pleistocene megafaunal DNA throughout a sediment core deposited in the lake over the past centuries suggests that physical processes, rather than presence of live organisms, are responsible for the recovery of this DNA. While not in itself fully conclusive, our data suggests the source of the DNA of Pleistocene mammals from older permafrost deposits, either from a carcass or from sedimentary materials carrying the DNA. The numbers of mammoth sequences increased with depth downcore, with comparably low abundances over the most recent 11 cm of the core (aged <30 years), pointing to a decrease of input of this DNA through time. The apparently rather limited extent of aDNA damage in the mammoth sequences suggests that the source of this DNA has been preserved exceptionally well, which also suggests an origin from permafrost, and the specific dynamics of thawing and re-deposition of material in the study area offer an explanation.

Here, the active thermokarst likely began during the climatic optimum of the Holocene (9000–3,000 BC; Savvichev et al., 2021). The LK-001 and LK-007 lake basins are interbedded mainly into the IVth marine plain, formed in the Kazantsevskaya, Marine Isotopic Stage 5. The permafrost of these lake catchments was formed no earlier than 70—60 kyr BP after the Kazantsevskaya transgression in more sub-aerial conditions and with mean annual temperatures 6 to 7 °C lower than modern temperatures (Baulin et al., 1981). As this area was under coastal-marine conditions for a long period, these lake basins may be paleo-marine remnants, or they were formed later as a result of thermokarst over the segregated or tabular (massive) ground ice during the Holocene climatic optimum (Kachurin, 1961). The area is also subject to abrupt permafrost thaw (thermo-denudation), resulting, for example, in the formation of retrogressive-thaw slumps (thermocirques) and the transport of a large amounts of thawed terrestrial material into the lake water (Dvornikov et al., 2018). Such abrupt permafrost thaw processes normally appear adjacent to lakes and can form specific geomorphological elements, that is, thermo-terraces (Kizyakov et al., 2006) within lake catchments and lakes; they are normally polycyclic processes appearing due to the extension of a seasonally thawed layer (active layer) up to the top of massive ice and more favorable thermal conditions within the existing forms (Leibman and Kizyakov, 2007; Kokelj et al., 2009). Traces of past thermo-denudation can be observed within both lake catchments. In catchments of five neighboring lakes, large retrogressive-thaw slumps appeared in recent years (2012–2013) accompanied by the thaw and lateral transport of modern and Late Pleistocene deposits into lakes. Additionally or alternatively, thermo-erosion of upper geomorphological levels and transport via stream networks could transport ancient material into the modern lacustrine sediments. However, the two studied lakes are headwater lakes (with outlet, no apparent inlet) and this option can only be considered in terms of small thermo-erosional valleys within the catchments.

An alternative mechanism for the redistribution of Late Pleistocene material in the sediments is related to subcap methane emission (bubbling) from degrading permafrost beneath the lake bottom. In-lake bubbling can be observed in a circum-Arctic scale: in North-East Siberia, Alaska and Canada. This is common especially in lakes with a depth exceeding two meters, which do not freeze entirely up to the bottom in winter, leading to the formation of a talik (a layer of year-round unfrozen ground that lies in permafrost area). The expansion of the talik may further trigger subcap methane emission, which can reach 40–70 kg yr–1 of pure (94–100%) methane in neighboring lakes (Kazantsev et al., 2020). The constant methane seepage does not allow ice to be formed in winter (whereas the normal winter ice thickness is approximately 1.5 m) and can potentially disturb the stratigraphy of lake sediments. Additionally, dramatic emissions of methane can form craters in terrestrial and lacustrine environments (Dvornikov et al., 2019). In this case, Late Pleistocene sediments will well be re-distributed within the water-body and the entire stratigraphy will be mixed.

In the case reported here, the simultaneous finding of a Pb210 chronology indicating recent sediment deposition and of plant macrofossils that dated to >8000 years BP in a sample from 36.5 cm, suggest lateral input of ancient material, including the mammalian DNA, putatively related to permafrost thawing processes. Numerous studies on aDNA discussed possible leaching through sedimentary strata of the DNA itself, yet it was typically considered not an issue as most of these studies were conducted under stable permafrost or similar conditions (e.g. Hansen et al., 2006; Haile et al., 2007, but see Andersen et al., 2012). Permafrost thawing and re-deposition of material adds a new dimension to this problem of temporal interpretation. The fact that we retrieved mammoth sequences from both cores of two lakes located approximately 5 km apart suggests that this is not an isolated phenomenon but occurs on a regional or even larger scale. Given the wide spread of abrupt permafrost thaw processes in the Arctic plains (West Siberia, Taimyr, Chukotka, Alaska, Canadian Arctic; Kizyakov et al., 2006 and references therein), the phenomena of disturbed stratigraphy of lacustrine sediments can potentially be observed at a pan-Arctic scale. While this indicates that temporal interpretation of sedimentary aeDNA records should be exercised with caution, our study also demonstrates that a careful evaluation of available information on the site and ecosystem in conjunction with the use of independent dating techniques can uncover incongruencies. This is more difficult in older time periods, where artefactual stratigraphies caused by equivalent processes acting for a limited time will not be detected as easily as in our case of long extinct species. The same applies to extant taxa or those which have undergone extinction or extirpation more recently, the presence of which cannot as easily be excluded as in the current example. However, we suggest that the inclusion of robust dating techniques and knowledge of local geophysical processes can provide good arguments to evaluate the reliability of aeDNA records.

Materials and methods

Field sites, DNA isolation and hybridization capture enrichment

Request a detailed protocol

In 2019, sediment cores (6 cm diameter; Table 1) were retrieved from two lakes (LK-001 and LK-007, respectively, Table 1) on the Yamal peninsula, Siberia, using a UWITEC piston corer (UWITEC, Mondsee, Austria). The lakes were located approximately 5 km apart. The cores were transported to the aDNA laboratories of the University of Konstanz, Germany; from lake LK-001, a secondary core was taken which was sliced in the field at 1 cm steps for radiometric dating from 0 to 39.5 cm depth, performed at the Environmental Radioactivity Research Centre, University of Liverpool (Supplement section 9). Additional 14 C dating of three specimens of plant remains, extracted at 36.5 cm, 51 cm, and 74 cm, was performed (Supplement section 9).

Sedimentary DNA was isolated from 23 samples of core LK-001 and from 16 samples of core LK-007 using commercially available kits with modified protocols (Supplement section 1). The extracts of core LK-001 were subjected to library preparation for capture enrichment. Enrichment probes were designed from mitogenomes of 17 herbivorous mammal species that currently or previously occurred in the Arctic (Appendix 1—table 2) and few lichen sequences (Appendix 1—table 3). Genomic libraries were produced according to Li et al., 2013 with some modifications (Seeber et al., 2019; Seeber et al., 2023; Li et al., 2023; Supplement section 2). Filtered reads were mapped to mammalian mitogenomes, followed by BLASTn alignment against the complete NCBI nucleotide database and subsequent metagenomic analyses using MEGAN (Huson et al., 2016). Reads assigned to Mammuthus were mapped to a complete M. primigenius reference mitogenome (NCBI accession NC_007596.2); reads assigned to Coelodonta antiquitatis were mapped to the NCBI reference mitogenome NC_012681.1 2. The reads mapped to mammoth from the top three libraries were assigned to haplogroup using mixemt (https://github.com/svohr/mixemt, Copy archieved at Vohr, 2024) with a custom-made representative panel of 15 mammoth mitogenomes (Figure 2; Appendix 1—table 6).

Conventional PCR, mammal metabarcoding, and ddPCR

Request a detailed protocol

Based on the enriched fragments with the highest coverage, PCR primers specific to M. primigenius were designed using Geneious Prime 2022.1.1 (Kearse et al., 2012), that is mamm801 (5`- CCCATGCAGGAGCTTCTGTAGA-3`) and mamm800r (5`-GACATAGCTGGAGGTTTTATGT-3`) to produce a 121 bp amplicon of the CO1 gene. The specific PCR conditions are described in Supplement section 6. Mammal metabarcoding PCRs were performed on 21 DNA extracts of core LK-001 and 27 samples of core LK-007, with eight independent replicates, each. Each batch of PCRs included one non-template control. Established metabarcoding primers were used (Giguet-Covex et al., 2014), and human blocking primers (Garcés-Pastor et al., 2021) were included. PCR conditions are described in the supplementary material. Sequencing was performed on an Illumina NovaSeq platform, with 2x150 reads. The raw data were processed as described in the supplement. We used the sample LK-001_66.5 of core LK-001 which had produced one of the highest Mammuthus read counts after enrichment. The specific target amplified by the primer pair were used to design a probe (5`-GGATACTCCTGCAAGGTGAAGTG-3`). With this probe, ddPCR was performed with 24 extracts of core LK-001 and with 30 extracts of core LK-007, with three replicates, each (Supplement section 8).

Appendix 1

Sampling and DNA isolation

The sediment cores were subsampled according to procedures described by Epp et al., 2019, extracting approximately 1–2 g sediment from the center of the core at close depth intervals (Appendix 1—table 1). Sampling was performed in a window-less cold room at 4 °C, adhering to standard aDNA precautions. DNA was extracted using two highly similar kits. Sample depths and respective extraction protocols for both cores are listed below (Appendix 1—table 1).

Appendix 1—table 1
Sample depths and extraction protocols.
(a) LK-001(b) LK-007
Sample depth (cm)Extraction protocol*Sample depth (cm)Extraction protocol*
1.5PowerLyzer2.0PowerLyzer
4.04.0PowerSoil Pro
7.06.0PowerLyzer
11.08.0PowerSoil Pro
12.010.0PowerLyzer
15,512.0PowerSoil Pro
18.014.0PowerSoil Pro
21.518.0PowerLyzer
23.020.PowerSoil Pro
28.024.0PowerLyzer
31.526.0PowerSoil Pro
35.030.0PowerSoil Pro
40.032.0PowerSoil Pro
44.036.0PowerLyzer
46.538.0PowerSoil Pro
49.042.0PowerSoil Pro
51.044.0PowerLyzer
62.048.0PowerLyzer
65.550.0PowerLyzer
66.554.0PowerLyzer
69.556.0PowerLyzer
73.560.0PowerLyzer
80.062.0PowerSoil Pro
64.0PowerLyzer
66.0PowerSoil Pro
68.0PowerLyzer
70.0PowerSoil Pro
  1. *

    Kits from Qiagen (Hilden, Germany).

The PowerLyzer DNA extraction was performed according to a previous study (Alsos et al., 2020), with some modifications (Appendix 1—table 2).

Appendix 1—table 2
PowerLyzer DNA extraction.
Day 1:
  • 1. Add 750 μL PowerBead Solution to the PowerBead Tube

  • 2. Transfer 0.25–0.35 g manually homogenized sediment to PowerBead Tube

  • 3. FastPrep bead beating: two times Quickprep protocol (20 s at 4.0 m/s); briefly centrifuge to eliminate foam

  • 4. Lysis mix (per sample):

    • Solution C1 60 µL

    • Proteinase K (20 mg/mL) 2 µL

    • 1 M DTT 25 µL

    • Vortex and briefly centrifuge.

  • 5. Add 87 μL lysis mix to each PowerBead Tube; vortex for 5 min; invert the tube, flick and vortex to dissolve pellet, if present.

  • 6. Incubate overnight at 56 °C and 12 rpm

Day 2:
  • 1. Remove PowerBead Tube from the incubator oven and allow to cool to room temperature

  • 2. Centrifuge at 10,000 x g for 1 min

  • 3. Add 250 μL Solution C2 to a 2 mL collection tube

  • 4. Avoiding the pellet, pour the supernatant from step 2 into the collection tube containing Solution C2; vortex for 5 s

  • 5. Incubate at 2–8 °C for 10 min

  • 6. Centrifuge at 10,000 x g for 1 min

  • 7. Label a new clean 2 mL collection tube and add 250 μL Solution C3

  • 8. Avoiding the pellet, pour up to 800 μL of supernatant to the collection tube containing Solution C3; vortex for 5 s

  • 9. Incubate at 2–8 °C for 10 min

  • 10. Centrifuge at 10,000 x g for 1 min

  • 11. Label a new clean 5 mL collection tube and add 1,400 μL Solution C4

  • 12. Avoiding the pellet, pour 880 μL of supernatant to the collection tube containing Solution C4; vortex for 5 s; briefly centrifuge

  • 13. Load 650 μL of the Solution C4-supernatant mix on a Spin Column

  • 14. Centrifuge the Spin Column at 10,000 x g for 1 min

  • 15. Transfer the Spin Column to a new 2 mL collection tube

  • 16. Repeat steps above until all Solution C4-supernatant mix has been loaded onto the Spin Column

  • 17. Centrifuge the Spin Column at 10,000 x g for 1 min

  • 18. Load 500 μL of Solution C5 on the Spin Column

  • 19. Centrifuge the Spin Column at 10,000 x g for 1 min

  • 20. Transfer the Spin Column to a new 1.5 mL collection tube

  • 21. Centrifuge the Spin Column at 10,000 x g for 1 min

  • 22. Transfer the Spin Column to a new, labelled, and sterile 1.5 mL collection tube

  • 23. Add 65 μL of elution buffer to the center of the filter membrane

  • 24. Incubate at room temperature for 10 min

  • 25. Centrifuge the Spin Column at 10,000 x g for 1 min

  • 26. Repeat steps above for a final elution volume of 120–125 μL

PowerSoil Pro DNA extraction was performed as follows:
Day 1:
  • 1. Spin the PowerBead Pro Tube briefly; add up to 500 mg soil and 800 μL Solution CD1. Vortex briefly to mix. Add 20 µL Proteinase K (2 mg/mL).

  • 2. FastPrep bead beating: two times 20 s at 4.0 m/s; briefly centrifuge to eliminate foam. Incubate at 56 °C overnight.

Day 2:
  • 3. Centrifuge the PowerBead Pro Tube at 15,000 x g for 1 min.

  • 4. Transfer the supernatant to a clean 2 mL microcentrifuge tube.

  • 5. Add 200 μL Solution CD2 and vortex for 5 s.

  • 6. Centrifuge at 15,000 x g for 1 min at room temperature. Transfer up to 700 μL supernatant to a clean 2 mL microcentrifuge tube.

  • 7. Add 600 μL Solution CD3 and vortex for 5 s.

  • 8. Load 650 μL of the lysate on an MB Spin Column and centrifuge at 15,000 x g for 1 min.

  • 9. Discard the flow-through and repeat step 8 to ensure that all of the lysate has passed through the MB Spin Column.

  • 10. Place the MB Spin Column in a clean 2 mL collection tube.

  • 11. Add 500 μL Solution EA to the MB Spin Column. Centrifuge at 15,000 x g for 1 min.

  • 12. Discard the flow-through and place the MB Spin Column back into the same collection tube.

  • 13. Add 500 μL Solution C5 to the MB Spin Column. Centrifuge at 15,000 x g for 1 min.

  • 14. Discard the flow-through and place the MB Spin Column in a new 2 mL collection tube

  • 15. Centrifuge at 16,000 x g for 2 min. Place the MB Spin Column in a new 1.5 mL elution tube.

  • 16. Add 50–100 μL Solution C6 to the center of the white filter membrane; incubate for 5 min, and centrifuge at 15,000 x g for 1 min.

Preparation of genomic libraries for hybridization capture

Double-stranded libraries were prepared from genomic DNA according to the protocol below and were visualized by gel electrophoresis. Each batch of libraries (N=12) was processed with a library blank to control for contamination during library prep. Indexing was performed using individual combinations of P5/P7 index primers.

Appendix 1—table 3
Library preparation.
End repairper library
Mix following components in a sterile low-binding PCR tube
NEBNext End Repair Buffer (10 X)5 µL
NEBNext End Repair Enzyme Mix2.5 µL
genomic DNA42.5 µL
Incubate in a thermal cycler for 30 mins at 20 °C.
Purify using QIAquick/MinElute PCR purification kit. Elution: add 32 µl buffer EB and incubate at 37 °C for 5 min before spinning down the DNA at 13,000 rpm for 1 min.
Adapter ligation
Mix following components in a sterile low-binding PCR tube
Quick Ligation Reaction Buffer (10 X)10 µL
nuclease-free water4 µL
P5/P7 adapter mix (50 µM stock)1 µL
DNA as purified in step above30 µL
Quick T4 DNA ligase4.8 µL
*final adapter concentrations for ancient samples should be between 0.25–0.5 µM.
**it is vital to add ligase after mixing DNA with adaptors.
Incubate for 15 min at 25 °C; purify using a QIAquick/MinElute PCR purification kit. Elution: 42 µL buffer EB and incubate at 37 ºC for 5 min before spinning down the DNA at 13,000 rpm for 1 min.
Fill-In Reaction
Add the following reagents into a low-binding PCR tube
ThermoPol Reaction Buffer5 µL
dNTPs (10 mM)2 µL
Bst DNA polymerase3 µL
DNA as eluted above40 µL
Incubate:
20 mins at 65 °C
20 mins at 80 °C
No purification is needed after this step.
Appendix 1—table 4
Indexing PCR.
ReagentµL
H2O3.45
Platinum Hifi Taq Buffer 10 X (Thermo Fisher Scientific)2.50
dNTPs (25 mM)0.25
bovine serum albumin (New England Biolabs)1.00
MgSO4 (50 mM)1.00
Platinum HiFi (5 U/µL; Thermo Fisher Scientific)0.20
total8.40
Index P5 (10 µM)0.80
Index P7 (10 µM)0.80
template DNA15.00
Thermocycling
°Ct
941 min
9415 s8 cycles
6020 s
6860 s
683 min
20store

Thereafter, libraries were pooled (four libraries/pool) and were purified using a MinElute PCR purification kit (Qiagen; elution: 2x20 µL). Concentrations were measured using a Qubit device (broad-range assay; Thermo Fisher Scientific).

Hybridization capture enrichment and sequencing

To design RNA oligonucleotide baits for hybridization capture, we compiled the mitochondrial genome sequences of 17 selected mammal species (comprising 282,168 nucleotides [nt]; Appendix 1—table 2) and ITS-1 and ITS-2 sequences of Cladonia rangiferina (8546 and 4750 nt, respectively; Appendix 1—table 3). The compiled mitochondrial sequences were submitted to Daicel Arbor Biosciences for custom design of 70 bp baits at threefold tiling, collapsed at 99% identity and 85% overlap, resulting in 9339 unique baits (747,120 nt).

Each purified library pool containing four libraries was subjected to one hybridization capture reaction, apart from libraries produced from extraction blanks (N=15), library blanks (N=15), and PCR non-template controls (N=30), all of which were pooled in one reaction due to the minute DNA concentrations (mostly below detection range).

Hybridization capture was performed according to the MYbaits protocol v. 5.00 (Daicel Arbor Biosciences) with the following modifications: a total amount of 150 ng baits per capture reaction was used, and library pools were incubated for hybridization with the baits at 58 °C for 24 hr. Post-capture, the enriched libraries were removed from the beads and were amplified similarly to the PCR protocol used for the indexing PCR and with primers IS5/IS6 (Illumina, San Diego, CA, USA) and 14 cycles. PCR products were then purified using the MinElute PCR Purification Kit (Qiagen), and the final library concentration was measured on an Agilent Bioanalyzer. For sequencing, the enriched libraries were pooled at equal concentrations, and two library pools (produced from 57 and 110 libraries, respectively) were sequenced on an Illumina NovaSeq platform using three Novaseq SP 2x150 flow cells.

Appendix 1—table 5
Mitogenome templates of 17 mammal species for design of RNA baits.
OrderSpeciesNCBI accession# of baits
ArtiodactylaBison bisonNC_012346.1446
Bos primigeniusNC_020746.1456
Saiga tataricaNC_013996.1538
Ovis canadensisNC_015889.1522
Ovibos moschatusNC_020631.1542
Cervus elaphusNC_007704.2523
Rangifer tarandusNC_007703.1526
Alces alcesNC_020677.1520
Camelus ferusNC_009629.2583
PerissodactylaEquus przewalskiiNC_024030.1575
Coelodonta antiquitatisNC_012681.1571
LagomorphaLepus arcticusNC_044769.1586
Ochotona collarisNC_003033.1591
ProboscideaMammuthus primigeniusNC_007596.2588
EulipotyphlaSorex tundrensisNC_025327.1584
RodentiaCastor canadensisNC_033912.1584
Dicrostonyx torquatusNC_034646.1575
9,310
Appendix 1—table 6
Cladonia rangiferina sequences for RNA bait design (shown are the NCBI accessions).
ITS-1
MN756840.1; DQ394367.1; JQ695919.1; MK179592.1; KP031549.1; KP001202.1; AF458306.1; KT792792.1; MK811970.1; KT792788.1; MK508944.1; GU169225.1; KP001197.1; KP001201.1; MK812260.1; MK811708.1; KY119381.1; MK508952.1; KT792789.1; EU266113.1; KY266884.1; KP001192.1; KP001190.1; JQ695918.1; KT792790.1; JQ695920.1; KP001191.1; AF458307.1; KP001200.1; MK508943.1; MK812460.1; MK508937.1; KT792791.1
resulting in 23 baits
ITS-2
KT792789.1; KP001190.1; AF458307.1; JQ695919.1; MK179592.1; DQ394367.1; KP001194.1; KP001193.1; KY266884.1; KP001199.1; KP001192.1; MK300750.1; MN756487.1; MK508937.1; KP001200.1; MK812460.1; MK508943.1; KP001191.1; KP031549.1; KP001202.1; AF458306.1; MK811970.1; KY119381.1; KP001198.1; MK812260.1; MK811708.1; GU169225.1; JQ695918.1; KP001201.1
resulting in 6 baits

Bioinformatics of capture enrichment data

Adapter sequences were removed and sequences were filtered using leeHom with the default ancient DNA settings (Renaud et al., 2014). Duplicates were removed, and filtered reads were mapped against a database of 73 mammal mitogenomes (Appendix 1—table 4) using BWA v. 0.7.17 (Li and Durbin, 2009) and were blastn-aligned against the complete NCBI Genbank nucleotide database (Benson et al., 2009; retrieved March 14 2022) with a maximum e-value of 0.01. All blast output files were processed using MEGAN Community Edition 6.19.8 using a weighted LCA algorithm (Huson et al., 2016). aDNA degradation was examined using mapDamage 2.2.1 (Ginolhac et al., 2011) against a reference genome of M. primigenius (NC_007596.2). The respective command lines are shown in Appendix 1—table 5. The reads blast-assigned to M. primigenius were mapped to the complete M. primigenius reference mitogenome to produce consensus sequences using Geneious Prime 2023.0.1 (Kearse et al., 2012).

Appendix 1—table 7
Reference mitogenomes used for mapping.
NC_020679.1Antilocapra americanaNC_018783.1Equus ovodovi
NC_012346.1Bison bisonHM118851.1Equus hemionus
NC_020746.1Saiga tataricaMK982180.1Equus asinus
NC_013996.1Bos primigeniusNC_012681.1Coelodonta antiquitatis
NC_015889.1Ovis canadensisNC_007596.2Mammuthus primigenius
NC_020630.1Oreamnos americanusFR691686.1Castor fiber
NC_020631.1Ovibos moschatusNC_033912.1Castor canadensis
NC_027233.1Bison priscusNC_034313.1Dicrostonyx groenlandicus
NC_009629.2Camelus ferusNC_034646.1Dicrostonyx torquatus
KR822422.1Camelops cf. hesternusJN181159.1Peromyscus leucopus
NC_013836.1Cervus elaphus xanthopygusNC_006853.1Bos taurus
NC_007704.2Cervus elaphusKM093871.1Capra hircus
NC_013840.1Cervus elaphus yarkandensisNC_015241.1Microtus fortis fortis
KP405229.1Alces alces cameloidesKP200876.1Vulpes lagopus
NC_020677.1Alces alcesHM236180.1Ovis aries
NC_020729.1Odocoileus hemionusKT448275.1Canis latrans
NC_015247.1Odocoileus virginianusJN632610.1Capreolus capreolus
NC_007703.1Rangifer tarandusKJ681493.1Capreolus pygargus
KY987554.1Platygonus compressusJN632629.1Dama dama
NC_002008.4Canis lupus familiarisKM982549.1Lynx lynx
NC_009686.1Canis lupus lupusKP202265.1Panthera pardus
NC_013445.1Cuon alpinusNC_026460.1Rhinolophus macrotis
NC_026529.1Vulpes lagopusY07726.1Ceratotherium simum
NC_028302.1Panthera leoNC_005089.1Mus musculus
NC_022842.1Panthera oncaAM711900.1Meles meles
NC_010642.1Panthera tigrisKM091450.1Mustela erminea
NC_014456.1Lynx rufusNC_005358.1Ochotona princeps
NC_020642.1Martes americanaNC_012095.1Sus scrofa domesticus
NC_020641.1Neovison visonDQ480489.1Canis lupus familiaris
NC_024942.1Mustela nigripesNC_020670.1Crocuta crocuta
NC_020664.1Martes pennantiNC_011116.1Arctodus simus
NC_020639.1Mustela nivalisNC_027963.1Sorex araneus
NC_009685.1Gulo guloNC_025327.1Sorex tundrensis
NC_011112.1Ursus spelaeusKJ397607.1Lepus arcticus
NC_003426.1Ursus americanusNC_001640.1Equus caballus
NC_003427.1Ursus arctosNC_024030.1Equus przewalskii
NC_003428.1Ursus maritimus
Appendix 1—table 8
Command lines and software used for initial mapping, processing, and taxonomic assignment.
adapter trimming, filtering, and merging: leeHom Renaud et al., 2014:src/leeHom -t 120 –ancientdna –auto -fq1 file_R1.fastq.gz -fq2 file_R2.fastq.gz -fqo leehom_out
mapping: BWA Li and Durbin, 2009 against 75 mammal mitogenomes Appendix 1—table 3:bwa index reference_mitogenomes.fasta
bwa aln reference_mitogenomes.fasta leehom_out.fq.gz -l 16,000 n 0.01 -O 2 -o 2 t 8>bwa_out.sai
bwa samse reference_mitogenomes.fasta bwa_out.sai leehom_out.fq.gz>bwa_out.sam
samtools view -q ≥ 30 S -b bwa_out.sam>bwa_out.bam
remove duplicates: samtools Li and Durbin, 2009samtools collate -o bwa_out_col.bam bwa_out.bam
samtools fixmate -m bwa_out_col.bam bwa_out_fixmate.bam
samtools sort -o bwa_out_pos.bam bwa_out_fixmate.bam
samtools markdup bwa_out_pos.bam bwa_out_mark.bam
samtools fastq bwa_out_mark.bam>bwa_out.fastq
fastq to fastased -n ‘1~4 s/^@/>/p;2~4 p’ bwa_out.fastq>bwa_out.fasta
#alignment: blastn Altschul et al., 1990
blastn -db ncbi_nt -query bwa_out.fasta -evalue 0.01 -out
blastn_out.fasta
aDNA damage: mapDamage Ginolhac et al., 2011bwa index reference.fasta
bwa aln reference.fasta sample.fasta >sample.sai
bwa samse referecne.fasta sample.sai sample.fasta >sample.sam samtools view -q 25 S -b sample.sam >sample.bam
mapDamage -i sample.bam -r reference.fasta
mapDamage -d results_ sample/ -y 0.1 --plot-only
mapDamage -i sample.bam -r reference.fasta –rescale
mapDamage -d results_sample / --forward --stats-only -v -r reference.fasta

Mammoth mitogenome haplogroup identification

The sequences mapping to mammoth from the libraries with the most mammoth reads were remapped to the mammoth reference (NC_007596) with minimap2 (https://github.com/lh3/minimap2, Copy archieved at Li, 2024) to generate a bam file. A panel of 15 mitogenome reference sequences (see Appendix 1—table 5) exemplifying the diversity of mammoths was used to identify variants representative of each haplogroup. In cases where there are multiple reference genomes for a haplogroup, the consensus variants unique to the haplogroup were used except for haplogroups B and D&E where haplosubgroups were also identified. mixemt (Vohr et al., 2017, https://github.com/svohr/mixemt) was then used to detect haplogroups present in the core and estimate the mixture proportions of these haplogroups, using only mapped reads with a map quality of >30 and with the requirement that 40% of the unique defining variants of each haplogroup must be observed to say it is present.

Appendix 1—table 9
Reference mammoth mitogenomes used for panel.
CladeHaplogroupSubgroupAccession numbersProportion of reads assigned to haplogroup
IIAEU153451, EU153450-
IIIBB0KX027526, KX027531-
B1KX0275260.1615
B2KX027531-
ICKX027498, KX027565, KX027567, JF912200, KX027499, KX027502-
ID&ED0DQ316067, EU153454, EU153447, EU153456, EU153449, EU153455,
EU153446
-
D1DQ316067-
D2EU153454-
D3EU1534470.2790
D4EU153456-
D5EU1534490.0765
D6EU1534550.4829
D7EU153446-
IFKX027503, KX027512, NC_015529, KX027511, KX027548, KX027547, KX027556, KX027559, KX027550-
Krestovka mammothKPRJEB42269 (European Nucleotide Archive)-
Conventional PCR
94 °C2 min
94 °C30 s55 cycles
54 °C30 s
68 °C20 s
68 °C1 min

Conventional PCR

The PCR reaction mix comprised 17.05 µL H2O, 0.2 µL Platinum Taq DNA-Polymerase High Fidelity (Thermo Fisher Scientific, Waltham, MA, USA), 2.5 µL 10 X reaction buffer, 0.25 µL dNTPs (25 mM), 1 µL bovine serum albumin (20 mg/mL; New England Biolabs, Ipswitch, MA, USA), 1 µL MgSO4 (50 mM), 1 µL of each primer (mamm801 and mamm800r; 10 µM), and 1 µL DNA extract. Thermocycling was performed as shown in Appendix 1—table 9.

Amplification products were visualized by gel electrophoresis and were purified from excised gel bands using a NucleoSpin Gel and PCR Clean-up kit (Macherey-Nagel, Düren, Germany), followed by Sanger sequencing.

Mammal metabarcoding

We used the primers MamP007F and MamP007R (Giguet-Covex et al., 2014) and blockers (Garcés-Pastor et al., 2021), with the following conditions:

Appendix 1—table 10
Conventional PCR.
ReagentµL
H2O14.85
Platinum Hifi Taq Buffer 10 X (Thermo Fisher Scientific)2.5
bovine serum albumin (New England Biolabs)0.2
MgSO4 (50 mM)1.00
dNTPs (25 mM)0.25
Platinum HiFi (5 U/µL; Thermo Fisher Scientific)0.20
Blocking primer R (1 µM)0.5
Blocking primer F (1 µM)0.5
total20.00
template DNA3.00
Thermocycling
°Ct
945 min
9430 s40 cycles
5030 s
6830 s
6810 min
RTstore

PCR products were purified using a MinElute kit (Qiagen). Sequencing was performed on an Illumina NovaSeq platform, with 2x150 reads. The raw data were processed as described below.

Bioinformatics of mammal metabarcoding data

The obtained datasets by Illumina Sequencing were then processed as follows:

Appendix 1—table 11
Conventional PCR.
Load dataobi import --quality-sanger file_R1.fastq reads1
obi import --quality-sanger file_R2.fastq reads2
Import tagsobi import --ngsfilter-input taglist.txt ngsfile
Align paired-end readsobi alignpairedend -R reads2 reads1 aligned_reads
Grep entries whose mode are alignmentobi grep -a mode:alignment aligned_reads good_sequences
Assign alignments to individual PCRsobi ngsfilter -t ngsfile -u unidentified_sequences good_sequences identified_sequences
Filter out sequencesobi grep -p "sequence[’score']>50" identified_sequences identified_sequences_filtered
obi grep -p "sequence[’score_norm']>0.9"
identified_sequences_filtered identified_sequences_filtered_adj
Dereplicate Sequencesobi uniq -m sample identified_sequences_filtered_adj dereplicated_sequences_filtered
Keep only useful tagsobi annotate -k COUNT -k MERGED_sample
dereplicated_sequences_filtered cleaned_metadata_sequences
Discard sequences that are shorter than 60 bp (based on primer pair)obi grep -p "len(sequence) ≥ 60 and sequence['COUNT'] ≥ 10" cleaned_metadata_sequences denoised_sequences
Clean the sequences from PCR/sequencing errorsobi clean -s MERGED_sample -r 0.05 H denoised_sequences cleaned_sequences
Load databasecp STD_MAM_1.dat.gz ~/edna_LauraB/master/mammalia/database/
Import it into DMSobi import /data/scc/edna/LauraBa/master/mammalia/database/STD_MAM_1.dat.gz database_mam
obi import --embl EMBL embl_refs
Download the taxonomywget https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz
Import the taxonomy in the DMSobi import --taxdump /data/scc/edna/LauraBa/master/mammalia/taxdump.tar.gz taxonomy/my_tax
Cleaning the database with in silico PCRobi ecopcr -e 3 l 50 L 150 F CGAGAAGACCCTATGGAGCT -R CCGAGGTCRCCCCAACC --taxonomy taxonomy/my_tax embl_refs mam_refs
Filter sequencesobi grep --require-rank=species --require-rank=genus --require-rank=family --taxonomy taxonomy/my_tax mam_refs
mam_refs_clean
Dereplicate identical sequencesobi uniq --taxonomy taxonomy/my_tax mam_refs_clean
mam_refs_uniq
Add taxid at the family levelobi grep --require-rank=family --taxonomy taxonomy/my_tax mam_refs_uniq
Build the reference databaseobi build_ref_db -t 0.97 --taxonomy taxonomy/my_tax mam_refs_uniq_clean mam_db_97
Assign each sequence to a taxonobi ecotag -m 0.97 --taxonomy taxonomy/my_tax -R mam_db_97 cleaned_sequences assigned_sequences
Align the sequencesobi align -t 0.95 assigned_sequences aligned_assigned_sequences
Export tables for downstream data analysisobi grep -A SCIENTIFIC_NAME assigned_sequences
assigned_for_metabR
Output two tables required by metabaRobi annotate -k MERGED_sample assigned_for_metabR assigned_for_metabR_reads_tableobi export --tab-output --output-na-string 0
assigned_for_metabR_reads_table >mam_reads_01.txtobi annotate --taxonomy taxonomy/my_tax \
--with-taxon-at-rank superkingdom \
--with-taxon-at-rank kingdom \
--with-taxon-at-rank phylum \
--with-taxon-at-rank subphylum \
--with-taxon-at-rank class \
--with-taxon-at-rank subclass \
--with-taxon-at-rank order \
--with-taxon-at-rank suborder \
--with-taxon-at-rank infraorder \
--with-taxon-at-rank superfamily \
--with-taxon-at-rank family \
--with-taxon-at-rank genus \
--with-taxon-at-rank species \
--with-taxon-at-rank subspecies \ assigned_for_metabR assigned_for_metabR_taxInfo
obi annotate \
-k BEST_IDENTITY -k TAXID -k SCIENTIFIC_NAME -k COUNT -k seq_length \
-k superkingdom_name \
-k kingdom_name \
-k phylum_name \
-k subphylum_name \
-k class_name \
-k subclass_name \
-k order_name \
-k suborder_name \
-k infraorder_name \
-k superfamily_name \
-k family_name \
-k genus_name \
-k species_name \
assigned_for_metabR_taxInfo assigned_for_metabR_motus
obi export --tab-output assigned_for_metabR_motus >mam_motus_01.txt
Further processing of the data sets was done using RStudio.
Editing files for metabarreads<- dt_reads %>%
dplyr::select(-c("DEFINITION", "NUC_SEQ"))%>% as.data.frame() %>%
janitor::row_to_names(row_number = 897, remove_rows_above = FALSE, remove_row = TRUE) %>% mutate_if(is.integer,as.numeric)
Assign name to first columnreads <- cbind(rownames(reads),reads)
rownames(reads) <- NULL
colnames(reads) <- c(names(reads))
colnames(reads)(1) <- "pcr_id"
Edit the names of the columnreads$pcr_id = strsplit(reads$pcr_id,"[.]") reads$pcr_id = sapply(reads$pcr_id, function(x) x[length(x)]) rownames(reads)<- reads$pcr_id
Organizing the MOTUs tablemotus<- dplyr::select(dt_motus, 'ID', 'NUC_SEQ', 'COUNT','BEST_IDENTITY', 'TAXID', 'SCIENTIFIC_NAME', ’superkingdom_name', ’species_name', 'class_name', 'order_name', 'family_name', 'genus_name', 'kingdom_name', 'phylum_name', ’subphylum_name', ’subclass_name', ’suborder_name')
names(motus)[names(motus) == 'NUC_SEQ'] <- ’sequence'

Droplet digital PCR (ddPCR)

ddPCR was performed using a Bio-Rad QX200 system (Bio-Rad Hercules, CA, USA) using the following conditions, and data were produced and analyzed as per the standard instructions of the manufacturer. Droplets were generated by using 21 µL of the PCR mixture and 70 µL ddPCR Droplet Reader Oil (Bio-Rad) according to the manufacturer’s instructions. The final volume for PCR amplification was 40 µL and was carried out in a C1000 Touch Thermo cycler (Bio-Rad). The products were analyzed using a QX200 Droplet Reader (Bio-Rad); the threshold was set manually to 3000.

Appendix 1—table 12
Conventional PCR.
ReagentµL
ddPCR Supermix for probes11
H2O (DEPC)6.8
20 x Target-Primers/Probe (FAM)1.1
20 x Target-Primers/Probe (HEX)1.1
total20
template DNA2.0
Thermocycling
°Ct
9510 min
9430 sec40 cycles
5030 sec
6030 sec
Appendix 1—figure 1
Comparison of transformed metabarcoding read counts (A) and ddPCR concentration estimations (B) of Mammuthus primigenius DNA in sediment cores LK-001 and LK-007 from the Yamal peninsula.

Sediment dating

The dating sections of core LK-001 (Appendix 1—table 13) were dried to constant weight at 65 °C and were then analyzed for 210Pb, 226Ra, and 137Cs by direct gamma assay at the Liverpool University Environmental Radioactivity Laboratory using Ortec HPGe GWL series well-type coaxial low background intrinsic germanium detectors. This allowed dating to a depth of 39.5 cm (Appendix 1—table 13).

Appendix 1—table 13
210Pb chronology of Yamal lake sediment core LK-001.
DepthChronologySedimentation Rate
DateAge
cmg cm–2ADy±g cm–2 y–1cm y–1
0.000.0201900
0.250.22018110.340.37
1.251.12016320.340.37
2.251.92013620.340.37
3.252.82010920.340.37
4.253.820081120.340.37
5.254.720051420.340.37
6.255.720021720.340.37
7.256.719992030.360.37
8.257.619962330.390.41
10.259.719922730.490.47
12.2511.819883140.530.49
14.2514.019843540.550.51
14.7514.519833640.550.51
15.5015.419813840.550.51
16.5016.519803940.550.51
17.5017.619784140.550.51
18.5018.619764340.550.51
20.5020.719714850.550.51
22.5022.719685160.550.51
23.5023.719665360.550.51
24.5024.819645560.550.51
25.5025.919625770.440.42
26.5027.019615870.370.36
27.5028.019576280.300.29
28.5029.019526790.280.22
29.5030.2194970100.230.19
31.5032.6193881100.230.19
33.5034.7193386100.230.19
35.5036.9192792110.230.19
37.5039.11913106130.230.19
39.5041.41895124170.230.19

Core LK-001 was additionally dated using radiocarbon. We collected plant remains throughout the core by first using a scalpel to cut 1 cm thick of bulk sediment at desired depths, followed by washing the sediment chunks with distilled water and filtering using 190 µm mesh sieves. We then collected leaves of deciduous trees and seeds, which were placed in glass vials and dried under a fume hood before sending to the Micadas (Alfred Wegener Institute, Bremerhaven, Germany) for radiocarbon dating. Sample preparation and measurement methodologies at the Micadas were performed as described previously (Mollenhauer et al., 2021). We sent three samples that had sufficient plant materials for dating and retained two dates after removing an inaccurate date caused by high inbuilt age from plant remains (Appendix 1—table 14).

Appendix 1—table 14
Radiocarbon dating results.
Sample labelF14C± (abs)Age (y)± (y)Weight (µg C)Comment
LK-001_36.5 cm0.95950.00953327935
LK-001_51 cm0.33960.00568677132148Off; lateral input
LK-001_74 cm0.82480.0234154722813

Mammal read numbers

The numbers of reads assigned to all mammals are indicated in Appendix 1—table 15 (core LK-001 only; hybridization capture experiment).

Appendix 1—table 15
Numbers of sequences assigned to mammals (core LK-001).
depth
sum1.54.07.011.012.015.518.021.523.028.031.535.040.046.549.051.062.065.566.569.573.580.0
Mammuthus primigenius19,64019116252429012056051946662778003697305118813741923631674519111010838141408
Rangifer tarandus18,05525363-3432868864418165459022186124015711574862842714563459911141,621
Dicrostonyx torquatus16,8702111123411112116142711035522104985012983147611933186562002763101114081928
Lepus63711481341052358424540369336630382231404799285141116126352330293298
Coelodonta antiquitatis273733--2811915334755-9712846729814127103152-2501444191
Homo sapiens387------38217--23-25-69-15-----
Ovibos moschatus moschatus145-69-------38---38--------
Castor fiber135----18----15---2751-61---17
Bos105--------------105-------
Cervinae92-------33----3128--------
Saiga tatarica91---------57-34----------
Sus scrofa cristatus85-------34------24----27--
Ochotona74-----------------14---60
Ovis aries musimon70----70-----------------
Bison54----24-30---------------
Crocuta crocuta53------------------53---
Chrotopterus auritus43-----43----------------
Sorex tundrensis32---------------32------
Equus32-----------------7---25
Peromyscus maniculatus bairdii29-------------29--------
Murinae28--------------------28-
Capra21---------------21------
Myopus schisticolor18-------18--------------
Lemmus trimucronatus1616---------------------

Data availability

Sequence data of the hybridization capture were made available as an NCBI BioProject (PRJNA1082062).

The following data sets were generated
    1. Seeber PA
    (2024) NCBI BioProject
    ID PRJNA1082062. Mitochondrial genomes of Pleistocene megafauna retrieved from recent sediment layers of two Siberian lakes.

References

  1. Book
    1. Baulin VV
    2. Chekhovskiy AL
    3. Sukhodolskiy SE
    (1981)
    Main stages of Permafrost development in the North-east of Western part of USSR and Western Siberia
    In: Dubikov GI, Baulin VV, editors. History of Development of Permafrost in Eurasia: On the Examples of Separate Regions. Nauka. pp. 41–60.
    1. Dvornikov YA
    2. Leibman MO
    3. Heim B
    4. Bartsch A
    5. Haas A
    6. Khomutov AV
    (2016)
    Geodatabase and WebGIS project for long-term permafrost monitoring at the Vaskiny Dachi Research Station, Yamal, Russia
    Polarforschung 85:107–115.
  2. Book
    1. Kachurin SP
    (1961)
    Thermokarst on the USSR Territory
    Academy of Sciences USSR.
    1. Kizyakov AI
    2. Leibman MO
    3. Perednya DD
    (2006)
    Destructive relief-forming processes at the coasts of the Arctic plains with tabular ground ice
    Kriosf Zemli 10:79–89.
  3. Book
    1. Leibman MO
    2. Kizyakov AI
    (2007)
    Cryogenic Landslides of Yamal and Yugorskiy Peninsulas
    Rosselkhozakademia.

Article and author information

Author details

  1. Peter Andreas Seeber

    Department of Biology, University of Konstanz, Konstanz, Germany
    Contribution
    Conceptualization, Data curation, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    For correspondence
    seeber.pa@gmail.com
    Competing interests
    Employed by and owns stock of Thermo Fisher Scientific
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4401-8248
  2. Laura Batke

    Department of Biology, University of Konstanz, Konstanz, Germany
    Contribution
    Formal analysis, Methodology
    Competing interests
    No competing interests declared
  3. Yury Dvornikov

    1. Agroengineering Department/Department of Landscape Design and Sustainable Ecosystems, Agrarian and Technological Institute, RUDN University, Moscow, Russian Federation
    2. Laboratory of Carbon Monitoring in Terrestrial Ecosystems, Institute of Physicochemical and Biological Problems of Soil Science of the Russian Academy of Sciences, Pushchino, Russian Federation
    Contribution
    Writing – review and editing
    Competing interests
    No competing interests declared
  4. Alexandra Schmidt

    Department of Biology, University of Konstanz, Konstanz, Germany
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9262-0941
  5. Yi Wang

    Department of Biology, University of Konstanz, Konstanz, Germany
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
  6. Kathleen Stoof-Leichsenring

    Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Polar Terrestrial Environmental Systems, Potsdam, Germany
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6609-3217
  7. Katie Moon

    1. Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, United States
    2. Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, United States
    Contribution
    Formal analysis, Writing – review and editing
    Competing interests
    No competing interests declared
  8. Samuel H Vohr

    Embark Veterinary, Inc, Boston, United States
    Contribution
    Formal analysis, Writing – review and editing
    Competing interests
    Employed by Embark Veterinary, Inc
  9. Beth Shapiro

    1. Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, United States
    2. Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, United States
    Contribution
    Writing – review and editing
    Competing interests
    No competing interests declared
  10. Laura S Epp

    Department of Biology, University of Konstanz, Konstanz, Germany
    Contribution
    Supervision, Funding acquisition, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    laura.epp@uni-konstanz.de
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2230-9477

Funding

Deutsche Forschungsgemeinschaft (EP-98/3-1)

  • Laura S Epp

Biodiversa+ (BiodivScen ERA-Net COFUND)

  • Laura S Epp

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This research was funded through the 2017–2018 Belmont Forum and BiodivERsA joint call for research proposals, under the BiodivScen ERA-Net COFUND program, and with the funding organizations Deutsche Forschungsgemeinschaft (DFG grant EP-98/3–1 to LSE), Agence Nationale de la Recherche (ANR), Research Council of Norway (NFR), the Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning (Formas), Academy of Finland, National Science Foundation (NSF) and the Natural Sciences and Engineering Research Council of Canada (NSERC-CRSNG). Field work was enabled through a grant of the Young Scholar Fund of the University of Konstanz to LSE. YD was supported by the RUDN University Strategic Academic Leadership Program. Expedition logistics were supported by the Department of Science and Innovations of Yamalo-Nenets Autonomous Okrug and the Non-profit organization «Russian Center of Arctic Exploration». We thank PD Dr. Elena Marinova for assistance and discussions concerning dating of plant macrofossils and Patrick Bartolin for assistance in the wet lab.

Version history

  1. Preprint posted: June 16, 2023 (view preprint)
  2. Sent for peer review: June 28, 2023
  3. Preprint posted: September 1, 2023 (view preprint)
  4. Preprint posted: February 2, 2024 (view preprint)
  5. Version of Record published: March 15, 2024 (version 1)

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.89992. This DOI represents all versions, and will always resolve to the latest one.

Copyright

© 2023, Seeber et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,310
    views
  • 126
    downloads
  • 2
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Peter Andreas Seeber
  2. Laura Batke
  3. Yury Dvornikov
  4. Alexandra Schmidt
  5. Yi Wang
  6. Kathleen Stoof-Leichsenring
  7. Katie Moon
  8. Samuel H Vohr
  9. Beth Shapiro
  10. Laura S Epp
(2024)
Mitochondrial genomes of Pleistocene megafauna retrieved from recent sediment layers of two Siberian lakes
eLife 12:RP89992.
https://doi.org/10.7554/eLife.89992.3

Share this article

https://doi.org/10.7554/eLife.89992

Further reading

    1. Computational and Systems Biology
    2. Genetics and Genomics
    Ardalan Naseri, Degui Zhi, Shaojie Zhang
    Research Article Updated

    Runs-of-homozygosity (ROH) segments, contiguous homozygous regions in a genome were traditionally linked to families and inbred populations. However, a growing literature suggests that ROHs are ubiquitous in outbred populations. Still, most existing genetic studies of ROH in populations are limited to aggregated ROH content across the genome, which does not offer the resolution for mapping causal loci. This limitation is mainly due to a lack of methods for the efficient identification of shared ROH diplotypes. Here, we present a new method, ROH-DICE (runs-of-homozygous diplotype cluster enumerator), to find large ROH diplotype clusters, sufficiently long ROHs shared by a sufficient number of individuals, in large cohorts. ROH-DICE identified over 1 million ROH diplotypes that span over 100 single nucleotide polymorphisms (SNPs) and are shared by more than 100 UK Biobank participants. Moreover, we found significant associations of clustered ROH diplotypes across the genome with various self-reported diseases, with the strongest associations found between the extended human leukocyte antigen (HLA) region and autoimmune disorders. We found an association between a diplotype covering the homeostatic iron regulator (HFE) gene and hemochromatosis, even though the well-known causal SNP was not directly genotyped or imputed. Using a genome-wide scan, we identified a putative association between carriers of an ROH diplotype in chromosome 4 and an increase in mortality among COVID-19 patients (p-value = 1.82 × 10−11). In summary, our ROH-DICE method, by calling out large ROH diplotypes in a large outbred population, enables further population genetics into the demographic history of large populations. More importantly, our method enables a new genome-wide mapping approach for finding disease-causing loci with multi-marker recessive effects at a population scale.

    1. Chromosomes and Gene Expression
    2. Genetics and Genomics
    Lisa Baumgartner, Jonathan J Ipsaro ... Julius Brennecke
    Research Advance

    Members of the diverse heterochromatin protein 1 (HP1) family play crucial roles in heterochromatin formation and maintenance. Despite the similar affinities of their chromodomains for di- and tri-methylated histone H3 lysine 9 (H3K9me2/3), different HP1 proteins exhibit distinct chromatin-binding patterns, likely due to interactions with various specificity factors. Previously, we showed that the chromatin-binding pattern of the HP1 protein Rhino, a crucial factor of the Drosophila PIWI-interacting RNA (piRNA) pathway, is largely defined by a DNA sequence-specific C2H2 zinc finger protein named Kipferl (Baumgartner et al., 2022). Here, we elucidate the molecular basis of the interaction between Rhino and its guidance factor Kipferl. Through phylogenetic analyses, structure prediction, and in vivo genetics, we identify a single amino acid change within Rhino’s chromodomain, G31D, that does not affect H3K9me2/3 binding but disrupts the interaction between Rhino and Kipferl. Flies carrying the rhinoG31D mutation phenocopy kipferl mutant flies, with Rhino redistributing from piRNA clusters to satellite repeats, causing pronounced changes in the ovarian piRNA profile of rhinoG31D flies. Thus, Rhino’s chromodomain functions as a dual-specificity module, facilitating interactions with both a histone mark and a DNA-binding protein.