Introduction

Sedimentary deposits constitute highly valuable archives of past ecosystem changes as they contain dateable layers with organismic remains including ancient DNA (aDNA). Such remains are typically assumed to represent the ecosystem of the time around which the respective stratum was deposited. aDNA from sediments has yielded remarkable insights regarding paleoecology, phylogeography, and extirpation and extinction events of keystone taxa such as mammoths (Haile et al., 2009; Boessenkool et al., 2012; Graham et al., 2016; Murchie, Monteath, et al., 2021). Based on such ancient environmental DNA (aeDNA), a recent study proposed that the woolly mammoth (Mammuthus primigenius) may have survived in Eurasia for much longer than previously assumed, as the authors retrieved mammoth DNA sequences in sediment layers that were approximately 4.6–7 thousand years (kyr) younger than the most recent mammoth fossils (Wang et al., 2021); however, in response to this interpretation, Miller and Simpson (2022) opined that these results may be more likely due to taphonomic processes leading to release of aeDNA from the remains of long-dead organisms from permafrost, where it is well preserved.

Conclusions derived from aeDNA isolated from sediment cores rely on the stratified nature of the remains in question and dating of the respective layer by radiometric methods. However, in theory, various physical and geochemical processes such as translocation of DNA through sediment strata (Haile et al., 2007), re-deposition of older sediment carrying DNA of extinct organisms (Arnold et al., 2011), and preservation bias (Boere et al., 2011) can distort the biological signal of aeDNA and thus bias the accuracy of allocation of taxa to specific time periods (Armbrecht et al., 2019). For lake sediments, deposited under aquatic conditions, studies have suggested that leaching is not a concern (Parducci et al., 2017), but it has been observed in soils and cave sediments (Haile et al., 2007). The question of obtaining last appearance dates of extinct taxa using aeDNA in dated sediment layers (Haile et al., 2009) is under discussion, but the surprisingly young records published so far still date to multiple thousands of years before present and thus lie within a timeframe of possible late survival.

Results and Discussion

In 2019, we retrieved short subsurface sediment cores from two Arctic thermokarst lakes (LK-001 and LK-007, located approximately 5 km apart, over massive permafrost; Table 1; Dvornikov et al., 2016) on the Yamal peninsula, Siberia, to extract DNA and assess changes in mammal abundances in the Arctic over the past decades and centuries. From lake LK-001, we collected a secondary core which was sliced in the field at 1-cm steps for Pb210 radiometric dating, which indicated that the sediments at the top of this core were deposited recently, and that the core spanned the past few centuries (Table S7). The cores for DNA extraction were closed in the field immediately after retrieval and were then transported to the dedicated aDNA laboratories of the University of Konstanz, Germany. In this lab facility, established in 2020, no other samples from the Arctic or from any large mammals had been processed previously. From core LK-001, we isolated DNA and produced genomic double-stranded libraries from 23 samples, from 1.5 to 80 cm core depth (Supplement section 1), according to standard procedures. The core was opened and all subsequent steps until index PCR setup were carried out under customary aDNA laboratory conditions. In particular, the core opening facilities and the lab are located in buildings separated from the downstream molecular genetic analyses, the ventilation of the aDNA lab is based on a HEPA filter system and positive air pressure, and the lab is subjected to nightly UV radiation. Work in the lab is conducted under strict aDNA precautions, adhering to established aDNA protocols (Fulton and Shapiro, 2019). We enriched the libraries for mammalian DNA using a custom RNA bait panel produced from complete mitogenome sequences of 17 mammal species that currently or previously occurred in the Arctic (adapted from Murchie, Kuch, et al., 2021). The enriched libraries were sequenced, and we mapped the sequences against a database of 73 mammal mitogenomes, followed by BLASTn alignment against the complete NCBI nt database. We thus retrieved mitogenomic sequences of mammals that were expected during the age range covered by the core (Table S9), e.g., reindeer (Rangifer tarandus), Arctic lemming (Dicrostonyx torquatus), and hare (Lepus); however, throughout the entire core, there were abundant sequences of two species that have been extinct for several thousand years, i.e., mammoth (Mammuthus primigenius) and woolly rhinoceros (Coelodonta antiquitatis). Twenty-two of the 23 LK-001 libraries produced > 1,000 reads, each, assigned to Mammuthus, with read counts ranging from 2,852 to 72,919 (mean 21,140 ± 17,296). Negative controls (extraction and library blanks) did not produce any reads assigned to mammals. In the sample with the highest M. primigenius read counts (31.5 cm depth, dated to 81 years), the coverage of the reference mitogenome (NCBI accession NC_007596.2) was 95.3%, (434 (± 213)-fold). Across all samples, 465,080 reads assigned to mammoth were produced, with 98.3% coverage (2,762-± 1,176-fold). Read lengths ranged from 28 to 289 bp (mean 100 ± 44 bp; Fig. 1). The number of woolly mammoth reads decreased from lower samples towards the top of the core (Fig. 1). Signatures of post-mortem DNA decay were comparably minor (Fig. 1), with reference to an M. primigenius genome downloaded from NCBI (accession NC_007596.2), and mapping suggested that the sequences throughout the core originated from multiple individuals. Further analyses of the three libraries with the most mammoth reads using mixemt (Vohr et al., 2017) identified a number of mitochondrial haplogroups in the sequences from the core, suggesting that they originated from a multiple individuals (Fig. 2.). The haplogroups identified were known to occupy the region, and it seems likely the sequences reflect a history of mammoth occupation at the core site. Twelve of the 23 libraries produced > 100 reads, each, assigned to woolly rhinoceros, with a total of 2,737 reads and a cumulative coverage of 44% (assembled to NC_012681.1 2). Further analyses are mostly focused on the mammoth sequences as these occurred in substantially higher numbers.

Sediment cores retrieved from two lakes on the Yamal peninsula, Siberia.

aDNA of Mammuthus in recent lake sediments. (A) Read counts assigned to Mammuthus (square-root-transformed proportion of the respective number of raw reads per library) after hybridization capture enrichment of aeDNA of core LK-001 (shown are results of 22 libraries; one library was excluded as it did not produce any reads assigned to mammals); square-root transformation of percentage. Indicated are sample depths (in cm; 1.5 to 80 cm) and approximate ages as per 210 Pb chronology (Table S7; to a maximum depth of 39.5 cm). The solid line indicates the general trend. Across the 22 libraries: (B) Fragment length distribution and (C) damage patterns (red indicates C-to-T transitions, blue G-to-A transitions. the Y-axis indicates the percentage of positions with a nucleotide change, the X-axis indicates the position along the fragment).

>. Locations of the sediment cores of the present study (Yamal peninsula, Siberia) and previously retrieved mammoth remains and their haplo(sub)groups (Table S6). The bar chart indicates a maximum-likelihood estimate of the haplogroup proportions derived from the reads from the three sediment core libraries with the most mammoth reads.

As these results were entirely incongruent with the temporal occurrence of these two species, we employed several methods to confirm their validity, i.e., conventional PCR and Sanger sequencing of a mammoth cytochrome c oxidase subunit I (COI) fragment, mammal metabarcoding, and droplet digital PCR (ddPCR) of a mammoth cytochrome b fragment. We performed these analyses on core LK-001 and, to evaluate whether this is a locally isolated phenomenon, on a second short core, LK-007, from a nearby lake, which had been opened and processed under the same conditions indicated above to prevent contamination. Additionally, we screened the core LK-001 for plant macrofossils, of which three were sent for 14C AMS dating.

Conventional PCR and Sanger sequencing confirmed amplification of a COI fragment of M. primigenius. Mammal metabarcoding produced mammoth sequences (74 bp) in 13 samples of core LK-001 and 9 samples of core LK-007. In the LK-001 core, the highest M. primigenius read count occurred at 31.5 cm (2,992 reads), and in the LK-007 core at 26 cm M. primigenius (3,580 reads). ddPCR produced M. primigenius sequences in 14 samples of each core. Metabarcoding and ddPCR patterns across the cores were similar, although not completely congruent, as ddPCR appeared to be more sensitive (Fig. S1). The dates retrieved by radiocarbon dating were not congruent with the initial age inference suggested by Pb210. While the lowest and topmost sample (with ages of 1,547 ± 228 and 339 ± 79 uncal. yrs BP respectively) suggest relatively young ages and agree in their temporal succession, the middle sample, at 51 cm, yielded a radiocarbon age of 8,677 ± 132 yrs.

The mammoth was abundant throughout most of Eurasia during the Pleistocene, but populations declined at the end of the Pleistocene, with the species going extinct in the mid-Holocene (Nogués-Bravo et al., 2008). The youngest fossils from mainland Siberia have been dated to 9,650 years (Stuart et al., 2002). The woolly rhinoceros was a cold-adapted megaherbivore, which was abundant from western Europe to north-east Siberia during the Middle to Late Pleistocene (Kahlke and Lacombat, 2008). This species was predominantly grazing, probably resorting to browsing only due to seasonal vegetation restrictions (Kahlke and Lacombat, 2008; Rey-Iglesia et al., 2021; Stefaniak et al., 2021). The reasons underlying the extinction of this species at ca 14 ka BP are not entirely clear, but it is largely attributed to sudden climate warming and subsequent habitat unsuitability due to vegetation changes (Stuart and Lister, 2012; Lord et al., 2020; Wang et al., 2021), likely coupled with human influence (Fordham et al. 2022), in the Bølling-Allerød interstadial warming.

The present data, which implies frequent and abundant Pleistocene megafaunal DNA throughout a sediment core deposited in the lake over the past centuries suggests that physical processes, rather than presence of live organisms, are responsible for the recovery of this DNA. While not in itself fully conclusive, our data suggests the source of the DNA of Pleistocene mammals from older permafrost deposits, either from a carcass or from sedimentary materials carrying the DNA. The numbers of mammoth sequences increased with depth downcore, with comparably low abundances over the most recent 11 cm of the core (aged < 30 years), pointing to a decrease of input of this DNA through time. The apparently rather limited extent of aDNA damage in the mammoth sequences suggests that the source of this DNA has been preserved exceptionally well, which also suggests an origin from permafrost, and the specific dynamics of thawing and re-deposition of material in the study area offer an explanation.

Here, the active thermokarst likely began during the climatic optimum of the Holocene (9000–3000 BC; Savvichev et al., 2021). The LK-001 and LK-007 lake basins are interbedded mainly into the IVth marine plain, formed in the Kazantsevskaya, Marine Isotopic Stage 5. The permafrost of these lake catchments was formed no earlier than 70—60 kyr BP after the Kazantsevskaya transgression in more sub-aerial conditions and with mean annual temperatures 6 to 7 °C lower than modern temperatures (Baulin et al., 1981). As this area was under coastal-marine conditions for a long period, these lake basins may be paleo-marine remnants, or they were formed later as a result of thermokarst over the segregated or tabular (massive) ground ice during the Holocene climatic optimum (Kachurin, 1961). The area is also subject to abrupt permafrost thaw (thermo-denudation), resulting, for example, in the formation of retrogressive-thaw slumps (thermocirques) and the transport of a large amounts of thawed terrestrial material into the lake water (Dvornikov et al., 2018). Such abrupt permafrost thaw processes normally appear adjacent to lakes and can form specific geomorphological elements,. i.e., thermo-terraces (Kizyakov et al., 2006) within lake catchments and lakes; they are normally polycyclic processes appearing due to the extension of a seasonally thawed layer (active layer) up to the top of massive ice and more favorable thermal conditions within the existing forms (Leibman and Kizyakov, 2007; Kokelj et al., 2009). Traces of past thermo-denudation can be observed within both lake catchments. In catchments of five neighboring lakes, large retrogressive-thaw slumps appeared in recent years (2012-2013) accompanied by the thaw and lateral transport of modern and Late Pleistocene deposits into lakes. Additionally or alternatively, thermo-erosion of upper geomorphological levels and transport via stream networks could transport ancient material into the modern lacustrine sediments. However, the two studied lakes are headwater lakes (with outlet, no apparent inlet) and this option can only be considered in terms of small thermo-erosional valleys within the catchments.

An alternative mechanism for the redistribution of Late Pleistocene material in the sediments is related to subcap methane emission (bubbling) from degrading permafrost beneath the lake bottom. In-lake bubbling can be observed in a circum-Arctic scale: in North-East Siberia, Alaska and Canada. This is common especially in lakes with a depth exceeding two meters, which do not freeze entirely up to the bottom in winter, leading to the formation of a talik (a layer of year-round unfrozen ground that lies in permafrost area). The expansion of the talik may further trigger subcap methane emission, which can reach 40-70 kg yr-1 of pure (94%–100%) methane in neighboring lakes (Kazantsev et al., 2020). The constant methane seepage doesn’t allow ice to be formed in winter (whereas the normal winter ice thickness is approximately 1.5 m) and can potentially disturb the stratigraphy of lake sediments. Additionally, dramatic emissions of methane can form craters in terrestrial and lacustrine environments (Dvornikov et al., 2019). In this case Late Pleistocene sediments will well be re-distributed within the water-body and the entire stratigraphy will be mixed. In the case reported here, the simultaneous finding of a Pb210 chronology indicating recent sediment deposition and of plant macrofossils that dated to > 8000 yrs BP in a sample from 36.5 cm, suggest lateral input of ancient material, including the mammalian DNA, putatively related to permafrost thawing processes. Numerous studies on aDNA discussed possible leaching through sedimentary strata of the DNA itself, yet it was typically considered not an issue as most of these studies were conducted under stable permafrost or similar conditions (e.g., Hansen et al., 2006; Haile et al., 2007, but see Andersen et al. 2012). Permafrost thawing and re-deposition of material adds a new dimension to this problem of temporal interpretation. The fact that we retrieved mammoth sequences from both cores of two lakes located approximately 5 km apart suggests that this is not an isolated phenomenon but occurs on a regional or even larger scale. Given the wide spread of abrupt permafrost thaw processes in the Arctic plains (West Siberia, Taimyr, Chukotka, Alaska, Canadian Arctic; Kizyakov et al., 2006 and references therein), the phenomena of disturbed stratigraphy of lacustrine sediments can potentially be observed at a pan-Arctic scale. While this indicates that temporal interpretation of sedimentary aeDNA records should be exercised with caution, our study also demonstrates that a careful evaluation of available information on the site and ecosystem in conjunction with the use of independent dating techniques can uncover incongruencies. This is more difficult in older time periods, where artefactual stratigraphies caused by equivalent processes acting for a limited time will not be detected as easily as in our case of long extinct species. The same applies to extant taxa or those which have undergone extinction or extirpation more recently, the presence of which cannot as easily be excluded as in the current example. However, we suggest that the inclusion of robust dating techniques and knowledge of local geophysical processes can provide good arguments to evaluate the reliability of aeDNA records.

Materials and Methods

Field sites, DNA isolation and hybridization capture enrichment

In 2019, sediment cores (6 cm diameter; Table 1) were retrieved from two lakes (LK-001 and LK-007, respectively, Table 1) on the Yamal peninsula, Siberia, using a UWITEC piston corer (UWITEC, Mondsee, Austria). The lakes were located approximately 5 km apart. The cores were transported to the aDNA laboratories of the University of Konstanz, Germany; from lake LK-001, a secondary core was taken which was sliced in the field at 1-cm steps for radiometric dating from 0 to 39.5 cm depth, performed at the Environmental Radioactivity Research Centre, University of Liverpool (Supplement section 9). Additional 14C dating of three specimens of plant remains, extracted at 36.5 cm, 51 cm and 74 cm, was performed (Supplement section 9).

Sedimentary DNA was isolated from 23 samples of core LK-001 and from 16 samples of core LK-007 using commercially available kits with modified protocols (Supplement section 1). The extracts of core LK-001 were subjected to library preparation for capture enrichment. Enrichment probes were designed from mitogenomes of 17 herbivorous mammal species that currently or previously occurred in the Arctic (Table S2) and few lichen sequences (Table S3). Genomic libraries were produced according to Li et al., 2013 with some modifications (Seeber et al., 2019; Seeber et al. 2023; Supplement section 2). Filtered reads were mapped to mammalian mitogenomes, followed by BLASTn alignment against the complete NCBI nucleotide database and subsequent metagenomic analyses using MEGAN (Huson et al., 2016). Reads assigned to Mammuthus were mapped to a complete M. primigenius reference mitogenome (NCBI accession NC_007596.2); reads assigned to Coelodonta antiquitatis were mapped to the NCBI reference mitogenome NC_012681.1 2. The reads mapped to mammoth from the top three libraries were assigned to haplogroup using mixemt (https://github.com/svohr/mixemt) with a custom-made representative panel of 15 mammoth mitogenomes (Fig. 2; Table S6).

Conventional PCR, mammal metabarcoding, and ddPCR

Based on the enriched fragments with the highest coverage, PCR primers specific to M. primigenius were designed using Geneious Prime 2022.1.1 (Kearse et al., 2012), i.e., mamm801 (5’-CCCATGCAGGAGCTTCTGTAGA-3’) and mamm800r (5’-GACATAGCTGGAGGTTTTATGT-3’) to produce a 121-bp amplicon of the CO1 gene. The specific PCR conditions are described in Supplement section 6. Mammal metabarcoding PCRs were performed on 21 DNA extracts of core LK-001 and 27 samples of core LK-007, with eight independent replicates, each. Each batch of PCRs included one non-template control. Established metabarcoding primers were used (Giguet-Covex et al., 2014), and human blocking primers (Garcés-Pastor et al., 2021) were included. PCR conditions are described in the supplementary material. Sequencing was performed on an Illumina NovaSeq platform, with 2 x 150 reads. The raw data were processed as described in the supplement. We used the sample LK-001_66.5 of core LK-001 which had produced one of the highest Mammuthus read counts after enrichment. The specific target amplified by the primer pair were used to design a probe (5’-GGATACTCCTGCAAGGTGAAGTG-3’). With this probe, ddPCR was performed with 24 extracts of core LK-001 and with 30 extracts of core LK-007, with three replicates, each (Supplement section 8).

Supporting information

S 1

Data availability

Raw sequence data of the hybridization capture were made available as an NCBI BioProject (PRJNA933601).

Acknowledgements

This research was funded through the 2017-2018 Belmont Forum and BiodivERsA joint call for research proposals, under the BiodivScen ERA-Net COFUND program, and with the funding organizations Deutsche Forschungsgemeinschaft (DFG grant EP-98/3-1 to L.S.E.), Agence Nationale de la Recherche (ANR), Research Council of Norway (NFR), the Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning (Formas), Academy of Finland, National Science Foundation (NSF) and the Natural Sciences and Engineering Research Council of Canada (NSERC-CRSNG). Field work was enabled through a grant of the Young Scholar Fund of the University of Konstanz to L.S.E. Y.D. was supported by the RUDN University Strategic Academic Leadership Program. Expedition logistics were supported by the Department of Science and Innovations of Yamalo-Nenets Autonomous Okrug and the Non-profit organization «Russian Center of Arctic Exploration». We thank PD Dr. Elena Marinova for assistance and discussions concerning dating of plant macrofossils and Patrick Bartolin for assistance in the wet lab.

Competing interests

The authors have no competing interests to declare.