Stable population structure in Europe since the Iron Age, despite high mobility
Figures

Timeline of new and published genomes.
(A) 204 newly reported genomes (black circles) are shown alongside published genomes (gray circles), ordered by time and region (colored the same way as in B). (B) Sampling locations of newly reported (black) and published (gray) genomes are indicated by diamonds, sized according to the number of genomes at each location.

Detailed map of locations for newly reported samples.
Each circle represents a location, the size of the circle corresponds to the number of individuals sampled from that location. Circles are colored by their time period: Bronze Age is green (Pian Sultano), Iron Age is yellow (two recently reported sites Tarquinia and Kerkouane), Imperial Rome and Late Antiquity is dark blue, Medieval Ages and Early Modern are light blue (Palazzo della Cancelleria, Velić, Gardun, Mirine-Fulfinum). Note that the Bronze Age and Iron Age sites were recently reported in Moots et al., 2022.

Armenia: two homogeneous genetic clusters distinguished by a temporal shift.
(A) Sampling locations of ancient genomes (open circles) colored by their genetic cluster identified using qpAdm modeling. (B) Date ranges for the genomes: each line represents the 95% confidence interval for the radiocarbon date or the upper and lower limit of the inferred date, and the point represents the midpoint of that range. (C) Projections of the genomes onto a PCA of present-day genomes (gray points labeled by their population). Present-day genomes from Armenia are shown with dark gray open circles.

Principal component analysis of present-day genomes from Europe and the Mediterranean.
PCA was performed on 829 individuals (480,712 snps) using smartpca v1600. The following parameters were used: 5 outlier iterations (numoutlieriter), 10 principal components along which to remove outliers (numoutlierevec), altnormstyle set to NO, with least squares projection turned on (lsqproject set to YES).

Ancestry clusters identified within regions.
Each row displays data from a single study region. The first column shows a map with the sampling locations for the individuals, while columns two through four show the individuals projected onto a PCA space of present-day genomes (gray points) (populations are labeled in the far right panel in row 1 and in Figure 2—figure supplement 1). Individual ancient genomes in the map and PCA panels are colored by ancestry clusters identified using qpAdm. Colors are not matched across regions. Star points are putative outliers, that is individuals with ancestry that is underrepresented in the region. They are not colored by ancestry clusters so as to reduce visual clutter.

SNP coverage comparison across cluster sizes and downstream outlier status.
(left) No significant correlation was detected between the median number of SNPs covered across the individuals in a cluster and cluster size. (right) There also was no significant difference in the number of SNPs covered between outlier and non-outlier clusters.

Southeastern Europe: highly heterogeneous Imperial Roman and Late Antiquity period population.
(A) Sampling locations of genetic clusters are represented by a single point per location. Outlier ancestries are black stars, all others are open circles colored by genetic cluster. (B) Colored bars span the minimum and maximum of the date ranges of samples (95% confidence interval from radiocarbon dating or archaeological range). Points are the mean of an individual’s date range. (C) Projections of the ancient genomes onto a PCA of present-day genomes (gray points). Population labels for the PCA reference space are shown in Figure 2C. Present-day genomes from Southeastern Europe are shown with dark gray open circles.

Population structure of Italy during the Imperial Roman and Late Antiquity period.
Ancient Italian genomes (colored points) from the Imperial Roman and Late Antiquity period were projected onto principal components of present-day genomes (gray points, populations labeled in Figure 2—figure supplement 1). Present-day Italian genomes are highlighted by a gray filled ellipse. Star points are outliers and circle points are non-outliers. Outlier clusters that can be modeled using contemporaneous populations are labeled with the potential source region.

Western Europe: heterogeneous Imperial Roman and Late Antiquity period population.
(A) Sampling locations of genetic clusters are represented by a single point per location. Outlier ancestries are black stars, all others are open circles colored by genetic cluster. (B) Colored bars span the minimum and maximum of the date ranges of samples (95% confidence interval from radiocarbon dating or archaeological range). Points are the mean of an individual’s date range. (C) Projections of the ancient genomes onto a PCA of present-day genomes (gray points). Population labels for the PCA reference space are shown in Figure 2C. Present-day genomes from Southeastern Europe are shown with dark gray open circles.

Ancestry outliers and their potential sources.
(A) The proportions of outliers in each region were determined by individual pairwise qpAdm modeling followed by clustering. (B) Sources were inferred by one component qpAdm modeling of resulting clusters with all genetic clusters in the dataset. In the network visualizations, nodes are regions and directed edges are drawn from sources to outliers (i.e. potential migrants). The full network of source to outlier is shown. (C) Examples of individual regions are shown in greater detail.

Lack of sex-bias amongst outliers with valid qpAdm sources.
The proportions of males and females do not differ significantly between outlier and non-outlier groups (p=0.4117). When outliers (with and without source) are treated as one group, there is still no significant association with outlier status and sex (p=0.633).

Distances of outliers to their candidate sources.
Geographic distance between the sampling locations of ‘outlier with source’ and the location of their putative source was calculated for each outlier. The mean distance was calculated if there were multiple putative sources.

Example routes and travel times across the Roman Empire.
Routes and travel times were approximated using orbis.stanford.edu, a geospatial network model of the Roman Empire. Routes shown are the fastest routes during Summer for civilians, utilizing road, river, coastal sea, and open sea, and by foot if on road. Routes for military individuals (not shown) are marginally faster.

Relatively stable population structure from Bronze Age to present-day.
(A) Overall genetic differentiation between populations (measured by FST) and its relationship to geographical distance (spatial structure) is similar from Bronze Age onward. Confidence intervals were calculated through a bootstrap procedure, using 200 bootstrap replicates. (B) In PC space, each genome is represented by a point, colored based on their origin (for present-day individuals) or sampling location (for historical samples). The PC space is established by present-day samples (bottom), onto which either historical period (middle) or prehistoric genomes (top) were projected. For projections, the present-day samples are shown in gray, and their extent is visualized by a gray polygon.

Simulation of population structure with and without long-range dispersal.
(A) A base model of spatial structure is established by calibrating per-generation dispersal rate to generate a maximum FST of ~0.03 across the maximal spatial distance, and visualized using PCA. In addition to this base dispersal, either 4% (B) or 8% (C) of individuals disperse longer distances, and the effect is tracked by analyzing spatial FST through time, as well as PCA after 120 generations of long-range dispersal.

A sigmaDisp - N parameter pair was chosen to closely approximate the observed FSTmax of ~0.03 using grid search across a range of parameter pairs.
We used the pair N=50,000 & sigmaDisp = 0.02 for all other simulations we report.

A sigmaDispLR parameter was chosen to qualitatively resemble long-range dispersal distances observed in the data, by comparing the distribution of distances under long-range dispersal (outliers) to randomly chosen distances given the spatial distribution of samples.
We used a value of 0.20 for all other simulations we report.

SNP coverage comparison between outliers and non-outliers in region-period pairings with “surprising” outliers (t-test p-value: 0.
242).

PCA projection (left) and SNP coverage comparison (right) for “surprising” outliers and surrounding non-outliers in Italy_IRLA.

Comparison of pairwise PCA projection distance to outgroup-f3 similarity across all qpAdm reference population individuals.
PCA projection distance was calculated as the euclidean distance on the first two principal components. Outgroup-f3 statistics were calculated relative to Mbuti, which is itself also a qpAdm reference population. Both panels show the same data, but each point is colored by either of the two reference populations involved in the pairwise comparison.

Comparing geographic distance to PCA distance between pairs of historical and pre-historical individuals matched by geographic space.
For each historical period individual we selected the closest pre-historical individual by geographic distance in an effort to match the distributions of pairwise geographic distance across the two time periods (left). For these distributions of individuals matched by geographic distance, we then queried the euclidean distance between their projection locations in the first two principal components (right).
Additional files
-
Supplementary file 1
Archaeological context for sampling locations.
- https://cdn.elifesciences.org/articles/79714/elife-79714-supp1-v1.docx
-
Supplementary file 2
Metadata for all newly reported individuals Supplementary file 3.
- https://cdn.elifesciences.org/articles/79714/elife-79714-supp2-v1.zip
-
Supplementary file 3
AMS and calibration results.
- https://cdn.elifesciences.org/articles/79714/elife-79714-supp3-v1.zip
-
Supplementary file 4
Published samples that contributed to this study.
- https://cdn.elifesciences.org/articles/79714/elife-79714-supp4-v1.zip
-
MDAR checklist
- https://cdn.elifesciences.org/articles/79714/elife-79714-mdarchecklist1-v1.docx