Figures and data

A Faroese whole genome reference.
A) Map of the Faroe Islands, colored by the six sampling regions. The number of minimally related FarGen participants from each region selected for whole genome sequencing is indicated. B) Principal component analysis (PCA) of Faroese genomes jointly called with relevant 10000 Genomes reference data shows separation of European groups by PCs 3 and 4 (FARO, Faroese, CEU, Central Europeans, GBR, British, FIN, Finnish IBS, Iberian, TSI, Tuscan, CHB, Han Chinese, YRI, Yoruban). C) Faroese enriched putatively functional alleles visualized by minor allele count, CADD score, and Variant Effect Predictor (VEP) consequence. Variants shown are those with CADD > 30 and at least two minor alleles observed in Faroese individuals, and no minor alleles observed in Finnish or Northern European reference individuals. D) HLA-B allele frequencies for alleles detected at least twice in Faroese individuals. In this cohort, 1 minor allele corresponds to an allele frequency of 1.25%.

Runs of homozygosity by group.
Amount of the genome (Mb) contained in runs of homozygosity (ROH) stratified by group. Top panel is the sum total of the genome contained within ROH, with the other panels showing this split by length (short, medium, and long).

Selection scan results for Faroese and British cohorts.
A) Log transformed two-tailed p-value of the standardized integrated haplotype score (iHS) in the 40 Faroese genomes (FARO). B) Log transformed two-tailed p-value of the standardized iHS for 90 British WGS samples from 1000 Genomes (GBR). C) log transformed two-tailed p-value for the standardized cross-population expected haplotype homozygosity (XPEHH) for FARO vs GBR (only positive values, which indicate selection in FARO, are plotted). Some genes in the top loci are indicated on each plot. The p-value cutoffs which correspond to a False Discovery Rate (FDR) at 0.01 and 0.001 are respectively indicated by the red dotted line and blue dashed line in each plot. A) For iHS in FARO, these cutoffs are p = 2.72 x 10-6 (FDR = 0.01) and p = 9.20 x 10-8 (FDR = 0.001). B) For iHS in GBR, the cutoffs are p = 2.78 x 10-6 (FDR = 0.01) and p = 1.75 x 10-7 (FDR = 0.001). C) For XPEHH in FARO vs GBR, the cutoffs are p = 2.35 x 10-6 (FDR = 0.01) and p = 3.01 x 10-8 (FDR = 0.001). See Methods for details on p-value and FDR estimation.

Haplotype visualizations for the LCT/MCM6 locus.
A) Decay in Expected Haplotype Homozygosity (EHH) and B) haplotype furcation plot for FARO centered on lactase persistence allele rs4988235; chr2_135851076_G_A. C) Decay in EHH for GBR and D) haplotype furcation for GBR centered on the same allele. E) Haplostrips visualization of haplotype structure in the region chr2:135677850-135986443. In this panel, columns correspond to segregating alleles, and rows correspond to individuals. In the haplotype furcation plots (panels B & D), the haplotypes for the reference allele (G) are in blue, and those for the alternate allele (A) are in red.

PCA analysis of 616 ancient imputed genomes from Europe and 40 present-day Faroese genomes.
Each individual is depicted as a pie chart, showing ancestry proportions estimated using Haplonet. Ancestry proportions for ancient individuals were estimated unsupervised, while those for present-day Faroese individuals were estimated semi-supervised using ancient genomes as references. The five colors represent different ancestral sources: orange for West Europe, green for North Europe, blue for Steppe, purple for the Levant and East Mediterranean, and red for East Europe. The geographical distribution (bottom-right) highlights historical samples (250 years BP) in red, this study’s samples in black, and an 800-year-old individual sample in blue.

Quality control of whole genome sequencing data.
A-D) Distributions of QC metrics across Faroese samples sequenced (mean depth, median insert size, mapping rate, duplicate rate). E) Mean WGS depth versus number of genotype calls with GQ > 20. Singletons and calls with GT <= 20 were discarded before imputation. F) Principal component analysis (PCA) of Faroese genomes jointly called with relevant 10000 Genomes reference data captures African ancestry in the first component and East Asian ancestry in the second component (FARO, Faroese, CEU, Central Europeans, GBR, British, FIN, Finnish IBS, Iberian, TSI, Tuscan, CHB, Han Chinese, YRI, Yoruban).

Kinship matrices between modern populations.
A) Kinship estimated by popkin, including global reference populations from the 1000 Genomes and the Faroese WGS cohort (FARO). B) Kinship estimated by popkin for the Faroese WGS cohort only. The matrix is rescaled after subsetting the individuals, so although the scales are different, the overall structure remains the same. The matrices are symmetrical and ordered by population or region label as indicated by the colored bars along the rows and columns. The diagonal of each matrix is the estimated inbreeding coefficient. The Faroes region labels are: VM = Vágar and Mykines; SR = Suðuroy; SM = Suðurstreymoy; SD = Sandoy, Skúvoy, Stóra Dímun; NG = Norðoyggjar; and EN = Eysturoy og Norðstreymoy.

Haplotype visualizations for top XP-EHH variant in SLC10A1 / SRSF5 locus.
A) Decay in Expected Haplotype Homozygosity per Site (EHHS) for chr14_69775276_C_T, comparing Faroese (FARO, teal) and British (GBR, red) haplotypes. B) Lengths for distinct haplotypes spanning chr14_69775276_C_T comparing FARO (teal) and GBR (red). C) haplotype furcation plot for FARO centered on chr14_69775276_C_T D) haplotype furcation for GBR centered on the same allele. In the haplotype furcation plots (panels C & D), haplotypes for the reference allele (C) are in blue, and those for the alternate allele (T) are in red.

Haplotype visualizations for top XP-EHH variant in POLQ locus.
A) Decay in Expected Haplotype Homozygosity per Site (EHHS) for chr3_121526194_G_A, comparing Faroese (FARO, teal) and British (GBR, red) haplotypes. B) Lengths for distinct haplotypes spanning chr3_121526194_G_A comparing FARO (teal) and GBR (red). C) haplotype furcation plot for FARO centered on chr3_121526194_G_A D) haplotype furcation for GBR centered on the same allele. In the haplotype furcation plots (panels C & D), haplotypes for the reference allele (G) are in blue, and those for the alternate allele (A) are in red.

Admixture plot showing proportions for 616 imputed ancient genomes from Europe together with 40 present-day Faroese genomes from this study.
Ancient individual groups are categorized based on patterns of IBD clustering as inferred in Allentoft et al. 2024. The plot uses five colors to represent different ancestral sources, which are maximized in individuals in different regions of europe: orange for “West Europe”, green for “North Europe”, blue for “Steppe”, purple for the “Levant and East Mediterranean” and red for “Eastern Europe”.

Map illustrating the geographical distribution of ancestry proportions for 616 ancient imputed individuals from Europe and 40 modern Faroese genomes from this study.
The map is divided into panels to capture both geographical and temporal variations across Europe. Each pie chart on the map represents admixture proportions with five colors: green, orange, blue, red, and purple maximize Northern European, Celtic, Steppe, Eastern European, and Levant / East Mediterranean ancestries, respectively, reflecting different ancestral sources.

Q-Q plots and p-value histograms for selection statistics.
Histograms of p-value distributions for A) integrated haplotype score (iHS) in the Faroese (FARO) haplotypes, C) iHS in the British (GBR) haplotypes, E) cross-population expected haplotype homozygosity (XP-EHH) comparing FARO and GBR. Q-Q plots for observed versus expected log transformed p-values for B) iHS in FARO, D) iHS in GBR F) XP-EHH between FARO and GBR. The estimated lambda inflation value is shown in each Q-Q plot.