Figures and data

A Faroese whole genome reference.
A) Map of the Faroe Islands, colored by the six sampling regions. The number of minimally related FarGen participants from each region selected for whole genome sequencing is indicated. B) Principal component analysis (PCA) of Faroese genomes jointly called with relevant 10000 Genomes reference data shows separation of European groups by PCs 3 and 4 (FARO, Faroese, CEU, Central Europeans, GBR, British, FIN, Finnish IBS, Iberian, TSI, Tuscan, CHB, Han Chinese, YRI, Yoruban). C) Faroese enriched putatively functional alleles visualized by minor allele count, CADD score, and Variant Effect Predictor (VEP) consequence. Variants shown are those with CADD > 30 and at least two minor alleles observed in Faroese individuals, and no minor alleles observed in Finnish or Northern European reference individuals. D) HLA-B allele frequencies for alleles detected at least twice in Faroese individuals. In this cohort, 1 minor allele corresponds to an allele frequency of 1.25%.

Runs of homozygosity by group.
Amount of the genome (Mb) contained in runs of homozygosity (ROH) stratified by group. Top panel is the sum total of the genome contained within ROH, with the other panels showing this split by length (short, medium, and long).

Selection scan results for Faroese and British cohorts.
A) Log transformed two-tailed p-value of the standardized integrated haplotype score (iHS) in the 40 Faroese genomes (FARO). B) Log transformed two-tailed p-value of the standardized iHS for 90 British WGS samples from 1000 Genomes (GBR). C) log transformed two-tailed p-value for the standardized cross-population expected haplotype homozygosity (XPEHH) for FARO vs GBR (only positive values, which indicate selection in FARO, are plotted). Some genes in the top loci are indicated on each plot. The p-value cutoffs which correspond to a False Discovery Rate (FDR) at 0.01 and 0.001 are respectively indicated by the red dotted line and blue dashed line in each plot. A) For iHS in FARO, these cutoffs are p = 2.72 x 10-6 (FDR = 0.01) and p = 9.20 x 10-8 (FDR = 0.001). B) For iHS in GBR, the cutoffs are p = 2.78 x 10-6 (FDR = 0.01) and p = 1.75 x 10-7 (FDR = 0.001). C) For XPEHH in FARO vs GBR, the cutoffs are p = 2.35 x 10-6 (FDR = 0.01) and p = 3.01 x 10-8 (FDR = 0.001). See Methods for details on p-value and FDR estimation.

Haplotype visualizations for the LCT/MCM6 locus.
A) Decay in Expected Haplotype Homozygosity (EHH) and B) haplotype furcation plot for FARO centered on lactase persistence allele rs4988235; chr2_135851076_G_A. C) Decay in EHH for GBR and D) haplotype furcation for GBR centered on the same allele. E) Haplostrips visualization of haplotype structure in the region chr2:135677850-135986443. In this panel, columns correspond to segregating alleles, and rows correspond to individuals. In the haplotype furcation plots (panels B & D), the haplotypes for the reference allele (G) are in blue, and those for the alternate allele (A) are in red.

PCA analysis of 616 ancient imputed genomes from Europe and 40 present-day Faroese genomes.
Each individual is depicted as a pie chart, showing ancestry proportions estimated using Haplonet. Ancestry proportions for ancient individuals were estimated unsupervised, while those for present-day Faroese individuals were estimated semi-supervised using ancient genomes as references. The five colors represent different ancestral sources: orange for West Europe, green for North Europe, blue for Steppe, purple for the Levant and East Mediterranean, and red for East Europe. The geographical distribution (bottom-right) highlights historical samples (250 years BP) in red, this study’s samples in black, and an 800-year-old individual sample in blue.