Abstract
The Faroe Islands are home to descendants of a North Atlantic founder population with a unique history shaped by both migration and periods of relative isolation. Here, we investigate the genetic diversity, population structure, and demographic history of the islands by analyzing whole genome sequencing data from 40 participants in the Faroe Genome Project. This represents the first whole genome sequencing panel of this size from the Faroe Islands. We observed numerous putatively functional private alleles, including stop gain variants and high impact missense variants in the cohort. Faroese individuals had a higher proportion of their genomes contained in long runs of homozygosity than other European groups, including Finnish, suggesting a more recent or stronger bottleneck in the Faroese population. Signals of positive selection were identified at loci containing genes that play roles in vitamin D and dietary fat absorption and DNA repair, while increased diversity on lactase persistence haplotypes was observed. Fine-scale analysis of haplotype structure in modern and ancient European genomes revealed genetic affinities with ancient Iron Age individuals from the North and West of Europe, providing evidence for potential contributions to the Faroese gene pool from Celtic and Viking populations as well as information about the temporal order in which these events happened. This study highlights the impact of evolutionary processes, such as ancient admixture, founder events, and positive selection, on the present-day genetic architecture of North Atlantic founder populations like the Faroe Islands.
Introduction
The Faroe Islands, nestled in the North Atlantic Ocean between Iceland and Norway, are home to the descendants of a North Atlantic founder population with a rich cultural heritage and a unique history shaped by both migration and periods of relative isolation. The exact settlement history of the islands is unclear, though historical records and analysis of Y-chromosome microsatellite markers point to a few founders most likely having arrived primarily from Scandinavia and the British Isles starting around the 9th century C.E.1–3 However, archeological evidence supports a possible earlier settlement of the island around the 4th-6th centuries C.E.4 Studies of mtDNA reveal an excess of maternal ancestry from the British Isles, while Y chromosome studies reveal an excess of paternal ancestry from Scandinavia, suggesting sex- biased admixture between these ancestral groups during the founding of the population.2,5 Since settlement and early waves of migration, the Faroe Islands have been mostly isolated and have experienced minimal immigration and population growth until recent years, with a census size of about 4,000 people in the 1700s6 increasing to over 54,000 as of August 2023 (https://hagstova.fo/en/population/population/population). Patterns of genetic diversity and linkage disequilibrium from few genetic markers reflect the founder event, followed by a historically small population size and subsequent rapid expansion.7
Overall, despite the small size and remote location of the Faroe Islands, the above evidence suggests that the genetic makeup of Faroese people may have been influenced by waves of early migration and admixture from various northwest European and Scandinavian populations. However, no population genomic studies have yet been carried out on whole– genome sequencing data to date, with the exception of one study investigating relatedness and autozygosity for a limited sample size of eight individuals.8 An in-depth analysis of the genomic architecture of the Faroese may reveal how the islands’ demographic history has contributed to present-day health and disease in this population. This is of particular interest as the Faroese have a high burden of certain diseases relative to global and other European populations, such as inflammatory bowel disease and type 2 diabetes, among others.9–18 The Faroe Genome (FarGen) Project set out to understand how the genetic diversity of the Faroese contributes to health.19
Moreover, studying the genetic diversity of the Faroese provides potential insight into human migration, adaptation, and population structure in the North Atlantic region. Recently developed haplotype-based methods can provide fine resolution for inferring shared ancestry among individuals and the detection of population-specific haplotypes, but they require panels of whole-genome sequencing data.20 These methods serve to better account for recent patterns of population structure and cryptic relatedness in population-based genetic studies, particularly in populations like the Faroe Islands with strong founder effects.
In this study, we present the first whole genomes sequenced as part of the FarGen Project. A set of 40 individuals were selected to optimally represent the genetic diversity of the broader Faroese population, with the aid of genealogical data reaching back to approximately 1650 CE. Whole genome sequencing (WGS) data was generated and used to identify putatively functional alleles enriched in the Faroese population, assess ancestry patterns within contemporary genomes, map signals of recent positive selection, and analyze local ancestry in a combined dataset of ancient and contemporary genomes of European descent spanning a period of 3,000 years.
This study provides insight into the genetic variation, demographic history, and selection landscape of the Faroese population. These first whole genomes from FarGen may serve as a useful reference for studies on the broad implications of various evolutionary genetic processes, including bottlenecks, ancient admixture, and positive selection. More in-depth studies may expand this current study by further investigating the genetic architecture of the Faroese to unravel the demographic and evolutionary history of the population and its impact on complex traits and diseases in the islands.
Results
Whole Genome Sequencing of Faroese Individuals
The Faroese Multi Generation Register (https://fargen.fo/research/multi-generation-registry) was used to reconstruct a single connected genealogical tree for the 1,541 participants in the first phase of the FarGen cohort.19,21 Individuals with fewer than six direct ancestors (two parents and four grandparents) recorded in the registry were excluded. Pairwise kinship coefficients were estimated from the genealogical tree and used to perform relatedness pruning with a threshold of 2−6 resulting in 332 minimally related individuals. We defined six geographical regions of the Faroe Islands using language dialects, and assigned individuals to these regions based on their place of birth (Fig. 1A). A total of 40 minimally related individuals were selected for inclusion in the study, with five to eight individuals sampled from each region.

A Faroese whole genome reference.
A) Map of the Faroe Islands, colored by the six sampling regions. The number of minimally related FarGen participants from each region selected for whole genome sequencing is indicated. B) Principal component analysis (PCA) of Faroese genomes jointly called with relevant 10000 Genomes reference data shows separation of European groups by PCs 3 and 4 (FARO, Faroese, CEU, Central Europeans, GBR, British, FIN, Finnish IBS, Iberian, TSI, Tuscan, CHB, Han Chinese, YRI, Yoruban). C) Faroese enriched putatively functional alleles visualized by minor allele count, CADD score, and Variant Effect Predictor (VEP) consequence. Variants shown are those with CADD > 30 and at least two minor alleles observed in Faroese individuals, and no minor alleles observed in Finnish or Northern European reference individuals. D) HLA-B allele frequencies for alleles detected at least twice in Faroese individuals. In this cohort, 1 minor allele corresponds to an allele frequency of 1.25%.
The genomes of the 40 individuals were sequenced to a median depth of 20x in the Faroe Islands at the FarGen laboratory, and they all passed quality control metrics (Fig. S1, Table S1). Variant calling was performed jointly with relevant reference genome panels, including 1000 Genomes high-coverage data from Northern European (CEU & GBR, N = 190), Southern European (TSI & IBS, N = 214), Finnish (FIN, N = 99), East Asian (CHB, N = 103), and West African (YRI, N = 108) individuals, and imputation was performed within the cohort using an approach we have previously described.22,23 The first component of principal component analysis (PCA) on the jointly called genotype data captured the cline between West African (YOR) and all other individuals, and the second component captured the cline between East Asian (CHB) and all other individuals (Fig. S1). Principal components three and four separated European individuals, with Faroese individuals forming a distinct cluster from Finnish, Northern European, and Southern European reference groups (Fig. 1B).
Variant calls were annotated with predicted functional impact and allele frequency across the reference groups. Using these annotations, we identified 35 putatively functional alleles present in our Faroese panel, that are unobserved in the European mainland reference panels included in this study (CADD > 30 and at least two minor alleles observed in Faroese individuals, and no minor alleles observed in Finnish or Northern European reference individuals, Table S2). These included 13 stop gain variants and 18 missense variants, and a maximum minor allele count of 5, corresponding to a frequency of 6.25% in the cohort (Fig. 1C).
HLA-B Allele Frequencies
Observational evidence from the FarGen project found that ankylosing spondylitis (AS) may be at a higher prevalence in the Faroe Islands (unpublished data), however this has not been confirmed by followup epidemiological studies. The major histocompatibility complex (MHC) plays a role in various autoimmune diseases that may be at higher prevalence in the Faroes, including ankylosing spondylitis and other more common diseases like inflammatory bowel disease.24,25 In particular, HLA-B*27 is associated with ankylosing spondylitis (AS), with approximately 80-90% of AS patients carrying the HLA-B*27 allele.25 It explains about 30% of the heritability, and ∼6-8% of European populations are carriers of HLA-B*27.25–28 Using the WGS data, we genotyped human leukocyte antigen (HLA) alleles with HLA*LA.29 We provide HLA-B allele counts and allele frequencies in the Faroese cohort as well as the allele frequencies in 1000 Genomes British (GBR), Central European (CEU), and Finnish (FIN) individuals (Table S3). To the best of our knowledge, there have not been any larger studies of HLA-B allele frequencies in Faroese individuals, and none are currently recorded in the Allele Frequency Net Database.30
The most frequent HLA-B allele in the Faroese cohort is HLA-B*07:02 (17.5% of haplotypes), which is a common haplotype in European populations (Fig. 1D).31,32 We did not observe a substantial difference in HLA-B*27 allele frequency in Faroese individuals (6.25% with four B*27:05 and one B*27:01 calls) as compared to other European reference groups (2-7.6%). While 80-90% of people of European ancestry with AS carry the HLA-B*27 allele, only ∼6% of HLA-B*27 carriers in the US and Europe have AS.33–35 The low frequency of the HLA- B*27 allele in this Faroese cohort in the population broadly suggests that if AS is at a higher prevalence, there may be other underlying genetic or environmental factors that explain some of the increased risk.
Population Structure and Relatedness
Pairwise kinship was calculated in the cohort using popkin.36–38 Clustering by geography was observed when including global reference populations in the kinship calculation, however the Faroese individuals do not show obvious clustering by region, which is expected given the relatedness pruning during sample selection (Fig. S2A-B).
We also looked at runs of homozygosity (ROHs) in the Faroese and reference cohorts (Fig. 2), which can provide insights into the demographic history of the population. As the Faroese population likely experienced a founder event during the settlement of the islands followed by rapid population size expansion in recent generations, we would expect to see more of the genome contained in ROHs compared to other global populations that have not experienced as strong a bottleneck.39 When looking at the sum total amount of the genome in ROHs, we found overall elevated levels of ROH in the European and Asian groups included in this analysis, most likely reflecting ancestral out-of-Africa bottlenecks for Eurasian populations (Fig. 2, top panel). However, the Faroese population did not have an elevated total amount of the genome contained in ROHs compared to other European groups.

Runs of homozygosity by group.
Amount of the genome (Mb) contained in runs of homozygosity (ROH) stratified by group. Top panel is the sum total of the genome contained within ROH, with the other panels showing this split by length (short, medium, and long).
To explore this further, we calculated the sum total amounts of ROH at different size categories (small, medium, large) (Fig. 2, bottom 3 panels). We see that, on average, the Faroese individuals have less of the genome contained in short (<300 Mb) and medium (> 300 Kb and <= 1 Mb) ROHs compared to other European groups, but more of the genome contained in long (> 1Mb) ROHs. Short and medium ROH are chunks inherited from older ancestors and reflect older events, for example, an ancient population bottleneck or founder event that has resulted in lower overall haplotype diversity, yet with enough time for recombination to break up haplotypes, while long ROH reflects chunks inherited from recent ancestors and can reflect more recent bottleneck events.39 Interestingly, the average amount of an individual’s genome that is contained in ROHs extending over 1Mb in length is higher in the Faroese population (∼82.5 Mb) than the Finnish reference individuals (∼63.9 Mb) and any of the other groups analyzed, which is consistent with a more recent or stronger bottleneck or founder event.
Signals of Positive Selection
We investigated signals of recent or ongoing positive selection in this Faroese cohort by calculating both the integrated haplotype score (iHS)40 using hapbin.41 The sample size of the WGS cohort is relatively small (n=40), so our ability to detect signals of selection is limited. For comparison, we also calculated iHS in British individuals from the 1000 Genomes (GBR) that were included in joint calling and passed subject-level filters (n=90). The iHS values were standardized genome-wide and two-tailed p-values were computed according to the standard normal distribution. We additionally calculated q-values (the minimum False Discovery Rate (FDR) should a test be considered significant) for each test,42 and determined p-value significance thresholds at which FDR < 0.01 and FDR < 0.001 for each population (see Methods). We observed that a number of significant selection signals were shared between the Faroese and British cohorts, which is not unexpected given the relationship between these two populations (Fig. 3A-B). The strength of these signals did differ from one population to the other, though this may be due to differences in sample size or changes in selection pressure after the populations diverged. To better identify population-specific signals, we also calculated the cross- population expected haplotype homozygosity (XPEHH) comparing Faroese and British haplotypes, and identified significance cutoffs following the same approach described above (Fig. 3C).43 Any extreme positive values of this statistic indicate longer haplotypes at a focal marker in the Faroese cohort compared to the British cohort, while extreme negative values indicate the reverse. Therefore, positive values are indicative of selection signals in the Faroese cohort. Across both tests, we highlight 20 loci with the most extreme values for these statistics, serving as evidence of positive selection in the Faroese genomes at those loci (Table S4).

Selection scan results for Faroese and British cohorts.
A) Log transformed two-tailed p-value of the standardized integrated haplotype score (iHS) in the 40 Faroese genomes (FARO). B) Log transformed two-tailed p-value of the standardized iHS for 90 British WGS samples from 1000 Genomes (GBR). C) log transformed two-tailed p-value for the standardized cross-population expected haplotype homozygosity (XPEHH) for FARO vs GBR (only positive values, which indicate selection in FARO, are plotted). Some genes in the top loci are indicated on each plot. The p-value cutoffs which correspond to a False Discovery Rate (FDR) at 0.01 and 0.001 are respectively indicated by the red dotted line and blue dashed line in each plot. A) For iHS in FARO, these cutoffs are p = 2.72 x 10-6 (FDR = 0.01) and p = 9.20 x 10-8 (FDR = 0.001). B) For iHS in GBR, the cutoffs are p = 2.78 x 10-6 (FDR = 0.01) and p = 1.75 x 10-7 (FDR = 0.001). C) For XPEHH in FARO vs GBR, the cutoffs are p = 2.35 x 10-6 (FDR = 0.01) and p = 3.01 x 10-8 (FDR = 0.001). See Methods for details on p-value and FDR estimation.
One signal that has been consistently observed across northern European populations is in the LCT/MCM6 region, corresponding to positive selection for lactase persistence alleles.40,43–45 Interestingly, this region showed strong iHS signals (|standardized iHS| > 8) in GBR and was considered genome-wide significant in our analysis (minimum p = 5.95 x 10-19, q = 7.82 x 10-14) (Fig. 3B), but the signal is weaker in the Faroese (|standardized iHS| > 4) and was not considered significant in our analysis (minimum p = 3.57 x 10-6, q = 0.0118) (Fig. 3A). To investigate the haplotype structure further, we plotted the decay in expected haplotype homozygosity (EHH)46 and haplotype furcation around one of the lactase persistence alleles (rs4988235; chr2_135851076_G_A) for the Faroese and British cohorts using the rehh package (Fig. 4A-D).47 The decay and furcation plots are centered around the focal marker, and a furcation occurs when unique haplotypes arise at an allele, similar to a tree splitting into branches. Thicker branches in the furcation plot indicate higher frequency of that haplotype in the population. Significant differences between the furcation patterns for an “ancestral” (reference) and “derived” (alternate) allele correspond to extreme iHS values and therefore are indicative of strong positive selection. For an alternative view of the region, we used Haplostrips to visualize the haplotype structure from chr2:135677850-135986443 (Fig. 4E).48 From these plots, we observed far less diversity on the lactase persistence haplotype in GBR, consistent with a stronger selection signal. This may be explained by shared selection on the ancestral northern European branch followed by either relaxed selection for lactase persistence or population-specific drift in the Faroes after the population split from other northern European groups and settled the archipelago.

Haplotype visualizations for the LCT/MCM6 locus.
A) Decay in Expected Haplotype Homozygosity (EHH) and B) haplotype furcation plot for FARO centered on lactase persistence allele rs4988235; chr2_135851076_G_A. C) Decay in EHH for GBR and D) haplotype furcation for GBR centered on the same allele. E) Haplostrips visualization of haplotype structure in the region chr2:135677850-135986443. In this panel, columns correspond to segregating alleles, and rows correspond to individuals. In the haplotype furcation plots (panels B & D), the haplotypes for the reference allele (G) are in blue, and those for the alternate allele (A) are in red.
One of the top XPEHH signals in the Faroese WGS cohort included variants in SLC10A1 (Fig. S3), a sodium/bile acid transporter that plays a role in circulating bile salts to and from the liver and small intestine for the absorption of dietary fat and fat-soluble vitamins such as vitamin D.49–52 SLC10A1 deficiency has been associated with familial hypercholanemia, or elevated concentrations of bile acids, which can lead to fat malabsorption and vitamin D deficiencies among other secondary health conditions.50,53,54 Another top XPEHH signal included variants in POLQ, encoding for a DNA polymerase which plays a role in DNA repair (Fig. S4).55–59 POLQ has been shown to be involved in various cancers in mice and humans, in particular skin, stomach, lung, breast, and colon cancers.55,59–62
Fine-Scale Structure and Connections to Ancient Genomes
Given that early Faroese settlers have documented historical relations to both Northern European Vikings and Northwestern European Celtic communities, we sought to study fine- scale genome-wide ancestry relationships between the sequenced Faroese genomes and publicly available ancient genomes from Iron Age and Viking Age Europe. We downloaded 616 ancient imputed genomes from Allentoft et al. 2024, spanning from the present-day to the late Bronze Age from Europe and focusing specifically on West- and North-Europe, including ancient Faroese genomes.63 We incorporated these genomes into a combined panel including our present-day Faroese dataset and used the software HaploNet to infer fine-scale population structure based on patterns of haplotype similarity across the genome.20
HaploNet identified five ancient sources, through unsupervised ancestry estimation. We then used HaploNet to model both ancient and present-day Faroese genomes as composites of any of the five ancestries through supervised ancestry estimation. We used the ancestral haplotype cluster frequency estimates from individuals not found in the Faroe Islands in order to estimate the admixture proportions of these sources in the European mainland. We used these admixture proportions to label the sources based on the locations in the map where these ancestries tend to be maximized in the Iron Age and Viking Age periods. The resulting labels were: “Steppe”, “East Europe”, “Levant and East Mediterranean”, “West Europe” and “North Europe”. For example, the ancestry labeled “West Europe” is maximized in individuals predominantly found in Celtic contexts (e.g. Roman and Iron Age Britain, Iron Age France) while the ancestry “North Europe” is maximized in individuals characteristic of historically Viking or pre-Viking contexts (e.g. Iron Age individuals from Denmark, as well as Viking Age individuals from Denmark, Norway, Sweden, and Estonia). However, we note there is no one-to-one correspondence between archaeological context and genetically inferred ancestry, and that many mainland individuals contain inferred ancestries from diverse sources. Indeed, Margaryan et al. 2020 showed that Viking-context individuals can derive ancestries from multiple Bronze and Iron Age sources across Europe.64
We then focused on the frequency of the ancestry sources in the Faroese individuals. We find that the present-day Faroese individuals are predominantly composed of roughly equal proportions of “West” and “North Europe” ancestry, while “East” and “South Europe” ancestries are detected at much lower frequencies. Present-day ancestry proportions are identical to those found in the Faroese ancient samples from the Sandur church site in Sandoy and dated to the 17th or 18th centuries based on their archaeological contexts (Fig. S5).64 Margaryan et al. also sequenced a Faroese sample that was excavated from the á Bønhúsfløtu site in the village of Hvalbøur on Suðuroy, and was contextually-dated to be approximately 800 years old. This individual is inferred to be almost entirely composed of “West Europe” ancestry (Fig. S6).
We utilized haplotype cluster likelihoods to explore population structure in the same set of samples. When plotting ancient European samples together with the Faroese samples (Fig. 5), we observe that present-day Faroese individuals (circled in black) separate from the ancient Europeans along the second principal component (PC2), as do the older Faroese samples from the 17th and 18th centuries (circled in red). This perhaps suggests a bottleneck process that differentiates the 17th/18th-century and present-day Faroese from the rest of the ancient European samples, in concordance with the above ROH results. Notably, the 800-year-old sample of a Faroese individual with predominantly “West Europe” ancestry does not fall along the Faroese PC2 cline, suggesting that this individual might predate the bottleneck.

PCA analysis of 616 ancient imputed genomes from Europe and 40 present-day Faroese genomes.
Each individual is depicted as a pie chart, showing ancestry proportions estimated using Haplonet. Ancestry proportions for ancient individuals were estimated unsupervised, while those for present-day Faroese individuals were estimated semi-supervised using ancient genomes as references. The five colors represent different ancestral sources: orange for West Europe, green for North Europe, blue for Steppe, purple for the Levant and East Mediterranean, and red for East Europe. The geographical distribution (bottom-right) highlights historical samples (250 years BP) in red, this study’s samples in black, and an 800-year-old individual sample in blue.
Discussion
Here we presented the first whole-genome sequence data from 40 minimally related individuals from across the Faroe Islands, a North Atlantic founder population. The Faroese have a high prevalence of several diseases in comparison to other European or global populations, several of which have been of particular interest in epidemiological studies of the region (e.g. inflammatory bowel disease, type 2 diabetes, multiple sclerosis, and ankylosing spondylitis).9–18 We investigated the frequency of HLA-B27 in the Faroe Islands, an allele which has been previously associated with ankylosing spondylitis (AS). We found that despite the observed high prevalence of AS in the Faroe Islands there was no evidence of increased HLA- B27 allele frequency compared to other European populations. However, a stop gain variant in SERPINB10 was among those enriched in the Faroese cohort, and it may contribute to increased AS risk (rs138084090, AF = 2.5%). Rare variant analysis of this gene in the UK Biobank found that an independent stop gain variant in the same gene is nominally associated with increased AS risk (rs145346731, p = 2.5e-4), which was the most significant phenotypic association for the gene.65 SERPINB10 is most highly expressed in neutrophilic metamyelocytes,66 a cell type with disease relevance to AS.67 CCDC168, another gene with a Faroese enriched stop gain variant had highly significant rare variant associations in the UK Biobank with corneal hysteresis and intraocular pressure (rs1361247423, p = 1.79e-57, p = 5.21e-13, respectively). Finally, a Faroese enriched stop gain variant in AGL identified in this study (rs113994128) has been previously reported as causing glycogen storage disease type IIIA in the Faroe Islands, which is estimated to have the highest prevalence of the disease world-wide.68 These findings emphasize the importance of further research into the role that unique genetic variation in the Faroe Islands may play in the incidence of diseases that, while common in the Faroes, have a global burden as well.
We found elevated amounts of the genome contained in long runs of homozygosity (ROHs) compared to other European reference cohorts, including another founder population from Finland. The higher proportion of the Faroese genome contained in these long ROHs suggests a stronger or more recent bottleneck in the Faroese population history. With the second phase of the FarGen study, a larger sample size will facilitate investigation into the timing and severity of the bottleneck(s). Additionally, founder populations such as the Finnish, have been a common focus for studies of founder events, genetic isolation, and the effects of haplotype sharing on aspects of human health and disease.69–72 As such, the Faroese population may serve as another useful global reference for studying the influence of demographic history on genetic variation and trait architecture. Long ROHs, in particular, can be enriched for deleterious variation or be of interest in understanding the genetic architecture of health-related traits in human populations.39,73 Studying these long ROHs may be relevant for future studies of health outcomes or other traits of interest in the Faroese population.
We also detected several regions under recent positive selection in the population. We found that many of the top selection signals were shared between the Faroese and British cohorts, which is unsurprising given the recent divergence between these two populations. We did find differences in the strength of these signals; for example, there is more diversity on the Faroese LCT/MCM6 lactase persistence haplotypes. The lactase persistence allele rs4988235 (chr2_135851076_G_A) has been inferred to be under strong selection at least until the medieval period in northwestern European groups.74,75 Patterson et al. 2021 found that the rate of increase in allele frequency may have slowed in recent periods; however, this does not exclude the possibility of continued or fluctuating selective pressure as this is consistent with the expected sigmoidal trajectory for an allele under ongoing selection.76 While the increased diversity on the Faroese lactase persistence haplotypes may be simply explained by population- specific drift, this result could also indicate relaxed selection for lactase persistence alleles after settlement of the Faroe Islands, possibly due to changes in dietary habits in the new environment. The traditional diet of the Faroe Islands consisted of a higher reliance on animal and marine fats such as sheep tallow, whale blubber, and liver from codfishes, while dairy products such as milk and cheese, and particularly that from cattle, were more limited in availability.77 Based on the ancient genomes for which we have available data, we can attest that the selected variant at rs4988235 was present at high frequencies already in the 17th/18th centuries, and imputation further suggests the haplotype containing the allele to at least have been present in the islands 800 years ago.
We detected selection targets that were specific to the Faroese population using the XP- EHH statistic with the British cohort as the comparison population. One top selection signal is in POLQ, which plays a role in DNA repair and various cancers. Without collecting relevant phenotypes or environmental factors, it is difficult to hypothesize what selection pressure may be driving the strong signal in POLQ, so this will be an important area of follow-up for future studies. In another top signal, we find SLC10A1 which plays a role in fat and vitamin D absorption. Positive selection related to differences in dietary fat intake has been hypothesized in many human populations, such as the Inuit population in Greenland.78,79 Also situated in a far northern latitude, the Faroese diet is similar to that of the Inuit population, relying on animal and marine fats.77,80 The relationship between SLC10A1 and vitamin D levels may also be relevant, as the northern latitudes of the Faroe Islands and minimal UV exposure can lead to vitamin D deficiency, which has been hypothesized to be a strong selection pressure in populations in extreme latitudes.81,82
Although we hypothesize that these results suggest possible adaptations to environmental pressures of diet or UV exposure in northern latitudes, we cannot draw definitive conclusions based on this current study. It is certainly possible that variants that have risen to high frequencies due to past or ongoing positive selection now play a role in health outcomes in modern populations. For example, the gene TBC1D4, which was shown to be under positive selection in the Greenlandic Inuit population likely due to a historical diet low in carbohydrates, has been associated with type 2 diabetes and insulin resistance in the same population.80,83 The prevalences of some diseases enriched in the Faroese population may be related to genomic regions under positive selection. For example, the results of several studies have suggested a role of vitamin D deficiency in the development of multiple sclerosis.84–86 Future studies could involve collection of relevant phenotypes and focus on characterizing selective pressures and fine-mapping targets of selection as have been done in studies that more thoroughly characterized selection signals related to dietary adaptation and UV exposure and their functional consequences in other northern latitude populations.79–81,83
We have inferred ancestry tracts in the present day Faroese genomes that were inherited from ancient populations throughout Europe. We found that present-day Faroese individuals have similar relative ancestry contributions from past “North” and “West Europe” Iron Age populations. The most ancient genome available from the Faroe Islands matches ancestry patterns found in Iron Age Wet Europe. This suggests that the “North Europe” ancestry may have arrived in the islands afterwards. This could have occurred either via a mixture of the original “West Europe” ancestry with individuals of predominantly “North Europe” ancestry, or a by replacement with individuals that were already of mixed ancestry at the time of arrival in the islands (the latter are not uncommon in Viking Age mainland Europe). Our analysis also suggests a bottleneck or a more progressive differentiation process in the islands relative to the mainland, which may postdate the most ancient Faroese genome currently available (approximately 800 years old). The most ancient Faroese sample from Margaryan et al. - composed almost entirely of “West Europe” ancestry - is a male individual found in a chapel-site in Suðuroy. Consistent with this, a local legend suggests this site may have been occupied by Irish monks.87 More ancient and present-day samples from the islands could shed further light on the history of the Faroese population.
This study focused on population genomic analyses such as selection scans, population structure, kinship, and ancestry. Given the unique settlement history and genetic architecture of the Faroe Islands, future studies which combine genomic data with relevant phenotype data could provide useful insight into the underlying genetic mechanisms of those traits. In particular, larger-scale genomic studies in the Faroese could investigate genetic risk factors which contribute to the high prevalence of autoimmune and metabolic disease on the islands. This is a focus of the second phase of the FarGen study, which is currently ongoing.
Methods
Sample Selection and Cryptic Relatedness
FarGen cohort
The participants in this study voluntarily enrolled in the FarGen project (The Faroe Genome Project: https://www.fargen.fo/en/home/). The 1,541 subjects are extensively described in Apol et al. 2022.21 Participant inclusion criteria for the FarGen project are that participants must live in the Faroe Islands or be of Faroese descent. Apol et al. report that 96.4% of the participants have between one and four Faroese grandparents. The cohort has a mixed health status composition, with 75% of the participants self-reporting that they have a confirmed diagnosis. Apol et al. found that the cohort is somewhat biased in terms of geographical representation, with the capital region being substantially over-represented.
Reconstruction of genealogy
The Multi-Generation Register at the Faroese Health Authority describes the ancestry of inhabitants of the Faroe Islands (http://fargen.fo/research/multi-generation-registry). The lineages can be traced back to approximately 1650 C.E. The register records birth date, parent identities, parents’ residence at the time of birth, and more. The Legacy Family Tree (https://legacyfamilytree.com) genealogy software is used to manage the digitized records. We reconstructed a genealogical tree of all the individuals in the FarGen cohort by looking up each individual, and recursively looking up their parents until there are no more ancestors. After reconstructing the genealogy of each individual two generations in the past, we discarded any who had fewer than 6 direct ancestors recorded in the Multi-Generation Registry (2 parents and 4 grandparents). We note that although an individual is recorded in the register, there is no guarantee they were born in the Faroe Islands.
Geographical stratification through dialect
We defined six geographical regions of the Faroe Islands as annotated in Fig. 1A: Norðoyggjar; Eysturoy and Norðstreymoy; Suðurstreymoy; Vágar and Mykines; Sandoy, Skúvoy, and Stóra Dímun; Suðuroy, and placed individuals within these regions based on birth place. The boundaries of the regions were defined using isoglosses (i.e. boundaries where we see changes in dialect) as described in Þráinsson 2012.88 For example, the isogloss for "á", which may be pronounced either as [a:] or [ɔa], separates Norðoyggjar in the north from the rest of the islands.
Calculating pairwise kinship coefficients
In order to avoid sequencing highly related samples, we used the large constructed genealogy to account for cryptic relatedness. We calculated pairwise kinship coefficient between every individual using kinship2.89 This method assumes that the founders (individuals in the pedigree without recorded parents) are unrelated to other founders and each individual founder’s parents were not related to each other, which may not always be the case in this population.
Sample selection using graph theory
We constructed a relationship graph with nodes representing individuals, and connected two nodes by an edge if their kinship coefficient is above a given threshold as described below. Ideally, we would remove nodes such that all edges are removed, while keeping as many nodes as possible, referred to as the maximum independent set problem. Obtaining an exact solution to the maximum independent set is an NP-hard problem (exponential time complexity), making it infeasible for our applications. Instead, we obtained an approximate maximum independent set using an algorithm described in Boppana et al. 1990.90 We performed relatedness pruning with a threshold of 2−6 on 1,294 FarGen individuals who were not missing a birth region, resulting in 332 individuals. This was the minimum threshold for which we could have enough sampling from each region. From these 332 individuals, we sampled 5 to 8 from each region, yielding 40 individuals for whole genome sequencing and subsequent analyses.
Bioinformatics
Sequencing and quality control
DNA samples from the 40 selected individuals were sequenced at FarGen (using TruSeq PCR-free libraries on Illumina NextSeq 500 instruments) to an average depth of 19.2x, ranging from 9.9x to 31.5x per sample. Sequencing QC was investigated with FASTQC, Picard (including CollectWgsMetrics for coverage) and VerifyBamID for contamination.
Variant calling and imputation
Reads were processed with Variant Bio’s in-house processing pipeline based on the GATK Best Practices (CCDG functional equivalence version).91 Joint genotyping was performed with GATK (version 4.2.0.0) including 714 genomes from the 1000 Genomes Project (503 Europeans, 103 Han Chinese, 108 Yoruba) and followed by VQSR (--truth-sensitivity-filter-level to 99.8 for SNPs and 99.0 for indels). Only PASS variants in GIAB high-confidence regions (∼80% of GRCh38) were retained.92 Genotypes with GQ<=20 were filtered (set to missing) and imputation was performed within the full cohort of 754 genomes using Beagle v5.1. Variants enriched in the Faroese cohort and with predicted functional impact (Table S2) were additionally hard-filtered with VQSLOD>20.
HLA typing
We ran HLA*LA (v1.0.3) on the mapped reads to determine HLA types for the 40 Faroese individuals as well as for the GBR, CEU, and FIN reference population individuals from 1000 Genomes.29 Benchmarking the method on 1000 Genomes Project data, where HLA types are known, we estimate overall HLA-B typing accuracy at 93.4%, 88.8% and 89.9% for the GBR (N=91), CEU (N=99), and FIN (N=99) reference populations. Accuracy of B27 detection specifically is 100% (N=10), 83.3% (N=6) and 100% (N=15) based on these three reference cohorts, respectively, with one B*27:05 allele mis-identified as B*27:26 in CEU. Table S3 contains counts, mean quality scores, and frequencies of all HLA-B alleles detected among the 40 Faroese as well as the GBR, CEU, and FIN reference populations.
Population Genetics
Individual and variant-level filtering
Beginning with a total of 21,837,577 variants and 754 individuals, we applied various individual-level and variant-level quality control filters for downstream analyses. We filtered 8 individuals with mismatched sex based on genetics and reported information. We additionally filtered 5 individuals that were, for any of PCs 1-10, further than 7 standard deviations from the mean. We then filtered 481,403 variants that were not in GIAB high confidence regions, had MSQ rank sum not equal to zero or failed gnomAD v3 QC.93 We removed 3,627 variants with a minor allele count of less than 1 after individual-level filters were applied. We removed 695,947 variants that were not autosomal. This final dataset included 20,656,600 variants and 741 individuals.
Selection scans
iHS and XP-EHH were calculated using the hapbin software (https://github.com/evotools/hapbin) with option –max-extend 1000000 and --minmaf 0.05.41 All other options were set to default. To compute P-values, we used the method by Fariello et al. (2013), exploiting the fact that detectable regions under strong selection affect a small portion of the genome.94 For both statistics, values were standardized genome-wide in 2% allele frequency bins, as allele frequency is correlated with allele age and therefore haplotype length.40,43 We first computed outlier-robust mean and standard deviation with the rlm() function from the MASS package in R, to reduce the influence of outliers.94,95 The standardized values of these summary statistics represent z-scores. We calculated two tailed p-values using these z-scores, giving the probability that we observe these data by chance compared to null expectations for the standard normal distribution. Q-Q plots and histograms of p-values for each statistic are provided in the Supplementary Materials (Fig. S7). For each summary statistic distribution, we also calculated the p-value cutoffs that correspond to a False Discovery Rate (FDR) of less than 0.01 and 0.001, using the q-value R package (https://github.com/StoreyLab/qvalue).42
EHH decay plots and haplotype furcations for the LCT locus were calculated using the rehh R package (https://cran.r-project.org/web/packages/rehh/index.html)47 and visualized the haplotype structure of the genomic region chr2:135701076-136009184 using haplostrips and plot option -S 3 (https://bitbucket.org/dmarnetto/haplostrips).48
Kinship and runs of homozygosity
The kinship matrix in the WGS cohort was calculated using the popkin software (https://github.com/StoreyLab/popkin).36–38 We restricted the analyses to biallelic SNPs with a minor allele frequency of at least 0.01 in at least one subpopulation (i.e. YRI, CHB, FIN, CEU, GBR, TSI, IBS, or FARO), resulting in a dataset of 15,206,409 variants. When calculating the kinship matrix for the Faroese WGS cohort only, we used the rescale_kinship() function, which will change the most recent common ancestor and give different absolute values, but the overall relationship structure in the subpopulation remains the same. Using the same data set, we calculated runs of homozygosity (ROHs) for each individual using bcftools/RoH.96 “Short” ROHs were classified as ROH less than or equal to 300 kb, “medium” as greater than 300kb and less than or equal to 1 Mb, and “long” as greater than 1 Mb.
Fine-scale structure estimation using ancient genomes
A panel of 616 imputed ancient genomes from Allentoft et al. 2024 (downloaded from https://doi.org/10.17894/ucph.d71a6a5a-8107-4fd9-9440-bdafdfe81455), representing individuals from several European regions (southern Europe, western Europe, northern Europe, eastern Europe, and central Europe) was used for analyses.63 Only samples that were estimated to be no older than 3,000 years old were used. Out of these ancient samples, 11 were excavated in the Faroe Islands, 10 of them are historical samples dated to approximately 250 years old, and 1 of them is dated to be approximately 800 years old.64 The sample location, approximate age, and sources for these samples are listed in Table S5. To consolidate the two panels, we first performed a liftover of the ancient genome VCF files to the GRCh38 reference genome. Following this, we applied quality filters to the dataset (bi-allelic sites MAF > 0.05 and imputation INFO >= 0.5).
We used HaploNet - a neural network-based method for performing window-based haplotype clustering across the genome - for fine-scale population structure inference on the combined panel. HaploNet uses a hidden Markov model to find an optimal window-based local ancestry “painting” across a genome, given estimated haplotype cluster likelihoods, haplotype cluster frequencies, and global ancestry proportions.97 The Faroese panel’s haplotype frequencies are very homogeneous and highly differentiated from mainland Europeans. For this reason, under an initial round of unsupervised ancestry estimation, we found that the Faroese individuals captured a major component at first split (K=2). We therefore implemented and utilized a semi-supervised ancestry estimation feature in HaploNet20. We performed haplotype clustering in non-overlapping windows of 512 SNPs, and we used the resulting haplotype cluster likelihoods to perform principal component analysis (PCA) and estimate both global and local ancestry in the Faroese individuals.
We performed global ancestry estimation in HaploNet using its EM algorithm to find the maximum likelihood estimates using only ancient European individuals (excluding the Faroese individuals).20 We used the EM algorithm a second time to estimate the ancestry proportions (Q matrix) in the Faroese individuals. The estimated haplotype cluster frequencies (F matrix) were kept fixed, which means that the semi-supervised approach can be seen as modeling the Faroese individuals using inferred haplotype clusters from the ancient European individuals.
Data availability
Variant-level summary statistics and genome-wide selection scan results for iHS and XPEHH are available upon request for research that is in line with informed consent and ethical approval. Genetic and meta data from this study is stored at the Faroese Health Authority.
Access to individual-level data is available for research upon participants’ re-consent. Researchers will be granted access to de-identified genetic data and metadata, provided that the project protocol has been approved by the Faroese Scientific Ethical Committee and a template material/data transfer agreement has been signed with the Faroese Health Authority in compliance with GDPR (see Gregersen et al., 2021). Requests should be made to Noomi O. Gregersen (noomi@fargen.fo).
Supplementary Figures

Quality control of whole genome sequencing data.
A-D) Distributions of QC metrics across Faroese samples sequenced (mean depth, median insert size, mapping rate, duplicate rate). E) Mean WGS depth versus number of genotype calls with GQ > 20. Singletons and calls with GT <= 20 were discarded before imputation. F) Principal component analysis (PCA) of Faroese genomes jointly called with relevant 10000 Genomes reference data captures African ancestry in the first component and East Asian ancestry in the second component (FARO, Faroese, CEU, Central Europeans, GBR, British, FIN, Finnish IBS, Iberian, TSI, Tuscan, CHB, Han Chinese, YRI, Yoruban).

Kinship matrices between modern populations.
A) Kinship estimated by popkin, including global reference populations from the 1000 Genomes and the Faroese WGS cohort (FARO). B) Kinship estimated by popkin for the Faroese WGS cohort only. The matrix is rescaled after subsetting the individuals, so although the scales are different, the overall structure remains the same. The matrices are symmetrical and ordered by population or region label as indicated by the colored bars along the rows and columns. The diagonal of each matrix is the estimated inbreeding coefficient. The Faroes region labels are: VM = Vágar and Mykines; SR = Suðuroy; SM = Suðurstreymoy; SD = Sandoy, Skúvoy, Stóra Dímun; NG = Norðoyggjar; and EN = Eysturoy og Norðstreymoy.

Haplotype visualizations for top XP-EHH variant in SLC10A1 / SRSF5 locus.
A) Decay in Expected Haplotype Homozygosity per Site (EHHS) for chr14_69775276_C_T, comparing Faroese (FARO, teal) and British (GBR, red) haplotypes. B) Lengths for distinct haplotypes spanning chr14_69775276_C_T comparing FARO (teal) and GBR (red). C) haplotype furcation plot for FARO centered on chr14_69775276_C_T D) haplotype furcation for GBR centered on the same allele. In the haplotype furcation plots (panels C & D), haplotypes for the reference allele (C) are in blue, and those for the alternate allele (T) are in red.

Haplotype visualizations for top XP-EHH variant in POLQ locus.
A) Decay in Expected Haplotype Homozygosity per Site (EHHS) for chr3_121526194_G_A, comparing Faroese (FARO, teal) and British (GBR, red) haplotypes. B) Lengths for distinct haplotypes spanning chr3_121526194_G_A comparing FARO (teal) and GBR (red). C) haplotype furcation plot for FARO centered on chr3_121526194_G_A D) haplotype furcation for GBR centered on the same allele. In the haplotype furcation plots (panels C & D), haplotypes for the reference allele (G) are in blue, and those for the alternate allele (A) are in red.

Admixture plot showing proportions for 616 imputed ancient genomes from Europe together with 40 present-day Faroese genomes from this study.
Ancient individual groups are categorized based on patterns of IBD clustering as inferred in Allentoft et al. 2024. The plot uses five colors to represent different ancestral sources, which are maximized in individuals in different regions of europe: orange for “West Europe”, green for “North Europe”, blue for “Steppe”, purple for the “Levant and East Mediterranean” and red for “Eastern Europe”.

Map illustrating the geographical distribution of ancestry proportions for 616 ancient imputed individuals from Europe and 40 modern Faroese genomes from this study.
The map is divided into panels to capture both geographical and temporal variations across Europe. Each pie chart on the map represents admixture proportions with five colors: green, orange, blue, red, and purple maximize Northern European, Celtic, Steppe, Eastern European, and Levant / East Mediterranean ancestries, respectively, reflecting different ancestral sources.

Q-Q plots and p-value histograms for selection statistics.
Histograms of p-value distributions for A) integrated haplotype score (iHS) in the Faroese (FARO) haplotypes, C) iHS in the British (GBR) haplotypes, E) cross-population expected haplotype homozygosity (XP-EHH) comparing FARO and GBR. Q-Q plots for observed versus expected log transformed p-values for B) iHS in FARO, D) iHS in GBR F) XP-EHH between FARO and GBR. The estimated lambda inflation value is shown in each Q-Q plot.
Acknowledgements
We thank the participants of the FarGen project. FarGen is supported by the Government of the Faroe Islands. F.R. is supported by a Novo Nordisk Fonden Data Science Ascending Investigator Award (NNF22OC0076816) and by the European Research Council (ERC) under the European Union’s Horizon Europe programme (grant agreements 101077592 and 951385). We also thank Victor Lee with assistance while working with ancient genomic data.
Additional information
Author Contributions
S.E.C., K.A.W., F.R. and N.O.G. designed the study. N.O.G. and K.D.A. secured ethical permissions, facilitated the inclusion of participants, and oversaw the compliance with ethical standards and protocols. G.A. prepared the data for the genealogy analysis. Ó.M. performed relatedness pruning and sample selection for WGS. L.N.L prepared samples for WGS and performed the WGS analyses. A.E. processed sequencing data, performed imputation, and carried out quality control. I.H. conducted kinship, ROH, and selection scan analyses. A.R.M. carried out ancient admixture and local ancestry analysis. J.M. added functionalities to Haplonet software for semi-supervised admixture and fine-structure analysis. S.E.C., N.M., and F.R. supervised analyses. I.H., S.E.C., and A.R.M. interpreted results and wrote the manuscript. All authors reviewed and contributed to the writing of this manuscript.
Materials & Correspondence
Correspondence to N.O.G., S.E.C., and F.R..
Additional files
References
- 1.The Faroe Islanders SagaCanada: Oberon Books Google Scholar
- 2.The origin of the isolated population of the Faroe Islands investigated using Y chromosomal markersHum. Genet 115:19–28Google Scholar
- 3.From the Vikings to the Reformation: A Chronicle of the Faroe Islands Up to 1538Nám Google Scholar
- 4.The Vikings were not the first colonizers of the Faroe IslandsQuat. Sci. Rev 77:228–232Google Scholar
- 5.Highly discrepant proportions of female and male Scandinavian and British Isles ancestry within the isolated population of the Faroe IslandsEur. J. Hum. Genet 14:497–504Google Scholar
- 6.Faroe: The Emergence of a NationLondon: C. Hurst Google Scholar
- 7.Linkage disequilibrium and demographic history of the isolated population of the Faroe IslandsEur. J. Hum. Genet 10:381–387Google Scholar
- 8.SNP heterozygosity, relatedness and inbreeding of whole genomes from the isolated population of the Faroe IslandsBMC Genomics 24Google Scholar
- 9.Both rare and common genetic variants contribute to autism in the Faroe IslandsNpj Genomic Med 4:1–10Google Scholar
- 10.Carnitine levels in 26,462 individuals from the nationwide screening program for primary carnitine deficiency in the Faroe IslandsJ. Inherit. Metab. Dis 37:215–222Google Scholar
- 11.Diabetes trends in EuropeDiabetes Metab. Res. Rev 18:S3–8Google Scholar
- 12.Prevalence of type 2 diabetes and prediabetes in the Faroe IslandsDiabetes Res. Clin. Pract 140:162–173Google Scholar
- 13.East–West gradient in the incidence of inflammatory bowel disease in Europe: the ECCO-EpiCom inception cohortGut 63:588–597Google Scholar
- 14.Global prevalence of ankylosing spondylitisRheumatology 53:650–657Google Scholar
- 15.High incidence of cystic fibrosis on The Faroe Islands: a molecular and genealogical studyHum. Genet 95:703–706Google Scholar
- 16.Multiple sclerosis: variation of incidence of onset over time in the Faroe IslandsMult. Scler. J 17:241–244Google Scholar
- 17.The Faroese IBD Study: Incidence of Inflammatory Bowel Diseases Across 54 Years of Population-based DataJ. Crohns Colitis 10:934–942Google Scholar
- 18.Whole-exome sequencing implicates DGKH as a risk gene for panic disorder in the Faroese populationAm. J. Med. Genet. B Neuropsychiatr. Genet 171:1013–1022Google Scholar
- 19.FarGen: Bioresource From the Faroe Genome ProjectOpen Journal of Bioresources 8:1https://doi.org/10.5334/ojb.71Google Scholar
- 20.Haplotype and population structure inference using neural networks in whole-genome sequencing dataGenome Res. gr 276813https://doi.org/10.1101/gr.276813.122Google Scholar
- 21.FarGen – participants in the genetic research infrastructure of the Faroe IslandsScand. J. Public Health 50:980–987Google Scholar
- 22.Mid-pass whole genome sequencing enables biomedical genetic studies of diverse populationsBMC Genomics 22Google Scholar
- 23.A global reference for human genetic variationNature 526:68–74Google Scholar
- 24.High-density mapping of the MHC identifies a shared role for HLA-DRB1*01:03 in inflammatory bowel diseases and heterozygous advantage in ulcerative colitisNat. Genet 47:172–179Google Scholar
- 25.Fifty years after the discovery of the association of HLA B27 with ankylosing spondylitisRMD Open 9:e003102Google Scholar
- 26.HLA-B27 SyndromesIn: StatPearls Treasure Island (FL): StatPearls Publishing Google Scholar
- 27.HLA-B27Annu. Rev. Immunol 33:29–48Google Scholar
- 28.Genetics and the causes of ankylosing spondylitisRheum. Dis. Clin. North Am 43:401–414Google Scholar
- 29.HLA*LA—HLA typing from linearly projected graph alignmentsBioinformatics 35:4394–4396Google Scholar
- 30.Allele frequency net database (AFND) 2020 update: gold-standard data classification, open access genotype data and new query toolsNucleic Acids Res 48:D783–D788Google Scholar
- 31.Common, intermediate and well-documented HLA alleles in world populations: CIWD version 3.0.0Hla 95:516–531Google Scholar
- 32.IPD-IMGT/HLA DatabaseNucleic Acids Res 48:D948–D955Google Scholar
- 33.The United States National Health and Nutrition Examination Survey and the epidemiology of ankylosing spondylitisAm. J. Med. Sci 341:281–283Google Scholar
- 34.Major histocompatibility complex associations of ankylosing spondylitis are complex and involve further epistasis with ERAP1Nat. Commun 6Google Scholar
- 35.HLA class I associations of ankylosing spondylitis in the white population in the United KingdomAnn. Rheum. Dis 55:268–270Google Scholar
- 36.Estimating FST and kinship for arbitrary population structuresPLOS Genet 17:e1009241Google Scholar
- 37.New kinship and FST estimates reveal higher levels of differentiation in the global human populationbioRxiv :653279https://doi.org/10.1101/653279Google Scholar
- 38.FST and kinship for arbitrary population structures I: Generalized definitionsbioRxiv :083915https://doi.org/10.1101/083915Google Scholar
- 39.Runs of homozygosity: windows into population history and trait architectureNat. Rev. Genet 19:220–234Google Scholar
- 40.A Map of Recent Positive Selection in the Human GenomePLOS Biol 4:e72Google Scholar
- 41.D. hapbin: An Efficient Program for Performing Haplotype-Based Scans for Positive Selection in Large Genomic DatasetsMol. Biol. Evol 32:3027–3029Google Scholar
- 42.Statistical significance for genomewide studiesProc. Natl. Acad. Sci 100:9440–9445Google Scholar
- 43.Genome-wide detection and characterization of positive selection in human populationsNature 449:913–918Google Scholar
- 44.Going global by adapting local: A review of recent human adaptationScience 354:54–59Google Scholar
- 45.Localizing Recent Adaptive Evolution in the Human GenomePLoS Genet 3:e90Google Scholar
- 46.Detecting recent positive selection in the human genome from haplotype structureNature 419:832–837Google Scholar
- 47.rehh: an R package to detect footprints of selection in genome-wide SNP data from haplotype structureBioinformatics 28:1176–1177Google Scholar
- 48.Haplostrips: revealing population structure through haplotype visualizationMethods Ecol. Evol 8:1389–1392Google Scholar
- 49.Functional and Pharmacological Comparison of Human and Mouse Na+/ Taurocholate Cotransporting Polypeptide (NTCP)SLAS Discov. Adv. Life Sci. R D 26:1055–1064Google Scholar
- 50.Sodium taurocholate cotransporting polypeptide (SLC10A1) deficiency: conjugated hypercholanemia without a clear clinical phenotypeHepatol. Baltim. Md 61:260–267Google Scholar
- 51.Ethnicity-dependent polymorphism in Na+-taurocholate cotransporting polypeptide (SLC10A1) reveals a domain critical for bile acid substrate recognitionJ. Biol. Chem 279:7213–7222Google Scholar
- 52.Molecular cloning, chromosomal localization, and functional characterization of a human liver Na+/bile acid cotransporterJ. Clin. Invest 93:1326–1331Google Scholar
- 53.Clinical and molecular study of a pediatric patient with sodium taurocholate cotransporting polypeptide deficiencyExp. Ther. Med 12:3294–3300Google Scholar
- 54.Homozygous p.Ser267Phe in SLC10A1 is associated with a new type of hypercholanemia and implications for personalized medicineSci. Rep 7:9214Google Scholar
- 55.Homologous-recombination-deficient tumours are dependent on Polθ-mediated repairNature 518:258–262Google Scholar
- 56.A role for DNA polymerase θ in promoting replication through oxidative DNA lesion, thymine glycol, in human cellsJ. Biol. Chem 289:13177–13185Google Scholar
- 57.Low-fidelity DNA synthesis by human DNA polymerase thetaNucleic Acids Res 36:3847–3856Google Scholar
- 58.POLQ (Pol theta), a DNA polymerase and DNA-dependent ATPase in human cellsNucleic Acids Res 31:6117–6126Google Scholar
- 59.DNA polymerase θ (POLQ), double-strand break repair, and cancerDNA Repair 44:22–32Google Scholar
- 60.Melanoma-derived DNA polymerase theta variants exhibit altered DNA polymerase activityBioRxiv :2023.11.14.566933https://doi.org/10.1101/2023.11.14.566933Google Scholar
- 61.Error-Prone Replication through UV Lesions by DNA Polymerase θ Protects against Skin CancersCell 176:1295–1309Google Scholar
- 62.Knockdown of POLQ interferes the development and progression of hepatocellular carcinoma through regulating cell proliferation, apoptosis and migrationCancer Cell Int 21Google Scholar
- 63.Population genomics of post-glacial western EurasiaNature 625:301–311Google Scholar
- 64.Population genomics of the Viking worldNature 585:390–396Google Scholar
- 65.Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomesCell Genomics 2Google Scholar
- 66.Expression Atlas update: insights from sequencing data at both bulk and single cell levelNucleic Acids Res 52:D107–D114Google Scholar
- 67.The Role of Neutrophils in Spondyloarthritis: A Journey across the Spectrum of Disease ManifestationsInt. J. Mol. Sci 24Google Scholar
- 68.Molecular genetic basis and prevalence of glycogen storage disease type IIIA in the Faroe IslandsEur. J. Hum. Genet 9:388–391Google Scholar
- 69.Finnish Disease Heritage II: population prehistory and genetic roots of FinnsHum. Genet 112:457–469Google Scholar
- 70.Fine-Scale Genetic Structure in FinlandG3 GenesGenomesGenetics 7:3459–3468Google Scholar
- 71.Molecular Genetics the Finnish Disease HeritageHum. Mol. Genet 8:1913–1923Google Scholar
- 72.Genome-wide association analysis of metabolic traits in a birth cohort from a founder populationNat. Genet 41:35–46Google Scholar
- 73.Long runs of homozygosity are enriched for deleterious variationAm. J. Hum. Genet 93:90–102Google Scholar
- 74.Large-scale migration into Britain during the Middle to Late Bronze AgeNature 601:588–594Google Scholar
- 75.Low Prevalence of Lactase Persistence in Bronze Age Europe Indicates Ongoing Strong Selection over the Last 3,000 YearsCurr. Biol 30:4307–4315Google Scholar
- 76.A Mathematical Theory of Natural and Artificial Selection, Part V: Selection and MutationMath. Proc. Camb. Philos. Soc 23:838–844Google Scholar
- 77.The Importance of Animal and Marine Fat in the Faroese Cuisine: The Past, Present, and Future of Local Food Knowledge in an Island SocietyFront. Sustain. Food Syst 5Google Scholar
- 78.Selection in Europeans on Fatty Acid Desaturases Associated with Dietary ChangesMol. Biol. Evol 34:1307–1318Google Scholar
- 79.Greenlandic Inuit show genetic signatures of diet and climate adaptationScience 349:1343–1347Google Scholar
- 80.Genetics of metabolic traits in Greenlanders: lessons from an isolated populationJ. Intern. Med 284:464–477Google Scholar
- 81.Environmental selection during the last ice age on the mother-to-infant transmission of vitamin D and fatty acids through breast milkProc. Natl. Acad. Sci 115:E4426–E4432Google Scholar
- 82.Tracing the peopling of the world through genomicsNature 541:302–310Google Scholar
- 83.A common Greenlandic TBC1D4 variant confers muscle insulin resistance and type 2 diabetesNature 512:190–193Google Scholar
- 84.Vitamin D as a protective factor in multiple sclerosisNeurology 79:2140–2145Google Scholar
- 85.Vitamin D and Multiple Sclerosis: A Comprehensive ReviewNeurol. Ther 7:59–85Google Scholar
- 86.Genetic and environmental determinants of 25-hydroxyvitamin D levels in multiple sclerosisMult. Scler. J 21:1414–1422Google Scholar
- 87.Christianity, churches and medieval Kirkjubøur – contacts and influences in the Faroe IslandsIn: Medieval Archaeology in Scandinavia and Beyond: History, Trends and Tomorrow Aarhus University Press pp. 235–256Google Scholar
- 88.Faroese: An Overview and Reference GrammarFøroya Fróđskaparfelag Google Scholar
- 89.The kinship2 R package for pedigree dataHum. Hered 78:91–93Google Scholar
- 90.Approximating maximum independent sets by excluding subgraphsBIT Numer. Math 32:180–196Google Scholar
- 91.Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projectsNat. Commun 9Google Scholar
- 92.PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regionsCell Genomics 2Google Scholar
- 93.The mutational constraint spectrum quantified from variation in 141,456 humansNature 581:434–443Google Scholar
- 94.Detecting Signatures of Selection Through Haplotype Differentiation Among Hierarchically Structured PopulationsGenetics 193:929–941Google Scholar
- 95.Modern Applied Statistics with SNy: Springer, New York Google Scholar
- 96.BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing dataBioinformatics 32:1749–1751Google Scholar
- 97.A Fast and Flexible Statistical Model for Large-Scale Population Genotype Data: Applications to Inferring Missing Genotypes and Haplotypic PhaseAm. J. Hum. Genet 78:629–644Google Scholar
Article and author information
Author information
Version history
- Sent for peer review:
- Preprint posted:
- Reviewed Preprint version 1:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.107428. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2025, Hamid et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 19
- downloads
- 0
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.