Native American genetic ancestry and pigmentation allele contributions to skin color in a Caribbean population

  1. Khai C Ang  Is a corresponding author
  2. Victor A Canfield
  3. Tiffany C Foster
  4. Thaddeus D Harbaugh
  5. Kathryn A Early
  6. Rachel L Harter
  7. Katherine P Reid
  8. Shou Ling Leong
  9. Yuka Kawasawa
  10. Dajiang Liu
  11. John W Hawley
  12. Keith C Cheng  Is a corresponding author
  1. Department of Pathology, Penn State College of Medicine, United States
  2. Jake Gittlen Laboratories for Cancer Research, Penn State College of Medicine, United States
  3. Department of Family & Community Medicine, Penn State College of Medicine, United States
  4. Department of Biochemistry and Molecular Biology, Penn State College of Medicine, United States
  5. Department of Pharmacology, Penn State College of Medicine, United States
  6. Institute of Personalized Medicine, Penn State College of Medicine, United States
  7. Department of Public Health Sciences, Penn State College of Medicine, United States
  8. Salybia Mission Project, Dominica


Our interest in the genetic basis of skin color variation between populations led us to seek a Native American population with genetically African admixture but low frequency of European light skin alleles. Analysis of 458 genomes from individuals residing in the Kalinago Territory of the Commonwealth of Dominica showed approximately 55% Native American, 32% African, and 12% European genetic ancestry, the highest Native American genetic ancestry among Caribbean populations to date. Skin pigmentation ranged from 20 to 80 melanin units, averaging 46. Three albino individuals were determined to be homozygous for a causative multi-nucleotide polymorphism OCA2NW273KV contained within a haplotype of African origin; its allele frequency was 0.03 and single allele effect size was –8 melanin units. Derived allele frequencies of SLC24A5A111T and SLC45A2L374F were 0.14 and 0.06, with single allele effect sizes of –6 and –4, respectively. Native American genetic ancestry by itself reduced pigmentation by more than 20 melanin units (range 24–29). The responsible hypopigmenting genetic variants remain to be identified, since none of the published polymorphisms predicted in prior literature to affect skin color in Native Americans caused detectable hypopigmentation in the Kalinago.

Editor's evaluation

This pigmentation study focuses on a community from Kalinago Territory from the Caribbean islands that on average possess high percentages of Indigenous American ancestry, and broadens the effort of quantifying the genetic effects on skin pigmentation in humans. This paper describes an analysis of the genetic structure of the Kalinago population in the Commonwealth of Dominica, and the relationship between ancestry and skin pigmentation in that population. They provide valuable new insights into the skin-lightening effect of Native American alleles, which likely have been obscured by the effect of European alleles in previous studies of admixed Native American populations. Additionally, this paper provides an interesting analysis of previously reported albinism alleles, which paints a more complex picture of the genetic architecture of pigmentation.

eLife digest

The variation in skin colour of modern humans is a product of thousands of years of natural selection. All human ancestry can be traced back to African populations, which were dark-skinned to protect them from the intense UV rays of the sun.

Over time, humans spread to other parts of the world, and people in the northern latitudes with lower UV developed lighter skin through natural selection. This was likely driven by a need for vitamin D, which requires UV rays for production.

Separate genetic mechanisms were involved in the evolution of lighter skin in each of the two main branches of human migration: the European branch (which includes peoples on the Indian subcontinent and Europe) and the East Asian branch (which includes East Asia and the Americas).

A variant of the gene SLC24A5 is the primary contributor to lighter skin colour in the European branch, but a corresponding variant driving light skin colour evolution in the East Asian branch remains to be identified.

One obstacle to finding such variants is the high prevalence of European ancestry in most people groups, which makes it difficult to separate the influence of European genes from those of other populations. To overcome this issue, Ang et al. studied a population that had a high proportion of Native American and African ancestors, but a relatively small proportion of European ancestors, the Kalinago people. The Kalinago live on the island of Dominica, one of the last Caribbean islands to be colonised by Europeans.

Ang et al. were able to collect hundreds of skin pigmentation measurements and DNA samples of the Kalinago, to trace the effect of Native American ancestry on skin colour. Genetic analysis confirmed their oral history records of primarily Native American (55%) – one of the highest of any Caribbean population studied to date – compared with African (32%) and European (12%) ancestries.

Native American ancestry had the highest effect on pigmentation and reduced it by more than 20 melanin units, while the European mutations in the genes SLC24A5 and SLC45A2 and an African gene variant for albinism only contributed 5, 4 and 8 melanin units, respectively. However, none of the so far published gene candidates responsible for skin lightening in Native Americans caused a detectable effect. Therefore, the gene responsible for lighter skin in Native Americans/East Asians has yet to be identified.

The work of Ang et al. represents an important step in deciphering the genetic basis of lighter skin colour in Native Americans or East Asians. A better understanding of the genetics of skin pigmentation may help to identify why, for example, East Asians are less susceptible to melanoma than Europeans, despite both having a lighter skin colour. It may also further acceptance of how variations in human skin tones are the result of human migration, random genetic variation, and natural selection for pigmentation in different solar environments.


Human skin pigmentation is a polygenic trait that is influenced by health and environment (Barsh, 2003). Lighter skin is most common in populations adapted to northern latitudes characterized by lower UV incidence than equatorial latitudes (Jablonski and Chaplin, 2000). Selection for lighter skin, biochemically driven by a solar UV-dependent photoactivation step in the formation of vitamin D (Engelsen, 2010; Hanel and Carlberg, 2020; Holick, 1981; Loomis, 1967) is regarded as the most likely basis for a convergent evolution of lighter skin color in European and East Asian/Native American populations (Lamason et al., 2005; Norton et al., 2007). The hypopigmentation polymorphisms of greatest significance in Europeans have two key characteristics: large effect size and near fixation. For example, the A111T allele in SLC24A5 (Lamason et al., 2005) explains at least 25% of the difference in skin color between people of African vs. European genetic ancestry, and is nearly fixed in European populations. No equivalent polymorphism in Native Americans or East Asians has been found to date.

Native Americans share common genetic ancestry with East Asians (Derenko et al., 2010; Tamm et al., 2007), diverging before ~15 kya (Gravel et al., 2013; Moreno-Mayar et al., 2018; Reich et al., 2012), but the extent to which these populations share pigmentation variants remains to be determined. The derived alleles of rs2333857 and rs6917661 near OPRM1, and rs12668421 and rs11238349 in EGFR are near fixation in some Native American populations, but all also have a high frequency in Europeans (Quillen et al., 2012), and none reach genome-wide significance in Adhikari et al., 2019. However, the latter found a significant association for the Y182H variant of MSFD12 with skin color, but its frequencies were only 0.27 and 0.17 in Native Americans and East Asians, respectively, suggesting that it can explain only a small portion of the difference between Native American and/or East Asians and African skin color. Thus, the genetic basis for lighter skin pigmentation specific to Native American and East Asian populations, whose African alleles would be expected to be ancestral, remains to be found.

The shared genetic ancestry of East Asians and Native Americans suggests the likelihood that some light skin color alleles are shared between these populations. This is particularly the case for any variants that achieved fixation in their common ancestors. For Native American populations migrating from Beringia to the Tropics, selection for darker skin color also appears likely (Jablonski and Chaplin, 2000; Quillen et al., 2019). This would have increased the frequency of novel dark skin variants, if any, and would have decreased the frequency of light skin variants that had not achieved fixation. Hypopigmenting alleles are associated with the European admixture characteristic of many current Native American populations (Brown et al., 2017; Gravel et al., 2013; Keith et al., 2021; Klimentidis et al., 2009; Reich et al., 2012). Since the European hypopigmenting alleles may mask the effects of East Asian and Native American alleles, we searched for an admixed Native American population with high African, but low European admixture.

Prior to European contact, the Caribbean islands were inhabited by populations who migrated from the northern coast of South America (Benn-Torres et al., 2008; Harvey et al., 1969; Honychurch, 2012; Island Caribs, 2016; Benn Torres et al., 2015). During the Colonial period, large numbers of Africans were introduced into the Caribbean as slave labor (Honychurch, 2012; Benn Torres et al., 2013). As a consequence of African and European admixture and high mortality among the indigenous populations, Native American genetic ancestry now contributes only a minor portion (<15%) of the genetic ancestry of most Caribbean islanders (Auton et al., 2015; Benn Torres et al., 2015). The islands of Dominica and St. Vincent were the last colonized by Europeans in the late 1700s (Honychurch, 2012; Honychurch, 1998; Rogoziński, 2000). In 1903, the British granted 15 km2 (3700 acres) on the eastern coast of Dominica as a reservation for the Kalinago, who were then called ‘Carib.’ When Dominica gained Independence in 1978, legal rights and a degree of protection from assimilation were gained by the inhabitants of the Carib Reserve (Honychurch, 2012) (redesignated Kalinago Territory in 2015). Oral history and beliefs among the Kalinago, numbering about 3000 living within the Territory, 2021; Figure 1—figure supplement 2 are consistent with the primarily Native American and African genetic ancestry, assessed and confirmed genetically here.

Early in our genetic and phenotypic survey of the Kalinago, we noted an albino individual, and upon further investigation, we learned of two others residing in the Territory. We set out to identify the mutant albinism allele to avoid single albino allele effects that would potentially mask Native American hypopigmentation alleles. Oculocutaneous albinism (OCA) is a recessive trait characterized by visual system abnormalities and hypopigmentation of skin, hair, and eyes (Gargiulo et al., 2011; Grønskov et al., 2007; Grønskov et al., 2014; Hong et al., 2006; Vogel et al., 2008) that is caused by mutations in any of a number of autosomal pigmentation genes (Carrasco et al., 2009; Edwards et al., 2010; Gao et al., 2017; Grønskov et al., 2013; Kausar et al., 2013; King et al., 2003; Spritz et al., 1995; Stevens et al., 1997; Stevens et al., 1995; Vogel et al., 2008; Woolf, 2005; Yi et al., 2003). The incidence of albinism is ~1:20,000 in populations of European descent, but much higher in some populations, including many in sub-Saharan Africa (1:5000)(Greaves, 2014). Here, we report on the genetic ancestry of a population sample representing 15% of the Kalinago population of Dominica, the identification of the new albinism allele in that population, and measurement of the hypopigmenting effects of the responsible albinism allele, the European SLC24A5A111T and SLC45A2L374 alleles. Native American genetic ancestry alone caused a measurable effect on pigmentation. In contrast, alleles identified in past studies of Native American skin color caused no significant effect on skin color.

Results and discussion

Our search for a population admixed for Native American/African ancestries with minimal European admixture led us to the ‘Carib’ population in the Commonwealth of Dominica. Observations from an initial trip to Dominica suggested wide variation in Kalinago skin color. Pursuit of the genetic studies described here required learning about oral and written histories, detailed discussion with community leadership, IRB approval from Ross University (until Hurricane Maria in 2017, the largest medical school in Dominica) and the Department of Health of the Commonwealth of Dominica, and relationship-building with three administrations of the Kalinago Council over 15 years.

Population sample

Our DNA and skin color sampling program encompassed 458 individuals, representing 15% of the population of the territory and all three known albino individuals. Ages ranged from 6 to 93 (Appendix 1—table 1 and Figure 1—figure supplement 3). We were able to obtain genealogical information for about half of the parents (243 mothers and 194 fathers). Community-defined ancestry (described as ‘Black,’ ‘Kalinago,’ or ‘Mixed’) for both parents was obtained for 426 individuals (92% of sample), including 108 parents from whom DNA samples were obtained (72 Kalinago, 36 Mixed, and 0 Black). They described themselves as Black, Kalinago, or Mixed from their perceived understanding of their parents or grandparents skin color.

Kalinago genetic ancestry

The earliest western mention of the Kalinago (originally as ‘Caribs’) was in Christopher Columbus’s journal dated November 26, 1492 (Honychurch, 2012). Little is known about the detailed cultural and genetic similarities and differences between them and other Caribbean pre-contact groups such as the Taino. African admixture in the present Kalinago population derived from the African slave trade; despite inquiry across community, governmental, and historical sources, we were unable to find documentation of specific regions of origin in Africa or well-defined contributions from other groups. The population’s linguistics are uninformative, as they speak, in addition to English, the same French-based Antillean Creole spoken on the neighboring islands of Guadeloupe and Martinique.

To study Kalinago population structure, we analyzed an aggregate of our Kalinago SNP genotype data and HGDP data (Li et al., 2008) using ADMIXTURE (Figure 1 and Figure 1—figure supplement 1) as described in Materials and methods. At K=3, the ADMIXTURE result confirmed the three major clusters, corresponding roughly to Africans (black cluster), European/Middle Easterners/Central and South Asians (yellow cluster), and East Asians/Native Americans (green cluster). At K=4 and higher, the red component that predominates Native Americans separates the Kalinago from the East Asians (green cluster). Consistent with prior work (Li et al., 2008), a purple cluster (Oceanians) appears at K=5 and a brown cluster (Central and South Asians) appears at K=6; both are minor sources of genetic ancestry in our Kalinago sample (average <1%) (Appendix 1—table 2).

Figure 1 with 3 supplements see all
Admixture analysis of Kalinago compared with Human Genome Diversity Project populations.

Results are depicted using stacked bar plots, with one column per individual. At K=3, the Kalinago, Native Americans, Oceanians, and East Asians fall into the same green cluster. At K=4, the Native Americans (red cluster) are separated from the East Asians (green cluster). Figure 1—figure supplement 1 shows the expanded admixture plot for K=6 with each populations labeled. Figure 1—figure supplement 2 shows the location of Kalinago Territory where fieldwork was performed.

Figure 1—source data 1

The source data contains results from Admixture analysis.

At K=4 to K=6, the Kalinago show on average 55% Native American, 32% African, and 11–12% European genetic ancestry. Estimates from a two-stage admixture analysis are similar, as are results from local genetic ancestry analysis (see Materials and methods) (Appendix 1—table 3), leading to estimates of 54–56% Native American, 31–33% African, and 11–13% European genetic ancestry. The individual with the least admixture has approximately 94% Native American and 6% African genetic ancestry. The results of the principal component (PC) analysis (PCA) (Figure 2—figure supplement 1) were consistent with ADMIXTURE analysis. The first two PCs suggest that most Kalinago individuals show admixture between Native American and African genetic ancestry, with a smaller but highly variable European contribution apparent in the displacement in PC2 (Figure 2—figure supplement 1). A smaller number of Kalinago individuals with substantial East Asian genetic ancestry exhibit displacement in PC3 (Figure 2—figure supplement 1).

Our analysis of Kalinago genetic ancestry revealed considerably more Native American and less European genetic ancestry than the Caribbean samples of Benn Torres et al., 2013, and the admixed populations from the 1000 Genomes Project (1KGP) (Auton et al., 2015; Figure 2). Some Western Hemisphere Native Americans reported in Reich et al., 2012, have varying proportions of European but very little African admixture (Figure 2B). Overall, the Kalinago have more Native American and less European genetic ancestry than any other Caribbean population.

Figure 2 with 2 supplements see all
Comparison of Kalinago genetic ancestry with that of other populations in the Western Hemisphere.

Ternary plots of genetic ancestry from our work and the literature show estimated proportions of African (AFR), European (EUR), and Native American (NAM) genetic ancestry. (A) Comparison of individuals (n=452, omitting 6 individuals with EAS >0.1) genotyped in this study to individuals (n=38) from southern Dominica sampled by Benn Torres et al., 2013. (B) Comparison of the Kalinago average genetic ancestry with other Native American populations. Kalinago, this study (n=458); Islands (BT) indicates Caribbean islanders reported in Benn Torres et al., 2013, with Dominica labeled; admixed (adm) AFR (1000 Genomes Project [1KGP]) and admixed NAM (1KGP) represent admixed populations from Auton et al., 2015, with Caribbean samples PUR (Puerto Rico) and ACB (Barbados) labeled; and AMR (Reich) indicates mainland Native American samples reported in Reich et al., 2012. Inset (top left) shows ancestries at vertices.

Figure 2—source data 1

Source data contains result from PCA analysis for Kalinago versus other Native American populations in the Western Hemisphere.

The 55% Native American genetic ancestry calculated from autosomal genotype in the Kalinago is greater than the reported 13% in Puerto Rico (Gravel et al., 2013), 10–15% for Tainos across the Caribbean (Schroeder et al., 2018), and 8% for Cubans (Marcheco-Teruel et al., 2014). This is also considerably higher than the reported 6% Native American genetic ancestry found in Bwa Mawego, a horticultural population that resides south of the Kalinago Territory (Keith et al., 2021). However, this result is lower than the 67% Native American genetic ancestry reported by Crawford et al., 2021, for an independently collected Kalinago samples based on the mtDNA haplotype analysis. This difference suggests a paternal bias in combined European and/or African admixture. Since our Illumina SNP-chip genotyping does not yield reliable identification of mtDNA haplotypes, we are currently unable to compare maternal to autosomal genetic ancestry proportions for our sample. Samples genotyped using 105 genetic ancestry informative markers from Jamaica and the Lesser Antilles (Benn Torres et al., 2015) yielded an average of 7.7% Native American genetic ancestry (range 5.6%–16.2%), with the highest value from a population in Dominica sampled outside the Kalinago reservation. Relevant to the potential mapping of Native American light skin color alleles, the Kalinago population has among the lowest European genetic ancestry (12%) compared to other reported Caribbean Native Americans in St. Kitts (8.2%), Barbados (11.5%), and Puerto Rico (71%) (Benn Torres et al., 2013). Contributing to the high percentage of Native American genetic ancestry in the Kalinago is their segregation within the 3700 acre Kalinago Territory in Dominica granted by the British in 1903, and the Kalinago tradition that women marrying non-Kalinago are required to leave the Territory; non-Kalinago spouses of Kalinago men are allowed to move to the Territory (KCA, KCC, Personal Communication with Kalinago Council, 2014). These factors help to explain why samples collected outside the Kalinago Territory (Benn Torres et al., 2013) show lower fractional Native American genetic ancestry.

During our fieldwork, it was noted that members of the Kalinago community characterized themselves and others in terms of perceived genealogical ancestry as ‘Black,’ ‘Kalinago,’ or ‘Mixed.’ Compared to individuals self-identified as ‘Mixed,’ those self-identified as ‘Kalinago’ have on average more Native American genetic ancestry (67% vs 51%), less European genetic ancestry (10% vs 14%), and less African genetic ancestry (23% vs 34%) (Figure 2—figure supplement 2). Thus, these folk categories based on phenotype are reflected in some underlying differences in genetic ancestry.

Kalinago skin color variation

Melanin index unit (MI) calculated from skin reflectance measured at the inner upper arm (see Materials and methods) was used as a quantitative measure of melanin pigmentation (Ang et al., 2012; Diffey et al., 1984). MI determined in this way is commonly used as a measure of constitutive skin pigmentation (Choe et al., 2006; Park and Lee, 2005). The MI in the Kalinago ranged from 20.7 to 79.7 (Figure 4—figure supplement 1), averaging 45.7. The three Kalinago albino individuals sampled had the lowest values (20.7, 22.4, and 23.8). Excluding these, the MI ranged between 28.7 and 79.7 and averaged 45.9. For comparison, the MI averaged 25 and 21 for people of East Asian and European genetic ancestry, respectively, as measured with the same equipment in our laboratory (Ang et al., 2012; Tsetskhladze et al., 2012). This range is similar to that of another indigenous population, the Senoi of Peninsular Malaysia (MI 24–78; mean = 45.7) (Ang et al., 2012). The Senoi are believed to include admixture from Malaysian Negritos whose pigmentation is darker (mean = 55) (Ang et al., 2012) than that of the average Kalinago. In comparison, the average MI was 53.4 for Africans in Cape Verde (Beleza et al., 2012) and 59 for African-Americans (Shriver et al., 2003). Individuals self-described as ‘Kalinago’ were slightly lighter and had a narrower MI distribution (42.5± 5.6, mean ± SD) compared to ‘Mixed’ (45.8± 9.6) (Figure 4—figure supplement 2).

An OCA2 albinism allele in the Kalinago

OCA is a genetically determined condition characterized by nystagmus, reduced visual acuity, foveal hypoplasia, and strabismus as well as hypopigmentation of the skin, hair, and eye (Dessinioti et al., 2009; van Geel et al., 2013). The three sampled albino individuals had pale skin (MI 20.7, 22.4, and 23.8 vs. 29–80 for non-albino individuals), showed nystagmus, and reported photophobia and high susceptibility to sunburn. In contrast to the brown irides and black hair of most Kalinago, including their parents, the albino individuals had blonde hair and gray irides with varying amounts of green and blue.

To identify the albinism variant in the Kalinago, we first determined that none of the albino individuals carried any of 28 mutations previously found in African or Native American albino individuals (Carrasco et al., 2009; King et al., 2003; Stevens et al., 1997; Yi et al., 2003), including a 2.7 kb exon 7 deletion in OCA2 found at high frequency in some African populations. Whole exome sequencing of one albino individual and one parent (obligate carrier) revealed polymorphisms homozygous in the albino individuals and heterozygous in the parent, an initial approach that assumes that the albino individual was not a compound heterozygote. We identified 12 variant alleles in 7 OCA genes (or genomic regions) that met these criteria (summarized in Appendix 1—table 4). None were nonsense or splice site variants. Five of the twelve variants were intronic, one was synonymous, one was located in 5’UTR, and three were in the 3’UTR (Appendix 1—table 4). Two missense variants were found in OCA2: SNP rs1800401 (c.913C>T or p.Arg305Trp in exon 9), R305W, and multi-nucleotide polymorphism rs797044784 in exon 8 (c.819_822delCTGGinsGGTC; p.Asn273_Trp274delinsLysVal), NW273KV.

Among 458 Kalinago OCA2 genotypes, 26 carried NW273KV and 60 carried R305W (Table 1). Only NW273KV homozygotes were albino individual. We know that the allele responsible for albinism was NW273KV because neither of the two individuals, homozygous for R305W but not NW273KV, was albino individual. In further support of this conclusion is that one individual who was homozygous for R305W and homozygous ancestral for NW273 had an MI of 72, among the darkest in the entire population. R305W is notably present with frequency >0.10 in some African, South Asian, and European populations (Auton et al., 2015), predicting a Hardy-Weinberg frequency of homozygotes above 1%. This is far greater than the observed frequency of individuals with albinism and therefore inconsistent with the idea that this is not a variant responsible for albinism. The fact that R305W scores incorrectly as pathogenic using SIFT, Polyphen 2.0, and PANTHER that R305W (Kamaraj and Purohit, 2014) suggests a need for refinement of these methods. The universal association of R305W with the NW273KV haplotype indicates that the founder haplotype of the NW273KV albinism mutation carried the silent R305W variant.

Table 1
Albinism among NW273KV and R305W genotypes.
Allele/genotypeNW273KV genotype
Homozygous ancestral*HeterozygousHomozygous derivedTotal
R305W genotypeHomozygous ancestral39800398
Homozygous derived1135
  1. *

    Ancestral = reference allele and derived = alternate allele for both variants.

  2. Albino phenotype. Notably, none of the other genotypic categories are albino individuals.

To identify the origin of the albino allele, albino individuals and carriers were analyzed for regions exhibiting homozygosity, and identity-by-descent and local genetic ancestry was estimated (see Materials and methods). All three albino individuals share a homozygous segment of ~1.7 Mb that encompasses several genes in addition to OCA2 (Figure 3). The albino haplotype defined by homozygosity in individuals 2 and 3 extends ~11 Mb; comparison to local genetic ancestry shows that this haplotype is clearly of African origin.

Haplotype analysis for three albino individuals.

The inner two lines indicate NAM (red) or AFR (dark blue) genetic ancestry; no EUR genetic ancestry was found in this genomic region. For this local genetic ancestry analysis, the region shown here consisted of 110 non-overlapping segments with 7–346 SNPs each (mean 65). The deduced extent of shared albino haplotype (dotted light blue lines) is indicated on each chromosome. The common region of overlap indicated by the minimum homozygous region (determined by albino individual 1) shared by all three albino individuals is shown at expanded scale below. Genes in this region are labeled, and the position of the NW273KV polymorphism in OCA2 is indicated by the red arrowhead.

The Kalinago albino individuals are the only reported individuals where the albinism was caused by homozygosity for the NW273KV allele of OCA2. Two reported albino individuals of African-American/Dutch descent were compound heterozygotes for the OCA2 mutation, with one allele being the NW273KV variant chromosome (Garrison et al., 2004; Lee et al., 1994). Conservation of the NW sequence among vertebrates and its inclusion in a potential N-linked glycosylation site (Rinchik et al., 1993) that is eliminated by the mutation supports the variant’s pathogenicity. The NW273KV frequency in our sample (0.03) translates into a Hardy-Weinberg albinism frequency (p2=0.0009) of ~1 per 1000, as observed (3 in a population of about 3000). Examination of publicly available data reveals three OCA2NW273KV heterozygotes in the 1000 Genome Project, a pair of siblings from Barbados (ACB) and one individual from Sierra Leone (MSL). The three 1KGP individuals share a haplotype of ~1.5 Mb, of which ~1.0 Mb matches the albino haplotype in the Kalinago. The phasing for the OCA2NW273KV variant in the public data is inconsistent, with the variant assigned to the wrong chromosome for the ACB siblings.

Genetic contributions to Kalinago skin color variation

One motivation for undertaking this work was to characterize genetic contributions to skin pigmentation in a population with primarily Native American and African genetic ancestry, so that we could focus on the effect of Native American hypopigmenting alleles without interference from European alleles. The Kalinago population described here comprises the only population we are aware of that fits this genetic ancestry profile. To control for the effects of the major European pigmentation loci, all Kalinago samples were genotyped for SLC24A5A111T and SLC45A2L374F. The phenotypic effects of these variants and OCA2NW273KV are shown in Figure 4. Each variant decreases melanin pigmentation, with homozygotes being lighter than heterozygotes. The greatest effect is seen in the OCA2NW273KV homozygotes (the albino individuals), as previously noted. The frequencies of the derived alleles of SLC24A5A111T and SLC45A2L374F in the Kalinago sample are 0.14 and 0.06, respectively.

Figure 4 with 2 supplements see all
Skin color distribution of Kalinago samples according to genotype.

The ‘triple ancestral’ plot is individuals ancestral for three pigmentation loci (SLC24A5111A, SLC45A2374L, and OCA2273NW). In the other plots, heterozygosity or homozygosity is indicated for the variants: OCA2NW273KV; SLC24A5A111T; and SLC45A2L374F. Individuals depicted in the second through fourth panels are repeated if they carry variants at more than one locus. M-index of the Kalinago ranged from 20.7 to 79.7 (Figure 4—figure supplement 1) and the histogram of skin color based on community-defined ancestry are shown in Figure 4—figure supplement 2.

Figure 4—source data 1

The source file contain melanin index distribution as function of community-described ancestry.
Figure 4—source data 2

The source data contains data of melanin indices according to genotype.

The markedly higher frequency of SLC24A5A111T compared to SLC45A2L374F is not explained solely by European admixture, given that most Europeans are nearly fixed for both alleles (Soejima and Koda, 2007). This deviation can be explained by the involvement of source populations that carry the SLC24A5A111T variant but not SLC45A2L374F. Although some sub-Saharan West African populations (the likeliest source of AFR genetic ancestry in the Kalinago) have negligible SLC24A5A111T frequencies, moderate frequencies are found in the Mende of Sierra Leone (MSL, allele frequency = 0.08) (Micheletti et al., 2020; Auton et al., 2015), while some West African populations such as Hausa and Mandinka who have allele frequencies of 0.11 and 0.15, respectively (Cheung et al., 2000; Rajeevan et al., 2012). Such African individuals carrying the SLC24A5A111T allele could potentially cause the observed frequencies by founder effect. In addition, the region of chromosome 5 containing SLC45A2 exhibits low European genetic ancestry (6.5%) that is consistent with low observed SLC45A2L374F frequency.

In order to investigate the potential effect of the SLC25A5A111T allele on the albinism phenotype, we also compared other pigmentation phenotypes such as the hair and eye colors for all albino individuals and carriers. One of the three Kalinago albino individuals was also heterozygous for SLC24A5A111T, but neither skin nor hair color for this individual was lighter than that of the other two albino individuals, who were homozygous for the ancestral allele at SLC24A5A111; this observation is consistent with epistasis of OCA2 hypopigmentation over that of SLC24A5A111T. Nine sampled non-albino individuals had combinations of hair that was reddish, yellowish, or blonde (n=6), skin with MI <30 (n=3), and gray, blue, green, or hazel irides (n=2); among these, six were heterozygous and one homozygous for SLC24A5A111T, and three were heterozygous for the albino variant. A precise understanding of the phenotypic effects of the combinations of these and other hypopigmenting alleles will require further study.

The strong dependence of pigmentation on Native American genetic ancestry is clarified by focusing on individuals lacking the hypopigmenting alleles SLC24A5A111T, SLC45A2L374F, and OCA2NW273KV (Figure 5). Although positive deviations from the best fit are apparent at both high and low Native American genetic ancestry, the trend toward lighter pigmentation as Native American genetic ancestry increases is clear. The net difference between African and Native American contributions to pigmentation appears likely to be bounded by the magnitudes of the slope vs NAM genetic ancestry (24 units) and the slope vs AFR genetic ancestry (29 units, not shown). The difference in melanin index value is expected to be explained by genetic variants that are highly differentiated between African and Native American populations.

Figure 5 with 2 supplements see all
Dependence of melanin unit on genetic ancestry for Kalinago.

Only individuals who are ancestral for SLC24A5111A, SLC45A2374L, and OCA2273NW alleles are shown (n=279). The dotted red line represents the best fit (linear regression). Slope is –24.3 (melanin index unit [MI] = –24.3*NAM+61.9); r2=0.2722.

To further investigate the contributions of genetic variation to skin color, we performed association analyses using an additive model for melanin index, conditioning on sex, genetic ancestry (using 10 PCs), and genotypes for SLC24A5A111T, SLC45A2L374F, and OCA2NW273KV. Assuming likely epistasis of albinism alleles over other hypopigmenting alleles, these analyses omitted the three albino individuals. Employing a linear regression model, we found that sex and all three genotyped polymorphisms were statistically significant (Table 2 and Figure 2—figure supplement 2). However, only SLC24A5A111T reaches genome-wide significance. PC1, which strongly correlated with Native American vs African genetic ancestry, exhibits the lowest p-value. Effect sizes were about –6 units (per allele) for SLC24A5A111T, –4 units for SLC45A2L374F, and –8 units for the first OCA2NW273KV allele.

Table 2
Effect sizes for covariates in linear regression model with 10 principal components.
CovariateEffect size (MI)p-Value
rs1426654 (SLC24A5A111T)–5.81.5E-12
rs16891982 (SLC45A2L374F) –4.46.7E-05
Albino allele (OCA2NW273KV) –7.72.2E-05
Sex (female vs male)–2.45.0E-04
  1. aPer allele effect size, in melanin units, for A111T and L374F; effect of first allele for albino variant.

Additional covariates were considered but not included in our standard model. Skin pigmentation exhibited a decreasing trend with age, but its contribution was not statistically significant (adjusted p-value = 0.08). Estimated effect sizes for significant covariates were little affected by the inclusion of age as a covariate (Appendix 1—table 5). Analysis of SNPs that were previously reported as relevant to pigmentation are shown in Appendix 2—table 1. The lowest (adjusted) p-value for this collection of variants is about 0.001, considerably larger than the p-values for the variants included as covariates in our standard model. Inclusion of the SNP of lowest p-value from each of the five regions containing BCN2, TYR, OCA2, MC1R, and OPRM1 only modestly altered effect sizes for the other covariates (Appendix 1—table 5).

The effect size for SLC24A5A111T measured here is consistent with previously reported results of –5 melanin units calculated from an African-American sample (Lamason et al., 2005; Norton et al., 2007) and –5.5 from admixed inhabitants of the Cape Verde islands (Beleza et al., 2013). Reported effect sizes for continental Africans are both higher and lower, –7.7 in Crawford et al., 2017, and –3.6 Martin et al., 2017b, while the estimated effect size in the CANDELA study (GWAS of combined admixed populations from Mexico, Brazil, Columbia, Chile, and Peru) (Adhikari et al., 2019) yielded an effect size about –3 melanin units.

A significant effect of SLC45A2L374F on skin pigmentation reported for the African-American sample by Norton et al., 2007, and in the CANDELA study by Adhikari et al., 2019, but not for the African Caribbean sample by Norton et al., 2007. The 4 unit effect size of this allele in the Kalinago reported here is similar to the 5 unit effect reported by Norton et al., 2007. Beleza et al., 2013 reported significance for an SNP in strong linkage disequilibrium with SLC45A2L374F, which was itself not genotyped.

Our estimate that a single OCA2NW273KV allele causes about –8 melanin units of skin lightening is the first reported population-based effect size measurement for any albinism allele. Although albinism is generally considered recessive, our population sample offered an opportunity to compare the effect size for the first and second alleles quantitatively. We applied the estimated parameters to the three albino individuals and found that they were lighter by an average of 10 uni nm, 05W homozygotes, when controlling for OCA2NW273KV status, OCA2R305W had no detectable effect on skin color (Appendix 2—table 1).

To identify novel SNPs that may contribute toward skin pigmentation in the Kalinago samples, we performed GWAS using linear regression and linear mixed models (LMMs). Estimated power for these analyses is shown in Figure 5—figure supplement 1, and Q-Q plots are depicted in Figure 5—figure supplement 2. The LMM approaches exhibited less statistic inflation than linear regression, likely because they better accounted for closely related individuals. Although the lowest p-values from the LMM-based methods meet the conventional criterion of 5e-08 for genome-wide significance (Appendix 3—table 1), our interpretation is that none of these variants warrant further investigation. Low observed minor allele frequencies (<2%) are inconsistent with those expected for variants responsible for pigmentation differences between the African and Native American populations because the frequencies of alleles responsible for population differences are expected to be highly differentiated between these source populations.

Additional Native American hypopigmenting alleles of significant effect size remain to be identified. Previously characterized variants do not explain this difference. It is possible that multiple hypopigmenting variants of small effect sizes are together required to reach Native American and/or East Asian levels of hypopigmentation, individually having insufficient effect to detect in the Kalinago, given our power limitations. If this is the case, multiple variants are required to explain the observed net difference in pigmentation. Alternatively, if there are variants with large effect sizes, it appears likely that they were not genotyped and are poorly tagged by the genotyped SNPs. Additional work will be required to find hypopigmentation alleles of significant effect size that are responsible for the lighter color of Native Americans.

Materials and methods


Request a detailed protocol

Participants from among the Kalinago populations were recruited with the help of nurses from the Kalinago Territory in 2014. Recruitment took place throughout the territory’s eight hamlets. Place and date of birth, reported genealogical ancestry of parents and grandparents, number of siblings, and response to sun exposure (tanning ability, burning susceptibility) were obtained by interview. Hair color and texture and eye color (characterized as black, brown, gray, blue, green, hazel, no pigment) were noted visually but not measured quantitatively.

Skin reflectometry

Request a detailed protocol

Skin reflectance was measured using a Datacolor CHECKPLUS spectrophotometer and converted to melanin unit as we have previously described (Ang et al., 2012; Diffey et al., 1984). To minimize the confounding effects of sun exposure and body hair, skin color measurements were measured on each participant’s inner arm, and the average of triplicate measurements was generated. Before skin color measurements were taken, alcohol wipes were used to minimize the effect of dirt and/or oil. In order to minimize blanching due to occlusion of blood from the region being measured, care was taken not to apply only sufficient pressure to the skin to prevent ambient light from entering the scanned area (Fullerton et al., 1996).

DNA collection

Request a detailed protocol

Saliva samples were collected using the Oragene Saliva kit, and DNA was extracted using the prepIT.L2P kit, both from DNA Genotek (Ottawa, Canada). DNA integrity was checked by agarose gel electrophoresis and quantitated using a NanoDrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). Further quantification was done using Qubit Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA) as needed, following the manufacturer’s instructions.


Request a detailed protocol

OCA variants previously identified in African and Native Americans (Carrasco et al., 2009; King et al., 2003; Stevens et al., 1997; Yi et al., 2003) were amplified by PCR in all albino individuals as well as control samples using published conditions. Selected alleles of SLC24A5, SLC45A2, OCA2, and MFSD12 were amplified in all sampled individuals as described in Appendix 1—table 6. Amplicons generated by 30 cycles of PCR using an Eppendorf thermocycler were sequenced (GeneWiz, South Plainfield, NJ, USA) and the chromatograms viewed using Geneious software.

Illumina SNP genotyping using the Infinium Omni2.5–8 BeadChip was performed for all the individuals sampled. This was performed in three cohorts, using slightly different versions of the array, and the results combined. Due to ascertainment differences between the cohorts, analysis is presented here only for the combined sample. After quality control to eliminate duplicates and monomorphic variants, and to remove variants and individuals with genotype failure rates >0.05, 358 Kalinago individuals and 1,638,140 unique autosomal SNPs remained.

Whole exome sequencing of albino individual and obligate carrier

Request a detailed protocol

In order to identify the causative variant for albinism in the Kalinago, two samples (one albino individual and one parent) were selected for whole exome sequencing. Following shearing of input DNA (1 µg) using a Covaris E220 Focused-ultrasonicator (Woburn, MA, USA), exome enrichment and library preparation was done using the Agilent SureSelect V5+UTR kit (Santa Clara, CA, USA). The samples were sequenced at 50× coverage using a HiSeq 2500 sequencer (Illumina, San Diego, CA, USA).

The fastq files were aligned back to Human Reference Genome GRCh37 (HG19) using BWA (Li and Durbin, 2009) and bowtie (Langmead et al., 2009). Candidate SNP polymorphisms were identified using GATK’s UnifiedGenotyper (McKenna et al., 2010), while the IGV browser was used to examine the exons of interest for indels (Thorvaldsdóttir et al., 2013). Variants with low sequence depth (<10) in either sample were excluded from further consideration.

Computational analysis

Request a detailed protocol

Basic statistics, merges with other datasets, and association analysis by linear regression were performed using plink 1.9 (Chang et al., 2015; Purcell et al., 2007). Phasing and imputation, as well as analysis of regions of homozygosity by descent and identity by descent were performed with Beagle 4.1 (Browning and Browning, 2013; Browning and Browning, 2007), using 1KGP phased data (Auton et al., 2015) as reference.

The genotyped individuals were randomly partitioned into nine subsets of 50 or 51 individuals (n=50 subsets) in which no pair exhibited greater than second-order relationship (PI_HAT >0.25 using the --genome command in plink). Using the same criteria, a maximal subset of 184 individuals was also generated (n=184 subset).

PCA was performed using the smartpca program (version 13050) in the eigensoft package (Price et al., 2006). For comparison to HGDP populations, Kalinago samples were projected onto PCs calculated for the HGDP samples alone. For use as covariates in association analyses, the n=184 subset was used to generate the PCA, and the remaining individuals were projected onto the same axes.

Admixture analysis was performed using the ADMIXTURE program (Alexander et al., 2009; Zhou et al., 2011). Each of the nine n=50 Kalinago subsets was merged with the N=940 subset of HGDP data (Li et al., 2008; Rosenberg, 2006) for analysis (349,923 SNPs) and the outputs combined, averaging genetic ancestry proportions for the common HGDP individuals across runs. These results were used in figures. Separately, two-stage admixture analysis started with the averaged estimated allele frequencies and then employed the projection (--P) matrix outputs to estimate individual genetic ancestry for the combined Kalinago sample. Individual ancestries estimated using both methods, as well as those estimated from a thinned subset of 50,074 SNPs were in good agreement, consistent with standard errors estimated by bootstrap analysis, although sample-wide averages differed slightly. Cross-validation is enabled by adding the --cv to the ADMIXTURE command.

For association analyses we removed the three-albino individuals and excluded SNPs with minor allele frequency <0.01. For conventional association analysis by linear regression, the standard additive genetic model included sex, the first 10 PCs, and genotypes of rs1426654 (SLC24A5), rs16891982 (SLC45A2), and the albino variant rs797044784 (OCA2) as covariates (Supplementary file 4). LMM analysis was performed using the mlma module of GCTA (Yang et al., 2011) with the --mlma-no-preadj-covar flag to suppress calculation using residuals. Two genetic relatedness matrices (GRM) were used: a standard GRM calculated using GCTA’s --make-grm command and an ancestry-aware GRM calculated using relationships deduced by REAP (Thornton et al., 2012) that utilized the output of the two-stage admixture analysis. For linear regression only, p-values were adjusted for statistic inflation by genomic control using the lambda calculated from the median chi-square statistic.

Statistical power was estimated by simulation, using a subset of genotyped SNPs. Starting with the 349,923 SNPs used for genetic ancestry analysis, the averaged P matrix from ADMIXTURE analysis at K=4 provided an initial estimate of allele frequencies in AFR and NAM ancestral populations; 10,233 SNPs exhibited differentiation of 0.7 or greater between these populations, a value chosen as a reasonable minimum population differentiation for causative variants. After removal of SNPs for which predicted Kalinago sample frequencies deviated by more than 0.1 from observed values and those with adjusted p<0.1, 8766 SNPs remained. Phenotypes were simulated by randomly selecting one of these SNPs and adding a defined effect size to the observed phenotype. Simulated datasets were then analyzed with plink using the standard genetic model.

Statistical analysis of pigmentary effect of albinism involved fitting parameters to an additive model for the sample containing carriers but lacking albino individuals, applying the same model to the albino individuals, and comparing residuals for the albinos and the other individuals.

Local genetic ancestry analysis of the region containing the albinism allele was performed using the PopPhased version of rfmix (v1.5.4) with the default window size of 0.2 cM (Maples et al., 2013). A subset of 1KGP data served as reference haplotypes for European, African, and East Asian populations, and the Native American genetic ancestry segments of the admixed samples as determined by Martin et al., 2017a, were combined to generate synthetic Native American reference haplotypes. For estimates of individual genetic ancestry, Viterbi outputs for each window were averaged across all autosomes.

Appendix 1

Supplementary Tables

Appendix 1—table 1
Sample Demographics.
CategoryEntire sample (N=461)
 mean (SD)39 (21.5)
Paternal ancestry
Maternal ancestry
  1. *

    community-described ancestry collected.

  2. values from reported genealogy; 75 fathers and 146 mothers as determined by genotyping.

Appendix 1—table 2
Summary of Kalinago ancestry from admixture analysis (n=458).

NAM = Native American, AFR = African, EUR = European, CSA = Central & South Asian, EAS = East Asian, OCE = Oceanian. At K=3, NAM, EAS, and OCE are not distinguishable.

Appendix 1—table 3
Ancestry proportions estimated using different approaches.
estimation approachAMRAFREUREAS
Admixture (subsets, K=4)0.5490.3180.1220.011
Admixture (two stage, K=4)0.5410.3160.1260.016
rfmix (4 clusters)0.5530.3130.1250.009
rfmix (3 clusters)0.5570.3260.117---
Appendix 1—table 4
Summary by locus of albinism candidates identified through exome sequencing.

Candidates are homozygous derived in one albino and heterozygous in one obligate carrier. No nonsense, frameshift, or splice variants was detected. Our initial attempt to identify the albinism variant in the Kalinago involved targeted genotyping of the albino individuals for 28 mutations previously observed (Honychurch, 2012; Honychurch, 1998; Li et al., 2008; Loomis, 1967) in African or Native American albinos; these included the 2.7 kb exon 7 deletion in OCA2 found at high frequency in some African populations. ­No mutation was detected using this approach.

OCA geneChromosomeVariantsMissense
OCA1 (TYR)110
OCA3 (TYRP1)90
OCA4 (SLC45A2)50
OCA6 (SLC24A5)150
OCA7 (LRMDA)1010
B. Characteristics of individual candidates identified through exome sequencing
ChrrsIDRefAltf(AFR)*GeneLocation/ Effect
  1. *

    Overall frequency for non-reference allele in seven 1KGP African populations.

  2. 1KGP describes this variant as four consecutive SNPs rs549973474, rs569395077, rs538385900 and rs558126113.

Appendix 1—table 5
Effect sizes for covariates in linear regression model with 10 Principal Components.

Effect sizes are per allele for genomic variants (first allele only for albino variant). PC1 variance for included individuals (n=452) is 0.0045. P values adjusted using genomic control (applied to GWAS on the full variant set) are omitted if raw P value is above 0.05.

variableGenestandardBETAP_rawQ (-log P)alt1BETAP_rawQBeta-ratio-std
  1. This table compares three versions of analysis (linear regression only).

  2. P values reported here (and Q = – log P) are not corrected for statistic inflation.

  3. The last column for each non-standard case shows ratio of the effect size to that for the standard model, omitting PCs other than.

  4. alt1 model adds age to standard analysis.

  5. alt2 model adds five additional SNPs to standard analysis.

Appendix 1—table 6
Amplification conditions used for genotyping Kalinago samples for the selected alleles.
Gene & VariantPrimer SequencePCR Annealing Temperature (°C)

Appendix 2

Supplementary Tables

Appendix 2—table 1
Effect sizes of previously reported variants in Kalinago samples.
CHRpos (b37)SNPREFALTgenelocationCADD_PHREDPolyphen (main)SIFT (main)FreqGT sourceAR2BETA_aP_a_rawP_a_adjBETA_bP_bBETA_cP_cBETA_dP_dreference(s)
225329016rs12233134CTEFR3B near POMCintronic0.348--0.473IMP10.610.1970.2653–0.060.9103650.150.7715450.230.657335Quillen et al., 2012
6457748rs4959270CALOC105374875 near IRF4intronic1.041--0.329GT10.070.87960.8959–0.140.783747–0.080.880465–0.070.896826Sulem et al., 2007
6466033rs1540771CTLOC105374875 near IRF4intronic0.95--0.305GT1–0.020.97030.9744–0.150.765123–0.080.875216–0.090.853243Sulem et al., 2007
6154663568rs2333857AGIPCEF1 near OPRM1intronic or upstream3.27--0.813IMP11.330.029320.059760.780.2256511.160.07191041.220.059139Quillen et al., 2012
6154721557rs6917661CTCNKSR3 near OPRM13'UTR or downstream1.824--0.584GT11.080.01170.029380.610.201660.720.1332230.760.111546Quillen et al., 2012
755109177rs12668421ATEGFRintronic0.212--0.494IMP0.98–0.160.74730.7809–0.390.46139–0.140.787191–0.150.780554Quillen et al., 2012
755156071rs11238349GAEGFRintronic0.431--0.393IMP10.340.480.542–0.650.22091–0.370.479127–0.340.523937Quillen et al., 2012
755454267rs4948023GALANCL2 near EGFRintronic4.667--0.684IMP10.160.76410.7956–0.220.6961310.000.9966040.000.994267Quillen et al., 2012
912682663rs10809826CGTYRP1upstream1.738--0.117IMP0.96–0.280.68030.7221–0.550.472775–0.650.392833–0.730.332409Adhikari et al., 2019
916864521rs2153271CTBNC2intronic20.5--0.152GT1–2.270.00023370.001459–2.290.00110657–2.250.00120438–2.370.00065015Ju and Mathieson, 2021
10119564143rs11198112CTnear EMX2intergenic17.790.187GT10.330.56640.62061.100.0673850.990.1002630.960.11598Adhikari et al., 2019
1188511524rs7118677GTGRM5 near TYRintronic2.034--0.144IMP1–2.420.00013920.000982–1.620.0152453–1.990.00286815–2.100.00151048Adhikari et al., 2019
1188911696rs1042602CATYRS192Y23.8probably_damaging(0.974)deleterious(0.01)0.072IMP0.74–2.900.00037010.002074–2.160.0119931–2.430.00469847–2.560.00273999Stokowski et al., 2007
1189011046rs1393350GATYRintronic1.555--0.019GT1–3.510.019470.04354–2.900.0869741–3.380.042737–3.570.0261186Liu et al., 2015
1189017961rs1126809GATYRR402Q27.2probably_damaging(0.994)deleterious(0.03)0.019IMP0.97–3.510.019470.04354–2.900.0869741–3.380.042737–3.570.0261186Adhikari et al., 2019; Ju and Mathieson, 2021
1289299746rs642742CTKITLGupstream14.92--0.569GT1–0.340.47570.538–0.200.708049–0.300.568757–0.280.602548Sturm, 2009
1289328335rs12821256TCKITLGupstream15.74--0.015GT11.200.53270.5902–1.360.524421–1.080.608744–0.950.648883Ju and Mathieson, 2021
1492773663rs12896399GTLOC105370627 near SLC24A4intronic0.043--0.054GT1–0.160.86920.887–0.960.352238–0.890.380027–0.980.337781Sulem et al., 2007
1528197037rs1800414TCOCA2H615R23.3benign(0.133)deleterious(0)0.070IMP0.260.480.61160.6612–0.010.9895920.070.9398430.080.938174Edwards et al., 2010
1528213850rs4778219CTOCA2intronic1.527--0.316GT1–0.530.30740.3782–0.890.0952479–0.930.083945–0.900.0977717Adhikari et al., 2019
1528235773rs1800404CTOCA2synonymous coding0.321--0.488GT1–1.500.0014460.005889–1.470.00324612–1.300.00924527–1.370.00639755Crawford et al., 2017; Adhikari et al., 2019
1528344238rs7495174AGOCA2intronic7.622--0.087GT11.640.03260.06491.570.06632651.600.05808151.600.0567887Han et al., 2008
1528365618rs12913832AGHERC2 near OCA2intronic15.8--0.074GT1–1.860.039260.07497–1.480.112546–1.530.100514–1.570.0919142Liu et al., 2015; Adhikari et al., 2019
1528380518rs4778249TAHERC2 near OCA2intronic0.649--0.790IMP1–2.230.00022140.0014–2.430.00041828–2.140.00162008–2.120.0016832Adhikari et al., 2019
1528530182rs1667394CTHERC2 near OCA2intronic1.111--0.452GT1–1.420.0020390.007666–1.560.00151414–1.410.0038335–1.430.00369895Sulem et al., 2007
1689986117rs1805007CTMC1RR151C25.2probably_damaging(0.996)deleterious(0.02)0.016GT11.320.47750.53970.670.7129810.930.6131050.860.64501Ju and Mathieson, 2021
1689986154rs885479GAMC1RR163Q10.89benign(0.013)tolerated(0.3)0.461IMP0.92–1.310.0085650.0231–1.600.00241496–1.460.00572561–1.480.00546969Liu et al., 2015
193548231rs2240751AGMFSD12Y182H27.4probably_damaging(0.999)deleterious(0)0.031GT–3.030.037350.0723–1.600.281792–1.460.33655–1.570.3027Adhikari et al., 2019
203625436rs562926CTATRNintronic or downstream4.601--0.402GT10.850.087050.1395–0.120.8215670.180.7306850.280.595369Quillen et al., 2012
2032856998rs6058017AGASIP/AHCY3'UTR/intron7.639--0.342IMP0.95–0.900.072740.1212–0.850.126901–1.040.0606301–1.040.0609305Stokowski et al., 2007

Appendix 3

Supplementary Tables

Appendix 3—table 1
Top novel variants that may contribute towards skin pigmentation from our GWAS analysis.While the lowest p-values from the LMM-based methods meet the conventional criterion of 5e-08 for genome wide significance, the low observed minor allele frequencies (<2%) are inconsistent with what would be expected for variants responsible for pigmentation differences between the African and Native American populations.
CHRpos (b37)SNPREFALTgenelocationCADD_PHREDFreqGT sourceAR2BETA_aP_a_rawP_a_adjBETA_bP_bBETA_cP_cBETA_dP_d
1114560208rs113236485AGnear SYT6intergenic1.3230.014IMP0.9111.961.01E-086.71E-0712.241.57E-0712.914.67E-0812.962.17E-08
1114576742rs145925324GAnear SYT6intergenic0.6480.013IMP112.501.09E-087.12E-0712.561.55E-0713.393.85E-0813.491.81E-08
1114581335rs141998140GTnear SYT6intergenic2.0990.013IMP112.501.09E-087.12E-0712.561.55E-0713.393.85E-0813.491.81E-08
1114582335rs187318390CTnear SYT6intergenic1.1650.013IMP112.501.09E-087.12E-0712.561.55E-0713.393.85E-0813.491.81E-08
1114586703rs149623066AGnear SYT6intergenic0.0520.013IMP112.501.09E-087.12E-0712.561.55E-0713.393.85E-0813.491.81E-08
1114595150rs78273840CTnear SYT6intergenic1.8050.017IMP0.788.873.13E-065.39E-0510.595.74E-0711.111.98E-0711.259.38E-08
1114611620rs116218201TGnear SYT6intergenic2.1990.012IMP0.8314.091.12E-091.24E-0714.183.11E-0815.523.16E-0915.691.23E-09
1114612965rs116746819GAnear SYT6intergenic0.3510.012IMP0.8414.091.12E-091.24E-0714.183.11E-0815.523.16E-0915.691.23E-09
1114614230rs549514340TCnear SYT6intergenic3.3080.012IMP0.8414.091.12E-091.24E-0714.183.11E-0815.523.16E-0915.691.23E-09
  1. key, a = linear regression, 10 PCs; b = LMM with 0 PCs, std GRM; c = LMM with 10 PCs, std GRM; d = LMM with 10 PCs, reap GRM; ad j = based on lambda, inflation factor; beta = effect size.

Data availability

The whole exome sequencing and whole genome SNP genotyping data underlying this article cannot be shared publicly due to the privacy of individuals and stipulation by the Kalinago community. Only de-identified filtered SNP data used in analyses will be shared. Additional data will be shared on request to the corresponding author, pending approval from the Kalinago Council. M-index and specific genotyping data (SLC24A5 A111T, SLC45A2 L374F, OCA2 NW273KV and OCA2 305W) and genotyping data for Admixture have been uploaded to Dryad The data cannot be used for any commercial purposes. We did not create any new software or script for analysis.

The following data sets were generated
    1. Ang KC
    2. Canfield VA
    3. Foster TC
    4. Harbaugh TD
    5. Early KA
    6. Harter R
    7. Harbaugh T
    8. Early K
    9. Harter R
    10. Reid KP
    11. Leong S
    12. Imamura Kawasawa Y
    13. Liu D
    14. Hawley JW
    15. Cheng KC
    (2023) Dryad Digital Repository
    Native American Genetic Ancestry and Pigmentation Allele Contributions to Skin Color in a Caribbean Population.


  1. Book
    1. Crawford MH
    2. Phillips-Krawczak C
    3. Beaty KG
    4. Boaz N
    (2021) Human migration
    In: Beaty KG, editors. Migration of Garifuna: Evolutionary Success StoryHuman Migration. New York: Oxford University Press. 153.
  2. Book
    1. Honychurch L
    Review of the lesser Antilles in the age of European expansion
    In: Honychurch L, editors. NWIG: New West Indian Guide / Nieuwe West-Indische Gids. Brill. pp. 305–307.
  3. Book
    1. Honychurch L
    The Dominica Story: A History of the Island
    1. Lee ST
    2. Nicholls RD
    3. Schnur RE
    4. Guida LC
    5. Lu-Kuo J
    6. Spinner NB
    7. Zackai EH
    8. Spritz RA
    Diverse mutations of the P gene among African-Americans with type II
    Human Molecular Genetics 3:2047–2051.
  4. Book
    1. Rogoziński J
    A brief history of the Caribbean: from the Arawak and Carib to the present: Rosenberg NA
    In: Rogoziński J, editors. Annals of Human Genetics. Wiley. pp. 841–847.
    1. Spritz RA
    2. Fukai K
    3. Holmes SA
    4. Luande J
    Frequent intragenic deletion of the p gene in tanzanian patients with type ii oculocutaneous albinism (Oca2)
    American Journal of Human Genetics 56:1320–1323.
    1. Stevens G
    2. van Beukering J
    3. Jenkins T
    4. Ramsay M
    An intragenic deletion of the p gene is the common mutation causing tyrosinase-positive oculocutaneous albinism in southern african negroids
    American Journal of Human Genetics 56:586–591.
  5. Website
    1. Territory K
    (2021) Wikipedia
    Accessed July 8, 2021.
    1. Woolf CM
    (2005) Albinism (Oca2) in Amerindians
    American Journal of Physical Anthropology Suppl 41:118–140.

Article and author information

Author details

  1. Khai C Ang

    1. Department of Pathology, Penn State College of Medicine, Hershey, United States
    2. Jake Gittlen Laboratories for Cancer Research, Penn State College of Medicine, Hershey, United States
    Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7695-9953
  2. Victor A Canfield

    1. Department of Pathology, Penn State College of Medicine, Hershey, United States
    2. Jake Gittlen Laboratories for Cancer Research, Penn State College of Medicine, Hershey, United States
    Formal analysis, Validation, Investigation, Visualization, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
  3. Tiffany C Foster

    1. Department of Pathology, Penn State College of Medicine, Hershey, United States
    2. Jake Gittlen Laboratories for Cancer Research, Penn State College of Medicine, Hershey, United States
    Data curation, Investigation, Methodology, Writing - original draft
    Competing interests
    No competing interests declared
  4. Thaddeus D Harbaugh

    1. Department of Pathology, Penn State College of Medicine, Hershey, United States
    2. Jake Gittlen Laboratories for Cancer Research, Penn State College of Medicine, Hershey, United States
    Competing interests
    No competing interests declared
  5. Kathryn A Early

    1. Department of Pathology, Penn State College of Medicine, Hershey, United States
    2. Jake Gittlen Laboratories for Cancer Research, Penn State College of Medicine, Hershey, United States
    Competing interests
    No competing interests declared
  6. Rachel L Harter

    Department of Pathology, Penn State College of Medicine, Hershey, United States
    Competing interests
    No competing interests declared
  7. Katherine P Reid

    1. Department of Pathology, Penn State College of Medicine, Hershey, United States
    2. Jake Gittlen Laboratories for Cancer Research, Penn State College of Medicine, Hershey, United States
    Competing interests
    No competing interests declared
  8. Shou Ling Leong

    Department of Family & Community Medicine, Penn State College of Medicine, Hershey, United States
    Competing interests
    No competing interests declared
  9. Yuka Kawasawa

    1. Department of Biochemistry and Molecular Biology, Penn State College of Medicine, Hershey, United States
    2. Department of Pharmacology, Penn State College of Medicine, Hershey, United States
    3. Institute of Personalized Medicine, Penn State College of Medicine, Hershey, United States
    Formal analysis
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8638-6738
  10. Dajiang Liu

    1. Department of Biochemistry and Molecular Biology, Penn State College of Medicine, Hershey, United States
    2. Department of Public Health Sciences, Penn State College of Medicine, Hershey, United States
    Formal analysis
    Competing interests
    No competing interests declared
  11. John W Hawley

    Salybia Mission Project, Saint David Parish, Dominica
    Competing interests
    No competing interests declared
  12. Keith C Cheng

    1. Department of Pathology, Penn State College of Medicine, Hershey, United States
    2. Jake Gittlen Laboratories for Cancer Research, Penn State College of Medicine, Hershey, United States
    3. Department of Biochemistry and Molecular Biology, Penn State College of Medicine, Hershey, United States
    4. Department of Pharmacology, Penn State College of Medicine, Hershey, United States
    Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Visualization, Methodology, Writing - review and editing
    For correspondence
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5350-5825


Hershey Rotary Club

  • Khai C Ang

Jake Gittlen Laboratories for Cancer Research

  • Keith C Cheng

Department of Pathology, Penn State College of Medicine

  • Keith C Cheng

Microryza (now

  • Khai C Ang

National Institute of Arthritis and Musculoskeletal and Skin Diseases (5R01 AR052535)

  • Keith C Cheng

National Institute of Arthritis and Musculoskeletal and Skin Diseases (3R01 AR052535-03S1)

  • Keith C Cheng

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.


We would like to thank the Kalinago Council, Dominica Ministry of Health, nurses at the Kalinago Territory, Salybia Mission Project, and the Kalinago community for their assistance and participation in this study. We would also like to acknowledge faculty of Ross University, Portsmouth, Dominica (now Bridgetown, Barbados), especially Drs. Gerhard Meisenberg (retired) and Liris Benjamin of Ross University in helping us to obtain the necessary IRB approval for fieldwork. This work was supported by the Hershey Rotary Club, Microryza (now, Jake Gittlen Laboratories for Cancer Research, National Institutes of Health grants 5R01 AR052535 and 3R01 AR052535-03S1 from the National Institute of Arthritis and Musculoskeletal and Skin Diseases, and Department of Pathology for funding portions of this project. We would also like to acknowledge members of the Cheng Lab for their constructive comments and input.


Human subjects: The study was reviewed and approved by the Kalinago council and institutional review boards of Penn State University (29269EP), Ross University, and the Dominica Ministry of Health (H125). Informed consent was obtained from each participant enrolled in the study, and in the case of minors, consent was also obtained from a parent or guardian.

Version history

  1. Preprint posted: November 29, 2021 (view preprint)
  2. Received: February 2, 2022
  3. Accepted: June 8, 2023
  4. Accepted Manuscript published: June 9, 2023 (version 1)
  5. Version of Record published: July 26, 2023 (version 2)
  6. Version of Record updated: October 20, 2023 (version 3)


© 2023, Ang et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.


  • 2,000
  • 145
  • 0

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Khai C Ang
  2. Victor A Canfield
  3. Tiffany C Foster
  4. Thaddeus D Harbaugh
  5. Kathryn A Early
  6. Rachel L Harter
  7. Katherine P Reid
  8. Shou Ling Leong
  9. Yuka Kawasawa
  10. Dajiang Liu
  11. John W Hawley
  12. Keith C Cheng
Native American genetic ancestry and pigmentation allele contributions to skin color in a Caribbean population
eLife 12:e77514.

Share this article

Further reading

    1. Developmental Biology
    2. Evolutionary Biology
    Zhuqing Wang, Yue Wang ... Wei Yan
    Research Article

    Despite rapid evolution across eutherian mammals, the X-linked MIR-506 family miRNAs are located in a region flanked by two highly conserved protein-coding genes (SLITRK2 and FMR1) on the X chromosome. Intriguingly, these miRNAs are predominantly expressed in the testis, suggesting a potential role in spermatogenesis and male fertility. Here, we report that the X-linked MIR-506 family miRNAs were derived from the MER91C DNA transposons. Selective inactivation of individual miRNAs or clusters caused no discernible defects, but simultaneous ablation of five clusters containing 19 members of the MIR-506 family led to reduced male fertility in mice. Despite normal sperm counts, motility, and morphology, the KO sperm were less competitive than wild-type sperm when subjected to a polyandrous mating scheme. Transcriptomic and bioinformatic analyses revealed that these X-linked MIR-506 family miRNAs, in addition to targeting a set of conserved genes, have more targets that are critical for spermatogenesis and embryonic development during evolution. Our data suggest that the MIR-506 family miRNAs function to enhance sperm competitiveness and reproductive fitness of the male by finetuning gene expression during spermatogenesis.

    1. Evolutionary Biology
    2. Immunology and Inflammation
    Mark S Lee, Peter J Tuohy ... Michael S Kuhns
    Research Advance

    CD4+ T cell activation is driven by five-module receptor complexes. The T cell receptor (TCR) is the receptor module that binds composite surfaces of peptide antigens embedded within MHCII molecules (pMHCII). It associates with three signaling modules (CD3γε, CD3δε, and CD3ζζ) to form TCR-CD3 complexes. CD4 is the coreceptor module. It reciprocally associates with TCR-CD3-pMHCII assemblies on the outside of a CD4+ T cells and with the Src kinase, LCK, on the inside. Previously, we reported that the CD4 transmembrane GGXXG and cytoplasmic juxtamembrane (C/F)CV+C motifs found in eutherian (placental mammal) CD4 have constituent residues that evolved under purifying selection (Lee et al., 2022). Expressing mutants of these motifs together in T cell hybridomas increased CD4-LCK association but reduced CD3ζ, ZAP70, and PLCγ1 phosphorylation levels, as well as IL-2 production, in response to agonist pMHCII. Because these mutants preferentially localized CD4-LCK pairs to non-raft membrane fractions, one explanation for our results was that they impaired proximal signaling by sequestering LCK away from TCR-CD3. An alternative hypothesis is that the mutations directly impacted signaling because the motifs normally play an LCK-independent role in signaling. The goal of this study was to discriminate between these possibilities. Using T cell hybridomas, our results indicate that: intracellular CD4-LCK interactions are not necessary for pMHCII-specific signal initiation; the GGXXG and (C/F)CV+C motifs are key determinants of CD4-mediated pMHCII-specific signal amplification; the GGXXG and (C/F)CV+C motifs exert their functions independently of direct CD4-LCK association. These data provide a mechanistic explanation for why residues within these motifs are under purifying selection in jawed vertebrates. The results are also important to consider for biomimetic engineering of synthetic receptors.