Native American genetic ancestry and pigmentation allele contributions to skin color in a Caribbean population
Abstract
Our interest in the genetic basis of skin color variation between populations led us to seek a Native American population with genetically African admixture but low frequency of European light skin alleles. Analysis of 458 genomes from individuals residing in the Kalinago Territory of the Commonwealth of Dominica showed approximately 55% Native American, 32% African, and 12% European genetic ancestry, the highest Native American genetic ancestry among Caribbean populations to date. Skin pigmentation ranged from 20 to 80 melanin units, averaging 46. Three albino individuals were determined to be homozygous for a causative multi-nucleotide polymorphism OCA2NW273KV contained within a haplotype of African origin; its allele frequency was 0.03 and single allele effect size was –8 melanin units. Derived allele frequencies of SLC24A5A111T and SLC45A2L374F were 0.14 and 0.06, with single allele effect sizes of –6 and –4, respectively. Native American genetic ancestry by itself reduced pigmentation by more than 20 melanin units (range 24–29). The responsible hypopigmenting genetic variants remain to be identified, since none of the published polymorphisms predicted in prior literature to affect skin color in Native Americans caused detectable hypopigmentation in the Kalinago.
Editor's evaluation
This pigmentation study focuses on a community from Kalinago Territory from the Caribbean islands that on average possess high percentages of Indigenous American ancestry, and broadens the effort of quantifying the genetic effects on skin pigmentation in humans. This paper describes an analysis of the genetic structure of the Kalinago population in the Commonwealth of Dominica, and the relationship between ancestry and skin pigmentation in that population. They provide valuable new insights into the skin-lightening effect of Native American alleles, which likely have been obscured by the effect of European alleles in previous studies of admixed Native American populations. Additionally, this paper provides an interesting analysis of previously reported albinism alleles, which paints a more complex picture of the genetic architecture of pigmentation.
https://doi.org/10.7554/eLife.77514.sa0eLife digest
The variation in skin colour of modern humans is a product of thousands of years of natural selection. All human ancestry can be traced back to African populations, which were dark-skinned to protect them from the intense UV rays of the sun.
Over time, humans spread to other parts of the world, and people in the northern latitudes with lower UV developed lighter skin through natural selection. This was likely driven by a need for vitamin D, which requires UV rays for production.
Separate genetic mechanisms were involved in the evolution of lighter skin in each of the two main branches of human migration: the European branch (which includes peoples on the Indian subcontinent and Europe) and the East Asian branch (which includes East Asia and the Americas).
A variant of the gene SLC24A5 is the primary contributor to lighter skin colour in the European branch, but a corresponding variant driving light skin colour evolution in the East Asian branch remains to be identified.
One obstacle to finding such variants is the high prevalence of European ancestry in most people groups, which makes it difficult to separate the influence of European genes from those of other populations. To overcome this issue, Ang et al. studied a population that had a high proportion of Native American and African ancestors, but a relatively small proportion of European ancestors, the Kalinago people. The Kalinago live on the island of Dominica, one of the last Caribbean islands to be colonised by Europeans.
Ang et al. were able to collect hundreds of skin pigmentation measurements and DNA samples of the Kalinago, to trace the effect of Native American ancestry on skin colour. Genetic analysis confirmed their oral history records of primarily Native American (55%) – one of the highest of any Caribbean population studied to date – compared with African (32%) and European (12%) ancestries.
Native American ancestry had the highest effect on pigmentation and reduced it by more than 20 melanin units, while the European mutations in the genes SLC24A5 and SLC45A2 and an African gene variant for albinism only contributed 5, 4 and 8 melanin units, respectively. However, none of the so far published gene candidates responsible for skin lightening in Native Americans caused a detectable effect. Therefore, the gene responsible for lighter skin in Native Americans/East Asians has yet to be identified.
The work of Ang et al. represents an important step in deciphering the genetic basis of lighter skin colour in Native Americans or East Asians. A better understanding of the genetics of skin pigmentation may help to identify why, for example, East Asians are less susceptible to melanoma than Europeans, despite both having a lighter skin colour. It may also further acceptance of how variations in human skin tones are the result of human migration, random genetic variation, and natural selection for pigmentation in different solar environments.
Introduction
Human skin pigmentation is a polygenic trait that is influenced by health and environment (Barsh, 2003). Lighter skin is most common in populations adapted to northern latitudes characterized by lower UV incidence than equatorial latitudes (Jablonski and Chaplin, 2000). Selection for lighter skin, biochemically driven by a solar UV-dependent photoactivation step in the formation of vitamin D (Engelsen, 2010; Hanel and Carlberg, 2020; Holick, 1981; Loomis, 1967) is regarded as the most likely basis for a convergent evolution of lighter skin color in European and East Asian/Native American populations (Lamason et al., 2005; Norton et al., 2007). The hypopigmentation polymorphisms of greatest significance in Europeans have two key characteristics: large effect size and near fixation. For example, the A111T allele in SLC24A5 (Lamason et al., 2005) explains at least 25% of the difference in skin color between people of African vs. European genetic ancestry, and is nearly fixed in European populations. No equivalent polymorphism in Native Americans or East Asians has been found to date.
Native Americans share common genetic ancestry with East Asians (Derenko et al., 2010; Tamm et al., 2007), diverging before ~15 kya (Gravel et al., 2013; Moreno-Mayar et al., 2018; Reich et al., 2012), but the extent to which these populations share pigmentation variants remains to be determined. The derived alleles of rs2333857 and rs6917661 near OPRM1, and rs12668421 and rs11238349 in EGFR are near fixation in some Native American populations, but all also have a high frequency in Europeans (Quillen et al., 2012), and none reach genome-wide significance in Adhikari et al., 2019. However, the latter found a significant association for the Y182H variant of MSFD12 with skin color, but its frequencies were only 0.27 and 0.17 in Native Americans and East Asians, respectively, suggesting that it can explain only a small portion of the difference between Native American and/or East Asians and African skin color. Thus, the genetic basis for lighter skin pigmentation specific to Native American and East Asian populations, whose African alleles would be expected to be ancestral, remains to be found.
The shared genetic ancestry of East Asians and Native Americans suggests the likelihood that some light skin color alleles are shared between these populations. This is particularly the case for any variants that achieved fixation in their common ancestors. For Native American populations migrating from Beringia to the Tropics, selection for darker skin color also appears likely (Jablonski and Chaplin, 2000; Quillen et al., 2019). This would have increased the frequency of novel dark skin variants, if any, and would have decreased the frequency of light skin variants that had not achieved fixation. Hypopigmenting alleles are associated with the European admixture characteristic of many current Native American populations (Brown et al., 2017; Gravel et al., 2013; Keith et al., 2021; Klimentidis et al., 2009; Reich et al., 2012). Since the European hypopigmenting alleles may mask the effects of East Asian and Native American alleles, we searched for an admixed Native American population with high African, but low European admixture.
Prior to European contact, the Caribbean islands were inhabited by populations who migrated from the northern coast of South America (Benn-Torres et al., 2008; Harvey et al., 1969; Honychurch, 2012; Island Caribs, 2016; Benn Torres et al., 2015). During the Colonial period, large numbers of Africans were introduced into the Caribbean as slave labor (Honychurch, 2012; Benn Torres et al., 2013). As a consequence of African and European admixture and high mortality among the indigenous populations, Native American genetic ancestry now contributes only a minor portion (<15%) of the genetic ancestry of most Caribbean islanders (Auton et al., 2015; Benn Torres et al., 2015). The islands of Dominica and St. Vincent were the last colonized by Europeans in the late 1700s (Honychurch, 2012; Honychurch, 1998; Rogoziński, 2000). In 1903, the British granted 15 km2 (3700 acres) on the eastern coast of Dominica as a reservation for the Kalinago, who were then called ‘Carib.’ When Dominica gained Independence in 1978, legal rights and a degree of protection from assimilation were gained by the inhabitants of the Carib Reserve (Honychurch, 2012) (redesignated Kalinago Territory in 2015). Oral history and beliefs among the Kalinago, numbering about 3000 living within the Territory, 2021; Figure 1—figure supplement 2 are consistent with the primarily Native American and African genetic ancestry, assessed and confirmed genetically here.
Early in our genetic and phenotypic survey of the Kalinago, we noted an albino individual, and upon further investigation, we learned of two others residing in the Territory. We set out to identify the mutant albinism allele to avoid single albino allele effects that would potentially mask Native American hypopigmentation alleles. Oculocutaneous albinism (OCA) is a recessive trait characterized by visual system abnormalities and hypopigmentation of skin, hair, and eyes (Gargiulo et al., 2011; Grønskov et al., 2007; Grønskov et al., 2014; Hong et al., 2006; Vogel et al., 2008) that is caused by mutations in any of a number of autosomal pigmentation genes (Carrasco et al., 2009; Edwards et al., 2010; Gao et al., 2017; Grønskov et al., 2013; Kausar et al., 2013; King et al., 2003; Spritz et al., 1995; Stevens et al., 1997; Stevens et al., 1995; Vogel et al., 2008; Woolf, 2005; Yi et al., 2003). The incidence of albinism is ~1:20,000 in populations of European descent, but much higher in some populations, including many in sub-Saharan Africa (1:5000)(Greaves, 2014). Here, we report on the genetic ancestry of a population sample representing 15% of the Kalinago population of Dominica, the identification of the new albinism allele in that population, and measurement of the hypopigmenting effects of the responsible albinism allele, the European SLC24A5A111T and SLC45A2L374 alleles. Native American genetic ancestry alone caused a measurable effect on pigmentation. In contrast, alleles identified in past studies of Native American skin color caused no significant effect on skin color.
Results and discussion
Our search for a population admixed for Native American/African ancestries with minimal European admixture led us to the ‘Carib’ population in the Commonwealth of Dominica. Observations from an initial trip to Dominica suggested wide variation in Kalinago skin color. Pursuit of the genetic studies described here required learning about oral and written histories, detailed discussion with community leadership, IRB approval from Ross University (until Hurricane Maria in 2017, the largest medical school in Dominica) and the Department of Health of the Commonwealth of Dominica, and relationship-building with three administrations of the Kalinago Council over 15 years.
Population sample
Our DNA and skin color sampling program encompassed 458 individuals, representing 15% of the population of the territory and all three known albino individuals. Ages ranged from 6 to 93 (Appendix 1—table 1 and Figure 1—figure supplement 3). We were able to obtain genealogical information for about half of the parents (243 mothers and 194 fathers). Community-defined ancestry (described as ‘Black,’ ‘Kalinago,’ or ‘Mixed’) for both parents was obtained for 426 individuals (92% of sample), including 108 parents from whom DNA samples were obtained (72 Kalinago, 36 Mixed, and 0 Black). They described themselves as Black, Kalinago, or Mixed from their perceived understanding of their parents or grandparents skin color.
Kalinago genetic ancestry
The earliest western mention of the Kalinago (originally as ‘Caribs’) was in Christopher Columbus’s journal dated November 26, 1492 (Honychurch, 2012). Little is known about the detailed cultural and genetic similarities and differences between them and other Caribbean pre-contact groups such as the Taino. African admixture in the present Kalinago population derived from the African slave trade; despite inquiry across community, governmental, and historical sources, we were unable to find documentation of specific regions of origin in Africa or well-defined contributions from other groups. The population’s linguistics are uninformative, as they speak, in addition to English, the same French-based Antillean Creole spoken on the neighboring islands of Guadeloupe and Martinique.
To study Kalinago population structure, we analyzed an aggregate of our Kalinago SNP genotype data and HGDP data (Li et al., 2008) using ADMIXTURE (Figure 1 and Figure 1—figure supplement 1) as described in Materials and methods. At K=3, the ADMIXTURE result confirmed the three major clusters, corresponding roughly to Africans (black cluster), European/Middle Easterners/Central and South Asians (yellow cluster), and East Asians/Native Americans (green cluster). At K=4 and higher, the red component that predominates Native Americans separates the Kalinago from the East Asians (green cluster). Consistent with prior work (Li et al., 2008), a purple cluster (Oceanians) appears at K=5 and a brown cluster (Central and South Asians) appears at K=6; both are minor sources of genetic ancestry in our Kalinago sample (average <1%) (Appendix 1—table 2).
At K=4 to K=6, the Kalinago show on average 55% Native American, 32% African, and 11–12% European genetic ancestry. Estimates from a two-stage admixture analysis are similar, as are results from local genetic ancestry analysis (see Materials and methods) (Appendix 1—table 3), leading to estimates of 54–56% Native American, 31–33% African, and 11–13% European genetic ancestry. The individual with the least admixture has approximately 94% Native American and 6% African genetic ancestry. The results of the principal component (PC) analysis (PCA) (Figure 2—figure supplement 1) were consistent with ADMIXTURE analysis. The first two PCs suggest that most Kalinago individuals show admixture between Native American and African genetic ancestry, with a smaller but highly variable European contribution apparent in the displacement in PC2 (Figure 2—figure supplement 1). A smaller number of Kalinago individuals with substantial East Asian genetic ancestry exhibit displacement in PC3 (Figure 2—figure supplement 1).
Our analysis of Kalinago genetic ancestry revealed considerably more Native American and less European genetic ancestry than the Caribbean samples of Benn Torres et al., 2013, and the admixed populations from the 1000 Genomes Project (1KGP) (Auton et al., 2015; Figure 2). Some Western Hemisphere Native Americans reported in Reich et al., 2012, have varying proportions of European but very little African admixture (Figure 2B). Overall, the Kalinago have more Native American and less European genetic ancestry than any other Caribbean population.
The 55% Native American genetic ancestry calculated from autosomal genotype in the Kalinago is greater than the reported 13% in Puerto Rico (Gravel et al., 2013), 10–15% for Tainos across the Caribbean (Schroeder et al., 2018), and 8% for Cubans (Marcheco-Teruel et al., 2014). This is also considerably higher than the reported 6% Native American genetic ancestry found in Bwa Mawego, a horticultural population that resides south of the Kalinago Territory (Keith et al., 2021). However, this result is lower than the 67% Native American genetic ancestry reported by Crawford et al., 2021, for an independently collected Kalinago samples based on the mtDNA haplotype analysis. This difference suggests a paternal bias in combined European and/or African admixture. Since our Illumina SNP-chip genotyping does not yield reliable identification of mtDNA haplotypes, we are currently unable to compare maternal to autosomal genetic ancestry proportions for our sample. Samples genotyped using 105 genetic ancestry informative markers from Jamaica and the Lesser Antilles (Benn Torres et al., 2015) yielded an average of 7.7% Native American genetic ancestry (range 5.6%–16.2%), with the highest value from a population in Dominica sampled outside the Kalinago reservation. Relevant to the potential mapping of Native American light skin color alleles, the Kalinago population has among the lowest European genetic ancestry (12%) compared to other reported Caribbean Native Americans in St. Kitts (8.2%), Barbados (11.5%), and Puerto Rico (71%) (Benn Torres et al., 2013). Contributing to the high percentage of Native American genetic ancestry in the Kalinago is their segregation within the 3700 acre Kalinago Territory in Dominica granted by the British in 1903, and the Kalinago tradition that women marrying non-Kalinago are required to leave the Territory; non-Kalinago spouses of Kalinago men are allowed to move to the Territory (KCA, KCC, Personal Communication with Kalinago Council, 2014). These factors help to explain why samples collected outside the Kalinago Territory (Benn Torres et al., 2013) show lower fractional Native American genetic ancestry.
During our fieldwork, it was noted that members of the Kalinago community characterized themselves and others in terms of perceived genealogical ancestry as ‘Black,’ ‘Kalinago,’ or ‘Mixed.’ Compared to individuals self-identified as ‘Mixed,’ those self-identified as ‘Kalinago’ have on average more Native American genetic ancestry (67% vs 51%), less European genetic ancestry (10% vs 14%), and less African genetic ancestry (23% vs 34%) (Figure 2—figure supplement 2). Thus, these folk categories based on phenotype are reflected in some underlying differences in genetic ancestry.
Kalinago skin color variation
Melanin index unit (MI) calculated from skin reflectance measured at the inner upper arm (see Materials and methods) was used as a quantitative measure of melanin pigmentation (Ang et al., 2012; Diffey et al., 1984). MI determined in this way is commonly used as a measure of constitutive skin pigmentation (Choe et al., 2006; Park and Lee, 2005). The MI in the Kalinago ranged from 20.7 to 79.7 (Figure 4—figure supplement 1), averaging 45.7. The three Kalinago albino individuals sampled had the lowest values (20.7, 22.4, and 23.8). Excluding these, the MI ranged between 28.7 and 79.7 and averaged 45.9. For comparison, the MI averaged 25 and 21 for people of East Asian and European genetic ancestry, respectively, as measured with the same equipment in our laboratory (Ang et al., 2012; Tsetskhladze et al., 2012). This range is similar to that of another indigenous population, the Senoi of Peninsular Malaysia (MI 24–78; mean = 45.7) (Ang et al., 2012). The Senoi are believed to include admixture from Malaysian Negritos whose pigmentation is darker (mean = 55) (Ang et al., 2012) than that of the average Kalinago. In comparison, the average MI was 53.4 for Africans in Cape Verde (Beleza et al., 2012) and 59 for African-Americans (Shriver et al., 2003). Individuals self-described as ‘Kalinago’ were slightly lighter and had a narrower MI distribution (42.5± 5.6, mean ± SD) compared to ‘Mixed’ (45.8± 9.6) (Figure 4—figure supplement 2).
An OCA2 albinism allele in the Kalinago
OCA is a genetically determined condition characterized by nystagmus, reduced visual acuity, foveal hypoplasia, and strabismus as well as hypopigmentation of the skin, hair, and eye (Dessinioti et al., 2009; van Geel et al., 2013). The three sampled albino individuals had pale skin (MI 20.7, 22.4, and 23.8 vs. 29–80 for non-albino individuals), showed nystagmus, and reported photophobia and high susceptibility to sunburn. In contrast to the brown irides and black hair of most Kalinago, including their parents, the albino individuals had blonde hair and gray irides with varying amounts of green and blue.
To identify the albinism variant in the Kalinago, we first determined that none of the albino individuals carried any of 28 mutations previously found in African or Native American albino individuals (Carrasco et al., 2009; King et al., 2003; Stevens et al., 1997; Yi et al., 2003), including a 2.7 kb exon 7 deletion in OCA2 found at high frequency in some African populations. Whole exome sequencing of one albino individual and one parent (obligate carrier) revealed polymorphisms homozygous in the albino individuals and heterozygous in the parent, an initial approach that assumes that the albino individual was not a compound heterozygote. We identified 12 variant alleles in 7 OCA genes (or genomic regions) that met these criteria (summarized in Appendix 1—table 4). None were nonsense or splice site variants. Five of the twelve variants were intronic, one was synonymous, one was located in 5’UTR, and three were in the 3’UTR (Appendix 1—table 4). Two missense variants were found in OCA2: SNP rs1800401 (c.913C>T or p.Arg305Trp in exon 9), R305W, and multi-nucleotide polymorphism rs797044784 in exon 8 (c.819_822delCTGGinsGGTC; p.Asn273_Trp274delinsLysVal), NW273KV.
Among 458 Kalinago OCA2 genotypes, 26 carried NW273KV and 60 carried R305W (Table 1). Only NW273KV homozygotes were albino individual. We know that the allele responsible for albinism was NW273KV because neither of the two individuals, homozygous for R305W but not NW273KV, was albino individual. In further support of this conclusion is that one individual who was homozygous for R305W and homozygous ancestral for NW273 had an MI of 72, among the darkest in the entire population. R305W is notably present with frequency >0.10 in some African, South Asian, and European populations (Auton et al., 2015), predicting a Hardy-Weinberg frequency of homozygotes above 1%. This is far greater than the observed frequency of individuals with albinism and therefore inconsistent with the idea that this is not a variant responsible for albinism. The fact that R305W scores incorrectly as pathogenic using SIFT, Polyphen 2.0, and PANTHER that R305W (Kamaraj and Purohit, 2014) suggests a need for refinement of these methods. The universal association of R305W with the NW273KV haplotype indicates that the founder haplotype of the NW273KV albinism mutation carried the silent R305W variant.
To identify the origin of the albino allele, albino individuals and carriers were analyzed for regions exhibiting homozygosity, and identity-by-descent and local genetic ancestry was estimated (see Materials and methods). All three albino individuals share a homozygous segment of ~1.7 Mb that encompasses several genes in addition to OCA2 (Figure 3). The albino haplotype defined by homozygosity in individuals 2 and 3 extends ~11 Mb; comparison to local genetic ancestry shows that this haplotype is clearly of African origin.
The Kalinago albino individuals are the only reported individuals where the albinism was caused by homozygosity for the NW273KV allele of OCA2. Two reported albino individuals of African-American/Dutch descent were compound heterozygotes for the OCA2 mutation, with one allele being the NW273KV variant chromosome (Garrison et al., 2004; Lee et al., 1994). Conservation of the NW sequence among vertebrates and its inclusion in a potential N-linked glycosylation site (Rinchik et al., 1993) that is eliminated by the mutation supports the variant’s pathogenicity. The NW273KV frequency in our sample (0.03) translates into a Hardy-Weinberg albinism frequency (p2=0.0009) of ~1 per 1000, as observed (3 in a population of about 3000). Examination of publicly available data reveals three OCA2NW273KV heterozygotes in the 1000 Genome Project, a pair of siblings from Barbados (ACB) and one individual from Sierra Leone (MSL). The three 1KGP individuals share a haplotype of ~1.5 Mb, of which ~1.0 Mb matches the albino haplotype in the Kalinago. The phasing for the OCA2NW273KV variant in the public data is inconsistent, with the variant assigned to the wrong chromosome for the ACB siblings.
Genetic contributions to Kalinago skin color variation
One motivation for undertaking this work was to characterize genetic contributions to skin pigmentation in a population with primarily Native American and African genetic ancestry, so that we could focus on the effect of Native American hypopigmenting alleles without interference from European alleles. The Kalinago population described here comprises the only population we are aware of that fits this genetic ancestry profile. To control for the effects of the major European pigmentation loci, all Kalinago samples were genotyped for SLC24A5A111T and SLC45A2L374F. The phenotypic effects of these variants and OCA2NW273KV are shown in Figure 4. Each variant decreases melanin pigmentation, with homozygotes being lighter than heterozygotes. The greatest effect is seen in the OCA2NW273KV homozygotes (the albino individuals), as previously noted. The frequencies of the derived alleles of SLC24A5A111T and SLC45A2L374F in the Kalinago sample are 0.14 and 0.06, respectively.
The markedly higher frequency of SLC24A5A111T compared to SLC45A2L374F is not explained solely by European admixture, given that most Europeans are nearly fixed for both alleles (Soejima and Koda, 2007). This deviation can be explained by the involvement of source populations that carry the SLC24A5A111T variant but not SLC45A2L374F. Although some sub-Saharan West African populations (the likeliest source of AFR genetic ancestry in the Kalinago) have negligible SLC24A5A111T frequencies, moderate frequencies are found in the Mende of Sierra Leone (MSL, allele frequency = 0.08) (Micheletti et al., 2020; Auton et al., 2015), while some West African populations such as Hausa and Mandinka who have allele frequencies of 0.11 and 0.15, respectively (Cheung et al., 2000; Rajeevan et al., 2012). Such African individuals carrying the SLC24A5A111T allele could potentially cause the observed frequencies by founder effect. In addition, the region of chromosome 5 containing SLC45A2 exhibits low European genetic ancestry (6.5%) that is consistent with low observed SLC45A2L374F frequency.
In order to investigate the potential effect of the SLC25A5A111T allele on the albinism phenotype, we also compared other pigmentation phenotypes such as the hair and eye colors for all albino individuals and carriers. One of the three Kalinago albino individuals was also heterozygous for SLC24A5A111T, but neither skin nor hair color for this individual was lighter than that of the other two albino individuals, who were homozygous for the ancestral allele at SLC24A5A111; this observation is consistent with epistasis of OCA2 hypopigmentation over that of SLC24A5A111T. Nine sampled non-albino individuals had combinations of hair that was reddish, yellowish, or blonde (n=6), skin with MI <30 (n=3), and gray, blue, green, or hazel irides (n=2); among these, six were heterozygous and one homozygous for SLC24A5A111T, and three were heterozygous for the albino variant. A precise understanding of the phenotypic effects of the combinations of these and other hypopigmenting alleles will require further study.
The strong dependence of pigmentation on Native American genetic ancestry is clarified by focusing on individuals lacking the hypopigmenting alleles SLC24A5A111T, SLC45A2L374F, and OCA2NW273KV (Figure 5). Although positive deviations from the best fit are apparent at both high and low Native American genetic ancestry, the trend toward lighter pigmentation as Native American genetic ancestry increases is clear. The net difference between African and Native American contributions to pigmentation appears likely to be bounded by the magnitudes of the slope vs NAM genetic ancestry (24 units) and the slope vs AFR genetic ancestry (29 units, not shown). The difference in melanin index value is expected to be explained by genetic variants that are highly differentiated between African and Native American populations.
To further investigate the contributions of genetic variation to skin color, we performed association analyses using an additive model for melanin index, conditioning on sex, genetic ancestry (using 10 PCs), and genotypes for SLC24A5A111T, SLC45A2L374F, and OCA2NW273KV. Assuming likely epistasis of albinism alleles over other hypopigmenting alleles, these analyses omitted the three albino individuals. Employing a linear regression model, we found that sex and all three genotyped polymorphisms were statistically significant (Table 2 and Figure 2—figure supplement 2). However, only SLC24A5A111T reaches genome-wide significance. PC1, which strongly correlated with Native American vs African genetic ancestry, exhibits the lowest p-value. Effect sizes were about –6 units (per allele) for SLC24A5A111T, –4 units for SLC45A2L374F, and –8 units for the first OCA2NW273KV allele.
Additional covariates were considered but not included in our standard model. Skin pigmentation exhibited a decreasing trend with age, but its contribution was not statistically significant (adjusted p-value = 0.08). Estimated effect sizes for significant covariates were little affected by the inclusion of age as a covariate (Appendix 1—table 5). Analysis of SNPs that were previously reported as relevant to pigmentation are shown in Appendix 2—table 1. The lowest (adjusted) p-value for this collection of variants is about 0.001, considerably larger than the p-values for the variants included as covariates in our standard model. Inclusion of the SNP of lowest p-value from each of the five regions containing BCN2, TYR, OCA2, MC1R, and OPRM1 only modestly altered effect sizes for the other covariates (Appendix 1—table 5).
The effect size for SLC24A5A111T measured here is consistent with previously reported results of –5 melanin units calculated from an African-American sample (Lamason et al., 2005; Norton et al., 2007) and –5.5 from admixed inhabitants of the Cape Verde islands (Beleza et al., 2013). Reported effect sizes for continental Africans are both higher and lower, –7.7 in Crawford et al., 2017, and –3.6 Martin et al., 2017b, while the estimated effect size in the CANDELA study (GWAS of combined admixed populations from Mexico, Brazil, Columbia, Chile, and Peru) (Adhikari et al., 2019) yielded an effect size about –3 melanin units.
A significant effect of SLC45A2L374F on skin pigmentation reported for the African-American sample by Norton et al., 2007, and in the CANDELA study by Adhikari et al., 2019, but not for the African Caribbean sample by Norton et al., 2007. The 4 unit effect size of this allele in the Kalinago reported here is similar to the 5 unit effect reported by Norton et al., 2007. Beleza et al., 2013 reported significance for an SNP in strong linkage disequilibrium with SLC45A2L374F, which was itself not genotyped.
Our estimate that a single OCA2NW273KV allele causes about –8 melanin units of skin lightening is the first reported population-based effect size measurement for any albinism allele. Although albinism is generally considered recessive, our population sample offered an opportunity to compare the effect size for the first and second alleles quantitatively. We applied the estimated parameters to the three albino individuals and found that they were lighter by an average of 10 uni nm, 05W homozygotes, when controlling for OCA2NW273KV status, OCA2R305W had no detectable effect on skin color (Appendix 2—table 1).
To identify novel SNPs that may contribute toward skin pigmentation in the Kalinago samples, we performed GWAS using linear regression and linear mixed models (LMMs). Estimated power for these analyses is shown in Figure 5—figure supplement 1, and Q-Q plots are depicted in Figure 5—figure supplement 2. The LMM approaches exhibited less statistic inflation than linear regression, likely because they better accounted for closely related individuals. Although the lowest p-values from the LMM-based methods meet the conventional criterion of 5e-08 for genome-wide significance (Appendix 3—table 1), our interpretation is that none of these variants warrant further investigation. Low observed minor allele frequencies (<2%) are inconsistent with those expected for variants responsible for pigmentation differences between the African and Native American populations because the frequencies of alleles responsible for population differences are expected to be highly differentiated between these source populations.
Additional Native American hypopigmenting alleles of significant effect size remain to be identified. Previously characterized variants do not explain this difference. It is possible that multiple hypopigmenting variants of small effect sizes are together required to reach Native American and/or East Asian levels of hypopigmentation, individually having insufficient effect to detect in the Kalinago, given our power limitations. If this is the case, multiple variants are required to explain the observed net difference in pigmentation. Alternatively, if there are variants with large effect sizes, it appears likely that they were not genotyped and are poorly tagged by the genotyped SNPs. Additional work will be required to find hypopigmentation alleles of significant effect size that are responsible for the lighter color of Native Americans.
Materials and methods
Recruitment
Request a detailed protocolParticipants from among the Kalinago populations were recruited with the help of nurses from the Kalinago Territory in 2014. Recruitment took place throughout the territory’s eight hamlets. Place and date of birth, reported genealogical ancestry of parents and grandparents, number of siblings, and response to sun exposure (tanning ability, burning susceptibility) were obtained by interview. Hair color and texture and eye color (characterized as black, brown, gray, blue, green, hazel, no pigment) were noted visually but not measured quantitatively.
Skin reflectometry
Request a detailed protocolSkin reflectance was measured using a Datacolor CHECKPLUS spectrophotometer and converted to melanin unit as we have previously described (Ang et al., 2012; Diffey et al., 1984). To minimize the confounding effects of sun exposure and body hair, skin color measurements were measured on each participant’s inner arm, and the average of triplicate measurements was generated. Before skin color measurements were taken, alcohol wipes were used to minimize the effect of dirt and/or oil. In order to minimize blanching due to occlusion of blood from the region being measured, care was taken not to apply only sufficient pressure to the skin to prevent ambient light from entering the scanned area (Fullerton et al., 1996).
DNA collection
Request a detailed protocolSaliva samples were collected using the Oragene Saliva kit, and DNA was extracted using the prepIT.L2P kit, both from DNA Genotek (Ottawa, Canada). DNA integrity was checked by agarose gel electrophoresis and quantitated using a NanoDrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). Further quantification was done using Qubit Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA) as needed, following the manufacturer’s instructions.
Genotyping
Request a detailed protocolOCA variants previously identified in African and Native Americans (Carrasco et al., 2009; King et al., 2003; Stevens et al., 1997; Yi et al., 2003) were amplified by PCR in all albino individuals as well as control samples using published conditions. Selected alleles of SLC24A5, SLC45A2, OCA2, and MFSD12 were amplified in all sampled individuals as described in Appendix 1—table 6. Amplicons generated by 30 cycles of PCR using an Eppendorf thermocycler were sequenced (GeneWiz, South Plainfield, NJ, USA) and the chromatograms viewed using Geneious software.
Illumina SNP genotyping using the Infinium Omni2.5–8 BeadChip was performed for all the individuals sampled. This was performed in three cohorts, using slightly different versions of the array, and the results combined. Due to ascertainment differences between the cohorts, analysis is presented here only for the combined sample. After quality control to eliminate duplicates and monomorphic variants, and to remove variants and individuals with genotype failure rates >0.05, 358 Kalinago individuals and 1,638,140 unique autosomal SNPs remained.
Whole exome sequencing of albino individual and obligate carrier
Request a detailed protocolIn order to identify the causative variant for albinism in the Kalinago, two samples (one albino individual and one parent) were selected for whole exome sequencing. Following shearing of input DNA (1 µg) using a Covaris E220 Focused-ultrasonicator (Woburn, MA, USA), exome enrichment and library preparation was done using the Agilent SureSelect V5+UTR kit (Santa Clara, CA, USA). The samples were sequenced at 50× coverage using a HiSeq 2500 sequencer (Illumina, San Diego, CA, USA).
The fastq files were aligned back to Human Reference Genome GRCh37 (HG19) using BWA (Li and Durbin, 2009) and bowtie (Langmead et al., 2009). Candidate SNP polymorphisms were identified using GATK’s UnifiedGenotyper (McKenna et al., 2010), while the IGV browser was used to examine the exons of interest for indels (Thorvaldsdóttir et al., 2013). Variants with low sequence depth (<10) in either sample were excluded from further consideration.
Computational analysis
Request a detailed protocolBasic statistics, merges with other datasets, and association analysis by linear regression were performed using plink 1.9 (Chang et al., 2015; Purcell et al., 2007). Phasing and imputation, as well as analysis of regions of homozygosity by descent and identity by descent were performed with Beagle 4.1 (Browning and Browning, 2013; Browning and Browning, 2007), using 1KGP phased data (Auton et al., 2015) as reference.
The genotyped individuals were randomly partitioned into nine subsets of 50 or 51 individuals (n=50 subsets) in which no pair exhibited greater than second-order relationship (PI_HAT >0.25 using the --genome command in plink). Using the same criteria, a maximal subset of 184 individuals was also generated (n=184 subset).
PCA was performed using the smartpca program (version 13050) in the eigensoft package (Price et al., 2006). For comparison to HGDP populations, Kalinago samples were projected onto PCs calculated for the HGDP samples alone. For use as covariates in association analyses, the n=184 subset was used to generate the PCA, and the remaining individuals were projected onto the same axes.
Admixture analysis was performed using the ADMIXTURE program (Alexander et al., 2009; Zhou et al., 2011). Each of the nine n=50 Kalinago subsets was merged with the N=940 subset of HGDP data (Li et al., 2008; Rosenberg, 2006) for analysis (349,923 SNPs) and the outputs combined, averaging genetic ancestry proportions for the common HGDP individuals across runs. These results were used in figures. Separately, two-stage admixture analysis started with the averaged estimated allele frequencies and then employed the projection (--P) matrix outputs to estimate individual genetic ancestry for the combined Kalinago sample. Individual ancestries estimated using both methods, as well as those estimated from a thinned subset of 50,074 SNPs were in good agreement, consistent with standard errors estimated by bootstrap analysis, although sample-wide averages differed slightly. Cross-validation is enabled by adding the --cv to the ADMIXTURE command.
For association analyses we removed the three-albino individuals and excluded SNPs with minor allele frequency <0.01. For conventional association analysis by linear regression, the standard additive genetic model included sex, the first 10 PCs, and genotypes of rs1426654 (SLC24A5), rs16891982 (SLC45A2), and the albino variant rs797044784 (OCA2) as covariates (Supplementary file 4). LMM analysis was performed using the mlma module of GCTA (Yang et al., 2011) with the --mlma-no-preadj-covar flag to suppress calculation using residuals. Two genetic relatedness matrices (GRM) were used: a standard GRM calculated using GCTA’s --make-grm command and an ancestry-aware GRM calculated using relationships deduced by REAP (Thornton et al., 2012) that utilized the output of the two-stage admixture analysis. For linear regression only, p-values were adjusted for statistic inflation by genomic control using the lambda calculated from the median chi-square statistic.
Statistical power was estimated by simulation, using a subset of genotyped SNPs. Starting with the 349,923 SNPs used for genetic ancestry analysis, the averaged P matrix from ADMIXTURE analysis at K=4 provided an initial estimate of allele frequencies in AFR and NAM ancestral populations; 10,233 SNPs exhibited differentiation of 0.7 or greater between these populations, a value chosen as a reasonable minimum population differentiation for causative variants. After removal of SNPs for which predicted Kalinago sample frequencies deviated by more than 0.1 from observed values and those with adjusted p<0.1, 8766 SNPs remained. Phenotypes were simulated by randomly selecting one of these SNPs and adding a defined effect size to the observed phenotype. Simulated datasets were then analyzed with plink using the standard genetic model.
Statistical analysis of pigmentary effect of albinism involved fitting parameters to an additive model for the sample containing carriers but lacking albino individuals, applying the same model to the albino individuals, and comparing residuals for the albinos and the other individuals.
Local genetic ancestry analysis of the region containing the albinism allele was performed using the PopPhased version of rfmix (v1.5.4) with the default window size of 0.2 cM (Maples et al., 2013). A subset of 1KGP data served as reference haplotypes for European, African, and East Asian populations, and the Native American genetic ancestry segments of the admixed samples as determined by Martin et al., 2017a, were combined to generate synthetic Native American reference haplotypes. For estimates of individual genetic ancestry, Viterbi outputs for each window were averaged across all autosomes.
Appendix 1
Supplementary Tables
Appendix 2
Supplementary Tables
Appendix 3
Supplementary Tables
Data availability
The whole exome sequencing and whole genome SNP genotyping data underlying this article cannot be shared publicly due to the privacy of individuals and stipulation by the Kalinago community. Only de-identified filtered SNP data used in analyses will be shared. Additional data will be shared on request to the corresponding author, pending approval from the Kalinago Council. M-index and specific genotyping data (SLC24A5 A111T, SLC45A2 L374F, OCA2 NW273KV and OCA2 305W) and genotyping data for Admixture have been uploaded to Dryad https://doi.org/10.5061/dryad.sf7m0cg7z. The data cannot be used for any commercial purposes. We did not create any new software or script for analysis.
-
Dryad Digital RepositoryNative American Genetic Ancestry and Pigmentation Allele Contributions to Skin Color in a Caribbean Population.https://doi.org/10.5061/dryad.sf7m0cg7z
References
-
Fast model-based estimation of ancestry in unrelated individualsGenome Research 19:1655–1664.https://doi.org/10.1101/gr.094052.109
-
The timing of Pigmentation lightening in EuropeansMolecular Biology and Evolution 30:24–35.https://doi.org/10.1093/molbev/mss207
-
Admixture and population stratification in African Caribbean populationsAnnals of Human Genetics 72:90–98.https://doi.org/10.1111/j.1469-1809.2007.00398.x
-
An anthropological genetic perspective on Creolization in the Anglophone CaribbeanAmerican Journal of Physical Anthropology 151:135–143.https://doi.org/10.1002/ajpa.22261
-
Admixture mapping identifies an Amerindian ancestry locus associated with albuminuria in Hispanics in the United StatesJournal of the American Society of Nephrology 28:2211–2220.https://doi.org/10.1681/ASN.2016091010
-
Rapid and accurate haplotype phasing and missing-data inference for whole-genome Association studies by use of localized haplotype clusteringThe American Journal of Human Genetics 81:1084–1097.https://doi.org/10.1086/521987
-
A splice site Mutation is the cause of the high prevalence of Oculocutaneous Albinism type 2 in the KUNA populationPigment Cell & Melanoma Research 22:645–647.https://doi.org/10.1111/j.1755-148X.2009.00575.x
-
ALFRED: a web-accessible allele frequency databasePacific Symposium on Biocomputing 2000:639–650.https://doi.org/10.1142/9789814447331_0062
-
BookHuman migrationIn: Beaty KG, editors. Migration of Garifuna: Evolutionary Success StoryHuman Migration. New York: Oxford University Press. 153.https://doi.org/10.1093/oso/9780190945961.003.0013
-
A portable instrument for Quantifying erythema induced by ultraviolet radiationThe British Journal of Dermatology 111:663–672.https://doi.org/10.1111/j.1365-2133.1984.tb14149.x
-
Molecular and clinical characterization of Albinism in a large cohort of Italian patientsInvestigative Opthalmology & Visual Science 52:1281.https://doi.org/10.1167/iovs.10-6091
-
Was skin cancer a selective force for black Pigmentation in early Hominin evolutionProceedings Biological Sciences 281:20132955.https://doi.org/10.1098/rspb.2013.2955
-
Oculocutaneous AlbinismOrphanet Journal of Rare Diseases 2:43.https://doi.org/10.1186/1750-1172-2-43
-
Mutations in C10Orf11, a melanocyte-differentiation gene, cause autosomal-recessive albinismAmerican Journal of Human Genetics 92:415–421.https://doi.org/10.1016/j.ajhg.2013.01.006
-
Clinical utility gene card for: oculocutaneous albinismEuropean Journal of Human Genetics 22:307.https://doi.org/10.1038/ejhg.2013.307
-
Skin colour and vitamin D: an updateExperimental Dermatology 29:864–875.https://doi.org/10.1111/exd.14142
-
Frequency of genetic traits in the Caribs of DominicaHuman Biology 41:342–364.https://doi.org/10.1007/BF00278729
-
The cutaneous Photosynthesis of Previtamin D3: a unique Photoendocrine systemThe Journal of Investigative Dermatology 77:51–58.https://doi.org/10.1111/1523-1747.ep12479237
-
Albinism in Africa as a public health issueBMC Public Health 6:212.https://doi.org/10.1186/1471-2458-6-212
-
BookReview of the lesser Antilles in the age of European expansionIn: Honychurch L, editors. NWIG: New West Indian Guide / Nieuwe West-Indische Gids. Brill. pp. 305–307.
-
The evolution of human skin colorationJournal of Human Evolution 39:57–106.https://doi.org/10.1006/jhev.2000.0403
-
Computational screening of disease-associated mutations in Oca2 GeneCell Biochemistry and Biophysics 68:97–109.https://doi.org/10.1007/s12013-013-9697-2
-
Genetic Admixture, self-reported Ethnicity, self-estimated Admixture, and skin Pigmentation among Hispanics and native AmericansAmerican Journal of Physical Anthropology 138:375–383.https://doi.org/10.1002/ajpa.20945
-
Diverse mutations of the P gene among African-Americans with type IIHuman Molecular Genetics 3:2047–2051.
-
Rfmix: A Discriminative modeling approach for rapid and robust local-ancestry inferenceAmerican Journal of Human Genetics 93:278–288.https://doi.org/10.1016/j.ajhg.2013.06.020
-
Human demographic history impacts genetic risk prediction across diverse populationsAmerican Journal of Human Genetics 100:635–649.https://doi.org/10.1016/j.ajhg.2017.03.004
-
Genetic consequences of the transatlantic slave trade in the AmericasAmerican Journal of Human Genetics 107:265–277.https://doi.org/10.1016/j.ajhg.2020.06.012
-
Genetic evidence for the Convergent evolution of light skin in Europeans and East AsiansMolecular Biology and Evolution 24:710–722.https://doi.org/10.1093/molbev/msl203
-
A study of skin color by Melanin index according to site, gestational age, birth weight and season of birth in Korean neonatesJournal of Korean Medical Science 20:105–108.https://doi.org/10.3346/jkms.2005.20.1.105
-
PLINK: A tool set for whole-genome Association and population-based linkage analysesAmerican Journal of Human Genetics 81:559–575.https://doi.org/10.1086/519795
-
Shades of complexity: new perspectives on the evolution and genetic architecture of human skinAmerican Journal of Physical Anthropology 168 Suppl 67:4–26.https://doi.org/10.1002/ajpa.23737
-
ALFRED: an allele frequency resource for research and teachingNucleic Acids Research 40:D1010–D1015.https://doi.org/10.1093/nar/gkr924
-
BookA brief history of the Caribbean: from the Arawak and Carib to the present: Rosenberg NAIn: Rogoziński J, editors. Annals of Human Genetics. Wiley. pp. 841–847.
-
Population differences of two coding SNPs in Pigmentation-related genes Slc24A5 and Slc45A2International Journal of Legal Medicine 121:36–39.https://doi.org/10.1007/s00414-006-0112-z
-
Frequent intragenic deletion of the p gene in tanzanian patients with type ii oculocutaneous albinism (Oca2)American Journal of Human Genetics 56:1320–1323.
-
An intragenic deletion of the p gene is the common mutation causing tyrosinase-positive oculocutaneous albinism in southern african negroidsAmerican Journal of Human Genetics 56:586–591.
-
A genomewide association study of skin pigmentation in A South Asian populationAmerican Journal of Human Genetics 81:1119–1132.https://doi.org/10.1086/522235
-
Molecular genetics of human pigmentation diversityHuman Molecular Genetics 18:9–17.https://doi.org/10.1093/hmg/ddp003
-
Genetic determinants of hair, eye and skin pigmentation in EuropeansNature Genetics 39:1443–1452.https://doi.org/10.1038/ng.2007.13
-
Estimating kinship in admixed populationsAmerican Journal of Human Genetics 91:122–138.https://doi.org/10.1016/j.ajhg.2012.05.024
-
Integrative Genomics viewer (IGV): high-performance Genomics data visualization and explorationBriefings in Bioinformatics 14:178–192.https://doi.org/10.1093/bib/bbs017
-
Hypomelanoses in childrenJournal of Cutaneous and Aesthetic Surgery 6:65–72.https://doi.org/10.4103/0974-2077.112665
-
Ocular Albinism and Hypopigmentation defects in Slc24A5-/- miceVeterinary Pathology 45:264–279.https://doi.org/10.1354/vp.45-2-264
-
Albinism (Oca2) in AmerindiansAmerican Journal of Physical Anthropology Suppl 41:118–140.https://doi.org/10.1002/ajpa.20357
-
GCTA: A tool for genome-wide complex trait analysisAmerican Journal of Human Genetics 88:76–82.https://doi.org/10.1016/j.ajhg.2010.11.011
-
A 122.5-Kilobase deletion of the P Gene underlies the high prevalence of Oculocutaneous Albinism type 2 in the Navajo populationAmerican Journal of Human Genetics 72:62–72.https://doi.org/10.1086/345380
-
A quasi-Newton acceleration for high-dimensional optimization AlgorithmsStatistics and Computing 21:261–273.https://doi.org/10.1007/s11222-009-9166-3
Article and author information
Author details
Funding
Hershey Rotary Club
- Khai C Ang
Jake Gittlen Laboratories for Cancer Research
- Keith C Cheng
Department of Pathology, Penn State College of Medicine
- Keith C Cheng
Microryza (now Experiment.com)
- Khai C Ang
National Institute of Arthritis and Musculoskeletal and Skin Diseases (5R01 AR052535)
- Keith C Cheng
National Institute of Arthritis and Musculoskeletal and Skin Diseases (3R01 AR052535-03S1)
- Keith C Cheng
The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.
Acknowledgements
We would like to thank the Kalinago Council, Dominica Ministry of Health, nurses at the Kalinago Territory, Salybia Mission Project, and the Kalinago community for their assistance and participation in this study. We would also like to acknowledge faculty of Ross University, Portsmouth, Dominica (now Bridgetown, Barbados), especially Drs. Gerhard Meisenberg (retired) and Liris Benjamin of Ross University in helping us to obtain the necessary IRB approval for fieldwork. This work was supported by the Hershey Rotary Club, Microryza (now Experiment.com), Jake Gittlen Laboratories for Cancer Research, National Institutes of Health grants 5R01 AR052535 and 3R01 AR052535-03S1 from the National Institute of Arthritis and Musculoskeletal and Skin Diseases, and Department of Pathology for funding portions of this project. We would also like to acknowledge members of the Cheng Lab for their constructive comments and input.
Ethics
Human subjects: The study was reviewed and approved by the Kalinago council and institutional review boards of Penn State University (29269EP), Ross University, and the Dominica Ministry of Health (H125). Informed consent was obtained from each participant enrolled in the study, and in the case of minors, consent was also obtained from a parent or guardian.
Copyright
© 2023, Ang et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 3,031
- views
-
- 200
- downloads
-
- 0
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Evolutionary Biology
Signs of ageing become apparent only late in life, after organismal development is finalized. Ageing, most notably, decreases an individual’s fitness. As such, it is most commonly perceived as a non-adaptive force of evolution and considered a by-product of natural selection. Building upon the evolutionarily conserved age-related Smurf phenotype, we propose a simple mathematical life-history trait model in which an organism is characterized by two core abilities: reproduction and homeostasis. Through the simulation of this model, we observe (1) the convergence of fertility’s end with the onset of senescence, (2) the relative success of ageing populations, as compared to non-ageing populations, and (3) the enhanced evolvability (i.e. the generation of genetic variability) of ageing populations. In addition, we formally demonstrate the mathematical convergence observed in (1). We thus theorize that mechanisms that link the timing of fertility and ageing have been selected and fixed over evolutionary history, which, in turn, explains why ageing populations are more evolvable and therefore more successful. Broadly speaking, our work suggests that ageing is an adaptive force of evolution.
-
- Evolutionary Biology
- Genetics and Genomics
The evolutionary origins of Bilateria remain enigmatic. One of the more enduring proposals highlights similarities between a cnidarian-like planula larva and simple acoel-like flatworms. This idea is based in part on the view of the Xenacoelomorpha as an outgroup to all other bilaterians which are themselves designated the Nephrozoa (protostomes and deuterostomes). Genome data can provide important comparative data and help understand the evolution and biology of enigmatic species better. Here, we assemble and analyze the genome of the simple, marine xenacoelomorph Xenoturbella bocki, a key species for our understanding of early bilaterian evolution. Our highly contiguous genome assembly of X. bocki has a size of ~111 Mbp in 18 chromosome-like scaffolds, with repeat content and intron, exon, and intergenic space comparable to other bilaterian invertebrates. We find X. bocki to have a similar number of genes to other bilaterians and to have retained ancestral metazoan synteny. Key bilaterian signaling pathways are also largely complete and most bilaterian miRNAs are present. Overall, we conclude that X. bocki has a complex genome typical of bilaterians, which does not reflect the apparent simplicity of its body plan that has been so important to proposals that the Xenacoelomorpha are the simple sister group of the rest of the Bilateria.