Research Article

Genetic diversity of CHC22 clathrin impacts its function in glucose metabolism

Imperial College London, United Kingdom
University College London, United Kingdom
University of Colorado, United States
Birkbeck College and University College London, United Kingdom
CNRS, France
University of Ferrara, Italy
University of Bristol, United Kingdom
University of Pennsylvania, United States
Stanford University, United States

Jun 4, 2019

Open access
Copyright information

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

CHC22 clathrin plays a key role in intracellular membrane traffic of the insulin-responsive glucose transporter GLUT4 in humans. We performed population genetic and phylogenetic analyses of the CHC22-encoding CLTCL1 gene, revealing independent gene loss in at least two vertebrate lineages, after arising from gene duplication. All vertebrates retained the paralogous CLTC gene encoding CHC17 clathrin, which mediates endocytosis. For vertebrates retaining CLTCL1, strong evidence for purifying selection supports CHC22 functionality. All human populations maintained two high frequency CLTCL1 allelic variants, encoding either methionine or valine at position 1316. Functional studies indicated that CHC22-V1316, which is more frequent in farming populations than in hunter-gatherers, has different cellular dynamics than M1316-CHC22 and is less effective at controlling GLUT4 membrane traffic, altering its insulin-regulated response. These analyses suggest that ancestral human dietary change influenced selection of allotypes that affect CHC22’s role in metabolism and have potential to differentially influence the human insulin response.

https://doi.org/10.7554/eLife.41517.001

eLife digest

When we eat carbohydrates, they are digested into sugars that circulate in the blood to provide energy for the brain and other parts of the body. But too much blood sugar can be poisonous. The body regulates blood sugar balance using the hormone insulin, which triggers the removal of sugar from the blood into muscle and fat cells. This removal process involves a pore in membranes at the surface of muscle and fat tissue, called a glucose transporter, through which the sugar molecules can pass. During fasting, the glucose transporter remains inside muscle and fat. But after a meal, insulin acts to release the transporter from its storage area to the surface of the tissue. How efficiently this process happens reflects how efficiently sugar can be removed from the blood. When this pathway breaks down, it can lead to diabetes.

In humans, a protein called CHC22 is needed to deliver the glucose transporter to its storage area. In mice, CHC22 is absent. The question arises: do different animals' eating habits influence CHC22's role in controlling blood sugar? The evolutionary history of CHC22 in a number of different animals could reveal what is special about glucose transport after a meal in humans, and how it might fail in diabetes.

By analyzing the genomes of several different species, Fumagalli et al. found that the gene encoding CHC22 first evolved around the time animals began developing a backbone and complex nervous systems. Afterwards, it was lost by some animals – including mice, sheep and pigs. Fumagalli et al. also discovered that CHC22 varies between individual people. A new form of CHC22, which first appeared in ancient humans, is less effective at holding the glucose transporter inside muscle and fat – leading to a tendency to reduce blood sugar levels. This new form became more common in humans over a period witnessing the introduction of cooking, and later farming; both of these technologies are associated with increased sugar in the diet. But not everyone has this new variant of the gene – both the old and newer variants are present in people today.

The history of CHC22 suggests that it was useful for early humans to hold the glucose transporter inside muscle and fat, keeping blood sugar levels high, which contributed to the development of a large brain. But as humans became exposed to higher dietary levels of sugar the newer form of CHC22 allowed blood sugar to be lowered more readily. People with different forms of CHC22 are likely to differ in their ability to control blood sugar after a meal. In some cases, this could lead to heightened blood sugar levels, which in turn can lead to diabetes.

https://doi.org/10.7554/eLife.41517.002

Introduction

Clathrin-coated vesicles (CCVs) are key players in eukaryotic intracellular membrane traffic (Brodsky, 2012). Their characteristic lattice-like coat is self-assembled from cytoplasmic clathrin proteins, captures membrane-embedded protein cargo and deforms the membrane into a vesicle. This process enables CCVs to mediate protein transport to and from the plasma membrane and between organelles. The triskelion-shaped clathrin molecule is formed from three identical clathrin heavy chain (CHC) subunits. Humans have two genes (CLTC and CLTCL1) that respectively encode CHC17 and CHC22 clathrins (Wakeham et al., 2005). CHC17 clathrin, which has three bound clathrin light chain (CLC) subunits, is expressed uniformly in all tissues and forms CCVs that control receptor-mediated endocytosis, as well as lysosome biogenesis and maturation of regulated secretory granules. These pathways are conventionally associated with clathrin function and are mediated by clathrin in all eukaryotic cells (Brodsky, 2012). In humans CHC22 clathrin is most highly expressed in muscle and adipose tissue and forms separate CCVs that are not involved in endocytosis (Dannhauser et al., 2017). In these tissues, CHC22 CCVs regulate targeting of the glucose transporter 4 (GLUT4) to an intracellular compartment where it is sequestered until released to the cell surface in response to insulin (Vassilopoulos et al., 2009). This insulin-responsive GLUT4 pathway is the dominant mechanism in humans for clearing blood glucose into muscle and fat tissues after a meal (Shepherd and Kahn, 1999). In addition to its distinct tissue expression pattern and biological function, CHC22 does not bind the CLC subunits that associate with CHC17 clathrin, even though the CHC protein sequences are 85% identical (Dannhauser et al., 2017; Liu et al., 2001). This remarkable biochemical and functional divergence evolved since the gene duplication event that gave rise to the two different clathrins during the emergence of chordates (Wakeham et al., 2005). Notably, however, the CLTCL1 gene encoding CHC22 evolved into a pseudogene in the Mus genus, although mice maintain an insulin-responsive GLUT4 pathway for clearing blood glucose. This observation suggests that, despite the importance of the CLTCL1 gene product, backup pathways have evolved to compensate for loss of the CHC22 protein. To understand the evolution of the specialized function of CHC22, and the potential selective processes involved, we here explore the phylogenetic history of the CLTCL1 gene in vertebrates and its population genetics in humans, non-human primates and bears.

Ecological shifts create selective forces that filter variation in cellular genes. These include changes in nutritional conditions (Babbitt et al., 2011), as well as encounters with pathogens (Fumagalli et al., 2011); both documented as selective forces that affect membrane traffic genes (Elde and Malik, 2009; Liu et al., 2014). Recent studies of the evolution of genes involved in membrane traffic have focused on an overview of all eukaryotes with the goals of establishing the origins of membrane-traffic regulating proteins in the last common eukaryotic ancestor and defining the species distribution of various families of traffic-regulating proteins (Rout and Field, 2017; Dacks and Robinson, 2017). These studies have identified common features of proteins that regulate membrane traffic (Rout and Field, 2017) and revealed that extensive gene duplication has allowed lineage-specific diversification of coat proteins and other membrane traffic regulators, such as the Rab GTPases (Diekmann et al., 2011; Guerrier et al., 2017). Our earlier study of available annotated genomes in 2005 suggested that the gene duplication giving rise to the two CHC-encoding genes occurred as a result of one of the whole genome duplications contributing to vertebrate evolution (Wakeham et al., 2005). Here we focus on the more recent evolutionary history of these genes, as well as analyze the increased number of fully annotated vertebrate genomes. We establish that the loss of CLTCL1 in the Mus genus is not unique in vertebrates, identifying at least one additional independent gene loss event in the clade of Cetartiodactyla affecting pigs, cows, sheep, porpoise, and possibly additional related species. Nonetheless, there is strong evidence for CHC22 sequence conservation amongst those species that retain CLTCL1 (Wakeham et al., 2005). This evolutionarily recent gene loss in some lineages and retention of the functional form in others suggested that CLTCL1 may still be under purifying selection, so we examined CLTCL1 variation between individuals within vertebrate populations. Comparing populations, we found CLTCL1 to be considerably more polymorphic than CLTC, which encodes the clathrin found in all eukaryotes, with evidence for strong ancient purifying selection for CHC17 clathrin function and relaxed purifying selection on CHC22 function. Additionally, we identified two common allotypes of human CHC22, which have different functional properties. The derived allele arose in ancient humans and is more frequent in farming populations when compared to hunter-gatherers. We previously observed that CHC22 accumulates at sites of GLUT4 retention in the muscle of insulin-resistant patients with type two diabetes (Vassilopoulos et al., 2009) in addition to its active role in membrane traffic of GLUT4. Thus, CHC22 variation has potential to differentially affect membrane traffic pathways involved in insulin resistance, as well as alter normal glucose metabolism within human and other vertebrate populations. The analyses reported here lead us to propose that variation in the CHC22 clathrin coat may be a response to changing nutritional pressures both between and within vertebrate species.

Results

Phylogenetic analyses reveal selective loss or retention of a functional CLTCL1 gene in vertebrates

Identification of CHC-encoding genes in 62 vertebrate and non-vertebrate species ((Figure 1—figure supplement 1, Figure 1—figure supplement 2) indicates a dynamic history of gene duplications and losses (Figure 1). The CLTCL1 gene was detected only in jawed vertebrates (bony vertebrates and cartilaginous fish), while the two jawless vertebrate genomes available – lamprey (Petromyzon marinus) and hagfish (Eptatretus burgeri) – have only one CHC-encoding gene. This distribution refines the timing of the CHC-encoding gene duplication to the period after the Agnatha split off the vertebrate lineage, estimated at 493.8MYA (95% HPD: 459.3, 533.8), and before the evolution of jawed vertebrates 450.8MYA (95% HPD: 432.1, 468.1) (Hedges et al., 2015; dos Reis et al., 2015). Of the ten species of bony fish that split off the spotted gar (Lepisosteus oculatus) lineage (Amores et al., 2011), whose genomes are generally tetraploid, all had two versions of CLTC and at least one CLTCL1 gene, except for cave fish (Astyanax mexicanus) apparently lacking CLTCL1. Eight additional species of vertebrates with high genome coverage and reliable annotation had the CLTC gene but no identifiable CLTCL1 gene. CLTCL1 genes are present in the Caviomorpha and Sciuridae rodent suborders, and lost in the Muroidea suborder from the entire Mus genus (Wakeham et al., 2005) and from rat (Rattus norvegicus). The Cetartiodactyla clade also appears to have lost CLTCL1, as CLTCL1 is absent from the four representative genomes in our dataset (pig (Sus scrofa), sheep (Ovis aries), cow (Bos taurus), Yangtze finless porpoise (Neophocaena asiaeorientalis)). This suggests a loss event before the Cetartiodactyla lineage split, independent of the loss event preceding split of the Muroidea lineage. The absence of CLTCL1 in rat clarifies why CHC22 could not be biochemically identified in rat and indicates that antibodies against CHC22 that react with rat cells must cross-react with other proteins (Towler et al., 2004). CLTCL1 was also not detected in the genomes of the little brown bat (Myotis lucifugus) and the duck-billed platypus (Ornithorhynchus anatinus). Assuming the genome annotations for the species analyzed are reliable, these data indicate that there have been at least five independent losses of CLTCL1 that are clade- or species-specific. The intermittent loss of CLTCL1 and the retention of CLTC raises the question of whether their patterns of evolution are typical for genes with related functions that duplicated in the same time frame as CLTC/CLTCL1.

Figure 1 with 2 supplements see all

Download asset Open asset

Phylogenetic analysis of *CLTC/CLTCL1* reveals independent loss of the gene encoding CHC22 clathrin from vertebrate lineages and complete conservation of the gene encoding CHC17 clathrin.

Phylogenetic profiles *of CLTC/CLTCL1* are shown, with gene presence in the corresponding genome indicated by a filled black circle. All sequences used have less than 5% unspecified residues (‘X’s in the relevant database). Divergent gene sequences with low sequence similarity but that still fall within the *CLTC* clade are shown as empty circles (see Materials and methods for similarity threshold and sequence IDs). Based on the profile and species tree the most parsimonious phylogenetic tree for loss and duplication events is inferred and shown as red stars and blue squares, respectively.

https://doi.org/10.7554/eLife.41517.003

CLTC and CLTCL1 are located on paralogous regions of human chromosomes 17 and 22, respectively. For these two genes, the evolutionary rates (rate of non-synonymous substitutions to rate of synonymous substitutions; dN/dS) across vertebrates at each position were determined and plotted along the length of the protein sequences (Figure 2A–B). Several adjacent paralogs have been maintained in these chromosomal regions, some of which are involved in membrane traffic, including the gene pair of MTMR4 and MTMR3, encoding myotubularin lipid phosphatases. Also, CLC subunits of CHC17 clathrin are encoded by paralogous genes on different chromosomes (CLTA and CLTB) that arose from a local gene duplication, mapped to the same time frame as the CHC-encoding duplication (Wakeham et al., 2005). Comparison of the distribution of dN/dS ratios for the three pairs revealed stronger purifying selection on the CLTC/CLTCL1 genes than on MTMR4/MTMR3 and CLTA/CLTB (Figure 2C–E), suggesting the CHC-encoding clade is more evolutionarily constrained. This observation is consistent with our previous identification of conserved signature residues in CLTCL1 using DIVERGE analysis (Wakeham et al., 2005) and indicates conserved functions for both the CLTC and CLTCL1 gene products. Furthermore, there is a striking difference in the distribution and average of evolutionary rates, as measured by dN/dS, between CLTC and CLTCL1 (Kolmogorov-Smirnov test p-value<2.2e-16), with CLTC being significantly more constrained by purifying selection than CLTCL1. In contrast, there is minimal difference in the distribution and average of evolutionary rates between the two paralog pairs MTMR4/MTMR3 and CLTA/CLTB (Kolmogorov-Smirnov test yields p-values 0.003643 and 0.9959, respectively).

Figure 2

Download asset Open asset

Genes encoding clathrin heavy chains show evidence for purifying selection with *CLTCL1* (CHC22-encoding) being more variable than *CLTC* (CHC17-encoding) over evolutionary time.

Evolutionary rates expressed as dN/dS ratios are shown for each position in *CLTC* (A) and *CLTCL1* (B). Rates are averages over an entire phylogenetic tree and therefore not specific to the human proteins. However, to assist interpretation, only rates for residues present in the human proteins are shown. Kernel density estimates of the distributions of dN/dS ratios per paralogous pair of proteins (**C–E**). *CLTA* and *CLTB* encode clathrin light chains A and B, respectively. *MMTR3* and *MMTR4* encode myotubularin lipid phosphatases. Mean dN/dS ratios averaged over all sites are shown as hatched marks.

https://doi.org/10.7554/eLife.41517.006

Human population genetic analyses indicate purifying selection with ongoing diversification for CLTCL1

To follow up the indication that CLTC and CLTCL1 are subject to different degrees of purifying selection, we investigated their variation in human populations. We analyzed 2504 genomes from the 1000 Genomes Project database, phase 3 (1000 Auton et al., 2015) and identified alleles resulting from non-synonymous substitutions for CLTC and CLTCL1. This dataset included individuals from each of five human meta-populations: European (EUR, 503), East Asian (EAS, 504), Admixed American (AMR, 347), South Asian (SAS, 489) and African (AFR, 661). Individual populations with their abbreviations are listed in Supplementary file 1a. The reference sequences for chimpanzee (Pan troglodytes) and pseudo-references for two archaic humans, Altai Neanderthal and Denisovan, were also included to relate allelic variation to the ancestral state. A median-joining network for all the inferred CLTC human alleles showed a very common allele (sample frequency 0.997) with only five low-frequency variants generating a total of six alleles (Figure 3A). Each allele encodes a variant of a CHC (allotype), which includes one or more single nucleotide polymorphisms (SNPs).

Figure 3 with 1 supplement see all

Download asset Open asset

The *CLTCL1* gene encoding human CHC22 has two major variants, and is highly polymorphic relative to the human *CLTC* gene encoding CHC17, with a similar pattern in chimpanzees.

Median joining network of human alleles for *CLTC* (A) and *CLTCL1* (B) are shown. Each circle represents a unique allele whose global frequency is proportional to its circle’s size and the line length between circles is proportional to the number of non-synonymous changes between alleles. For *CLTC*, the least common alleles have a frequency ranging from 0.04% and 0.06% and the circles representing them were magnified by a factor of 10. For *CLTCL1*, only alleles with a frequency greater than 20% were plotted. The two major alleles show a combined frequency of 77% while the other alleles depicted in the figure have a frequency ranging from 0.44% to 5.67%. Segregation of the M1316V variation is depicted with a hashed line, with alleles carrying the M variant on the left-hand side, and alleles carrying the V variant on the right-hand side. The meta-populations in which the allele is found are indicated in color representing their percentage of the total frequency of the allele in humans. Meta-populations analyzed are African (AFR), American (AMR), East Asian (EAS), European (EUR), South Asian (SAS). (**C–D**) Median joining network of *CLTC* (C) and *CLTCL1* (D) alleles for chimpanzees (*Pan troglodytes*, four identified subspecies and one unidentified) and bonobos (*Pan paniscus*). The species and subspecies in which each variant is found are indicated in color representing their percentage of the of the total frequency of the variant in chimpanzees and bonobos.

https://doi.org/10.7554/eLife.41517.007

In contrast to CLTC, we identified 46 non-synonymous SNPs in CLTCL1, present in 52 distinct haplotypes (referred to here as alleles, following the definition given above, Supplementary file 2b-c). A median-joining network for the most common CLTCL1 alleles showed that they are widely distributed within the human meta-populations (Figure 3B). Each meta-population tends to have private, less frequent alleles. Nevertheless, all the meta-populations comprised two main allelic clades, together constituting a sample frequency of 77%. These two main alleles differ by a single methionine to valine substitution at position 1316 (M1316V) in the protein sequence (SNP ID rs1061325 with genomic location chr22:19184095 on hg19 assembly). The valine at position 1316 is predicted to have a functional effect on the protein since it was categorized as ‘probably damaging’ with a probability of 0.975 by PolyPhen (Adzhubei et al., 2010) and as ‘damaging’ by SIFT (Kumar et al., 2009). Sequences in chimpanzee and archaic humans have the M1316 allotype, suggesting that M1316 is likely to represent the ancestral state. To further investigate this, we inspected raw sequencing data from both Altai Neanderthal and Denisovan (Supplementary file 1d). We inferred the most likely genotype to be homozygous for the M1316 amino acid (minimum sequencing depths equal to 40 and 28, respectively). We then extracted sequencing data for an additional 13 archaic and ancient humans (Supplementary file 1d). We found that the V1316 amino acid is present in Pleistocene hunter-gatherers and Neolithic farmers but not in other Neanderthals or Holocene hunter-gatherers in this limited data set. The equivalent residue in human CHC17 (encoded by CLTC) is also methionine, suggesting that methionine at this position likely pre-dated the initial duplication generating CLTCL1. For the non-human species analyzed (Figure 1), all CHC-encoding genes present would produce clathrins with M1316, further indicating its ancient and conserved role in CHC structure or function.

To quantify the levels of nucleotide and allelic diversity for non-synonymous sites within human populations, several summary statistics of diversity were calculated. For populations within each meta-population, we separately calculated Watterson’s and Nei’s estimators of genetic diversity (TW and PI, respectively), Tajima’s D (TD), Fu and Li’s D* (FLDs) and F* (FLFs), the sum of squared allele frequencies including the most common allele (H1) and excluding it (H2), and the normalized ratio between H2 and H1 (H2H1) (Supplementary file 2a). To assess whether observed summary statistics are expected or not under neutral evolution in each population, we calculated the empirical null distribution from a set of 500 control genes with the same coding length as CLTCL1 (Supplementary file 2b). High or low percentile rank values for CLTCL1 in the empirical distribution indicate that the summary statistic for CLTCL1 is unlikely to occur by mutation and neutral genetic drift alone. Summary statistics and populations were then clustered according to their empirical ranks and plotted on a heat map (Figure 4).

Figure 4 with 3 supplements see all

Download asset Open asset

Summary statistics for genetic diversity of *CLTCL1* indicate selection over neutral variation.

For each human population (on the rows) we calculated several summary statistics to analyze diversity (on the columns, defined in Materials and methods) and reported their percentile rank against their corresponding empirical distribution based on 500 control genes. The resulting matrix was then sorted on both axes as a dendrogram (not reported) based on the pairwise distances between each pair of populations. The populations analyzed, with their abbreviations, are listed in Supplementary file 1a, and the inclusive meta-population is indicated in parentheses, defined as in the legend to Figure 3. As depicted in the color legend, red and yellow denote low and high percentile ranks, respectively. Percentiles lower than 0.10 or greater than 0.90 are given in the corresponding cell.

https://doi.org/10.7554/eLife.41517.009

All populations tend to display high genetic diversity for CLTCL1, as summarized by PI and TW, and an unusually high frequency for the second most common allele, as summarized by H2 and H2H1. Such configuration is likely to occur under balancing selection (Charlesworth, 2006) or because of a soft sweep (Messer and Petrov, 2013). That CLTCL1 was low ranking in all populations for H1, a statistic representing the frequency of the most common allele, also supported diversifying selection rather than hard sweeps. On the other hand, all populations display negative TD values with many populations exhibiting negative FLDs and FLFs values. These values are consistent with low diversity within common alleles and an excess of low-frequency variants. Finally, we calculated a measure of genetic differentiation (fixation index F_ST) between pairs of canonical reference populations, namely Yoruba from Nigeria (YRI), North Americans with European ancestry (CEU), and Han Chinese from Beijing (CHB). We did not find any evidence that F_ST values for CLTCL1 (YRI-CEU 0.15, YRI-CHB 0.077, CEU-CHB 0.065) are outliers in the empirical distribution of control genes.

Such inconsistent patterns could be partly explained by the fact that we considered only non-synonymous changes, and the limited number of SNPs considered per gene may create a larger variance in the empirical distributions, especially for allele-based statistics. We therefore further examined the high frequency of the second most common allele by investigating whole genomic variation, including silent SNPs. We observed a local increase of H2 statistics in CLTCL1 for European populations, which already shows a large value based on non-synonymous changes (Figure 4—figure supplement 1). This analysis also indicates that any selection signatures are restricted to a local genomic region encompassing CLTCL1.

Another reason for the summary statistics not being strong outliers in the empirical distribution is the high recombination rate (sex-average rate of 2.5 cM/Mb) inferred for the genomic region encompassing CLTCL1 (Kong et al., 2002). We therefore performed coalescent simulations under neutrality of a putative 100kbp genomic region surrounding the SNP encoding the M1316V variation, taking into account the local recombination rate and a previously proposed demographic model for Africans (YRI), Europeans (CEU) and East Asians (CHB) (Gutenkunst et al., 2009) with a mutation rate of 1.5 × 10⁻⁸ per base pair per generation. The observed values for TW and PI were significantly greater than expected under neutral evolution for all populations (p-values<0.001), while TD was greater than expected for CHB only, although with a marginally non-significant statistical support (p-value 0.056). All these results are suggestive of a genetic diversity higher than expected under neutrality for a region encompassing M1316V, although possible complex evolutionary scenarios may limit the power of summary statistics to detect such selective events.

One plausible explanation of the high genetic diversity and frequency of the two major alleles of CLTCL1 that occur in all modern human populations (Figure 3B, Supplementary file 1b) is balancing selection (Charlesworth, 2006). Such a distribution of allele frequency was confirmed using a different data set of more than 50 sampled human populations (Figure 4—figure supplement 2). In several populations, we also observed an apparent excess of heterozygosity at SNP rs1061325 (Supplementary file 2c), compatible with heterozygote advantage (overdominance) for the two encoded allotypes differing at residue 1316. Specifically, all European populations show a ratio of observed versus expected (assuming Hardy-Weinberg equilibrium) heterozygosity greater than 1, with the highest value of 1.24 (chi-squared test nominal p-value 0.047) for Iberic Spanish (IBS) (Figure 4—figure supplement 3). Selective pressures that might be acting on CLTCL1, irrespective of population distribution, could be changes in human diet, a number of which have been inferred over the last 2.6 million years (Hardy et al., 2015). Perhaps the best known of these dietary transitions are the introduction of cooking ~450 KYA, the development of farming ~12,500 YA, and more recently industrialized food processing, which gradually and then dramatically increased carbohydrate availability and consumption by humans. As CHC22 clathrin, the gene product of CLTCL1, is required for formation of the intracellular pathway critical for an insulin response, its genetic history could potentially be influenced by these changes. To address the hypothesis that nutritional habits conferred selective pressure on CLTCL1, we compared the frequency of SNP rs1061325 (M1316V) in farming versus hunter-gatherer population samples from ancient and modern humans. Although the appearance of SNP rs1061325 predates the advent of farming (Supplementary file 1d), the observed frequencies of this allele, which encodes the CHC22-V1316 allotype, are consistent with a tendency for it to increase once farming became common practice for a population (Figure 5), although the small sample size for modern humans limits the power to reach statistical significance. The highest difference in allele frequency was observed between early farmers and hunter-gatherers from West Eurasia. However, as these two populations are highly diverged, it remains possible that this significant difference in allele frequency is due to genetic drift shaped by population history, rather than natural selection. To test this model, using the same dataset, we extracted 2500 control SNPs with a global minor allele frequency similar to rs1061325 (M1316V) (up to an error of 5%) and minimum global sequencing depth of 100X. We obtained statistical significance (p-value 0.036) when testing the difference in derived allele frequency in farmers compared to hunter-gatherers (+26.58%), while we found no statistical support (p-value 0.080) when testing the absolute difference in allele frequency between these two populations.

Figure 5

Download asset Open asset

Frequencies of the V1316 variant of CHC22 trend higher in populations of farmers compared to hunter-gatherers.

Maximum a posteriori estimates and 95% highest posterior density credible intervals of the frequency of V1316 are compared for modern and ancient hunter-gatherer (HG) and farmer populations indigenous to three continents. Probability of the V allele being at a higher frequency in farmers, labeled as P(f > hg), is also reported.

https://doi.org/10.7554/eLife.41517.013

Genetic variation in non-human vertebrate species supports functional diversification of CLTCL1

We analyzed allelic variation for CLTC and CLTCL1 in the genomes of 79 individuals representing six species of great ape, two species each for chimpanzees, gorillas and orangutans (Pan troglodytes, Pan paniscus, Gorilla beringei, Gorilla gorilla, Pongo abellii, Pongo pygmaeus). After data filtering and haplotype phasing, we found no non-synonymous SNPs for CLTC and 64 putative non-synonymous SNPs for CLTCL1 (Supplementary file 3a). In three species of great apes analyzed, one of the non-synonymous changes in CLTCL1 leads to a premature stop-codon at amino position 41, with an overall frequency of 36%. However, sequences containing the stop-codon exhibited only a marginal increase of nucleotide diversity (+4.7% as measured by Watterson’s index; Watterson, 1975) compared to the full-length sequences, suggesting that these are relatively new variants. Notably, for all the non-human primates analyzed, CLTCL1 variants do not encode the V1316 allotype, which appears private to humans. However, in all three types of great ape we found a common but different substitution, threonine (T1316), at the same amino acid position.

To further investigate variation in non-human primates, we increased the sample size per species by analyzing CLTC and CLTCL1 variation in 70 chimpanzee and bonobo genomes, including four subspecies of chimpanzee. While no variation was observed for CLTC (Figure 3C), a median-joining network for the inferred 8 CLTCL1 alleles (Supplementary file 3b) showed a major allele common to different species and subspecies with less frequent alleles primarily restricted to individual ones (Figure 3D). In this chimpanzee data set, we observed considerable diversity, with a potential tendency towards multiple variants. However, amino acid 1316 was not covered in this data set, possibly due to poor data mapping quality associated with the high nucleotide diversity observed. In another data set of 20 individuals (Teixeira et al., 2015), we found a frequency of 10% for the T1316 allotype in chimpanzees but not in bonobos.

We further investigated CLTCL1 variation in polar bears (Ursus maritimus) and their closest related species, brown bears (Ursus arctos). These two species, which diverged 479–343 KYA (Liu et al., 2014), have very different diets (Liu et al., 2014; Bojarska and Selva, 2012) and are phylogenetically closer to each other than chimpanzees and humans. Polar bears subsist on a high fat, low carbohydrate diet, whereas brown bears consume a more varied diet of carbohydrate, protein and fat. Analysis of 21 bear genomes (seven polar bears and 14 brown bears) (Benazzo et al., 2017), revealed three positions (1267, 1389, and 1522) which are fixed in polar bears but are either polymorphic or have a different residue in brown bears (Supplementary file 3c). Genetic differentiation between polar and brown bears, as measured by F_ST, is markedly higher for CLTCL1 (0.56) than for CLTC (0.26) (Liu et al., 2014). Furthermore, a phylogenetic tree of both bear species in our sample exhibits more diversification for CLTCL1 compared to CLTC (Figure 3—figure supplement 1). This sample of bear populations may support the emergence of multiple CLTCL1 variants within a species and a potential role for diet-related selection.

Modeling CHC22 variation based on CLTCL1 polymorphism suggests an effect on clathrin lattice contacts

One expectation for selection of the human-specific CLTCL1 allele encoding the CHC22-V1316 allotype is that this amino acid change might confer a functional change in the clathrin lattice. This was predicted by the PolyPhen and SIFT analyses, highlighting the change as potentially structure-altering. As many humans are heterozygous for the M1316 and V1316 allotypes (44% based on all individuals from 1000 Genomes project), there may potentially be special properties for mixed lattices formed from the two protein allotypes. To address the possibility that the M1316V polymorphism affects protein function, we used MODELLER (Benjamin and Sali, 2014) to produce a homology model of the two CHC22 allotypes based on the crystal structure of CHC17 clathrin (PDB 1B89) (Ybe et al., 1999), taking advantage of the 85% protein sequence identity between human CHC17 and CHC22 (Figure 6). Modeling using UCSF Chimera (Pettersen et al., 2004) showed that residue 1316 is found at a key interface between triskelion legs in assembled clathrin (Figure 6A and top of panel B). If M1316 is substituted by V1316, the smaller side chain creates a void that would be energetically unfavorable (Figure 6, bottom of panel B), such that the triskelion leg might twist slightly to close the void. In the clathrin lattice, the legs have a torque that rotates the assembly interface along the protein sequence (Wilbur et al., 2005), so a further twist could slightly adjust the interface, altering assembly interactions. Changes in the assembly interface could affect integrity of the lattice and potentially influence kinetics of assembly and disassembly. Mixed lattices of the two CHC22 allotypes would therefore have different properties from CHC22 coats formed in homozygotes for the two major CLTCL1 alleles. CHC22 is needed for the traffic of GLUT4 to its intracellular storage compartment, where GLUT4 awaits release to the plasma membrane in response to insulin. However, CHC22 also accumulates at the GLUT4 storage compartment (GSC) when it expands due to impaired GLUT4 release in cases of insulin-resistant type two diabetes (T2D) (Vassilopoulos et al., 2009). Thus, genetic variation of CHC22 could alter rates of retention and release of GLUT4 in both healthy and disease states.

Figure 6

Download asset Open asset

Modeling of the structural changes in clathrin caused by the methionine-valine dimorphism at residue 1316 predicts conformational alteration.

Model of the CHC17 clathrin lattice (A) is reproduced with permission (Fotin et al., 2004) with the region comprising residue 1316 boxed. Panel B (top part) is the magnified boxed region in A with a CHC22-M1316 model (residues 1210 to 1516) docked into one of the four clathrin heavy chains (CHC-1) forming the edge of the lattice. The black arrow shows the location of the amino acid residue 1316 with the side chain highlighted in CHC-1. The density of the other three CHCs is indicated. Computational models of human CHC22 (residues 1210 to 1516) with either Met or Val at position 1316 (B, lower parts). The yellow circle encloses space opened by reducing the side chain size, which would require a shift in CHC torque to regain structurally favorable side chain contacts.

© 2004, Springer Nature. All Rights Reserved. Panel A reproduced, with permission, from Fotin et al. (2004).

https://doi.org/10.7554/eLife.41517.014

CHC22 variants display functional differences

To test whether the evolutionary change from M1316 to V1316 in CHC22 clathrin alters its properties, three aspects of CHC22 biochemistry and function were compared for the two allotypes. HeLa cells were transfected with constructs encoding each CHC22 variant or CHC17, tagged with green fluorescent protein (GFP). Atypically for their epithelial cell origin but not for transformed cells, HeLa cells express CHC22 clathrin (they are homozygous for the M1316 allotype) (Adey et al., 2013; Landry et al., 2013). We observed that the transfected fluorescently tagged CHC22 allotypes were both concentrated in the perinuclear region of the cell, similar to endogenous CHC22-M1316 detected by antibody, and did not overlap with endogenous CHC17 (Figure 7A). Conversely, transfected GFP-CHC17 did not overlap with endogenous CHC22, so expression of the transfected CHCs reflected their natural distribution (Dannhauser et al., 2017). Using these constructs, the dynamics of membrane association for the two allotypes of CHC22 and for CHC17 was assessed by Fluorescence Recovery After Photobleaching (FRAP). To assess clathrin turnover, as an indicator of clathrin coat stability, cells expressing fluorescent proteins were photobleached in the perinuclear area (Figure 7B) and their rate of fluorescence recovery was measured. Recovery of CHC17 fluorescence was the fastest, consistent with its more soluble properties compared to CHC22 (Dannhauser et al., 2017). CHC22-M1316 showed the slowest recovery and CHC22-V1316 was intermediate (Figure 7C–E), suggesting that it is more exchangeable in the CHC22 coat than the M1316 allotype.

Figure 7

Download asset Open asset

The CHC22-M1316 and CHC22-V1316 allotypes have different dynamics of membrane association, as measured by fluorescence recovery after photobleaching (FRAP).

HeLa cells were transfected with CHC22-M1316-GFP (CHC22M) or CHC22-V1316-GFP (CHC22V) or CHC17-GFP and the expressed constructs were localized relative to endogenous CHC22 and CHC17, which were also compared to each other (A). Endogenous CHC22, CHC17 and the transfected proteins were visualized by immunofluorescence with anti-CHC22 rabbit polyclonal antibody (pAb, red), anti-CHC17 mouse monoclonal antibody (mAb, green) and anti-GFP chicken polyclonal antibody (green for CHC17-GFP or red for CHC22-GFP), respectively. Bars represent 3 μm (untransfected and CHC22M-GFP) and 5 μm (CHC22V-GFP and CHC17-GFP). Transfectants were photobleached in the circular region indicated (B) and recovery of fluorescence (FRAP) was visualized over time (bars, 10 μm) and quantified within the bleached regions (C). For the data in (C), area under the curves (D) and mobile fractions Mf (E) were calculated (Lippincott-Schwartz et al., 2001). We performed a one-way analysis of variance (ANOVA) with Tukey’s multiple comparison post-hoc test: * p-value<0.05, ** p-value<0.01.

https://doi.org/10.7554/eLife.41517.015

Figure 7—source data 1 FRAP experiment data set 1.: https://doi.org/10.7554/eLife.41517.016
Download elife-41517-fig7-data1-v1.xlsx
Figure 7—source data 2 FRAP experiment data set 2.: https://doi.org/10.7554/eLife.41517.017
Download elife-41517-fig7-data2-v1.xlsx
Figure 7—source data 3 FRAP experiment data set 3.: https://doi.org/10.7554/eLife.41517.018
Download elife-41517-fig7-data3-v1.xlsx

The impact of CHC22 variation on GLUT4 retention was then assessed. Because HeLa cells express CHC22, they can form a GSC, when transfected to express GLUT4. These cells sequester GLUT4 intracellularly, and then release it to the plasma membrane in response to insulin, behaving like muscle and adipocytes, though with more modest insulin response (Camus et al., 2018; Trefely et al., 2015; Haga et al., 2011). To detect GLUT4 release to the cell surface, we used a construct expressing GLUT4 tagged with mCherry and a hemagglutinin (HA) epitope embedded in an exofacial loop of the transporter (HA-GLUT4-mCherry). Appearance of surface GLUT4 in response to insulin was detected by fluorescence-activated cell sorting (FACS) using an antibody to the HA epitope (Figure 8A). Transfection of HeLa cells with siRNA depleting CHC22 ablates this insulin-responsive pathway (Camus et al., 2018) (Figure 8A). We then assessed if siRNA inhibition of insulin-responsive GLUT4 release can be rescued by expression of CHC22-M1316-GFP or CHC22-V1316-GFP. These constructs, the same as characterized in Figure 7A, are siRNA-resistant, as well as being GFP-tagged. We observed that, when endogenous CHC22 was depleted, CHC22-M1316 was able to restore the insulin response but CHC22-V1316 was not, when the rescue constructs were expressed at the same levels in cells (measured by intensity of GFP fluorescence) (Figure 8A). CHC17 expression also did not rescue insulin-induced GLUT4 expression, as shown previously (Vassilopoulos et al., 2009). However, CHC22-V1316 is functional for trapping GLUT4 intracellularly because CHC22-transgenic mice that express CHC22-V1316 in muscle, using the natural human promoter, show excessive GLUT4 sequestration in muscle compared to wild-type mice without CHC22, leading to higher blood glucose in the transgenic animals (Vassilopoulos et al., 2009). To analyze GLUT4 sequestration in another way, cells depleted for CHC22 and then transfected with mCherry-GLUT4 plus either CHC22 allotype or CHC17 were each divided into three populations expressing equivalently low, medium and high levels of the transfected CHC-GFP. Then, the total GLUT4 content of the cells was measured by mCherry fluorescence. We observed higher levels of GLUT4 in CHC22-depleted cells expressing CHC22-M1316-GFP, compared to cells expressing either CHC22-V1316-GFP or CHC17-GFP at both medium and high levels of CHC expression (Figure 8B). This suggests that GLUT4 is sequestered more effectively from degradative membrane traffic pathways when trafficked by CHC22-M1316 than by CHC22-V1316, indicating that the M1316 variant is more efficient at targeting GLUT4 to the GSC. As indicated by their weak insulin response compared to muscle or fat cells, HeLa cells are only just able to form a functional GSC from which GLUT4 can be released. For these cells, the less effective CHC22-V1316 is inadequate to restore GSC formation when their endogenous CHC22-M1316 is depleted. Use of this HeLa model was necessitated by the lack of natural models for the CHC22-dependent GLUT4 pathway in myoblasts and adipocytes, as well as a lack of antibodies that detect surface GLUT4. Nonetheless, these experiments demonstrate a functional difference between CHC22-M1316 and CHC22-V1316 and suggest that CHC22-V1316 is less efficient at GLUT4 sequestration.

Figure 8

Download asset Open asset

Differences in intracellular GLUT4 sequestration and stability occur in cells expressing the CHC22-M1316 or CHC22-V1316 allotypes.

HeLa cells were treated with siRNA to deplete endogenous CHC22 or with control siRNA, then transfected to co-express HA-GLUT4-mCherry along with CHC17-GFP (CHC17), CHC22-M1316-GFP (CHC22M) or CHC22-V1316-GFP (CHC22V) (A and B). Total levels of expressed GLUT4 and CHC were measured by FACS (mean fluorescence intensity (MFI) for mCherry or GFP, respectively). Surface levels of GLUT4 were measured with anti-HA antibody at basal conditions (-) or after 30 min of exposure to insulin (+) and surface/total GLUT4 is reported as a measure of GLUT4 translocation to the cell surface (A) in cells expressing equivalent total levels of CHC-GFP. The extent of GLUT4 translocation was assessed in each experimental group before and after insulin stimulation; Student t-test, * p-value<0.05. Transfected cells treated with siRNA to deplete endogenous CHC22, but not treated with insulin, were gated into thirds expressing equivalently low (L), medium (M) and high (H) levels of CHC-GFP for each type of CHC, then total levels of HA-GLUT4-mCherry in each population were plotted (B). We performed a one-way analysis of variance (ANOVA) with Tukey’s multiple comparison post-hoc test: * p-value<0.05.

https://doi.org/10.7554/eLife.41517.019

Figure 8—source data 1 GLUT4 translocation experiment.: https://doi.org/10.7554/eLife.41517.020
Download elife-41517-fig8-data1-v1.xlsx
Figure 8—source data 2 Total GLUT4-mCherry levels.: https://doi.org/10.7554/eLife.41517.021
Download elife-41517-fig8-data2-v1.xlsx

Discussion

We studied the phylogenetics and population genetics of CHC22 clathrin to understand the functional variation of this protein in relation to its evolutionary history. CHC22 clathrin is a key player in post-prandial blood glucose clearance in humans through its role in intracellular packaging of the GLUT4 glucose transporter in muscle and fat, the tissues in which CHC22 and GLUT4 are expressed (Vassilopoulos et al., 2009). The CHC22 pathway positions GLUT4 for cell surface release in response to insulin and consequent uptake of glucose into these tissues (Bryant et al., 2002). The CLTCL1 gene encoding CHC22 resulted from gene duplication that we have now dated to 494–451 MYA, early in vertebrate evolution when jawed vertebrates emerged. We had previously shown that CLTCL1 is a pseudogene in mice (Wakeham et al., 2005). Expanding analysis to 56 jawed vertebrate genomes (>5X coverage) we could not detect CLTCL1 in nine of them. Six of these absences can be ascribed to two independent gene loss events in branches of the Rodentia and the Cetartidactylae. The three others may represent additional gene losses or incomplete genome annotation. All vertebrate and non-vertebrate eukaryotes considered here have retained the parent CLTC gene encoding CHC17 clathrin, which mediates endocytosis and other housekeeping membrane traffic pathways. The analysis described here establishes that CLTC is under strong purifying selection. Notable is our evidence for purifying selection on CLTCL1 in the species in which it has been retained, supporting its functional importance in those species. Compared to CLTC, extensive allelic diversity was observed for CLTCL1 in all species for which populations were analyzed, including humans, chimpanzees and bears. Variant alleles were species-specific in most cases. In all human populations, two allelic variants of CLTCL1 are present in high frequency, differing only at one nucleotide, resulting in CHC22 protein with either methionine or valine at position 1316. The V1316 allotype appears specific to humans, but some non-human primates have a different variation at the position 1316. Analysis of ancient humans dated the appearance of the V1316 variant to 500–50 KYA and indicated that M1316, which is fixed in CHC17 clathrin, is the ancestral state. Analyses of human population genetic data provided support for the maintenance of high genetic diversity and two allotypes of CHC22. We hypothesize that selective pressure on CHC22 clathrin comes from its role in nutrient metabolism. Consistent with this hypothesis, we observed functional differences between the two CHC22 allotypes in their capacity to control GLUT4 membrane traffic, as predicted by structural modeling and differences in cellular dynamics of the two allotypes.

Retention of CLTC in all vertebrate species is consistent with the encoded CHC17 mediating cellular housekeeping clathrin functions shared by all eukaryotes. On the other hand, CHC22, encoded by the paralogous gene CLTCL1, operates in the specialized insulin-responsive GLUT4 pathway to make the pathway more efficient in those species that retained CLTCL1. Data presented here (Figure 8) and our recent mapping of a novel intracellular location for CHC22 function (Camus et al., 2018) indicate that, in human cells, CHC22 clathrin promotes transport from the secretory pathway to the insulin-responsive GSC. This CHC22 pathway complements the endocytic pathway for GLUT4 targeting to the GSC, so species without CHC22 can rely primarily on endocytosis for GLUT4 trafficking to the GSC, while species with CHC22 use both pathways. Thus, we hypothesize that species with functional CHC22 clathrin are more efficient at intracellular GLUT4 sequestration, resulting in lower surface GLUT4 in the absence of insulin, and tighter regulation of GLUT4 release in response to insulin. The trade-off is that these species have an inherent increased tendency to insulin resistance as their GLUT4 is sequestered more effectively. The two main vertebrate branches that have lost CHC22 comprise the Muridae (mice and rats) who are incessant herbivores and the Cetartiodactyla (sheep, cattle, porpoise and pigs) which include the ruminants (sheep and cattle) whose muscle uptake of glucose is critical for muscle function, but is not a main pathway for glucose clearance (Hocquette et al., 1995). These two groups of species require greater availability of GLUT4 on their cell surfaces, so that more efficient GLUT4 sequestration by CHC22 would not be favorable to their nutritional needs. The fact that CHC22 alters the balance of membrane traffic to the GSC means that species losing CLTCL1 could evolve compensatory pathways more compatible with their diets. Thus, transgenic mice expressing CHC22 over-sequester GLUT4 in their muscle and develop hyperglycemia with aging (Vassilopoulos et al., 2009). The cave fish, which appears to lack CLTCL1, has independently evolved mutations in the insulin receptor, creating natural insulin resistance, such that the presence of CHC22 on top of this mechanism might be detrimental (Riddle et al., 2018). The loss of CLTCL1 from cave fish is consistent with the insulin responsive GLUT4 pathway being a target for natural selection driven by diet, which might also explain CLTCL1 variation or loss for additional vertebrate species during vertebrate evolution.

The allelic variation reported here for CLTCL1 in human and bear populations further supports the hypothesis that CLTCL1 has undergone continued selection during vertebrate evolution in relation to diet. While purifying selection appears to be operating on CLTCL1 in those species that retain it, CLTCL1 is far more variable than CLTC in these species. In humans, we find two major and functionally distinct alleles at remarkably similar frequencies in all populations studied. Statistical analysis comparing early farmer and hunter-gatherer populations shows an apparent increase of the V1316 variant, suggesting a correlation with regular consumption of digestible carbohydrate. Notably, the SNP distinguishing these alleles is human-specific and likely arose 550–50 KYA (i.e. post-Neanderthal split, pre-Neolithic). Other dramatic increases in digestible carbohydrate utilization have been inferred for humans in this timeframe; in particular the advent of cooking (which gelatinizes crystalized starch, making it much easier to digest), salivary amylase gene copy number increase (allowing increased starch digestion capacity) and accelerated brain size increase (which would increase demands for blood glucose) (Hardy et al., 2015). While the co-evolution of these cultural and genetic traits was originally proposed to have occurred some 800 KYA, recent studies indicate a time frame of 450–300 KYA years for cooking (Shahack-Gross et al., 2014), increased oral amylase activity (Inchley et al., 2016) and accelerated brain size increase (Dunbar, 2019). The fact that the two major human CLTCL1 alleles are functionally distinct is consistent with diversifying selection operating on CLTCL1, with a balancing selection possibly caused by heterozygote advantage. While population genetic signatures for balancing or overdominant selection were not entirely robust, some summary statistics were suggestive of an increased diversity that was unlikely to have occurred under neutrality. Other statistics, such as the ones based on allele frequencies, would not be expected to gain significance within the timeframe of the human-specific diversifying selection we detect. The allelic diversity of CLTCL1 in other primate species could have the potential effect of diluting its function. Whilst chimpanzees are omnivores and gorillas herbivores, both rely for nutrition on extensive foraging for carbohydrate. Also notable is that polar bears, who have a very low carbohydrate diet compared to their brown bear relatives, have distinct CHC22 variants with unknown functionality, again consistent with CLTCL1 undergoing selection driven by nutritional ecology.

Clathrins are self-assembling proteins and function as a latticed network in the protein coat that they form on transport vesicles. Our structural modeling predicts that the single amino acid difference between the two main human CHC22 allotypes could influence the strength of molecular interactions in the CHC22 clathrin lattice, as position 1316 occurs at a lattice assembly interface (Figure 6). When expressed in cells, both CHC22 variants gave the same overall intracellular distribution, but CHC22-V1316 shows faster turnover from membranes than CHC22-M1316 (Figure 7) and is less effective at GLUT4 sequestration (Figure 8B). These properties are consistent with the methionine to valine change attenuating GLUT4 retention. This interpretation is further supported by a GLUT4 translocation assay, which indicates that the V1316 variant is less effective in forming the insulin-responsive GSC than the ancestral M1316 form of CHC22 (Figure 8A). Thus, mixed lattices occurring in heterozygous individuals, potentially reflect balancing selection and overdominance, might reduce GLUT4 sequestration compared to M1316 homozygotes. This would have the effect of improving glucose clearance. It can be argued that human consumption of digestible carbohydrate on a regular basis (Hardy et al., 2015), requiring increased glucose clearance, might be a selective force driving this genetic adaptation. This view is consistent with the increased frequency of the V1316 variant in early farmers. It is also possible that some forms of polar bear CHC22 are super-active at GLUT4 sequestration, providing a route to maintain high blood glucose, as occurs through other mutations in the cave fish (Riddle et al., 2018).

Regulators of fundamental membrane traffic pathways have diversified through gene duplication in many species over the timespan of eukaryotic evolution. Retention and loss can, in some cases, be correlated with special requirements resulting from species differentiation, such as the extensive elaboration of genes in the secretory pathway of Tetrahymena (Dacks and Robinson, 2017; Bright et al., 2010). The evolutionary history of CLTCL1, following vertebrate-specific gene duplication, suggests that differentiation of nutritional habits has shaped selection for the presence and absence of CLTCL1 in some vertebrate species, and its diversification in humans and potentially other species. Though its highest expression is in muscle and adipose tissue, transient expression of CHC22 during human brain development has also been documented (Nahorski et al., 2015). This was noted in a study of a very rare null mutant of CLTCL1 that caused loss of pain sensing in homozygotes and no symptoms for heterozygotes (Nahorski et al., 2015). Attenuated CHC22 function of the V1316 variant might lead to a spectrum of pain sensing in humans but this is unlikely to be a strong selective force affecting reproductive success, whereas glucose homeostasis, as suggested by our analysis, is more likely. By exerting efficient control of blood glucose levels, the presence of CHC22 clathrin was likely beneficial in providing the nutrition required to develop the large human brain, as well as affecting reproduction by influencing glucose availability during pregnancy (Hardy et al., 2015). However, over the last 12,500 years in association with farming, or perhaps over the last 450,000 years in association with cooking, salivary amylase activity and starch digestion (Hardy et al., 2015; Shahack-Gross et al., 2014; Inchley et al., 2016), readily available carbohydrate has increased our need to clear glucose from the blood, such that selection continues to act on CLTCL1 in humans. Our cell biology studies have also demonstrated that CHC22 increases GLUT4 retention. While we would not expect the major CLTCL1 polymorphism to directly influence the development of T2D, CHC22 accumulates on the expanded GSC that forms in cases of insulin-resistant T2D (Vassilopoulos et al., 2009), so its variation could potentially exacerbate insulin resistance to different degrees. The genetic diversity that we report here may reflect evolution towards reversing a human tendency to insulin resistance and have relevance to coping with increased carbohydrate in modern diets.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Cell line (human)	HeLa	ATCC	Cat. #: CCL-2; RRID:CVCL_0030
Antibody	Mouse monoclonal anti-CHC17 (X22)	Frances Brodsky PMID: 2415533		IF (5 mg/mL)
Antibody	Mouse monoclonal anti-CHC17 (TD.1)	Frances Brodsky PMID: 1547490		WB (1.3 mg/mL)
Antibody	Rabbit polyclonal anti-CHC22 (SHL-KS)	Frances Brodsky PMID: 29097553		WB (0.4 mg/mL)
Antibody	Mouse monoclonal anti-β-actin (AC-15)	Sigma	Cat. #: A1978; RRID:AB_476692	WB (1:2000)
Antibody	Purified anti-HA.11 (16B12)	Covance	Cat. #: MMS-101P; RRID:AB_10064068
Antibody	Rabbit polyclonal anti-CHC22	Proteintech	Cat. #: 22283–1-AP; RRID:AB_11183764
Antibody	Goat anti-rabbit IgG coupled to HRP	ThermoFisher Scientific	Cat. #: 172–1019	WB (1:8000)
Antibody	Goat anti-mouse IgG coupled to HRP	ThermoFisher Scientific	Cat. #: 170–6516	WB (1:8000)
Antibody	Anti-mouse IgG1 coupled to Brilliant Violet 421 (RMG1-1)	Biolegend	Cat. #: 406616; RRID:AB_2562234	FC (1:200)
Recombinant DNA reagent	HA-GLUT4-mCherry	This paper		Generated from HA- GLUT4-GFP (gift from Dr Tim McGraw, PMID: 11058093)
Recombinant DNA reagent	CHC22V (pEGFP-C1-GFP-CHC22V)	Frances Brodsky PMID: 20065094
Recombinant DNA reagent	CHC22M (pEGFP-C1-GFP-CHC22M)	This paper		Generated by Quick change mutagenesis from CHC22V
Recombinant DNA reagent	CHC17 (pEGFP-C1-GFP-CHC17)	Frances Brodsky PMID: 29097553
Sequence-based reagent	AllStars Negative Control siRNA	Qiagen	Cat. #: SI03650318
Commercial assay or kit	Quick change mutagenesis	New England Biolabs, USA	Cat. #: E0554S
Commercial assay or kit	BCA	Pierce	Cat. #: 23225
Commercial assay or kit	Western Lightning Chemilumi-nescence Reagent	GE Healthcare	Cat. #: RPN2209
Chemical compound, drug	JetPrime transfection reagent	PolyPlus	Cat. #: 114–07
Chemical compound, drug	Insulin	Sigma	Cat. #: I9278
Chemical compound, drug	Bovine serum albumin (BSA)	Sigma	Cat. #: A7906
Software, algorithm	FlowJo	Treestar
Software, algorithm	ImageJ	NIH
Software, algorithm	Prism	Graphpad
Software, algorithm	R	R Project		Packages: pegas, Smisc, gplots
Other	CellView glass bottom culture dish	Greiner Bio-one	Cat. #: 627860
Other	Nitrocellulose membrane	Biorad	Cat. #: 1620112

Phylogenetics

Request a detailed protocol

Vertebrate genomes as well as genomes of Drosophila melanogaster, Caenorhabditis elegans, Ciona intestinalis and Ciona savignyi were downloaded from Ensembl (Yates et al., 2016), all accessed on 23/04/2016 except for pig (14/12/2017), marmoset and hagfish (both 09/08/2018), excluding vertebrate species sequenced below five-fold genome coverage, that is with less than five reads per site on average. In addition, we downloaded the genomes of the elephant shark (Venkatesh et al., 2014), whale shark (Read et al., 2017), marmot (The Alpine Marmot Genome, BioProject PRJEB8272 on NCBI) and porpoise (Yuan et al., 2018). All potential orthologs for the human isoforms of CLTC/CLTCL1, MTMR4/MTMR3, and CLTA/CLTB, in the above genomes were retrieved via BLAST (Boratyn et al., 2013). An e-value threshold of 0.001 with additional constraints applied by default in InParanoid version 7 (Ostlund et al., 2010) were used (at least 50% of the sequences are covered by the alignment, and at least 25% of the residues are aligned). The polar bear (Ursus maritimus) (Liu et al., 2014), brown bear (Ursus arctos) (Benazzo et al., 2017) and black bear (Ursus americanus) CHC17 and CHC22 protein sequences were manually added. For CLTCL1 only the elephant and horse sequences (XP_023397213.1 and XP_023502410.1 respectively) were manually added.

The sequences corresponding to the longest transcripts were aligned with MAFFT (Katoh and Standley, 2013) and phylogenetic trees generated with PhyML (Guindon et al., 2010). The last two steps were repeated after manually removing outlier sequences lying on long branches (CLTC/CLTCL1: ENSTNIP00000007811.1, ENSTGUP00000014952.1, XP_023397213.1, XP_023502410.1; CLTA/CLTB: ENSPSIP00000012669.1) and, in the case of genomes not retrieved from Ensembl (therefore lacking the gene-to-transcript mapping), sequences most likely corresponding to alternative transcripts (XP_015350877.1, XP_007899998.1, XP_007899997.1, XP_007904368.1, XP_007904367.1, XP_020375861.1, XP_020375865.1, XP_020375862.1, XP_020392037.1, XP_020375864.1, XP_020375859.1). Trees were manually reconciled based on the Ensembl species tree extended by elephant shark, whale shark, brown bear, black bear, porpoise and marmot with TreeGraph (Stöver and Müller, 2010). Branch lengths were estimated based on the multiple sequence alignment (MSA) with PhyML fixing the manually reconciled topology, with options ‘-u’ and ‘--constraint_file’. With this approach no support values for splits are calculated. The resulting trees were used as input to generate a new phylogeny-aware MSAs with PRANK (Löytynoja and Goldman, 2005). Branch lengths of the reconciled topologies were then re-estimated based on the MSA generated by PRANK.

To compute evolutionary rates, the sequences and subtrees corresponding to CLTC and CLTCL1 clades after duplication (i.e. excluding non-vertebrates) were extracted and sequences from species without either CLTC or CLTCL1 were removed. The same procedure was performed for MTMR4/MTMR3 and CLTA/CLTB. A phylogeny-aware MSA was computed with PRANK on the remaining sequences, and the amino acid alignment was converted to a codon alignment with PAL2NAL (Suyama et al., 2006). Finally, dN/dS ratios (i.e. the ratio of the rate of nonsynonymous substitutions to the rate of synonymous substitutions) were inferred based on the codon alignments with PAML (Yang, 2007) for the six proteins independently using the site model M7. Model M7 fits a Beta-distribution to the site rates by estimating the two Beta parameters shape and scale. Rates are estimated per site over the entire phylogeny, and therefore represent time averages. Phylogenetic trees of consensus amino acid sequences for bear samples only were computed using PhyML 3.1 (Guindon et al., 2010) with default values as implemented in Phylogeny.fr (Dereeper et al., 2008).

Population genetics

Request a detailed protocol

Phased genotypes were obtained by querying Variant Call Format (VCF) files (Danecek et al., 2011) from the 1000 Genomes Project database Phase 3 (1000 Auton et al., 2015) for all available 2504 samples. Only high-quality variants were retained using vcflib (https://github.com/vcflib/vcflib) with options ‘VT = SNP and QUAL > 1 and AC >1 and DP >5000’. Missing genotypes were assigned to homozygotes for the reference alleles. Finally, only sites with a recorded annotated function of being missense, nonsense, stop-loss or frame-shift for tested genes according to the UCSC Table Browser were retained (Speir et al., 2016) (tables snp150 and snp150CodingDbSnp). For each retained position, the reference sequence for chimpanzee from the UCSC Table Browser (Speir et al., 2016) (table snp150OrthoPt5Pa2Rm8) was initially used to infer the putative ancestral state. For ambiguous or multiallelic states in the chimpanzee sequence, the human reference base was used as an initial proxy for the ancestral state. The predicted functional impact of amino acid replacements was obtained by using Polyphen (Adzhubei et al., 2010) and SIFT (Kumar et al., 2009). Additional frequency information for a single mutation of interest in more than 50 human populations was retrieved from the HGDP CEPH Panel (Cann et al., 2002) from http://hgdp.uchicago.edu/cgi-bin/gbrowse/HGDP/. Genotype data for farmer and hunter-gatherer individuals were collected from the Simons Genome Diversity Project Dataset (Mallick et al., 2016). Populations were merged based on their assigned geographical region with the following classification for hunter-gatherers: Africa (Biaka, Ju|‘hoan North, Khomani San, Mbuti), Central Asia and Siberia (Aluet, Chukchi, Eskimo Chaplin, Eskimo Naukan, Eskimo Sireniki, Even, Itelman, Tlingit, Tubalar, Ulchi, Atayal), East and South Asia (Atayal, Kusunda). Farmer and hunter-gatherer allele frequencies were compared following a previously described approach (Raineri et al., 2014). Briefly, we analytically computed the probability that the V allele is more frequent in farmers than in hunter gatherers while fully accounting for the uncertainty in the individual frequency estimates. V allele frequencies were inferred from allele counts of M and V in a Bayesian framework with a conjugate Beta uniform prior. We recorded maximum a posteriori estimates with 95% highest posterior density credible intervals computed with the Smisc R library, version 0.3.9. We collected further published ancient DNA data from Western Eurasia and classified into three genetic grouping: hunter-gatherer (HG), early farmer (EF) and steppe, using supervised ADMIXTURE (Alexander et al., 2009) as previously described (Mathieson and Mathieson, 2018). These are genetic groups and not directly based on differences in material culture or subsistence, but importantly in the case of HG and EF, these genetic classifications correspond closely to hunter-gatherer and agricultural subsistence strategies (Haak et al., 2015; Skoglund et al., 2014; Skoglund et al., 2012). We then restricted analysis to samples dated between 10,000 and 5,000 years before present that were classified as either HG or EF, leading to a dataset of 119 HG and 316 EF of which 85 and 188 respectively had coverage at rs1061325. Frequencies for South-East Asians and ancient Eurasians were down-sampled to ensure numerical stability. The HeLa genomic data were accessed through the NIH database of Genotypes and Phenotypes (dbGaP at http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap) through dbGaP accession number phs000640.

High-coverage VCF files for 79 individuals from six species of great apes were retrieved (Prado-Martinez et al., 2013). Data was filtered using vcflib on the combined data set with the options ‘QUAL > 32 and DP >50 and DP <7000 and FS <27 and MQ > 25 and AC >1’, similarly to the original manuscript describing this data set (Prado-Martinez et al., 2013). To retrieve nonsynonymous changes, only variants where the translated proteins for each allele differ were retained. We finally phased the data and assigned individual haplotypes using shapeit v2.r837 with the options ‘-burn 50 -prune 20 -main 100 -window 0.5 -effective-size 20000’. Additional 110 genomes of chimpanzees and bonobos were analyzed (Teixeira et al., 2015; de Manuel et al., 2016). Data filtering, functional annotation and haplotype phasing were performed as described above.

Full genome VCF files for two high-coverage archaic humans, namely one Altai Neanderthal (Prüfer et al., 2014) and one Denisova were retrieved (Meyer et al., 2012). Low-quality sites were filtered out using vcflib with the options ‘QUAL > 1 and DP >10’. A pseudo-reference sequence for each archaic human was constructed by replacing the heterozygous sites with the previously inferred human ancestral state. Sequencing data information for additional ancient human samples were obtained from previously published high-quality whole genome sequences (Skoglund et al., 2014; Broushaki et al., 2016; Hofmanová et al., 2016; Lazaridis et al., 2014; Olalde et al., 2014; Raghavan et al., 2014; Seguin-Orlando et al., 2014; Fu et al., 2014). Genotype likelihoods were calculated using the standard GATK model (McKenna et al., 2010). Median-joining network plots were generated in R using pegas package (Paradis, 2010).

Several summary statistics were calculated on the inferred alleles to describe their levels of nucleotide diversity. Specifically, for each population separately, Watterson’s estimator of population mutation parameter (TW) (Watterson, 1975), Nei’s genetic diversity index (PI) (Nei, 1973), Tajima’s D (TD) (Tajima, 1989), Fu and Li’s D* (FLDs) and F* (FLFs) (Fu and Li, 1993), the sum of squared allele frequencies including the most common allele (H1) and excluding it (H2) and their normalized ratio (H2H1) (Garud and Rosenberg, 2015; Garud et al., 2015) were calculated. We also computed genetic differentiation (F_ST) (Reynolds et al., 1983) between pairs of canonical reference populations, namely Yoruban (YRI), Europeans (CEU), and Han Chinese (CHB).

To assess whether the observed summary statistics are expected under neutral evolution, genes with a coding length approximately equal (±5%) to the one observed for the tested gene, CLTCL1, were selected. For this analysis, the longest isoform for each gene, and its annotation was considered according to refGene table from the UCSC Genome Browser. We discarded genes on chromosome six and on sex chromosomes, as well as CLTA, CLTB and CLTC. This set was further reduced to the first 500 genes with the closest genomic length to CLTCL1. As summary statistics can be calculated only in case of genetic variability, genes showing no non-synonymous SNPs within each population were discarded. For each summary statistic, the empirical percentile rank for the value observed in CLTCL1 compared to the whole distribution of control genes was calculated. Low or high values are suggestive of CLTCL1 being an outlier in the empirical distribution. For plotting purposes, summary statistics and populations were clustered according to a dendrogram inferred from their respective distances based on the calculated matrix of empirical percentile ranks. That is, populations clustering together exhibit similar patterns of percentile ranks, and thus of summary statistics. The underlying dendrograms are not reported. The heatmap plot was generated using the function heatmap.2 in R with the package gplots. Cells with an empirical percentile rank lower than 0.10 or greater than 0.90 were filled with the exact rank value. We also obtained a null distribution of summary statistics by performing coalescent simulations using msms (Ewing and Hermisson, 2010) under a previously derived demographic model for human populations (Gutenkunst et al., 2009).

Structure prediction by modeling

Request a detailed protocol

MODELLER v9.13 (Benjamin and Sali, 2014) was used to model the structure of the proximal leg segment of CHC22, using the crystal structure of bovine CHC17 (PDB 1B89) (Ybe et al., 1999) as a template. The model of the M1316V mutant was derived in a similar way using a mutated sequence. Structure visualization and analysis of residue interactions at the mutation site M1316 were performed using UCSF Chimera (Pettersen et al., 2004). The wild type and mutant homology models were positioned in the cryo-electron microscopy map of the bovine clathrin lattice (EMD: 5119) (Fotin et al., 2004) by structural superposition on the atomic model originally fitted in the map (PDB 1XI4).

Functional experiments

Antibodies, plasmids and reagents

Request a detailed protocol

Mouse monoclonal anti-CHC17 antibodies X22 (Brodsky, 1985), TD.1 (Näthke et al., 1992) and affinity-purified rabbit polyclonal antibody specific for CHC22 and not CHC17 (Vassilopoulos et al., 2009) were produced in the Brodsky laboratory. Commercial sources of antibodies were as follows: mouse monoclonal anti-β-actin (clone AC-15, Sigma), mouse monoclonal anti-HA (clone 16B12, Covance), rabbit polyclonal anti-CHC22 (Proteintech). Secondary antibodies coupled to HRP were from ThermoFisher, the secondary antibody coupled to Brilliant Violet 421 was from BioLegend. The HA-GLUT4-mCherry was generated by replacing the GFP from the HA-GLUT4-GFP construct (gift from Dr Tim McGraw; Lampson et al., 2000) with mCherry using KpnI and EcoRI. The generation of the CHC22 variant expressing a valine at position 1316 (CHC22V) was previously described (Esk et al., 2010). The CHC22 variant expressing a methionine at position 1316 (CHC22M) was generated from CHC22V by quick-change mutagenesis (New England Biotechnologies, USA) following manufacturer’s instructions.

Small RNA interference

Request a detailed protocol

Targeting siRNA was produced to interact with DNA sequences AAGCAATGAGCTGTTTGAAGA for CHC17 (Esk et al., 2010) (Qiagen), TCGGGCAAATGTGCCAAGCAA and AACTGGGAGGATCTAGTTAAA for CHC22 (1:1 mixture of siRNAs were used) (Vassilopoulos et al., 2009) (Dharmacon). Non-targeting control siRNA was the Allstars Negative Control siRNA (Qiagen).

Cell culture

Request a detailed protocol

HeLa cells were grown in Dulbecco’s Modified Eagle Medium high glucose (Gibco) supplemented with 10% FBS (Gibco), 50 U/mL penicillin, 50 μg/mL streptomycin (Gibco), 10 mM Hepes (Gibco) and maintained at 37°C in a 5% CO₂ atmosphere. HeLa cells were free of mycoplasma infection.

siRNA and DNA transfection

Request a detailed protocol

Cells were transfected for 72 hr with 20 nM of siRNA. Silencing was assessed by immunoblotting. Transient DNA transfections for rescue experiments were performed during the third day of silencing. For FACS experiments, cells (per well of 6-well plate, 70% confluent) were transiently transfected with 1 μg DNA for CHC22M-GFP and CHC22V-GFP, 1.5 μg DNA for CHC17-GFP and HA-GLUT4-mCherry. For FRAP experiments, cells (per glass bottom dish, 60% confluent) were transfected with 0.75 μg DNA for CHC22-GFP (M or V) or 1.5 μg DNA for CHC17-GFP. FACS and FRAP experiments were carried out 24 hr later. All transfections were performed using JetPrime transfection reagent (PolyPlus) following manufacturer’s instructions.

GLUT4 translocation assay using flow cytometry

Request a detailed protocol

HeLa cells were grown in 6-well plates and transiently transfected with either HA-GLUT4-mCherry alone or in combination with GFP-tagged CHC22 (M or V) or CHC17-GFP the day before the experiment. The next day, cells were serum-starved (2 hr) before insulin stimulation (170 nM or vehicle (water) for 15 min, 37°C). Cells were then placed on ice and rapidly washed (2X, PBS, 4°C) and fixed (PFA 2%, 15 min). After fixation, cells were washed (2X, PBS, RT) then blocked for 1 hr (PBS 2% BSA, RT) before incubation with monoclonal anti-HA antibody (45 min, RT) to detect surface GLUT4. After incubation, cells were washed (3X, PBS, RT) and incubated with anti-mouse secondary Ig coupled to Brilliant Violet 421 (45 min, RT). Cells were then washed (5X, PBS, RT), gently lifted using a cell scraper (Corning), pelleted (800xg, 8 min) and re-suspended (PBS, 2% BSA, 4°C). Data was acquired with Diva acquisition software by LSRII flow cytometer (Becton Dickinson). Typically, 10,000 to 30,000 events were recorded and Mean Fluorescence Intensity (MFI) values for surface GLUT4 (Brilliant Violet 421) and total GLUT4 (mCherry) were recorded using 450/50 and 530/30 filters, respectively. The ratio of surface to total MFI was calculated to quantify the extent of GLUT4 translocation. MFI values for total GLUT4 (mCherry) were plotted for GLUT4 stability assays. Histograms and post-acquisition analysis were performed using FlowJo software (Treestar). Total GLUT4 and surface GLUT4 values are reported separately for cells expressing CHC22 variants with equalized GFP signals at the top third (high), middle third (medium) and bottom third (low) levels of expression.

Fluorescence Recovery After Photobleaching

Request a detailed protocol

The imaging of transiently transfected HeLa cells grown on Cellview glass bottom culture dishes (Greiner, Germany) was performed at 37°C in a 5% CO₂ atmosphere, using low 488 nm laser power to minimize photobleaching using a 63x (1.4 NA) lens on a Leica SP8 confocal microscope. A 2.0 μm² circular region of interest was positioned in the perinuclear region of the transfected cells, a region where both CHC22 and CHC17 naturally occupy. A 100% laser power (488 nm) coupled to 11 iterations was performed to achieve GFP photobleaching. Recovery of fluorescence was recorded from 4 to 10 independent cells per dish. The experiment was repeated at least three times.

Immunoblotting

Request a detailed protocol

HeLa cells protein extracts were quantified by BCA (Pierce), separated by SDS-PAGE (10% acrylamide), transferred to nitrocellulose membrane (0.2 μm, Biorad), labeled with primary antibodies (1–5 μg/mL), washed and labeled with species-specific horseradish peroxidase-conjugated secondary antibodies (ThermoFisher). Peroxidase activity was detected using Western Lightning Chemiluminescence Reagent (GE Healthcare). The molecular migration position of transferred proteins was compared to the PageRuler Prestain Protein Ladder 10 to 170 kDa (Thermo Fisher Scientific). Signals were detected using the Chemidoc XRS + imaging system (Biorad) and quantifications were performed using Image J software (NIH).

Statistical analyses

Request a detailed protocol

Graphs and statistical analyses were performed using Prism software (Graphpad). Detailed statistical information including statistical test used, number of independent experiments, p-values, definition of error bars is listed in individual figure legends. All experiments were performed at least three times.

Data availability

All data generated or analysed during this study are included in the manuscript and supporting files.

References

1. Adey A
2. Burton JN
3. Kitzman JO
4. Hiatt JB
5. Lewis AP
6. Martin BK
7. Qiu R
8. Lee C
9. Shendure J
(2013) The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line
Nature 500:207–211.

https://doi.org/10.1038/nature12064
- PubMed
- Google Scholar
(2010) A method and server for predicting damaging missense mutations
Nature Methods 7:248–249.

https://doi.org/10.1038/nmeth0410-248
- PubMed
- Google Scholar
(2009) Fast model-based estimation of ancestry in unrelated individuals
Genome Research 19:1655–1664.

https://doi.org/10.1101/gr.094052.109
- PubMed
- Google Scholar
(2011) Genome evolution and meiotic maps by massively parallel DNA sequencing: spotted gar, an outgroup for the teleost genome duplication
Genetics 188:799–808.

https://doi.org/10.1534/genetics.111.127324
- PubMed
- Google Scholar
(2015) A global reference for human genetic variation
Nature 526:68–74.

https://doi.org/10.1038/nature15393
- PubMed
- Google Scholar
1. Babbitt CC
2. Warner LR
3. Fedrigo O
4. Wall CE
5. Wray GA
(2011) Genomic signatures of diet-related shifts during human origins
Proceedings of the Royal Society B: Biological Sciences 278:961–969.

https://doi.org/10.1098/rspb.2010.2433
- Google Scholar
1. Benazzo A
2. Trucchi E
3. Cahill JA
4. Maisano Delser P
5. Mona S
6. Fumagalli M
7. Bunnefeld L
8. Cornetti L
9. Ghirotto S
10. Girardi M
11. Ometto L
12. Panziera A
13. Rota-Stabelli O
14. Zanetti E
15. Karamanlidis A
16. Groff C
17. Paule L
18. Gentile L
19. Vilà C
20. Vicario S
21. Boitani L
22. Orlando L
23. Fuselli S
24. Vernesi C
25. Shapiro B
26. Ciucci P
27. Bertorelle G
(2017) Survival and divergence in a small group: the extraordinary genomic history of the endangered apennine Brown bear stragglers
PNAS 114:E9589–E9597.

https://doi.org/10.1073/pnas.1707279114
- PubMed
- Google Scholar
1. Benjamin W
2. Sali A
(2014) Comparative protein structure modeling using MODELLER
Current Protocols In Bioinformatics 47:5–6.

https://doi.org/10.1002/cpbi.3
- Google Scholar
1. Bojarska K
2. Selva N
(2012) Spatial patterns in Brown bear ursus arctos diet: the role of geographical and environmental factors
Mammal Review 42:120–143.

https://doi.org/10.1111/j.1365-2907.2011.00192.x
- Google Scholar
1. Boratyn GM
2. Camacho C
3. Cooper PS
4. Coulouris G
5. Fong A
6. Ma N
7. Madden TL
8. Matten WT
9. McGinnis SD
10. Merezhuk Y
11. Raytselis Y
12. Sayers EW
13. Tao T
14. Ye J
15. Zaretskaya I
(2013) BLAST: a more efficient report with usability improvements
Nucleic Acids Research 41:W29–W33.

https://doi.org/10.1093/nar/gkt282
- PubMed
- Google Scholar
(2010) Comprehensive analysis reveals dynamic and evolutionary plasticity of rab GTPases and membrane traffic in tetrahymena thermophila
PLOS Genetics 6:e1001155.

https://doi.org/10.1371/journal.pgen.1001155
- PubMed
- Google Scholar
1. Brodsky FM
(1985) Clathrin structure characterized with monoclonal antibodies. I. analysis of multiple antigenic sites
The Journal of Cell Biology 101:2047–2054.

https://doi.org/10.1083/jcb.101.6.2047
- PubMed
- Google Scholar
1. Brodsky FM
(2012) Diversity of clathrin function: new tricks for an old protein
Annual Review of Cell and Developmental Biology 28:309–336.

https://doi.org/10.1146/annurev-cellbio-101011-155716
- PubMed
- Google Scholar
1. Broushaki F
2. Thomas MG
3. Link V
4. López S
5. van Dorp L
6. Kirsanow K
7. Hofmanová Z
8. Diekmann Y
9. Cassidy LM
10. Díez-Del-Molino D
11. Kousathanas A
12. Sell C
13. Robson HK
14. Martiniano R
15. Blöcher J
16. Scheu A
17. Kreutzer S
18. Bollongino R
19. Bobo D
20. Davudi H
21. Munoz O
22. Currat M
23. Abdi K
24. Biglari F
25. Craig OE
26. Bradley DG
27. Shennan S
28. Veeramah K
29. Mashkour M
30. Wegmann D
31. Hellenthal G
32. Burger J
(2016) Early neolithic genomes from the eastern fertile crescent
Science 353:499–503.

https://doi.org/10.1126/science.aaf7943
- PubMed
- Google Scholar
(2002) Regulated transport of the glucose transporter GLUT4
Nature Reviews Molecular Cell Biology 3:267–277.

https://doi.org/10.1038/nrm782
- PubMed
- Google Scholar
Preprint
1. Camus SM
2. Camus MD
3. Sadacca LA
4. Esk C
5. Gould GW
6. Kioumourtzoglou D
7. Bryant NJ
8. Mukherjee S
9. Brodsky FM
(2018) CHC22 clathrin diverts GLUT4 from the ER-to-Golgi intermediate compartment for intracellular sequestration
bioRxiv.

https://doi.org/10.1101/242941
- Google Scholar
1. Cann HM
2. de Toma C
3. Cazes L
4. Legrand MF
5. Morel V
6. Piouffre L
7. Bodmer J
8. Bodmer WF
9. Bonne-Tamir B
10. Cambon-Thomsen A
11. Chen Z
12. Chu J
13. Carcassi C
14. Contu L
15. Du R
16. Excoffier L
17. Ferrara GB
18. Friedlaender JS
19. Groot H
20. Gurwitz D
21. Jenkins T
22. Herrera RJ
23. Huang X
24. Kidd J
25. Kidd KK
26. Langaney A
27. Lin AA
28. Mehdi SQ
29. Parham P
30. Piazza A
31. Pistillo MP
32. Qian Y
33. Shu Q
34. Xu J
35. Zhu S
36. Weber JL
37. Greely HT
38. Feldman MW
39. Thomas G
40. Dausset J
41. Cavalli-Sforza LL
(2002) A human genome diversity cell line panel
Science 296:261–262.

https://doi.org/10.1126/science.296.5566.261b
- PubMed
- Google Scholar
1. Charlesworth D
(2006) Balancing selection and its effects on sequences in nearby genome regions
PLOS Genetics 2:e64.

https://doi.org/10.1371/journal.pgen.0020064
- PubMed
- Google Scholar
1. Dacks JB
2. Robinson MS
(2017) Outerwear through the ages: evolutionary cell biology of vesicle coats
Current Opinion in Cell Biology 47:108–116.

https://doi.org/10.1016/j.ceb.2017.04.001
- PubMed
- Google Scholar
(2011) The variant call format and VCFtools
Bioinformatics 27:2156–2158.

https://doi.org/10.1093/bioinformatics/btr330
- PubMed
- Google Scholar
1. Dannhauser PN
2. Camus SM
3. Sakamoto K
4. Sadacca LA
5. Torres JA
6. Camus MD
7. Briant K
8. Vassilopoulos S
9. Rothnie A
10. Smith CJ
11. Brodsky FM
(2017) CHC22 and CHC17 clathrins have distinct biochemical properties and display differential regulation and function
Journal of Biological Chemistry 292:20834–20844.

https://doi.org/10.1074/jbc.M117.816256
- PubMed
- Google Scholar
1. de Manuel M
2. Kuhlwilm M
3. Frandsen P
4. Sousa VC
5. Desai T
6. Prado-Martinez J
7. Hernandez-Rodriguez J
8. Dupanloup I
9. Lao O
10. Hallast P
11. Schmidt JM
12. Heredia-Genestar JM
13. Benazzo A
14. Barbujani G
15. Peter BM
16. Kuderna LF
17. Casals F
18. Angedakin S
19. Arandjelovic M
20. Boesch C
21. Kühl H
22. Vigilant L
23. Langergraber K
24. Novembre J
25. Gut M
26. Gut I
27. Navarro A
28. Carlsen F
29. Andrés AM
30. Siegismund HR
31. Scally A
32. Excoffier L
33. Tyler-Smith C
34. Castellano S
35. Xue Y
36. Hvilsom C
37. Marques-Bonet T
(2016) Chimpanzee genomic diversity reveals ancient admixture with bonobos
Science 354:477–481.

https://doi.org/10.1126/science.aag2602
- PubMed
- Google Scholar
1. Dereeper A
2. Guignon V
3. Blanc G
4. Audic S
5. Buffet S
6. Chevenet F
7. Dufayard JF
8. Guindon S
9. Lefort V
10. Lescot M
11. Claverie JM
12. Gascuel O
(2008) Phylogeny.fr: robust phylogenetic analysis for the non-specialist
Nucleic Acids Research 36:W465–W469.

https://doi.org/10.1093/nar/gkn180
- PubMed
- Google Scholar
(2011) Thousands of rab GTPases for the cell biologist
PLOS Computational Biology 7:e1002217.

https://doi.org/10.1371/journal.pcbi.1002217
- PubMed
- Google Scholar
(2015) Uncertainty in the timing of origin of animals and the limits of precision in molecular timescales
Current Biology 25:2939–2950.

https://doi.org/10.1016/j.cub.2015.09.066
- PubMed
- Google Scholar
1. Dunbar RIM
(2019)
Landscapes of Human Evolution: Contributions in Honour of John Gowlett

Fire and the biogeography of Palaeo hominins, Landscapes of Human Evolution: Contributions in Honour of John Gowlett, Oxford, ArcheoPress.
- Google Scholar
1. Elde NC
2. Malik HS
(2009) The evolutionary conundrum of pathogen mimicry
Nature Reviews Microbiology 7:787–797.

https://doi.org/10.1038/nrmicro2222
- PubMed
- Google Scholar
1. Esk C
2. Chen CY
3. Johannes L
4. Brodsky FM
(2010) The clathrin heavy chain isoform CHC22 functions in a novel endosomal sorting step
Journal of Cell Biology 188:131–144.

https://doi.org/10.1083/jcb.200908057
- PubMed
- Google Scholar
1. Ewing G
2. Hermisson J
(2010) MSMS: a coalescent simulation program including recombination, demographic structure and selection at a single locus
Bioinformatics 26:2064–2065.

https://doi.org/10.1093/bioinformatics/btq322
- PubMed
- Google Scholar
1. Fotin A
2. Cheng Y
3. Sliz P
4. Grigorieff N
5. Harrison SC
6. Kirchhausen T
7. Walz T
(2004) Molecular model for a complete clathrin lattice from electron cryomicroscopy
Nature 432:573–579.

https://doi.org/10.1038/nature03079
- PubMed
- Google Scholar
1. Fu Q
2. Li H
3. Moorjani P
4. Jay F
5. Slepchenko SM
6. Bondarev AA
7. Johnson PL
8. Aximu-Petri A
9. Prüfer K
10. de Filippo C
11. Meyer M
12. Zwyns N
13. Salazar-García DC
14. Kuzmin YV
15. Keates SG
16. Kosintsev PA
17. Razhev DI
18. Richards MP
19. Peristov NV
20. Lachmann M
21. Douka K
22. Higham TF
23. Slatkin M
24. Hublin JJ
25. Reich D
26. Kelso J
27. Viola TB
28. Pääbo S
(2014) Genome sequence of a 45,000-year-old modern human from western Siberia
Nature 514:445–449.

https://doi.org/10.1038/nature13810
- PubMed
- Google Scholar
1. Fu YX
2. Li WH
(1993)
Statistical tests of neutrality of mutations

Genetics 133:693–709.
- PubMed
- Google Scholar
(2011) Signatures of environmental genetic adaptation pinpoint pathogens as the main selective pressure through human evolution
PLOS Genetics 7:e1002355.

https://doi.org/10.1371/journal.pgen.1002355
- PubMed
- Google Scholar
(2015) Recent selective sweeps in North American Drosophila melanogaster show signatures of soft sweeps
PLOS Genetics 11:e1005004.

https://doi.org/10.1371/journal.pgen.1005004
- PubMed
- Google Scholar
1. Garud NR
2. Rosenberg NA
(2015) Enhancing the mathematical properties of new haplotype homozygosity statistics for the detection of selective sweeps
Theoretical Population Biology 102:94–101.

https://doi.org/10.1016/j.tpb.2015.04.001
- PubMed
- Google Scholar
(2017) An evolutionary balance: conservation vs innovation in ciliate membrane trafficking
Traffic 18:18–28.

https://doi.org/10.1111/tra.12450
- PubMed
- Google Scholar
(2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0
Systematic Biology 59:307–321.

https://doi.org/10.1093/sysbio/syq010
- PubMed
- Google Scholar
(2009) Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data
PLOS Genetics 5:e1000695.

https://doi.org/10.1371/journal.pgen.1000695
- PubMed
- Google Scholar
1. Haak W
2. Lazaridis I
3. Patterson N
4. Rohland N
5. Mallick S
6. Llamas B
7. Brandt G
8. Nordenfelt S
9. Harney E
10. Stewardson K
11. Fu Q
12. Mittnik A
13. Bánffy E
14. Economou C
15. Francken M
16. Friederich S
17. Pena RG
18. Hallgren F
19. Khartanovich V
20. Khokhlov A
21. Kunst M
22. Kuznetsov P
23. Meller H
24. Mochalov O
25. Moiseyev V
26. Nicklisch N
27. Pichler SL
28. Risch R
29. Rojo Guerra MA
30. Roth C
31. Szécsényi-Nagy A
32. Wahl J
33. Meyer M
34. Krause J
35. Brown D
36. Anthony D
37. Cooper A
38. Alt KW
39. Reich D
(2015) Massive migration from the steppe was a source for Indo-European languages in Europe
Nature 522:207–211.

https://doi.org/10.1038/nature14317
- PubMed
- Google Scholar
1. Haga Y
2. Ishii K
3. Suzuki T
(2011) N-glycosylation is critical for the stability and intracellular trafficking of glucose transporter GLUT4
Journal of Biological Chemistry 286:31320–31327.

https://doi.org/10.1074/jbc.M111.253955
- PubMed
- Google Scholar
(2015) The importance of dietary carbohydrate in human evolution
The Quarterly Review of Biology 90:251–268.

https://doi.org/10.1086/682587
- PubMed
- Google Scholar
1. Hedges SB
2. Marin J
3. Suleski M
4. Paymer M
5. Kumar S
(2015) Tree of life reveals clock-like speciation and diversification
Molecular Biology and Evolution 32:835–845.

https://doi.org/10.1093/molbev/msv037
- PubMed
- Google Scholar
1. Hocquette JF
2. Bornes F
3. Balage M
4. Ferre P
5. Grizard J
6. Vermorel M
(1995) Glucose-transporter (GLUT4) protein content in oxidative and glycolytic skeletal muscles from calf and goat
Biochemical Journal 305 ( Pt 2:465–470.

https://doi.org/10.1042/bj3050465
- PubMed
- Google Scholar
1. Hofmanová Z
2. Kreutzer S
3. Hellenthal G
4. Sell C
5. Diekmann Y
6. Díez-Del-Molino D
7. van Dorp L
8. López S
9. Kousathanas A
10. Link V
11. Kirsanow K
12. Cassidy LM
13. Martiniano R
14. Strobel M
15. Scheu A
16. Kotsakis K
17. Halstead P
18. Triantaphyllou S
19. Kyparissi-Apostolika N
20. Urem-Kotsou D
21. Ziota C
22. Adaktylou F
23. Gopalan S
24. Bobo DM
25. Winkelbach L
26. Blöcher J
27. Unterländer M
28. Leuenberger C
29. Çilingiroğlu Ç
30. Horejs B
31. Gerritsen F
32. Shennan SJ
33. Bradley DG
34. Currat M
35. Veeramah KR
36. Wegmann D
37. Thomas MG
38. Papageorgopoulou C
39. Burger J
(2016) Early farmers from across Europe directly descended from neolithic Aegeans
PNAS 113:6886–6891.

https://doi.org/10.1073/pnas.1523951113
- PubMed
- Google Scholar
1. Inchley CE
2. Larbey CD
3. Shwan NA
4. Pagani L
5. Saag L
6. Antão T
7. Jacobs G
8. Hudjashov G
9. Metspalu E
10. Mitt M
11. Eichstaedt CA
12. Malyarchuk B
13. Derenko M
14. Wee J
15. Abdullah S
16. Ricaut FX
17. Mormina M
18. Mägi R
19. Villems R
20. Metspalu M
21. Jones MK
22. Armour JA
23. Kivisild T
(2016) Selective sweep on human amylase genes postdates the split with Neanderthals
Scientific Reports 6:37198.

https://doi.org/10.1038/srep37198
- PubMed
- Google Scholar
1. Katoh K
2. Standley DM
(2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability
Molecular Biology and Evolution 30:772–780.

https://doi.org/10.1093/molbev/mst010
- PubMed
- Google Scholar
(2002) A high-resolution recombination map of the human genome
Nature Genetics 31:241–247.

https://doi.org/10.1038/ng917
- PubMed
- Google Scholar
1. Kumar P
2. Henikoff S
3. Ng PC
(2009) Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm
Nature Protocols 4:1073–1081.

https://doi.org/10.1038/nprot.2009.86
- PubMed
- Google Scholar
(2000)
Demonstration of insulin-responsive trafficking of GLUT4 and vpTR in fibroblasts

Journal of Cell Science 113:4065–4076.
- PubMed
- Google Scholar
1. Landry JJ
2. Pyl PT
3. Rausch T
4. Zichner T
5. Tekkedil MM
6. Stütz AM
7. Jauch A
8. Aiyar RS
9. Pau G
10. Delhomme N
11. Gagneur J
12. Korbel JO
13. Huber W
14. Steinmetz LM
(2013) The genomic and transcriptomic landscape of a HeLa cell line
G3: Genes|Genomes|Genetics 3:1213–1224.

https://doi.org/10.1534/g3.113.005777
- Google Scholar
1. Lazaridis I
2. Patterson N
3. Mittnik A
4. Renaud G
5. Mallick S
6. Kirsanow K
7. Sudmant PH
8. Schraiber JG
9. Castellano S
10. Lipson M
11. Berger B
12. Economou C
13. Bollongino R
14. Fu Q
15. Bos KI
16. Nordenfelt S
17. Li H
18. de Filippo C
19. Prüfer K
20. Sawyer S
21. Posth C
22. Haak W
23. Hallgren F
24. Fornander E
25. Rohland N
26. Delsate D
27. Francken M
28. Guinet JM
29. Wahl J
30. Ayodo G
31. Babiker HA
32. Bailliet G
33. Balanovska E
34. Balanovsky O
35. Barrantes R
36. Bedoya G
37. Ben-Ami H
38. Bene J
39. Berrada F
40. Bravi CM
41. Brisighelli F
42. Busby GB
43. Cali F
44. Churnosov M
45. Cole DE
46. Corach D
47. Damba L
48. van Driem G
49. Dryomov S
50. Dugoujon JM
51. Fedorova SA
52. Gallego Romero I
53. Gubina M
54. Hammer M
55. Henn BM
56. Hervig T
57. Hodoglugil U
58. Jha AR
59. Karachanak-Yankova S
60. Khusainova R
61. Khusnutdinova E
62. Kittles R
63. Kivisild T
64. Klitz W
65. Kučinskas V
66. Kushniarevich A
67. Laredj L
68. Litvinov S
69. Loukidis T
70. Mahley RW
71. Melegh B
72. Metspalu E
73. Molina J
74. Mountain J
75. Näkkäläjärvi K
76. Nesheva D
77. Nyambo T
78. Osipova L
79. Parik J
80. Platonov F
81. Posukh O
82. Romano V
83. Rothhammer F
84. Rudan I
85. Ruizbakiev R
86. Sahakyan H
87. Sajantila A
88. Salas A
89. Starikovskaya EB
90. Tarekegn A
91. Toncheva D
92. Turdikulova S
93. Uktveryte I
94. Utevska O
95. Vasquez R
96. Villena M
97. Voevoda M
98. Winkler CA
99. Yepiskoposyan L
100. Zalloua P
101. Zemunik T
102. Cooper A
103. Capelli C
104. Thomas MG
105. Ruiz-Linares A
106. Tishkoff SA
107. Singh L
108. Thangaraj K
109. Villems R
110. Comas D
111. Sukernik R
112. Metspalu M
113. Meyer M
114. Eichler EE
115. Burger J
116. Slatkin M
117. Pääbo S
118. Kelso J
119. Reich D
120. Krause J
(2014) Ancient human genomes suggest three ancestral populations for present-day europeans
Nature 513:409–413.

https://doi.org/10.1038/nature13673
- PubMed
- Google Scholar
(2001) Studying protein dynamics in living cells
Nature Reviews Molecular Cell Biology 2:444–456.

https://doi.org/10.1038/35073068
- PubMed
- Google Scholar
1. Liu SH
2. Towler MC
3. Chen E
4. Chen CY
5. Song W
6. Apodaca G
7. Brodsky FM
(2001) A novel clathrin homolog that co-distributes with cytoskeletal components functions in the trans-Golgi network
The EMBO Journal 20:272–284.

https://doi.org/10.1093/emboj/20.1.272
- PubMed
- Google Scholar
1. Liu S
2. Lorenzen ED
3. Fumagalli M
4. Li B
5. Harris K
6. Xiong Z
7. Zhou L
8. Korneliussen TS
9. Somel M
10. Babbitt C
11. Wray G
12. Li J
13. He W
14. Wang Z
15. Fu W
16. Xiang X
17. Morgan CC
18. Doherty A
19. O'Connell MJ
20. McInerney JO
21. Born EW
22. Dalén L
23. Dietz R
24. Orlando L
25. Sonne C
26. Zhang G
27. Nielsen R
28. Willerslev E
29. Wang J
(2014) Population genomics reveal recent speciation and rapid evolutionary adaptation in polar bears
Cell 157:785–794.

https://doi.org/10.1016/j.cell.2014.03.054
- PubMed
- Google Scholar
1. Löytynoja A
2. Goldman N
(2005) An algorithm for progressive multiple alignment of sequences with insertions
PNAS 102:10557–10562.

https://doi.org/10.1073/pnas.0409137102
- PubMed
- Google Scholar
1. Mallick S
2. Li H
3. Lipson M
4. Mathieson I
5. Gymrek M
6. Racimo F
7. Zhao M
8. Chennagiri N
9. Nordenfelt S
10. Tandon A
11. Skoglund P
12. Lazaridis I
13. Sankararaman S
14. Fu Q
15. Rohland N
16. Renaud G
17. Erlich Y
18. Willems T
19. Gallo C
20. Spence JP
21. Song YS
22. Poletti G
23. Balloux F
24. van Driem G
25. de Knijff P
26. Romero IG
27. Jha AR
28. Behar DM
29. Bravi CM
30. Capelli C
31. Hervig T
32. Moreno-Estrada A
33. Posukh OL
34. Balanovska E
35. Balanovsky O
36. Karachanak-Yankova S
37. Sahakyan H
38. Toncheva D
39. Yepiskoposyan L
40. Tyler-Smith C
41. Xue Y
42. Abdullah MS
43. Ruiz-Linares A
44. Beall CM
45. Di Rienzo A
46. Jeong C
47. Starikovskaya EB
48. Metspalu E
49. Parik J
50. Villems R
51. Henn BM
52. Hodoglugil U
53. Mahley R
54. Sajantila A
55. Stamatoyannopoulos G
56. Wee JT
57. Khusainova R
58. Khusnutdinova E
59. Litvinov S
60. Ayodo G
61. Comas D
62. Hammer MF
63. Kivisild T
64. Klitz W
65. Winkler CA
66. Labuda D
67. Bamshad M
68. Jorde LB
69. Tishkoff SA
70. Watkins WS
71. Metspalu M
72. Dryomov S
73. Sukernik R
74. Singh L
75. Thangaraj K
76. Pääbo S
77. Kelso J
78. Patterson N
79. Reich D
(2016) The simons genome diversity project: 300 genomes from 142 diverse populations
Nature 538:201–206.

https://doi.org/10.1038/nature18964
- PubMed
- Google Scholar
1. Mathieson S
2. Mathieson I
(2018) FADS1 and the timing of human adaptation to agriculture
Molecular Biology and Evolution 35:2957–2970.

https://doi.org/10.1093/molbev/msy180
- PubMed
- Google Scholar
1. McKenna A
2. Hanna M
3. Banks E
4. Sivachenko A
5. Cibulskis K
6. Kernytsky A
7. Garimella K
8. Altshuler D
9. Gabriel S
10. Daly M
11. DePristo MA
(2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data
Genome Research 20:1297–1303.

https://doi.org/10.1101/gr.107524.110
- PubMed
- Google Scholar
1. Messer PW
2. Petrov DA
(2013) Population genomics of rapid adaptation by soft selective sweeps
Trends in Ecology & Evolution 28:659–669.

https://doi.org/10.1016/j.tree.2013.08.003
- PubMed
- Google Scholar
1. Meyer M
2. Kircher M
3. Gansauge MT
4. Li H
5. Racimo F
6. Mallick S
7. Schraiber JG
8. Jay F
9. Prüfer K
10. de Filippo C
11. Sudmant PH
12. Alkan C
13. Fu Q
14. Do R
15. Rohland N
16. Tandon A
17. Siebauer M
18. Green RE
19. Bryc K
20. Briggs AW
21. Stenzel U
22. Dabney J
23. Shendure J
24. Kitzman J
25. Hammer MF
26. Shunkov MV
27. Derevianko AP
28. Patterson N
29. Andrés AM
30. Eichler EE
31. Slatkin M
32. Reich D
33. Kelso J
34. Pääbo S
(2012) A high-coverage genome sequence from an archaic denisovan individual
Science 338:222–226.

https://doi.org/10.1126/science.1224344
- PubMed
- Google Scholar
1. Nahorski MS
2. Al-Gazali L
3. Hertecant J
4. Owen DJ
5. Borner GH
6. Chen YC
7. Benn CL
8. Carvalho OP
9. Shaikh SS
10. Phelan A
11. Robinson MS
12. Royle SJ
13. Woods CG
(2015) A novel disorder reveals clathrin heavy chain-22 is essential for human pain and touch development
Brain 138:2147–2160.

https://doi.org/10.1093/brain/awv149
- PubMed
- Google Scholar
1. Näthke IS
2. Heuser J
3. Lupas A
4. Stock J
5. Turck CW
6. Brodsky FM
(1992) Folding and trimerization of clathrin subunits at the triskelion hub
Cell 68:899–910.

https://doi.org/10.1016/0092-8674(92)90033-9
- PubMed
- Google Scholar
1. Nei M
(1973) Analysis of gene diversity in subdivided populations
PNAS 70:3321–3323.

https://doi.org/10.1073/pnas.70.12.3321
- PubMed
- Google Scholar
(2014) Derived immune and ancestral pigmentation alleles in a 7,000-year-old mesolithic European
Nature 507:225–228.

https://doi.org/10.1038/nature12960
- PubMed
- Google Scholar
1. Ostlund G
2. Schmitt T
3. Forslund K
4. Köstler T
5. Messina DN
6. Roopra S
7. Frings O
8. Sonnhammer EL
(2010) InParanoid 7: new algorithms and tools for eukaryotic orthology analysis
Nucleic Acids Research 38:D196–D203.

https://doi.org/10.1093/nar/gkp931
- PubMed
- Google Scholar
1. Paradis E
(2010) Pegas: an R package for population genetics with an integrated-modular approach
Bioinformatics 26:419–420.

https://doi.org/10.1093/bioinformatics/btp696
- PubMed
- Google Scholar
1. Pettersen EF
2. Goddard TD
3. Huang CC
4. Couch GS
5. Greenblatt DM
6. Meng EC
7. Ferrin TE
(2004) UCSF chimera--a visualization system for exploratory research and analysis
Journal of Computational Chemistry 25:1605–1612.

https://doi.org/10.1002/jcc.20084
- PubMed
- Google Scholar
1. Prado-Martinez J
2. Sudmant PH
3. Kidd JM
4. Li H
5. Kelley JL
6. Lorente-Galdos B
7. Veeramah KR
8. Woerner AE
9. O'Connor TD
10. Santpere G
11. Cagan A
12. Theunert C
13. Casals F
14. Laayouni H
15. Munch K
16. Hobolth A
17. Halager AE
18. Malig M
19. Hernandez-Rodriguez J
20. Hernando-Herraez I
21. Prüfer K
22. Pybus M
23. Johnstone L
24. Lachmann M
25. Alkan C
26. Twigg D
27. Petit N
28. Baker C
29. Hormozdiari F
30. Fernandez-Callejo M
31. Dabad M
32. Wilson ML
33. Stevison L
34. Camprubí C
35. Carvalho T
36. Ruiz-Herrera A
37. Vives L
38. Mele M
39. Abello T
40. Kondova I
41. Bontrop RE
42. Pusey A
43. Lankester F
44. Kiyang JA
45. Bergl RA
46. Lonsdorf E
47. Myers S
48. Ventura M
49. Gagneux P
50. Comas D
51. Siegismund H
52. Blanc J
53. Agueda-Calpena L
54. Gut M
55. Fulton L
56. Tishkoff SA
57. Mullikin JC
58. Wilson RK
59. Gut IG
60. Gonder MK
61. Ryder OA
62. Hahn BH
63. Navarro A
64. Akey JM
65. Bertranpetit J
66. Reich D
67. Mailund T
68. Schierup MH
69. Hvilsom C
70. Andrés AM
71. Wall JD
72. Bustamante CD
73. Hammer MF
74. Eichler EE
75. Marques-Bonet T
(2013) Great ape genetic diversity and population history
Nature 499:471–475.

https://doi.org/10.1038/nature12228
- PubMed
- Google Scholar
1. Prüfer K
2. Racimo F
3. Patterson N
4. Jay F
5. Sankararaman S
6. Sawyer S
7. Heinze A
8. Renaud G
9. Sudmant PH
10. de Filippo C
11. Li H
12. Mallick S
13. Dannemann M
14. Fu Q
15. Kircher M
16. Kuhlwilm M
17. Lachmann M
18. Meyer M
19. Ongyerth M
20. Siebauer M
21. Theunert C
22. Tandon A
23. Moorjani P
24. Pickrell J
25. Mullikin JC
26. Vohr SH
27. Green RE
28. Hellmann I
29. Johnson PL
30. Blanche H
31. Cann H
32. Kitzman JO
33. Shendure J
34. Eichler EE
35. Lein ES
36. Bakken TE
37. Golovanova LV
38. Doronichev VB
39. Shunkov MV
40. Derevianko AP
41. Viola B
42. Slatkin M
43. Reich D
44. Kelso J
45. Pääbo S
(2014) The complete genome sequence of a Neanderthal from the Altai mountains
Nature 505:43–49.

https://doi.org/10.1038/nature12886
- PubMed
- Google Scholar
1. Raghavan M
2. Skoglund P
3. Graf KE
4. Metspalu M
5. Albrechtsen A
6. Moltke I
7. Rasmussen S
8. Stafford TW
9. Orlando L
10. Metspalu E
11. Karmin M
12. Tambets K
13. Rootsi S
14. Mägi R
15. Campos PF
16. Balanovska E
17. Balanovsky O
18. Khusnutdinova E
19. Litvinov S
20. Osipova LP
21. Fedorova SA
22. Voevoda MI
23. DeGiorgio M
24. Sicheritz-Ponten T
25. Brunak S
26. Demeshchenko S
27. Kivisild T
28. Villems R
29. Nielsen R
30. Jakobsson M
31. Willerslev E
(2014) Upper palaeolithic Siberian genome reveals dual ancestry of native Americans
Nature 505:87–91.

https://doi.org/10.1038/nature12736
- PubMed
- Google Scholar
(2014) A note on exact differences between beta distributions in genomic (Methylation) studies
PLOS ONE 9:e97349.

https://doi.org/10.1371/journal.pone.0097349
- PubMed
- Google Scholar
1. Read TD
2. Petit RA
3. Joseph SJ
4. Alam MT
5. Weil MR
6. Ahmad M
7. Bhimani R
8. Vuong JS
9. Haase CP
10. Webb DH
11. Tan M
12. Dove ADM
(2017) Draft sequencing and assembly of the genome of the world's largest fish, the whale shark: Rhincodon typus Smith 1828
BMC Genomics 18:532.

https://doi.org/10.1186/s12864-017-3926-9
- PubMed
- Google Scholar
(1983)
Estimation of the coancestry coefficient: basis for a short-term genetic distance

Genetics 105:767–779.
- PubMed
- Google Scholar
1. Riddle MR
2. Aspiras AC
3. Gaudenz K
4. Peuß R
5. Sung JY
6. Martineau B
7. Peavey M
8. Box AC
9. Tabin JA
10. McGaugh S
11. Borowsky R
12. Tabin CJ
13. Rohner N
(2018) Insulin resistance in cavefish as an adaptation to a nutrient-limited environment
Nature 555:647–651.

https://doi.org/10.1038/nature26136
- PubMed
- Google Scholar
1. Rout MP
2. Field MC
(2017) The evolution of organellar coat complexes and organization of the eukaryotic cell
Annual Review of Biochemistry 86:637–657.

https://doi.org/10.1146/annurev-biochem-061516-044643
- PubMed
- Google Scholar
1. Seguin-Orlando A
2. Korneliussen TS
3. Sikora M
4. Malaspinas AS
5. Manica A
6. Moltke I
7. Albrechtsen A
8. Ko A
9. Margaryan A
10. Moiseyev V
11. Goebel T
12. Westaway M
13. Lambert D
14. Khartanovich V
15. Wall JD
16. Nigst PR
17. Foley RA
18. Lahr MM
19. Nielsen R
20. Orlando L
21. Willerslev E
(2014) Paleogenomics. genomic structure in Europeans dating back at least 36,200 years
Science 346:1113–1118.

https://doi.org/10.1126/science.aaa0114
- PubMed
- Google Scholar
(2014) Evidence for the repeated use of a central hearth at middle pleistocene (300 ky ago) Qesem Cave, Israel
Journal of Archaeological Science 44:12–21.

https://doi.org/10.1016/j.jas.2013.11.015
- Google Scholar
1. Shepherd PR
2. Kahn BB
(1999) Glucose transporters and insulin action--implications for insulin resistance and diabetes mellitus
New England Journal of Medicine 341:248–257.

https://doi.org/10.1056/NEJM199907223410406
- PubMed
- Google Scholar
(2012) Origins and genetic legacy of neolithic farmers and hunter-gatherers in Europe
Science 336:466–469.

https://doi.org/10.1126/science.1216304
- PubMed
- Google Scholar
1. Skoglund P
2. Malmström H
3. Omrak A
4. Raghavan M
5. Valdiosera C
6. Günther T
7. Hall P
8. Tambets K
9. Parik J
10. Sjögren KG
11. Apel J
12. Willerslev E
13. Storå J
14. Götherström A
15. Jakobsson M
(2014) Genomic diversity and admixture differs for stone-age Scandinavian foragers and farmers
Science 344:747–750.

https://doi.org/10.1126/science.1253448
- PubMed
- Google Scholar
1. Speir ML
2. Zweig AS
3. Rosenbloom KR
4. Raney BJ
5. Paten B
6. Nejad P
7. Lee BT
8. Learned K
9. Karolchik D
10. Hinrichs AS
11. Heitner S
12. Harte RA
13. Haeussler M
14. Guruvadoo L
15. Fujita PA
16. Eisenhart C
17. Diekhans M
18. Clawson H
19. Casper J
20. Barber GP
21. Haussler D
22. Kuhn RM
23. Kent WJ
(2016) The UCSC genome browser database: 2016 update
Nucleic Acids Research 44:D717–D725.

https://doi.org/10.1093/nar/gkv1275
- PubMed
- Google Scholar
1. Stöver BC
2. Müller KF
(2010) TreeGraph 2: combining and visualizing evidence from different phylogenetic analyses
BMC Bioinformatics 11:7.

https://doi.org/10.1186/1471-2105-11-7
- PubMed
- Google Scholar
(2006) PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments
Nucleic Acids Research 34:W609–W612.

https://doi.org/10.1093/nar/gkl315
- PubMed
- Google Scholar
1. Tajima F
(1989)
Statistical method for testing the neutral mutation hypothesis by DNA polymorphism

Genetics 123:585–595.
- PubMed
- Google Scholar
1. Teixeira JC
2. de Filippo C
3. Weihmann A
4. Meneu JR
5. Racimo F
6. Dannemann M
7. Nickel B
8. Fischer A
9. Halbwax M
10. Andre C
11. Atencia R
12. Meyer M
13. Parra G
14. Pääbo S
15. Andrés AM
(2015) Long-Term balancing selection in LAD1 maintains a missense Trans-Species polymorphism in humans, chimpanzees, and bonobos
Molecular Biology and Evolution 32:1186–1196.

https://doi.org/10.1093/molbev/msv007
- PubMed
- Google Scholar
1. Towler MC
2. Gleeson PA
3. Hoshino S
4. Rahkila P
5. Manalo V
6. Ohkoshi N
7. Ordahl C
8. Parton RG
9. Brodsky FM
(2004) Clathrin isoform CHC22, a component of neuromuscular and myotendinous junctions, binds sorting nexin 5 and has increased expression during myogenesis and muscle regeneration
Molecular Biology of the Cell 15:3181–3195.

https://doi.org/10.1091/mbc.e04-03-0249
- PubMed
- Google Scholar
1. Trefely S
2. Khoo PS
3. Krycer JR
4. Chaudhuri R
5. Fazakerley DJ
6. Parker BL
7. Sultani G
8. Lee J
9. Stephan JP
10. Torres E
11. Jung K
12. Kuijl C
13. James DE
14. Junutula JR
15. Stöckli J
(2015) Kinome screen identifies PFKFB3 and glucose metabolism as important regulators of the Insulin/Insulin-like growth factor (IGF)-1 signaling pathway
Journal of Biological Chemistry 290:25834–25846.

https://doi.org/10.1074/jbc.M115.658815
- PubMed
- Google Scholar
1. Vassilopoulos S
2. Esk C
3. Hoshino S
4. Funke BH
5. Chen CY
6. Plocik AM
7. Wright WE
8. Kucherlapati R
9. Brodsky FM
(2009) A role for the CHC22 clathrin heavy-chain isoform in human glucose metabolism
Science 324:1192–1196.

https://doi.org/10.1126/science.1171529
- PubMed
- Google Scholar
1. Venkatesh B
2. Lee AP
3. Ravi V
4. Maurya AK
5. Lian MM
6. Swann JB
7. Ohta Y
8. Flajnik MF
9. Sutoh Y
10. Kasahara M
11. Hoon S
12. Gangu V
13. Roy SW
14. Irimia M
15. Korzh V
16. Kondrychyn I
17. Lim ZW
18. Tay BH
19. Tohari S
20. Kong KW
21. Ho S
22. Lorente-Galdos B
23. Quilez J
24. Marques-Bonet T
25. Raney BJ
26. Ingham PW
27. Tay A
28. Hillier LW
29. Minx P
30. Boehm T
31. Wilson RK
32. Brenner S
33. Warren WC
(2014) Elephant shark genome provides unique insights into gnathostome evolution
Nature 505:174–179.

https://doi.org/10.1038/nature12826
- PubMed
- Google Scholar
(2005) Clathrin heavy and light chain isoforms originated by independent mechanisms of gene duplication during chordate evolution
PNAS 102:7209–7214.

https://doi.org/10.1073/pnas.0502058102
- PubMed
- Google Scholar
1. Watterson GA
(1975) On the number of segregating sites in genetical models without recombination
Theoretical Population Biology 7:256–276.

https://doi.org/10.1016/0040-5809(75)90020-9
- PubMed
- Google Scholar
(2005) New faces of the familiar clathrin lattice
Traffic 6:346–350.

https://doi.org/10.1111/j.1600-0854.2005.00277.x
- PubMed
- Google Scholar
1. Yang Z
(2007) PAML 4: phylogenetic analysis by maximum likelihood
Molecular Biology and Evolution 24:1586–1591.

https://doi.org/10.1093/molbev/msm088
- PubMed
- Google Scholar
1. Yates A
2. Akanni W
3. Amode MR
4. Barrell D
5. Billis K
6. Carvalho-Silva D
7. Cummins C
8. Clapham P
9. Fitzgerald S
10. Gil L
11. Girón CG
12. Gordon L
13. Hourlier T
14. Hunt SE
15. Janacek SH
16. Johnson N
17. Juettemann T
18. Keenan S
19. Lavidas I
20. Martin FJ
21. Maurel T
22. McLaren W
23. Murphy DN
24. Nag R
25. Nuhn M
26. Parker A
27. Patricio M
28. Pignatelli M
29. Rahtz M
30. Riat HS
31. Sheppard D
32. Taylor K
33. Thormann A
34. Vullo A
35. Wilder SP
36. Zadissa A
37. Birney E
38. Harrow J
39. Muffato M
40. Perry E
41. Ruffier M
42. Spudich G
43. Trevanion SJ
44. Cunningham F
45. Aken BL
46. Zerbino DR
47. Flicek P
(2016) Ensembl 2016
Nucleic Acids Research 44:D710–D716.

https://doi.org/10.1093/nar/gkv1157
- PubMed
- Google Scholar
1. Ybe JA
2. Brodsky FM
3. Hofmann K
4. Lin K
5. Liu SH
6. Chen L
7. Earnest TN
8. Fletterick RJ
9. Hwang PK
(1999) Clathrin self-assembly is mediated by a tandemly repeated superhelix
Nature 399:371–375.

https://doi.org/10.1038/20708
- PubMed
- Google Scholar
1. Yuan Y
2. Zhang P
3. Wang K
4. Liu M
5. Li J
6. Zheng J
7. Wang D
8. Xu W
9. Lin M
10. Dong L
11. Zhu C
12. Qiu Q
13. Li S
(2018) Genome sequence of the freshwater Yangtze finless porpoise
Genes 9:213.

https://doi.org/10.3390/genes9040213
- Google Scholar

Article and author information

Author details

Matteo Fumagalli
1. Department of Life Sciences, Imperial College London, Ascot, United Kingdom
2. Research Department of Structural and Molecular Biology, Division of Biosciences, University College London, London, United Kingdom
3. Research Department of Genetics, Evolution and Environment, Division of Biosciences, University College London, London, United Kingdom
4. UCL Genetics Institute, University College London, London, United Kingdom
Contribution
Formal analysis, Investigation, Visualization, Methodology, Writing—review and editing

Contributed equally with
Stephane M Camus and Yoan Diekmann

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-4084-2953
Stephane M Camus

Research Department of Structural and Molecular Biology, Division of Biosciences, University College London, London, United Kingdom

Contribution
Formal analysis, Investigation, Visualization, Methodology, Writing—review and editing

Contributed equally with
Matteo Fumagalli and Yoan Diekmann

Competing interests
No competing interests declared
Yoan Diekmann
1. Research Department of Genetics, Evolution and Environment, Division of Biosciences, University College London, London, United Kingdom
2. UCL Genetics Institute, University College London, London, United Kingdom
Contribution
Formal analysis, Investigation, Visualization, Methodology, Writing—review and editing

Contributed equally with
Matteo Fumagalli and Stephane M Camus

Competing interests
No competing interests declared
Alice Burke

Research Department of Structural and Molecular Biology, Division of Biosciences, University College London, London, United Kingdom

Contribution
Investigation, Writing—review and editing

Competing interests
No competing interests declared
Marine D Camus

Research Department of Structural and Molecular Biology, Division of Biosciences, University College London, London, United Kingdom

Contribution
Investigation, Writing—review and editing

Competing interests
No competing interests declared
Paul J Norman
1. Division of Bioinformatics and Personalized Medicine, University of Colorado, Aurora, United States
2. Department of Microbiology and Immunology, University of Colorado, Aurora, United States
Contribution
Resources, Formal analysis, Writing—review and editing

Competing interests
No competing interests declared
Agnel Joseph

Institute of Structural and Molecular Biology, Birkbeck College and University College London, London, United Kingdom

Contribution
Investigation, Visualization, Writing—review and editing

Competing interests
No competing interests declared
Laurent Abi-Rached

Aix-Marseille Univ, IRD, MEPHI, IHU Méditerranée Infection, CNRS, Marseille, France

Contribution
Resources, Writing—review and editing

Competing interests
No competing interests declared
Andrea Benazzo

Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy

Contribution
Resources, Data curation, Writing—review and editing

Competing interests
No competing interests declared
Rita Rasteiro

School of Biological Sciences, University of Bristol, Bristol, United Kingdom

Contribution
Resources, Data curation, Writing—review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-4217-3060
Iain Mathieson

Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States

Contribution
Formal analysis, Investigation, Writing—review and editing

Competing interests
No competing interests declared
Maya Topf

Institute of Structural and Molecular Biology, Birkbeck College and University College London, London, United Kingdom

Contribution
Supervision, Writing—review and editing

Competing interests
No competing interests declared
Peter Parham
1. Department of Structural Biology, Stanford University, Stanford, CA, United States
2. Department of Microbiology and Immunology, Stanford University, Stanford, CA, United States
Contribution
Supervision, Writing—review and editing

Competing interests
No competing interests declared
Mark G Thomas
1. Research Department of Genetics, Evolution and Environment, Division of Biosciences, University College London, London, United Kingdom
2. UCL Genetics Institute, University College London, London, United Kingdom
Contribution
Conceptualization, Supervision, Writing—review and editing

Competing interests
No competing interests declared
Frances M Brodsky
1. Research Department of Structural and Molecular Biology, Division of Biosciences, University College London, London, United Kingdom
2. Institute of Structural and Molecular Biology, Birkbeck College and University College London, London, United Kingdom
Contribution
Conceptualization, Supervision, Funding acquisition, Writing—original draft, Project administration

For correspondence
f.brodsky@ucl.ac.uk

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-1334-9258

Funding

National Institutes of Health (DK095663)

Matteo Fumagalli
Frances M Brodsky

National Institutes of Health (AI090905)

Peter Parham

Medical Research Council (MR/S008144/1)

Frances M Brodsky

Wellcome Trust (107858/Z/15/Z)

Frances M Brodsky

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported by grants from the National Institutes of Health (USA) DK095663 (FMB and MF) and AI090905 (PP) and Medical Research Council grant MR/S008144/1 and Wellcome Trust Investigator Award 107858/Z/15/Z to FMB. We are grateful to Libby Guethlein for helpful discussions on genomic data retrieval and processing. Genomic sequence data and cell culture studies reported here utilize the HeLa cell line. Henrietta Lacks, and the HeLa cell line that was established from her tumor cells, have made significant contributions to scientific progress and advances in human health.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.