The evolutionary history of human spindle genes includes back-and-forth gene flow with Neandertals

  1. Stéphane Peyrégne  Is a corresponding author
  2. Janet Kelso
  3. Benjamin M Peter
  4. Svante Pääbo  Is a corresponding author
  1. Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Germany
5 figures, 8 tables and 1 additional file

Figures

Figure 1 with 1 supplement
Genomic regions around spindle genes where archaic humans fall outside the modern human variation.

Each panel corresponds to the region around the missense change(s) (red stars) in a spindle gene. The grey boxes correspond to exons. The curves give the posterior probability (computed as in Peyrégn…

Figure 1—figure supplement 1
Genomic regions where archaic humans fall outside the modern human variation, identified using the most recent deCODE recombination map (Halldorsson et al., 2019).
Evidence for selection in the spindle genes with age estimates of these substitutions.

(A) The genetic length of segments around the missense substitutions where the Altai Neandertal and Denisova 3 fall outside the human variation (Figure 1) using the African-American map, AAmap, or …

A modern human-like haplotype in some Neandertals.

Genotypes from 13 archaic individuals (y-axis) are shown in a region around the two missense changes (dots) in KNL1. We only show positions (x-axis) that are derived in all Luhya and Yoruba …

Figure 4 with 1 supplement
The modern human-like KNL1 haplotype in Neandertals.

(A) Pairwise differences between two high coverage Neandertal genomes (Chagyrskaya 8 and Altai Neandertal (Denisova 5)) in non-overlapping sliding windows of 276 kb (histogram) and in the KNL1

Figure 4—figure supplement 1
Genotypes of the 12 non-African individuals that inherited one copy of KNL1 from archaic humans.

We show positions within 40kb downstream of the modern-like KNL1 haplotype identified in Chagyrskaya 8 to highlight 7 positions (red marks) where only those 12 individuals (out of 929 individuals …

Schematic illustration of the history of KNL1.

The tree delineated in black corresponds to the average relationship between the modern and archaic human populations. The inner colored trees correspond to the relationship of different KNL1

Tables

Appendix 1—table 1
Location and predicted effects of the studied amino acid changes in spindle proteins, as reported in dbNSFP version 4.2 (48).

The predictions are for the ancestral variants. We put “damaging” in between quotation marks as the ancestral versions of ATRX and KATNA1 are unlikely to be damaging (as the ancestral amino acid …

proteinpositionamino acid changeprotein domain (Uniprot)effect prediction for the ancestral variant (MutPred)potentially “damaging” ancestral variant according to:
ATRX475D ->H--FATHMM;
M-CAP
KATNA1343A ->T--FATHMM;
PrimateAI
KIF18A67R ->Kkinesin motor (11-355)Loss of methylation (P=0.0087)-
KNL1159H ->Rinteraction domain with BUB1 and BUB1B (1-728)--
1,086G ->S2 × 104 AA approximate repeats
(855–1201)
Loss of phosphorylation (P=0.0382)-
NEK6291D ->Hprotein kinase domain
(45-310)
--
RSPH1213K ->Q---
SPAG543P ->S-Loss of phosphorylation (P=0.0244)
Gain of catalytic residue (P=0.0179)
-
162E ->G---
410D ->H---
STARD93,925A ->T---
Appendix 1—table 2
Deleteriousness and conservation scores at the studied positions with missense changes in spindle genes, as reported in dbNSFP version 4.2 (48).

A high CADD score indicates that the ancestral variant is likely to be deleterious (Kircher et al., 2014; Rentzsch et al., 2019; Pollard et al., 2010) and a high conservation score means that the …

geneposition (hg19) rs IDcorresponding amino acid changeDeleteriousnessConservation
CADD score (hg19)GERP
++RS score
phyloP 100way vertebrate scorephastCons 100way vertebrate score
KATNA16–149,918,766
rs73781249
A343T2.0510335.484.8341.000
SPAG517–26,925,570
NA
P43S–0.425670–3.57–1.4040.000
17–26,919,777
NA
E162G0.2963173.660.2800.001
17–26,919,034
NA
D410H1.0627435.42.0321.000
KNL115–40,912,860
rs755472529
H159R0.4754543.7–0.0160.001
15–40,915,640
NA
G1086S0.8017874.121.0260.054
KIF18A11–28,119,295
rs775297730
R67K1.1345892.620.5250.845
NEK69–127,113,155
rs146443565
D291H–0.293112–1.563.2841.000
ATRXX-76,939,325
rs146863015
D475H–0.1416063.640.7910.840
RSPH121–43,897,491
rs146298259
K213Q–0.0610531.810.6720.079
STARD915–42,985,549
rs573215252
A3925T–0.351117–2.510.0470.000
Appendix 2—table 1
Age estimates of the missense substitutions in spindle genes.

The ages were estimated in the regions where the Altai Neandertal and Denisova 3 genomes fall outside the human variation (intersection of the regions identified with the African-American and deCODE …

GenechromosomeRegion (hg19)Lower age (kya)Upper age (kya)
ATRXX76,703,773–77,246,471NANA
KATNA16149,840,973–149,930,4258631,329
KIF18A1128,018,167–28,304,2938431,006
KNL11540,898,141–40,948,3061,0271,690
NEK69127,109,510–127,113,614NANA
RSPH12143,897,417–43,897,549NANA
SPAG51726,875,942–27,045,524677796
STARD91542,941,540–42,989,1609471,401
Appendix 3—table 1
Coverage depth in archaic human genomes at positions with modern human-specific missense substitutions in spindle genes.

The numbers of DNA fragments carrying a particular base are reported in parentheses after the corresponding base. Bases in uppercase were sequenced in the forward orientation, whereas those in …

GeneATRXKATNA1KIF18AKNL1NEK6RSPH1SPAG5STARD9
Chr-PositionX-76,939,3256–149,918,76611–28,119,29515–40,912,86015–40,915,6409–127,113,15521–43,897,49117–26,919,03417–26,919,77717–26,925,57815–42,985,549
AncestralCCCAGGTCTGG
DerivedGTTGACGGCAA
Altai NeandertalC (21)
c (31)
C (22)
c (15)
T* (2)
C (19)
c (41)
T* (1)
A (24)
a (28)
G (33)
g (16)
G (23)
g (23)
T (18)
t (26)
C (17)
c (17)
T (18)
t (20)
a (1)
G (28)
g (17)
G (13)
g (16)
A* (1)
Chagyrskaya 8C (10)
c (9)
T (1)
t (1)
C (13)
c (5)
C (15)
c (19)
T* (3)
A (1)
G* (14)
g* (11)
A* (19)
a* (8)
G (5)
g (7)
a (1)
T (17)
t (13)
C (11)
c (7)
T (1)
T (16)
t (11)
G (7)
g (6)
a* (1)
G (8)
g (13)
Denisova 3C (19)
c (17)
C (15)
c (13)
T* (2)
C (17)
c (24)
A (20)
a (30)
G (25)
g (17)
a* (1)
G (15)
g (14)
T (19)
t (27)
C (20)
c (14)
T (16)
t (11)
G (12)
g (6)
G (17)
g (12)
Denisova 11C (1)
c (2)
NAC (1)a (2)g (1)G (1)
g (1)
T (4)
t (2)
c (2)
t (1)
NANAG (1)
Goyet Q56-1C (1)NAC (1)
c (5)
G* (1)A* (1)
a* (2)
G (1)
g (2)
NAC (1)T (1)G (4)
g (2)
G (2)
Hohlenstein-StadelNANANANANANANANANANANA
Les Cottés Z4-1514C (1)
c (1)
c (2)
T* (1)
C (5)
c (4)
A (2)
a (4)
G (3)
g (2)
NAT (2)
t (1)
NAT (1)G (1)G (1)
Mezmaiskaya 1c (3)NAC (1)
c (1)
NAG (2)NAt (2)C (2)
c (1)
g* (2)
NANANA
Mezmaiskaya 2C (1)C (1)
c (1)
T* (1)
C (1)g* (1)A* (1)G (2)
g (1)
NAC (1)T (1)G (2)g (3)
Scladina I-4ANANANANANANANANANANANA
Spy 94 ANAc (1)C (1)g* (1)A* (1)
a* (3)
NAT (1)C (1)T (1)g (1)NA
Vindija 33.19C (16)
c (18)
T (1)
C (8)
c (11)
C (17)
c (20)
T* (1)
A (13)
a (19)
G (20)
g (16)
a* (2)
G (8)
g (8)
T (22)
t (15)
C (12)
c (14)
T (15)
t (7)
G (15)
g (13)
a* (1)
G (14)
g (14)
Appendix 3—table 2
Coverage depth of the Mezmaiskaya 1 genome at positions with modern human-specific substitutions in SPAG5.

Only positions covered by at least one DNA sequence are reported. Bases in uppercase were sequenced in the forward orientation, whereas those in lowercase were sequenced in the reverse orientation. …

NeandertalChr-position (rs ID)AncestralDerivedAllele counts
Mezmaiskaya 117–26,864,608
(rs188710272)
AGA (1)
17–26,891,162
(NA)
TGT (1)
17–26,892,376
(NA)
ATA (2)
17–26,913,024
(NA)
AGa (a)
17–26,919,034
(NA)
CGC (2)
c (1)
g (2)
17–26,948,236
(NA)
GAg (1)
17–26,967,723
(rs558276956)
AGA (3)
17–27,005,275
(NA)
GAG (1)
17–27,010,483
(NA)
GAg (1)
Appendix 4—table 1
Positions defining the closely related haplotype between some modern humans and Neandertals.

At these positions, the Chagyrskaya 8 genome differs from other high-quality archaic genomes without the modern human-like haplotype but some African genomes from the HGDP dataset carry the same …

ChromosomePosition (hg19) rs IDReferenceAlternative (Chagyrskaya 8-like)Chagyrskaya 8-like allele frequency in genomes from the HGDP dataset
AfricansNon-Africans
1540,885,107
rs16970851
AG0.320.41
40,886,017
rs8034043
CT0.320.41
40,886,020
rs8034048
CG0.320.40
40,892,601
rs11855923
GA0.350.40
40,893,573
rs12905162
CA0.380.40
40,905,450
rs11856438
CT0.370.41
40,908,904
rs11852670
AG0.390.41
40,910,707
rs12914743
TC0.380.41
40,915,045
rs8041534
TG0.380.41
40,915,894
rs11070285
TC0.390.41
40,925,214
rs11856802
TA0.390.41
40,926,654
rs11854986
CG0.350.40
40,929,814
rs11070286
TC0.370.41
40,937,647
rs3092979
AG0.380.41
40,959,413
rs73396515
GA0.360.10
40,959,624
rs35047458
GA0.360.40
40,960,432
rs12902568
GA0.360.40
40,963,160
rs7182530
AG0.370.41
40,987,528
rs1801320
GC0.380.11
Appendix 5—table 1
Origin of the modern human genomes from the HGDP dataset (Bergström et al., 2020) with a KNL1 copy inherited from Neandertals.
samplepopulationregion
HGDP00125HazaraCentral South Asia
HGDP00547Papuan SepikOceania
HGDP00639BedouinMiddle East
HGDP00696PalestinianMiddle East
HGDP00714CambodianEast Asia
HGDP00774HanEast Asia
HGDP00822HanEast Asia
HGDP00954YakutEast Asia
HGDP00960YakutEast Asia
HGDP00966YakutEast Asia
HGDP01023HanEast Asia
HGDP01181YiEast Asia
Appendix 6—table 1
Allele counts at positions with nearly fixed missense variants in the spindle genes of modern humans from the gnomAD database (v2.1.1), (Karczewski et al., 2020).

Columns 7–8 and 9–10 correspond to the allele counts among the 125,748 whole-exome sequences (WES) and the 15,708 whole-genome sequences (WGS), respectively. Anc = Ancestral

GeneChr-Position (rd ID)Anc(nearly) fixedAllelesVEP Annot.# Anc (WES)Total (WES)# Anc (WGS)Total (WGS)
ATRXX-76,939,325
(rs146863015)
CGG-Cmissense66182,7451122,042
KATNA16–149,918,766
(rs73781249)
CTT-Cmissense259251,19013131,400
KIF18A11–28,119,295
(rs775297730)
CTT-Cmissense26249,508131,396
KNL115–40,912,860
(rs755472529)
AGG-AmissenseNANA131,368
G-Tmissense1227,420NANA
15–40,915,640 (NA)GAA-GmissenseNANANANA
NEK69–127,113,155
(rs146443565)
GCC-Gmissense164250,1402631,404
RSPH121–43,897,491
(rs146298259)
TGG-Tmissense236251,4143031,386
G-Astop gained10251,414131,386
SPAG517–26,919,034
(NA)
CGG-CmissenseNANANANA
17–26,919,777 (NA)TCC-Amissense3251,430NANA
17–26,925,578 (NA)GAA-GmissenseNANANANA
A-Tmissense1251,066NANA
STARD915–42,985,549
(rs573215252)
GAA-Gmissense5139,342331,284
A-Cmissense5139,342NANA

Additional files

Download links