Human-specific lncRNAs contributed critically to human evolution by distinctly regulating gene expression
Figures
Study overview.
(A) The relationships between chimpanzees, the three archaic humans, and the three modern human populations, with dashed lines indicating the phylogenetic distances from modern humans based on related studies. Based on the left-top icons, the DBS in B2M lacks a counterpart in chimpanzees; the DBS in ABL2 has great differences between archaic and modern humans; the DBS in IRNAR1 is polymorphic in modern humans (red letters indicate tissue-specific expression quantitative trait loci (eQTLs) or population-specific mutations). (B) The mean length and affinity of strong DBSs. (C) Numbers of target genes and target transcripts of HS lncRNAs. (D) The illustrative figure shows the targeting relationships between HS lncRNAs (Appendix 2—figure 1). (E) Sequence distances of the top 40% of DBSs from modern humans to chimpanzees and archaic humans. (F) The illustrative figure shows the impacts of HS lncRNA–target transcript on gene expression in GTEx tissues (Figure 2).
The impact of HS lncRNA–DBS interaction on gene expression in GTEx tissues and organs.
(A) The distribution of the percentage of HS lncRNA–target transcript pairs with correlated expression across GTEx tissues and organs. Higher percentages of correlated pairs are in brain regions than in other tissues and organs. (B) The distribution of significantly changed DBSs (in terms of sequence distance) in HS lncRNA–target transcript pairs across GTEx tissues and organs between archaic and modern humans. Orange, red, and dark red indicate significant changes from Denisovans (D), Altai Neanderthals and Denisovans (AD), and all three archaic humans (ADV). DBSs in HS lncRNA–target transcript pairs with correlated expression in seven brain regions (in dark red) have changed significantly and consistently since the Altai Neanderthals, Denisovans, and Vindija Neanderthals (one-sided two-sample Kolmogorov–Smirnov test, significant changes determined by FDR <0.001).
Human-specifically reshaped gene expression by HS lncRNAs in the frontal cortex (BA9).
(A) Genes expressed in the human frontal cortex are enriched for HS lncRNAs’ target genes and neurodevelopment-related pathways. Squares, dots, and colors indicate HS lncRNAs, gene modules (Module_1 and Module_2 are illustrated), and enriched KEGG pathways, respectively. (B) Comparison of modules and genes in humans (indicated by H) and macaques (indicated by M). In each pair of modules, green and blue dots denote human genes and their orthologs, and lines between dots indicate correlated expression. Many orthologous genes in macaques (displayed at the corresponding positions) are not in the modules, and correlated expression is more prominent in humans than in macaques.
DBSs of the HS lncRNA RP11-423H2.3 in two genomic regions.
In the upper and lower panels that display two genomic regions, the tracks from top to bottom are DBSs (orange peaks), gene annotation, histone modification signals in cell lines, DNA methylation signals in cell lines, H3K4me3 RawSignal, and MRE CpG signals. DBSs overlap very well with DNA methylation and histone modification signals in multiple cell lines.
Predicted DBSs and experimentally identified (by CHART-seq) DNA-binding sites of NEAT1 and MALAT1 in two cell lines (West et al., 2014).
DBSs were predicted using the DNA sequences of CHART-seq peaks. 99% and 87% of experimentally identified DNA-binding sites of NEAT1 and MALAT1 overlap with predicted DBSs. (A) Predicted DBSs and experimentally identified DNA-binding sites of NEAT1 in three genomic regions. (B) Predicted DBSs and experimentally identified DNA-binding sites of MALAT1 in three genomic regions.
Examples of co-localization of DBSs, TEs, and cCREs in the promoter regions of genes.
(A) The DBSs of AC106795.2 in the promoter region of ADARB1. (B) The DBSs of AC106795.2 in the promoter region of CDC42EP1. (C) The DBS of AL008727.1 in the promoter region of CD81. (D) The DBS of AC007876.1 in the promoter region of DIDO1 and GID8.
The expression change of target genes was significantly larger than that of non-target genes after DBD knockout.
The fold change of gene expression was computed using the edgeR package. The |fold change| distribution of target genes was compared with the |fold change| distribution of non-target genes (one-sided Mann–Whitney test). (A) The knockout of a 157-bp sequence (chr17:80252565–80252721) which contains the DBD of RP13-516M14.1, in the HeLa cell line. (B) The knockout of a 202-bp sequence (chr1:113392603–113392804) which contains the DBD of RP11-426L16.8, in the RKO cell line. (C) The knockout of a 198-bp sequence (chr17:19460524–19460721) which contains the DBD of SNORA59B, in the SK-MES-1 cell line. (D, E) The knockout of the DBD of a wrongly transcribed long noncoding RNA (chr1:156641670–156661464) in the A549 cell line and the HCT116 cell line. (F, G) The knockout of the DBD of a wrongly transcribed long noncoding RNA (chr10:52443915–52455313) in the A549 cell line and the HCT116 cell line. These wrongly transcribed long noncoding RNAs are labeled as ‘MSTRG’ transcripts by the Stringtie package. (H) The knockout of the DBD of a third wrongly transcribed long noncoding RNA in the HCT116 cell line.
Significant up- and downregulation (|log2(fold change)| >1, FDR <0.1) of target genes after DBD knockout.
(A) RP13-516M14.1. (B) RP11-426L16.8. (C) SNORA59B.
Potential targeting regulation between HS lncRNAs.
The circle’s brown and green regions indicate promoter and gene body regions. Arrows indicate the direction from the gene body to the promoter regions. The width of the arrows indicates the binding affinity of DBSs, and the sizes of blue dots indicate the number of DBSs of the lncRNA in the genome.
Some DBSs (indicated by blue bars) are in human-specific genome sequences.
(A–D) The DBSs of RP11-848P1.4 in the genes ADCY2, CTD-3179P9.2, IPO11, and PRKAA1. (E) The DBS of RP11-598D14.1 in the gene NCAPG2.
Many genes and transcripts contain DBSs for multiple HS lncRNAs.
(A) Left to right: the DBSs of RP11-65G9.1, LA16c-306A4.2, RP13-516M14.1, SNORA59B, RP11-423H2.3, and TTTY8/8B in the A1BG. (B) Left to right: the DBSs of TTTY8/8B, RP4-669L17.10, and RP11-423H2.3 in TLR1. (C) Left to right: the DBSs of LA16c-306A4.2, RP11-423H2.3, RP11-423H2.3, RP1-118J21.25, RP11-706O15.5, and SNORA59B in TMEM210B.
In the GNAS region, RP11-423H2.3 has a DBS (indicated by the blue bar) wherein a selection signal was detected in CEU and CHB (Tajima’s D = −0.99/–1.13/1.86 in CEU/CHB/YRI, integrated Fst = 0.22), and has another DBS (indicated by the orange bar) wherein a selection signal was detected in YRI (Tajima’s D = 0.25/1.09/–1.17 in CEU/CHB/YRI, integrated Fst = 0.33).
HS lncRNAs on the Y chromosome often have longer DBSs than HS lncRNAs on the autosomes.
The top panel shows that the DBS of TTTY2/2B in HLA-C (indicated by the blue bar) is longer than the two DBSs of RP11-423H2.3 (indicated by the green bars). The bottom panel shows that the DBS of TTTY8/8B in IFNAR1 (indicated by the blue bar) is longer than the DBS of LINC00279 (indicated by the green bar).
The linkage disequilibrium (LD) of the key SNP in DBSs of HS lncRNAs in genes on some chromosomes.
(A) The LD of the key SNP in the DBSs of LA16c-306A4.2 in some genes on chromosome 16. (B) The LD of the key SNP in the DBSs of RP11-423H2.3 in some genes on chromosome 1. (C) The LD of the key SNP in the DBSs of SNORA59B in some genes on chromosome 1. (D) The LD of the key SNP in the DBSs of TTTY8B in some genes on chromosome 16.
Numbers of DBSs with large distances from modern humans to archaic humans and chimpanzees, and from the human ancestor to chimpanzees, archaic humans, and modern humans.
Left: DBSs in 4248, 1256, 2513, and 134 genes have distances >0.034 from modern humans to chimpanzees, Altai Neanderthals, Denisovans, and Vindija Neanderthals. Right: DBSs in 5033, 6908, 9707, 5189, and 5521 genes have distances >0.015 from the ancestor to modern humans, Altai Neanderthals, Denisovans, Vindija Neanderthals, and chimpanzees.
The most changed DBSs also have large sequence distances between humans and gorillas.
(A) Scatter plot showing the sequence distances between humans and chimpanzees and between humans and gorillas. (B) The scatter plot shows the average sequence distances between humans and chimpanzees, the three archaic humans, and between humans and gorillas. The rho and p values were estimated using the Spearman correlation test.
Positive selection signals detected by the XP-CLR program in (A) RP11-848P1.4, (B) RP11-598D14.1, and (C) CTD-2291D10.1 in CEU and CHB.
Favored mutations detected by iSAFE.
Left and right vertical axes indicate iSAFE scores and recombination rate. The purple diamond marks the top-scored mutation. Colors mark linkage disequilibrium (LD) (r2) between the top-scoring mutation and others. The yellow line indicates that mutations above it have a probability of p = 1e−6 to be neutral. The blue curve indicates the position-specific recombination rates. (A) SNPs in RP11-598D14.1. The top-scoring SNP has DAFs of 0.125/0.960/0.922 in YRI/CEU/CHB. (B) SNPs in AC006129.1. The top-scoring SNP has DAFs of 0.134/0.717/0.587 in YRI/CEU/CHB.
HS lncRNA genes with significantly changed Tajima’s D in CEU, CHB, and YRI.
Negative and positive Tajima’s D scores, which are significantly smaller or larger than the genome-wide background in a population, indicate the signature of positive selection or balancing selection, respectively, in the population.
The linkage disequilibrium (LD) of SNPs in HS lncRNA genes in CEU, CHB, and YRI.
The red color indicates high LD values. These panels show that LD between SNPs in CEU and CHB in these lncRNA genes is stronger than LD between SNPs in YRI. (A) AC024592.9, (B) AC129929.5, (C) RP11-157B13.7, (D) RP11-277P12.10, (E) CTD-2291D10.1, and (F) CTD-2142D14.1.
The distributions of DBS sequence distances and promoter sequence distances from modern to archaic humans (right-hand panels illustrating distances >0.005).
A fraction of DBSs has larger distances than promoters.
DBSs have significantly higher eQTL density than promoters.
DBSs and promoters harboring at least one eQTL were used to compute eQTL density and make the comparison. A one-sided Mann–Whitney test was used to compute the p-value.
The distribution of the percentage of HS TF–target transcript pairs with correlated expression across GTEx tissues and organs (see Figure 3A).
The distribution of significantly changed DBSs (in terms of sequence distance) in HS TF–target transcript pairs across GTEx tissues and organs between archaic and modern humans.
As in Figure 3B, orange, red, and dark red indicate significant changes from Denisovans (D), Altai Neanderthals and Denisovans (AD), and all three archaic humans (ADV).
Human-specifically rewired gene expression by HS lncRNAs in the anterior cingulate cortex (BA24).
(A) Genes expressed in the anterior cingulate cortex are enriched for HS lncRNAs’ target genes and neurodevelopment-related pathways. Squares, dots, and colors indicate HS lncRNAs, gene modules, and enriched KEGG pathways, respectively. (B) Comparison of modules and genes in humans (indicated by H) and macaques (indicated by M).
Tables
Genes with DBSs that have largest affinity values and mostly changed sequence distances (from modern humans to archaic humans and chimpanzees).
| Target gene | Annotation | Binding affinity | Mostly changed |
|---|---|---|---|
| IFNAR1 | That is, Interferon Alpha and Beta Receptor Subunit 1. | 794 | C, D |
| NFATC1 | A TF that induces gene transcription during immune responses. | 736 | C |
| NFATC1 | A TF that induces gene transcription during immune responses. | 491 | C, A, D |
| ANKLE2 | Diseases associated with ANKLE2 include microcephaly. | 527 | C, D |
| SEMA4D | Regulating phosphatidylinositol 3-kinase signaling, neuron projection development, and phosphate metabolic process. | 495 | C, A, D |
| KIF21B | Essential for neuronal morphology, synapse function, and learning and memory. | 471 | C |
| ALDH3B2 | An aldehyde dehydrogenase for alcohol metabolism. | 444 | C |
| NTSR1 | A brain and gastrointestinal peptide that mediates functions of neurotensin (e.g., hypotension, hyperglycemia, hypothermia, and antinociception). | 402 | C, A, D |
| MC5R | A receptor for melanocyte-stimulating hormone and adrenocorticotropic hormone. | 397 | C |
| THEG | Specifically expressed in the germ cells and involved in spermatogenesis. | 395 | C, D |
| HERC6 | In pathways including class I MHC-mediated antigen processing and presentation, and the innate immune system. | 369 | C, A, D |
| SLC2A11 | Facilitating glucose transporter. | 356 | C |
| NGEF | Playing a role in axon guidance regulating ephrin-induced growth cone collapse and dendritic spine morphogenesis. | 354 | C |
| SHC2 | Involved in the signal transduction pathways of neurotrophin-activated Trk receptors in cortical neurons. | 345 | C, D |
| BAIAP3 | Regulating behavior and food intake by controlling calcium-stimulated exocytosis of neurotransmitters, serotonin, and hormones like Insulin. | 321 | C |
| SLURP1 | A marker of late differentiation of the skin. | 319 | C |
| MLPH | Involved in melanosome transport. | 307 | C |
| TAS1R3 | Responding to the umami taste stimulus and recognizing diverse natural and synthetic sweeteners. | 304 | C, D |
| SLC2A1 | A major glucose transporter in the mammalian blood–brain barrier. | 356 | C |
| CTD-3224I3.3 | An lncRNA is highly expressed in the cerebellum, lung, and testis. | 312 | C, A, D |
-
‘C’, ‘A’, ‘D’, and ‘V’ indicate that the DBS has mostly changed sequence distances from modern humans to chimpanzees, Altai Neanderthals, Denisovans, and Vindija Neanderthals, respectively. NFATC1 is displayed in two rows because the DBSs of SNORA59B and TTTY8/TTTY8B have different affinity values.
GO terms generated by different gene sets with large and small DBS distances from humans to chimpanzees and Altai Neanderthals.
| Top 25% genes (sorted by DBS distance from humans to chimpanzees) in Supplementary file 1F, column A | term_id | adj_p | Bottom 25% genes (sorted by DBS distance from humans to chimpanzees) in Supplementary file 1F, column A | term_id | adj_p |
|---|---|---|---|---|---|
| Behavior | GO:0007610 | 8.26E−07 | Head development | GO:0060322 | 1.96E−03 |
| Head development | GO:0060322 | 4.87E−05 | Forebrain development | GO:0030900 | 2.26E−03 |
| Brain development | GO:0007420 | 2.69E−04 | Brain development | GO:0007420 | 2.80E−03 |
| Forebrain development | GO:0030900 | 8.21E−03 | Behavior | GO:0007610 | 3.78E−03 |
| Sensory organ development | GO:0007423 | 1.07E−02 | Locomotory behavior | GO:0007626 | 1.75E−02 |
| Learning or memory | GO:0007611 | 1.36E−02 | |||
| Locomotory behavior | GO:0007626 | 1.63E−02 | |||
| Sensory system development | GO:0048880 | 1.93E−02 | |||
| Sensory perception of sound | GO:0007605 | 2.05E−02 | |||
| Adaptive thermogenesis | GO:1990845 | 3.49E−02 | |||
| Top 25% genes (sorted by DBS distance from humans to Altai Neanderthals) in Supplementary file 1F, column C | term_id | adj_p | Bottom 25% genes (sorted by DBS distance from humans to Altai Neanderthals) in Supplementary file 1F, column C | term_id | adj_p |
| Behavior | GO:0007610 | 1.28E−09 | Brain development | GO:0007420 | 1.34E−04 |
| Head development | GO:0060322 | 2.16E−05 | Sensory organ development | GO:0007423 | 1.97E−04 |
| Learning or memory | GO:0007611 | 2.66E−05 | Head development | GO:0060322 | 3.98E−04 |
| Brain development | GO:0007420 | 4.15E−05 | Sensory organ morphogenesis | GO:0090596 | 2.03E−03 |
| Locomotory behavior | GO:0007626 | 7.74E−05 | Behavior | GO:0007610 | 9.69E−03 |
| Learning | GO:0007612 | 3.07E−04 | Locomotory behavior | GO:0007626 | 1.66E−02 |
| Forebrain development | GO:0030900 | 3.23E−04 | Sensory system development | GO:0048880 | 4.72E−02 |
| Sensory organ development | GO:0007423 | 3.48E−04 | |||
| Sensory system development | GO:0048880 | 4.16E−04 | |||
| Sensory organ morphogenesis | GO:0090596 | 6.43E−03 | |||
| Associative learning | GO:0008306 | 6.43E−03 | |||
| Memory | GO:0007613 | 1.18E−02 | |||
| Social behavior | GO:0035176 | 1.37E−02 | |||
| Sensory perception of sound | GO:0007605 | 2.91E−02 | |||
| Intersection of top 50% genes (sorted by DBS distance from humans to chimpanzees) and ASE genes in Supplementary file 1F, columns A and F | term_id | adj_p | Intersection of bottom 50% genes (sorted by DBS distance from humans to chimpanzees) and ASE genes in Supplementary file 1F, columns A and F | term_id | adj_p |
| Cellular pigmentation | GO:0033059 | 3.27E−05 | |||
| Pigmentation | GO:0043473 | 3.94E−04 | |||
| Behavior | GO:0007610 | 1.08E−03 | |||
| Sensory system development | GO:0048880 | 2.61E−03 | |||
| Learning | GO:0007612 | 3.69E−03 | |||
| Learning or memory | GO:0007611 | 1.60E−02 | |||
| Associative learning | GO:0008306 | 1.62E−02 | |||
| Cognition | GO:0050890 | 2.06E−02 | |||
| Sensory organ development | GO:0007423 | 2.11E−02 | |||
| Adaptive thermogenesis | GO:1990845 | 3.16E−02 | |||
| Memory | GO:0007613 | 4.85E−02 | |||
| Intersection of top 50% genes (sorted by DBS distance from humans to Altai Neanderthals) and ASE genes in Supplementary file 1F, columns C and F | term_id | adj_p | Intersection of bottom 50% genes (sorted by DBS distance from humans to Altai Neanderthals) and ASE genes in Supplementary file 1F, columns C and F | term_id | adj_p |
| Behavior | GO:0007610 | 3.88E−05 | Pigmentation | GO:0043473 | 7.11E−03 |
| Sensory system development | GO:0048880 | 4.09E−03 | Cellular pigmentation | GO:0033059 | 1.10E−02 |
| Sensory organ development | GO:0007423 | 1.51E−02 | |||
| Sensory perception of sound | GO:0007605 | 3.95E−02 | |||
| Learning | GO:0007612 | 4.74E−02 | |||
| Learning or memory | GO:0007611 | 4.77E−02 |
-
The presence and absence of human evolution-related GO terms in the ORA results (Supplementary file 1G, H). Left: The top genes. Right: The bottom genes. Upper (black): Target genes. Bottom (blue): The intersections of target genes and genes with significant ASE (p-adj <0.01 and |LFC| >0.5). HS lncRNAs’ target genes are sorted by DBS distance from humans to chimpanzees and Altai Neanderthals.
Genes with DBSs that are most polymorphic and have mostly changed sequence distances from humans to archaic humans and chimpanzees.
| Target gene | Annotation | SNP number | Mostly changed |
|---|---|---|---|
| IFNAR1 | That is, Interferon Alpha and Beta Receptor Subunit 1. | 31 | C, D |
| DECR2 | The related pathways include metabolism and regulation of lipid metabolism. | 17 | C, A, D |
| DOK7 | Essential for neuromuscular synaptogenesis. | 17 | C, D |
| TAS1R3 | Responding to the umami taste stimulus and recognizing diverse natural and synthetic sweeteners. | 17 | C, D |
| NFATC1 | A TF that induces gene transcription during immune responses. | 16 | C, D |
| ST3GAL4 | Involved in protein glycosylation. | 15 | C, D |
| CAMK2B | Calcium/calmodulin-dependent protein kinase important for dendritic spine and synapse formation and maintaining synaptic plasticity. | 13 | C, D |
| HLA-DQB1-AS1 | Highly expressed in EBV-transformed lymphocytes, lung, and spleen. | 13 | C, A, D, V |
| ANKLE2 | Diseases associated with ANKLE2 include microcephaly. | 12 | C, D |
| KRTAP1-3 | The KAP proteins form a matrix of keratin intermediate filaments that contribute to the structure of hair fibers. | 12 | C, D |
| INS, INS-IGF2 | Insulin decreases blood glucose concentration. | 11 | C, A, D |
| SHC2 | Involved in the signal transduction pathways of neurotrophin-activated Trk receptors in cortical neurons. | 11 | C, D |
| FN3KRP | Deglycating proteins to restore their function, important for modern humans adaptive to high glucose intake and functions in all tissues. | 10 | C, D |
| TFB1M | The encoded protein is part of the basal mitochondrial transcription complex and is necessary for mitochondrial gene expression. | 10 | C, A, D |
-
Some protein-coding genes that have (1) large DBS distances from humans to chimpanzees, (2) large DBS distances to Altai Neanderthals, Denisovans, or Vindija Neanderthals, and (3) dense SNPs. Letters C, A, D, and V indicate that DBS distance from humans to chimpanzees, Altai Neanderthals, Denisovans, and Vindija Neanderthals ≥0.037. Note that different HS lncRNAs’ DBSs in a gene may have somewhat different sequences, weighted Fst, and Tajima’s D.
The enriched GO terms for the top 2000 and bottom 2000 genes with largest and smallest binding affinity.
Upper (black): Top 30 GO terms of the top 2000 genes (left) and the bottom 2000 genes (right). Lower (blue): Bottom 13 GO terms of the top 2000 genes (left) and 30 of bottom GO terms of the bottom 2000 genes (right).
| GO terms (genes with strongest DBS) | Term_id | Adjusted_p | GO terms (genes with weakest DBS) | Term_id | Adjusted_p |
|---|---|---|---|---|---|
| Small GTPase mediated signal transduction | GO:0007264 | 8.55E−17 | Neuron projection development | GO:0031175 | 3.79E−11 |
| Neuron projection development | GO:0031175 | 2.36E−16 | Cell morphogenesis involved in differentiation | GO:0000904 | 5.36E−11 |
| Cell projection morphogenesis | GO:0048858 | 5.53E−16 | Cellular component morphogenesis | GO:0032989 | 6.81E−11 |
| Neuron projection morphogenesis | GO:0048812 | 8.56E−16 | Regulation of plasma membrane bounded cell projection organization | GO:0120035 | 2.58E−10 |
| Plasma membrane bounded cell projection morphogenesis | GO:0120039 | 8.56E−16 | Plasma membrane bounded cell projection morphogenesis | GO:0120039 | 3.44E−10 |
| Cell junction organization | GO:0034330 | 1.96E−15 | Regulation of anatomical structure morphogenesis | GO:0022603 | 4.34E−10 |
| Cell part morphogenesis | GO:0032990 | 5.24E−15 | Cell projection morphogenesis | GO:0048858 | 4.88E−10 |
| Synaptic signaling | GO:0099536 | 1.38E−14 | Cell part morphogenesis | GO:0032990 | 6.25E−10 |
| Cellular component morphogenesis | GO:0032989 | 3.76E−14 | Regulation of cell projection organization | GO:0031344 | 1.13E−09 |
| Trans-synaptic signaling | GO:0099537 | 1.88E−13 | Neuron projection morphogenesis | GO:0048812 | 1.64E−09 |
| Cell morphogenesis involved in differentiation | GO:0000904 | 3.16E−13 | Actin filament-based process | GO:0030029 | 1.65E−09 |
| Regulation of small GTPase mediated signal transduction | GO:0051056 | 4.03E−13 | Actin cytoskeleton organization | GO:0030036 | 1.99E−09 |
| Chemical synaptic transmission | GO:0007268 | 4.18E−13 | Organophosphate metabolic process | GO:0019637 | 2.05E−09 |
| Anterograde trans-synaptic signaling | GO:0098916 | 4.18E−13 | Cell morphogenesis involved in neuron differentiation | GO:0048667 | 2.47E−09 |
| Regulation of plasma membrane bounded cell projection organization | GO:0120035 | 1.72E−12 | Regulation of cellular component biogenesis | GO:0044087 | 8.54E−09 |
| Regulation of cell projection organization | GO:0031344 | 1.74E−12 | Regulation of neuron projection development | GO:0010975 | 1.22E−08 |
| Cell morphogenesis involved in neuron differentiation | GO:0048667 | 3.83E−12 | Cell junction organization | GO:0034330 | 3.32E−08 |
| Dendrite development | GO:0016358 | 4.71E−11 | Organophosphate biosynthetic process | GO:0090407 | 3.82E−08 |
| Enzyme-linked receptor protein signaling pathway | GO:0007167 | 4.71E−11 | Growth | GO:0040007 | 3.85E−08 |
| Cell surface receptor signaling pathway involved in cell–cell signaling | GO:1905114 | 4.86E−11 | Developmental growth | GO:0048589 | 5.54E−08 |
| Actin filament-based process | GO:0030029 | 7.42E−11 | Positive regulation of protein modification process | GO:0031401 | 9.39E−08 |
| Regulation of transmembrane transport | GO:0034762 | 8.88E−11 | Regulation of cell morphogenesis | GO:0022604 | 1.08E−07 |
| Synapse organization | GO:0050808 | 1.07E−10 | Negative regulation of cellular component organization | GO:0051129 | 1.64E−07 |
| Regulation of cellular component biogenesis | GO:0044087 | 1.17E−10 | Lipid biosynthetic process | GO:0008610 | 2.44E−07 |
| Metal ion transport | GO:0030001 | 4.76E−10 | Positive regulation of transport | GO:0051050 | 3.52E−07 |
| Ras protein signal transduction | GO:0007265 | 5.80E−10 | Regulation of locomotion | GO:0040012 | 4.21E−07 |
| Regulation of ion transport | GO:0043269 | 7.96E−10 | Organelle assembly | GO:0070925 | 4.42E−07 |
| Modulation of chemical synaptic transmission | GO:0050804 | 8.08E−10 | Regulation of cell migration | GO:0030334 | 4.90E−07 |
| Regulation of trans-synaptic signaling | GO:0099177 | 8.80E−10 | Mitotic cell cycle | GO:0000278 | 6.42E−07 |
| Cation transmembrane transport | GO:0098655 | 1.65E−09 | Synapse organization | GO:0050808 | 6.74E−07 |
| Absent speech | HP:0001344 | 1.54E−02 | Glycolipid metabolic process | GO:0006664 | 3.80E−02 |
| Abnormal aggressive, impulsive, or violent behavior | HP:0006919 | 1.54E−02 | Carboxylic acid catabolic process | GO:0046395 | 3.80E−02 |
| Autistic behavior | HP:0000729 | 1.54E−02 | Regulation of epithelial cell proliferation | GO:0050678 | 3.83E−02 |
| Absent toe | HP:0010760 | 1.89E−02 | Response to radiation | GO:0009314 | 3.85E−02 |
| Abnormality of calvarial morphology | HP:0002648 | 1.89E−02 | Protein methylation | GO:0006479 | 3.86E−02 |
| Aplasia/hypoplasia of toe | HP:0001991 | 1.89E−02 | Protein alkylation | GO:0008213 | 3.86E−02 |
| Tall stature | HP:0000098 | 1.89E−02 | Golgi organization | GO:0007030 | 3.88E−02 |
| Short philtrum | HP:0000322 | 1.89E−02 | Membrane depolarization | GO:0051899 | 3.97E−02 |
| Motor stereotypy | HP:0000733 | 3.37E−02 | Skeletal system morphogenesis | GO:0048705 | 3.98E−02 |
| Slender finger | HP:0001238 | 3.37E−02 | Positive chemotaxis | GO:0050918 | 3.98E−02 |
| Asymmetric growth | HP:0100555 | 3.37E−02 | Development of primary sexual characteristics | GO:0045137 | 3.98E−02 |
| Abnormal upper limb bone morphology | HP:0040070 | 3.37E−02 | Metaphase/anaphase transition of cell cycle | GO:0044784 | 3.98E−02 |
| Long fingers | HP:0100807 | 4.09E−02 | Non-motile cilium assembly | GO:1905515 | 3.98E−02 |
| Muscle cell differentiation | GO:0042692 | 4.55E−02 | |||
| Cell activation involved in immune response | GO:0002263 | 4.73E−02 | |||
| Regulation of exocytosis | GO:0017157 | 4.74E−02 | |||
| Negative regulation of chromosome organization | GO:2001251 | 4.76E−02 | |||
| ADP metabolic process | GO:0046031 | 4.76E−02 | |||
| Cytoskeleton-dependent cytokinesis | GO:0061640 | 4.76E−02 | |||
| Regulation of canonical Wnt signaling pathway | GO:0060828 | 4.77E−02 | |||
| Olefinic compound metabolic process | GO:0120254 | 4.77E−02 | |||
| DNA geometric change | GO:0032392 | 4.77E−02 | |||
| Gonad development | GO:0008406 | 4.77E−02 | |||
| Reproductive system development | GO:0061458 | 4.77E−02 | |||
| Vasculature development | GO:0001944 | 4.77E−02 | |||
| Response to insulin | GO:0032868 | 4.79E−02 | |||
| Ribonucleotide biosynthetic process | GO:0009260 | 4.79E−02 | |||
| Organic acid biosynthetic process | GO:0016053 | 4.82E−02 | |||
| Vacuole organization | GO:0007033 | 4.84E−02 | |||
| Import across plasma membrane | GO:0098739 | 4.95E−02 |
Enriched GO terms of different sets of genes with large and small DBS distances from humans to chimpanzees and Altai Neanderthals.
Shown are the presence and absence of GO terms highly related to human evolution. The intersections of genes sorted by DBS distance from humans to chimpanzees and to Altai Neanderthals, respectively, and genes showing significant ASE (adj-p <0.01 and |LFC| >0.5).
| Intersection of top 50% of genes (sorted by DBS distance from humans to chimpanzees) and ASE genes (adj-p <0.01) | term_ID | adj_p | Intersection of bottom 50% of genes (sorted by DBS distance from humans to chimpanzees) and ASE genes (adj-p <0.01) | term_ID | adj_p |
|---|---|---|---|---|---|
| Cellular pigmentation | GO:0033059 | 2.41E−06 | Brain development | GO:0007420 | 7.80E−03 |
| Behavior | GO:0007610 | 5.78E−05 | Forebrain development | GO:0030900 | 4.34E−02 |
| Pigmentation | GO:0043473 | 7.68E−05 | |||
| Learning | GO:0007612 | 3.60E−04 | |||
| Associative learning | GO:0008306 | 2.08E−03 | |||
| Adaptive thermogenesis | GO:1990845 | 2.28E−03 | |||
| Sensory system development | GO:0048880 | 3.19E−03 | |||
| Cold-induced thermogenesis | GO:0106106 | 3.78E−03 | |||
| Digestive system development | GO:0055123 | 3.94E−03 | |||
| Glucose metabolic process | GO:0006006 | 3.97E−03 | |||
| Learning or memory | GO:0007611 | 4.69E−03 | |||
| Cognition | GO:0050890 | 6.32E−03 | |||
| Regulation of cold-induced thermogenesis | GO:0120161 | 6.33E−03 | |||
| Memory | GO:0007613 | 4.42E−02 | |||
| Alcohol metabolic process | GO:0006066 | 4.82E−02 | |||
| Intersection of top 50% of genes (sorted by DBS distance from humans to Altai Neanderthals) and ASE genes (adj-p <0.01) | term_ID | adj_p | Intersection of bottom 50% of genes (sorted by DBS distance from humans to Altai Neanderthals) and ASE genes (adj-p <0.01) | term_ID | adj_p |
| Behavior | GO:0007610 | 2.52E−07 | Pigmentation | GO:0043473 | 4.39E−04 |
| Glucose metabolic process | GO:0006006 | 9.29E−04 | Cellular pigmentation | GO:0033059 | 2.74E−03 |
| Sensory system development | GO:0048880 | 1.05E−03 | Brain development | GO:0007420 | 4.96E−02 |
| Learning | GO:0007612 | 1.77E−03 | |||
| Learning or memory | GO:0007611 | 3.62E−03 | |||
| Cognition | GO:0050890 | 4.42E−03 | |||
| Associative learning | GO:0008306 | 6.95E−03 | |||
| Digestive system development | GO:0055123 | 8.54E−03 | |||
| Cold-induced thermogenesis | GO:0106106 | 1.43E−02 | |||
| Adaptive thermogenesis | GO:1990845 | 1.50E−02 | |||
| Brain development | GO:0007420 | 1.85E−02 | |||
| Forebrain development | GO:0030900 | 2.04E−02 | |||
| Regulation of cold-induced thermogenesis | GO:0120161 | 2.28E−02 | |||
| Alcohol metabolic process | GO:0006066 | 2.38E−02 | |||
| Memory | GO:0007613 | 2.72E−02 | |||
| Visual behavior | GO:0007632 | 4.57E−02 |
Numbers of favored and hitchhiking mutations in different classes of DBSs.
| Hitchhiking SNPs | Strong old | Strong young | Strong others | Weak old | Weak young | Weak others |
|---|---|---|---|---|---|---|
| 3/15,685 | 11/163,007 | 78/170,389 | 10/180,505 | 44/47,251 | 57/168,692 | |
| Favored SNPs | Strong old | Strong young | Strong others | Weak old | Weak young | Weak others |
| 0/10,216 | 1/16,040 | 4/92,153 | 0/26,532 | 5/31,242 | 2/108,014 |
The 14 SNPs have high DAF in YRI and are eQTLs exclusively in the GTEx tissue Thyroid.
| SNP ID | CEU-frequency | CHB-frequency | YRI-frequency |
|---|---|---|---|
| rs75508216 | 0.01 | 0.05 | 0.1 |
| rs114086993 | 0.01 | 0.05 | 0.1 |
| rs201187971 | 0.01 | 0 | 0.1 |
| rs73677017 | 0 | 0 | 0.14 |
| rs11944829 | 0 | 0 | 0.14 |
| rs114884549 | 0 | 0 | 0.15 |
| rs77133472 | 0 | 0 | 0.15 |
| rs115688283 | 0 | 0 | 0.17 |
| rs113131895 | 0 | 0 | 0.17 |
| rs142522981 | 0 | 0.02 | 0.19 |
| rs112731299 | 0 | 0 | 0.2 |
| rs4565803 | 0.01 | 0 | 0.24 |
| rs4604779 | 0.01 | 0 | 0.24 |
| rs76612433 | 0.02 | 0.19 | 0.24 |
Sensitivity analysis of GO-term enrichment across different DBS sequence distance cutoffs.
The table shows the numbers of target genes identified and the false discovery rates (FDR) for the enrichment of three selected GO terms at four different distance cutoffs. Note that, unlike in the old Figure 2, the results for chimpanzees and Altai Neanderthals are not directly comparable here, as the numbers of target genes used for the enrichment analysis differ between them at each cutoff.
| Cutoff = 0.03 | Cutoff = 0.034 | Cutoff = 0.04 | Cutoff = 0.05 | |||||
|---|---|---|---|---|---|---|---|---|
| Chimp | Altai | Chimp | Altai | Chimp | Altai | Chimp | Altai | |
| Target gene with distance > cutoff | 7087 | 1817 | 4248 | 1256 | 3789 | 1036 | 3223 | 745 |
| Behavior (FDR) | 7.06E-05 | 8.56E-03 | 7.10E-07 | 1.32E-06 | 2.31E-05 | 4.65E-05 | 5.17E-05 | 0.00741 |
| Neuron projection development (FDR) | 2.91E-05 | 1.77 E -02 | 1.41 E -08 | 9.91E-05 | 4.52E-07 | 0.01887 | 4.22E-05 | NS |
| Synaptic signaling (FDR) | 1.86E-08 | 4.99E-03 | 4.60E-07 | 2.34E-05 | 4.32 E -07 | 6.31 E -05 | 9.11E-07 | 0.001891 |
Additional files
-
MDAR checklist
- https://cdn.elifesciences.org/articles/89001/elife-89001-mdarchecklist1-v1.pdf
-
Supplementary file 1
All supplementary results from this study (including 18 tables).
- https://cdn.elifesciences.org/articles/89001/elife-89001-supp1-v1.xlsx