Immune genes are hotspots of shared positive selection across birds and mammals

  1. Allison J Shultz  Is a corresponding author
  2. Timothy B Sackton  Is a corresponding author
  1. Harvard University, United States
8 figures, 4 tables and 2 additional files

Figures

Species tree of 39 species used in study.

Photo credit from Wikimedia Commons (top to bottom): Elegant Crested Tinamou: DickDaniels, Greater Rhea: Quartl, Southern Cassowary: Summerdrought, Mallard: Dcoetzee, Red Junglefowl: Francesco Veronesi, Common Cuckoo: Mike McKenzie, Anna’s Hummingbird: Becky Matsubara, Little Egret: GDW.45, Adelie Penguin: Stan Shebs, Downy Woodpecker: Wolfgang Wander, American Crow: DickDaniels, Zebra Finch: Jim Bendon.

https://doi.org/10.7554/eLife.41815.002
Distributions of gene-wide statistics using gene trees or species trees as the input phylogeny.

(A) Comparison of M0 model ω values between genes under selection for all site tests, and those not under selection, for both gene trees and species trees. The mean ω values were significantly higher for genes under selection for both gene trees and species trees (Mann-Whitney U-test: gene trees: W = 1201387, p < 0.0001; species trees: W = 938934, p < 0.0001). (B) Violin plot of ω values from the PAML M0 model using either gene trees or species trees as the input phylogeny. (C) Violin plot of alignment lengths for genes significant in all tests of selection, or not significant in all tests of selection using either gene trees or species trees as input phylogenies. (D) Distribution of the proportion of significant lineages for HOGs identified as not significant (FDR-corrected p-value ≥ 0.05), or significant (FDR-corrected p-values < 0.05) with BUSTED. The means of the two distributions are significantly different (Mann-Whitney U-test: W= 6205530, p<10-16).

https://doi.org/10.7554/eLife.41815.005
Figure 3 with 1 supplement
Pathway enrichment results to determine whether positively selected genes are functionally similar with chicken as the reference organism.

(A) The 18 pathways significant at q-value < 0.1 ordered by enrichment values, calculated as the proportion of genes under selection in the pathway over the proportion of genes significant in all KEGG pathways. Points are filled by q-value, and each pathway is colored by the broader KEGG functional category. Edge lengths do not contain meaningful values, but are chosen to maximize viewability. Pathways with more genes under selection than expected based on median gene length are highlighted with an asterisk. (B) Map depicting the relationships of all significant genes among pathways. Each gene (small, light circle) is connected by a line to each pathway it belongs to (large, dark circle). Each point is shaded according to the broader KEGG functional category. See Figure 3—figure supplement 1 for pathways identified using species trees.

https://doi.org/10.7554/eLife.41815.006
Figure 3—figure supplement 1
Pathways enriched using results from species trees as the input phylogeny.
https://doi.org/10.7554/eLife.41815.007
Figure 4 with 4 supplements
A visualization of PC1 scores estimated by the proportion of sites under selection using gene trees on the phylogeny and the maximum likelihood reconstruction of the PC1 values for internal branches.

The PC1 scores indicate species in different clades that have similar parameter estimates from aBS-REL tests for positive selection and explain 7.2% of the variance across all genes tested. See Figure 4 supplement 1 for a visualization of PC1 estimated using ω values and gene trees, Figure 4 supplement 2 for a visualization of PC1 estimated using the proportion of sites under selection using species trees, Figure 4 supplement 3 for a visualization of PC1 estimated using ω using species trees, and Figure 4 supplement 4 for a visualization of eigenvalues for all PCAs.

https://doi.org/10.7554/eLife.41815.008
Figure 4—figure supplement 1
(A) visualization of PC1 scores estimated by the ω estimates using gene trees on the phylogeny and the maximum likelihood reconstruction of the PC1 values for internal branches.

The PC1 scores explain 7.6% of the variance across all genes tested.

https://doi.org/10.7554/eLife.41815.009
Figure 4—figure supplement 2
(A) visualization of PC1 scores estimated by the proportion of sites under selection using species trees on the phylogeny and the maximum likelihood reconstruction of the PC1 values for internal branches.

The PC1 scores explain 7.6% of the variance across all genes tested.

https://doi.org/10.7554/eLife.41815.010
Figure 4—figure supplement 3
(A) visualization of PC1 scores estimated by the ω estimates using species trees on the phylogeny and the maximum likelihood reconstruction of the PC1 values for internal branches.

The PC1 scores explain 7.6% of the variance across all genes tested.

https://doi.org/10.7554/eLife.41815.011
Figure 4—figure supplement 4
Visualization of the variance explained by the first 10 PC axes (scree plot).

PCA of the (A) proportion of sites under selection using gene trees, (B) log-transformed ω values using gene trees, (C) proportion of sites under selection using species trees, and D) ω values using species trees across genes for each species.

https://doi.org/10.7554/eLife.41815.012
Figure 5 with 1 supplement
A phylomorphospace plot showing the association between log-transformed body mass on the x-axis and PC1 estimated from the proportion of sites under selection on the y-axis.

Species values and reconstructed node values are connected by phylogeny. A PGLS analysis of these two traits showed a significant correlation (p=0.0001). See Figure 5—figure supplement 1 for a phylomorphospace plot depicting PC1 estimated using ω estimates.

https://doi.org/10.7554/eLife.41815.013
Figure 5—figure supplement 1
A phylomorphospace plot showing the association between log-transformed body mass on the x-axis and PC1 estimated from the ω estimates on the y-axis.

Species values and reconstructed node values are connected by phylogeny. A PGLS analysis of these two traits showed a significant correlation (p=0.0016).

https://doi.org/10.7554/eLife.41815.014
Figure 6 with 1 supplement
Signatures of shared positive selection in birds and mammals.

For all analyses, we considered four different FDR-corrected p-value cutoffs for significance (to identify genes under positive selection) A. Odds ratio of overlap in genes under selection in both bird and mammal datasets. We indicate the number of observed genes under selection in both clades (n obs) and the number of expected genes under selection in both clades (n exp). B. Pathway enrichment scores from KEGG pathway enrichment tests with genes under selection in both birds and mammals as the test set, and genes under selection in birds as the background set. Ten pathways significantly enriched in birds with at least one gene under selection in both birds and mammals are color-coded. All other pathways are shown in grey. Significant enrichment values are outlined in black (q-value < 0.1) or grey (q-value < 0.2). C. Null distribution of enrichment scores generated from 1,000 randomization tests compared to empirical enrichment scores (vertical bars). Null distributions were generated by randomly selecting gene sets from the background set of genes (bird significant genes) for use as the test set. The randomized test set contained the same number of genes as empirical test set for each FDR-corrected p-value cutoff for significance. Empirical enrichment scores are depicted by a vertical bar, and with significant q-value scores outlined in black (q-value < 0.1) or grey (q-value < 0.2). See Figure 6 Supplement 1 for the odds ratio overlap in genes under selection in both bird and mammal datasets with the 20% most constrained genes removed.

https://doi.org/10.7554/eLife.41815.015
Figure 6—figure supplement 1
Odds ratio overlap of bird and mammal genes under selection with 20% most constrained genes (as estimated by the m0 model) removed.
https://doi.org/10.7554/eLife.41815.016
A comparison of genes under positive selection in birds and genes differentially expressed following pathogen challenge to test for patterns of pathogen-mediate selection.

For different pathogens, we show the proportion of genes under positive selection in birds (defined as significant with FDR corrected p-value < 0.05 for all PAML and BUSTED model comparisons) for genes down significantly down regulated, significantly up regulated, or not significantly differentially regulated. The number above each bar indicates the number of genes in a given transcriptional response class. The significance of enrichment for positively-selected genes in up- or down-regulated expression classes, as calculated by logistic regression, is indicated by asterisks above the “down” and “up” bars.

https://doi.org/10.7554/eLife.41815.017
Comparison of differential expression effect values for groups of genes, across pathogens with transcriptome data available for both birds and mammals.

Groups of genes are defined as being under positive selection in both birds and mammals, in birds only, or neither birds nor mammals (not significant). Differential expression effect values for each gene are calculated as the harmonic mean of the absolute β values of birds and mammals. We compared the mean of each category to that of the other two categories within each pathogen with Mann-Whitney U-tests, and the significance-level for each test is indicated by asterisks. Comparisons with p>0.10 are left blank. Note that boxplot outliers are not depicted.

https://doi.org/10.7554/eLife.41815.020

Tables

Table 1
PAML Model descriptions.
https://doi.org/10.7554/eLife.41815.003
ModelModel descriptionParameters
M0one ratioω
M1aneutralp0 (p1 = 1 p0)
ω0 < 1, ω1 = 1
M2a_fixedneutralp0,p1 (p1 = 1 p0 - p1)
ω0 < 1, ω1 = 1, ω2 = 1
M2aselectionp0,p1 (p1 = 1 p0 - p1)
ω0 < 1, ω1 = 1, ω2 > 1
M7neutral (beta distribution)p, q
M8aneutral (beta distribution)p0 (p1 = 1 p0)
p, q, ωs = 1
M8selection (beta distribution)p0 (p1 = 1 p0)
p, q, ωs> 1
Table 2
Counts (above) and proportions (below) for all tests of individual, and combined tests of selection for gene trees and species trees.
https://doi.org/10.7554/eLife.41815.004
DatasetN genesm1a vs m2am2a vs m2a_fixedm7 vs m8m8 vs m8aAll PAMLBustedAll PAML + BUSTED
Gene trees112311925
0.17
2197
0.20
7504
0.67
3679
0.33
1901
0.17
6244
0.56
1562
0.14
Species trees86691783
0.21
2026
0.23
6293
0.73
3395
0.39
1752
0.20
3870
0.45
1203
0.14
Table 3
Fisher’s exact test results from bird and mammal transcriptome studies.
https://doi.org/10.7554/eLife.41815.018
PathogenTranscriptional responseN genes diff. expressed in both lineagesN genes expected by chancep-valueOdds ratio (95% conf. intervals)
Influenzaup3014.8<0.00012.48 (1.56,3.84)
Influenzadown77.21.0000.96 (0.35,2.29)
West Nile Virusup60.9<0.000125.06 (4.46,253.47)
West Nile Virusdown001.0000 (0, 884.62)
E. coliup8452.4<0.00011.9 (1.45, 2.47)
E. colidown2420.70.3971.2 (0.73, 1.90)
Mycoplasmaup168.40.0102.1 (1.14, 3.62)
Mycoplasmadown00.11.0000 (0, 47.56)
Plasmodiumup149.80.1661.53 (0.79, 2.77)
Plasmodiumdown93.20.0043.44 (1.42, 7.57)
Table 4
Logistic regression results testing whether genes under selection in birds could be predicted by selection status in mammals (sig_mammals), transcriptional regulation in birds, or their interaction.
https://doi.org/10.7554/eLife.41815.019
PathogenTranscriptional responseN genesPredictor variableEstimateStandard errorZ scorep-value
Influenzadown4488sig_mammals0.760.098.45<0.0001
Influenzadown4488down_reg_birds−0.120.42−0.290.771
Influenzadown4488sig_mammals: down_reg_birds−0.760.85−0.890.372
Influenzaup4488sig_mammals0.770.098.52<0.0001
Influenzaup4488up_reg_birds0.790.223.690.0002
Influenzaup4488sig_mammals: up_reg_birds−0.880.5−1.770.077
West nile virusdown3774sig_mammals0.770.17.76<0.0001
West nile virusdown3774down_reg_birds11.98196.970.060.952
West nile virusdown3774sig_mammals: down_reg_birds----
West nile virusup3774sig_mammals0.770.17.76<0.0001
West nile virusup3774up_reg_birds1.10.871.270.203
West nile virusup3774sig_mammals: up_reg_birds−1.461.66−0.880.378
E. colidown4225sig_mammals0.740.097.95<0.0001
E. colidown4225down_reg_birds−0.160.19−0.830.409
E. colidown4225sig_mammals: down_reg_birds0.180.50.360.717
E. coliup4225sig_mammals0.720.17.2<0.0001
E. coliup4225up_reg_birds0.240.12.390.017
E. coliup4225sig_mammals: up_reg_birds−0.030.26−0.130.895
Mycoplasmadown4059sig_mammals0.730.097.74<0.0001
Mycoplasmadown4059down_reg_birds−0.590.53−1.110.266
Mycoplasmadown4059sig_mammals: down_reg_birds−0.060.93−0.060.948
Mycoplasmaup4059sig_mammals0.750.17.77<0.0001
Mycoplasmaup4059up_reg_birds0.510.212.420.016
Mycoplasmaup4059sig_mammals: up_reg_birds−0.710.41−1.730.084
Plasmodiumdown3222sig_mammals0.740.116.86<0.0001
Plasmodiumdown3222down_reg_birds−0.020.37−0.050.961
Plasmodiumdown3222sig_mammals: down_reg_birds0.010.850.010.992
Plasmodiumup3222sig_mammals0.730.116.71<0.0001
Plasmodiumup3222up_reg_birds0.190.230.850.396
Plasmodiumup3222sig_mammals: up_reg_birds0.440.870.50.614

Additional files

Supplementary file 1

Supplemental tables, see README for descriptions of each table.

https://doi.org/10.7554/eLife.41815.021
Transparent reporting form
https://doi.org/10.7554/eLife.41815.022

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Allison J Shultz
  2. Timothy B Sackton
(2019)
Immune genes are hotspots of shared positive selection across birds and mammals
eLife 8:e41815.
https://doi.org/10.7554/eLife.41815