Advances in genome sequencing have dramatically improved our understanding of the genetic basis of human diseases, and thousands of human genes have been associated with different diseases. Despite our expanding knowledge of gene-disease associations, and despite the medical importance of disease genes, their recent evolution has not been thoroughly studied across diverse human populations. In particular, recent genomic adaptation at disease genes has not been characterized as well as purifying selection and long-term adaptation. Understanding the relationship between disease and adaptation at the gene level in the human genome is hampered by the fact that we don’t know whether disease genes have experienced more, less, or as much adaptation as non-disease genes during the last ~50,000 years of recent human evolution. Here, we compare the rate of strong recent adaptation in the form of selective sweeps between mendelian, non-infectious disease genes and non-disease genes across 26 distinct human populations from the 1,000 Genomes Project. We find that mendelian disease genes have experienced far less selective sweeps compared to non-disease genes especially in Africa. This sweep deficit at mendelian disease genes is less visible in East Asia or Europe. Investigating further the possible causes of the sweep deficit at disease genes, we find that this deficit is very strong at disease genes with both low recombination rates and with high numbers of associated disease variants, but is almost non-existent at disease genes with higher recombination rates or lower numbers of associated disease variants. Because segregating recessive deleterious variants have the ability to interfere with adaptive ones, these observations strongly suggest that adaptation has been slowed down by the presence of interfering recessive deleterious variants at disease genes. This is further supported by population simulations that show that interference at disease genes is expected to be lower in East Asia and Europe. These results clarify the evolutionary relationship between disease genes and recent genomic adaptation, and suggest that disease genes suffer not only from a higher load of segregating deleterious mutations, but also from a transient inability to adapt as much, and/or as fast as the rest of the genome.
The entire article is based on publicly available disease genes and genomic data. The disease genes used and sweep data and the sweep enrichment analysis pipeline (bootstrap test and False Positive risk estimation) with the required input files including the confounding factors are available at https://github.com/DavidPierreEnard/Gene_Set_Enrichment_Pipeline
- David Enard
- David Enard
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
- Luis Barreiro
- Received: April 1, 2021
- Accepted: October 2, 2021
- Accepted Manuscript published: October 12, 2021 (version 1)
© 2021, Di et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.