The whale shark genome reveals patterns of vertebrate gene family evolution
Abstract
Chondrichthyes (cartilaginous fishes) are fundamental for understanding vertebrate evolution, yet their genomes are understudied. We report long-read sequencing of the whale shark genome to generate the best gapless chondrichthyan genome assembly yet with higher contig contiguity than all other cartilaginous fish genomes, and studied vertebrate genomic evolution of ancestral gene families, immunity, and gigantism. We found a major increase in gene families at the origin of gnathostomes (jawed vertebrates) independent of their genome duplication. We studied vertebrate pathogen recognition receptors (PRRs), which are key in initiating innate immune defense, and found diverse patterns of gene family evolution, demonstrating that adaptive immunity in gnathostomes did not fully displace germline-encoded PRR innovation. We also discovered a new Toll-like receptor (TLR29) and three NOD1 copies in the whale shark. We found chondrichthyan and giant vertebrate genomes had decreased substitution rates compared to other vertebrates, but gene family expansion rates varied among vertebrate giants, suggesting substitution and expansion rates of gene families are decoupled in vertebrate genomes. Finally, we found gene families that shifted in expansion rate in vertebrate giants were enriched for human cancer-related genes, consistent with gigantism requiring adaptations to suppress cancer.
Data availability
Raw genome sequencing data have been deposited to SRA under SRX3471980. Raw transcriptome sequence sequence data are available at NCBI BioProject ID PRJDB8472 and DDBJ DRA ID DRA008572. The assembly has been deposited to GenBank and is accessioned as GCA_001642345.2.
-
Whale shark blood cell transcriptomeBioProject, PRJDB8472.
Article and author information
Author details
Funding
George Aquarium
- Alistair DM Dove
Emory School of Medicine Development
- Timothy Read
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Reviewing Editor
- Dario Riccardo Valenzano, Max Planck Institute for Biology of Ageing, Germany
Publication history
- Received: December 2, 2020
- Accepted: August 18, 2021
- Accepted Manuscript published: August 19, 2021 (version 1)
- Version of Record published: September 21, 2021 (version 2)
Copyright
This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Metrics
-
- 4,283
- Page views
-
- 456
- Downloads
-
- 10
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Evolutionary Biology
- Genetics and Genomics
Functionally indispensable genes are likely to be retained and otherwise to be lost during evolution. This evolutionary fate of a gene can also be affected by factors independent of gene dispensability, including the mutability of genomic positions, but such features have not been examined well. To uncover the genomic features associated with gene loss, we investigated the characteristics of genomic regions where genes have been independently lost in multiple lineages. With a comprehensive scan of gene phylogenies of vertebrates with a careful inspection of evolutionary gene losses, we identified 813 human genes whose orthologs were lost in multiple mammalian lineages: designated ‘elusive genes.’ These elusive genes were located in genomic regions with rapid nucleotide substitution, high GC content, and high gene density. A comparison of the orthologous regions of such elusive genes across vertebrates revealed that these features had been established before the radiation of the extant vertebrates approximately 500 million years ago. The association of human elusive genes with transcriptomic and epigenomic characteristics illuminated that the genomic regions containing such genes were subject to repressive transcriptional regulation. Thus, the heterogeneous genomic features driving gene fates toward loss have been in place and may sometimes have relaxed the functional indispensability of such genes. This study sheds light on the complex interplay between gene function and local genomic properties in shaping gene evolution that has persisted since the vertebrate ancestor.
-
- Evolutionary Biology
- Plant Biology
While the domestication process has been investigated in many crops, the detailed route of cultivation range expansion and factors governing this process received relatively little attention. Here using mungbean (Vigna radiata var. radiata) as a test case, we investigated the genomes of more than one thousand accessions to illustrate climatic adaptation’s role in dictating the unique routes of cultivation range expansion. Despite the geographical proximity between South and Central Asia, genetic evidence suggests mungbean cultivation first spread from South Asia to Southeast, East, and finally reached Central Asia. Combining evidence from demographic inference, climatic niche modeling, plant morphology, and records from ancient Chinese sources, we showed that the specific route was shaped by the unique combinations of climatic constraints and farmer practices across Asia, which imposed divergent selection favoring higher yield in the south but short-season and more drought-tolerant accessions in the north. Our results suggest that mungbean did not radiate from the domestication center as expected purely under human activity, but instead the spread of mungbean cultivation is highly constrained by climatic adaptation, echoing the idea that human commensals are more difficult to spread through the south-north axis of continents.