The whale shark genome reveals patterns of vertebrate gene family evolution

  1. Milton Tan  Is a corresponding author
  2. Anthony K Redmond
  3. Helen Dooley
  4. Ryo Nozu
  5. Keiichi Sato
  6. Shigehiro Kuraku
  7. Sergey Koren
  8. Adam M Phillippy
  9. Alistair DM Dove
  10. Timothy Read
  1. University of Illinois Urbana-Champaign, United States
  2. Trinity College Dublin, Ireland
  3. University of Maryland School of Medicine, United States
  4. Okinawa Churashima Research Center, Japan
  5. National Institute of Genetics, Japan
  6. National Center for Biotechnology Information, United States
  7. Georgia Aquarium, United States
  8. Emory University School of Medicine, United States

Abstract

Chondrichthyes (cartilaginous fishes) are fundamental for understanding vertebrate evolution, yet their genomes are understudied. We report long-read sequencing of the whale shark genome to generate the best gapless chondrichthyan genome assembly yet with higher contig contiguity than all other cartilaginous fish genomes, and studied vertebrate genomic evolution of ancestral gene families, immunity, and gigantism. We found a major increase in gene families at the origin of gnathostomes (jawed vertebrates) independent of their genome duplication. We studied vertebrate pathogen recognition receptors (PRRs), which are key in initiating innate immune defense, and found diverse patterns of gene family evolution, demonstrating that adaptive immunity in gnathostomes did not fully displace germline-encoded PRR innovation. We also discovered a new Toll-like receptor (TLR29) and three NOD1 copies in the whale shark. We found chondrichthyan and giant vertebrate genomes had decreased substitution rates compared to other vertebrates, but gene family expansion rates varied among vertebrate giants, suggesting substitution and expansion rates of gene families are decoupled in vertebrate genomes. Finally, we found gene families that shifted in expansion rate in vertebrate giants were enriched for human cancer-related genes, consistent with gigantism requiring adaptations to suppress cancer.

Data availability

Raw genome sequencing data have been deposited to SRA under SRX3471980. Raw transcriptome sequence sequence data are available at NCBI BioProject ID PRJDB8472 and DDBJ DRA ID DRA008572. The assembly has been deposited to GenBank and is accessioned as GCA_001642345.2.

The following data sets were generated

Article and author information

Author details

  1. Milton Tan

    Illinois Natural History Survey, University of Illinois Urbana-Champaign, Champaign, United States
    For correspondence
    miltont@illinois.edu
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9803-0827
  2. Anthony K Redmond

    Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
    Competing interests
    No competing interests declared.
  3. Helen Dooley

    Microbiology & Immunology, University of Maryland School of Medicine, Baltimore, United States
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2570-574X
  4. Ryo Nozu

    Okinawa Churashima Research Center, Okinawa Churashima Research Center, Okinawa, Japan
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1099-3152
  5. Keiichi Sato

    Okinawa Churashima Research Center, Okinawa Churashima Research Center, Okinawa, Japan
    Competing interests
    No competing interests declared.
  6. Shigehiro Kuraku

    Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Japan
    Competing interests
    Shigehiro Kuraku, Reviewing editor, eLife.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1464-8388
  7. Sergey Koren

    National Center for Biotechnology Information, Bethesda, United States
    Competing interests
    No competing interests declared.
  8. Adam M Phillippy

    National Center for Biotechnology Information, Bethesda, United States
    Competing interests
    No competing interests declared.
  9. Alistair DM Dove

    Georgia Aquarium, Georgia Aquarium, Atlanta, United States
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3239-4772
  10. Timothy Read

    Emory University School of Medicine, Atlanta, United States
    Competing interests
    No competing interests declared.

Funding

George Aquarium

  • Alistair DM Dove

Emory School of Medicine Development

  • Timothy Read

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 7,488
    views
  • 645
    downloads
  • 26
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Milton Tan
  2. Anthony K Redmond
  3. Helen Dooley
  4. Ryo Nozu
  5. Keiichi Sato
  6. Shigehiro Kuraku
  7. Sergey Koren
  8. Adam M Phillippy
  9. Alistair DM Dove
  10. Timothy Read
(2021)
The whale shark genome reveals patterns of vertebrate gene family evolution
eLife 10:e65394.
https://doi.org/10.7554/eLife.65394

Share this article

https://doi.org/10.7554/eLife.65394

Further reading

    1. Evolutionary Biology
    2. Genetics and Genomics
    Torsten Günther, Jacob Chisausky ... Cristina Valdiosera
    Research Article

    Cattle (Bos taurus) play an important role in the life of humans in the Iberian Peninsula not just as a food source but also in cultural events. When domestic cattle were first introduced to Iberia, wild aurochs (Bos primigenius) were still present, leaving ample opportunity for mating (whether intended by farmers or not). Using a temporal bioarchaeological dataset covering eight millennia, we trace gene flow between the two groups. Our results show frequent hybridisation during the Neolithic and Chalcolithic, likely reflecting a mix of hunting and herding or relatively unmanaged herds, with mostly male aurochs and female domestic cattle involved. This is supported by isotopic evidence consistent with ecological niche sharing, with only a few domestic cattle possibly being managed. The proportion of aurochs ancestry in domestic cattle remains relatively constant from about 4000 years ago, probably due to herd management and selection against first generation hybrids, coinciding with other cultural transitions. The constant level of wild ancestry (~20%) continues into modern Western European breeds including Iberian cattle selected for aggressiveness and fighting ability. This study illuminates the genomic impact of human actions and wild introgression in the establishment of cattle as one of the most important domestic species today.

    1. Evolutionary Biology
    2. Genetics and Genomics
    James Boocock, Noah Alexander ... Leonid Kruglyak
    Research Article

    Expression quantitative trait loci (eQTLs) provide a key bridge between noncoding DNA sequence variants and organismal traits. The effects of eQTLs can differ among tissues, cell types, and cellular states, but these differences are obscured by gene expression measurements in bulk populations. We developed a one-pot approach to map eQTLs in Saccharomyces cerevisiae by single-cell RNA sequencing (scRNA-seq) and applied it to over 100,000 single cells from three crosses. We used scRNA-seq data to genotype each cell, measure gene expression, and classify the cells by cell-cycle stage. We mapped thousands of local and distant eQTLs and identified interactions between eQTL effects and cell-cycle stages. We took advantage of single-cell expression information to identify hundreds of genes with allele-specific effects on expression noise. We used cell-cycle stage classification to map 20 loci that influence cell-cycle progression. One of these loci influenced the expression of genes involved in the mating response. We showed that the effects of this locus arise from a common variant (W82R) in the gene GPA1, which encodes a signaling protein that negatively regulates the mating pathway. The 82R allele increases mating efficiency at the cost of slower cell-cycle progression and is associated with a higher rate of outcrossing in nature. Our results provide a more granular picture of the effects of genetic variants on gene expression and downstream traits.