The whale shark genome reveals patterns of vertebrate gene family evolution

  1. Milton Tan  Is a corresponding author
  2. Anthony K Redmond
  3. Helen Dooley
  4. Ryo Nozu
  5. Keiichi Sato
  6. Shigehiro Kuraku
  7. Sergey Koren
  8. Adam M Phillippy
  9. Alistair DM Dove
  10. Timothy Read
  1. University of Illinois Urbana-Champaign, United States
  2. Trinity College Dublin, Ireland
  3. University of Maryland School of Medicine, United States
  4. Okinawa Churashima Research Center, Japan
  5. National Institute of Genetics, Japan
  6. National Center for Biotechnology Information, United States
  7. Georgia Aquarium, United States
  8. Emory University School of Medicine, United States

Abstract

Chondrichthyes (cartilaginous fishes) are fundamental for understanding vertebrate evolution, yet their genomes are understudied. We report long-read sequencing of the whale shark genome to generate the best gapless chondrichthyan genome assembly yet with higher contig contiguity than all other cartilaginous fish genomes, and studied vertebrate genomic evolution of ancestral gene families, immunity, and gigantism. We found a major increase in gene families at the origin of gnathostomes (jawed vertebrates) independent of their genome duplication. We studied vertebrate pathogen recognition receptors (PRRs), which are key in initiating innate immune defense, and found diverse patterns of gene family evolution, demonstrating that adaptive immunity in gnathostomes did not fully displace germline-encoded PRR innovation. We also discovered a new Toll-like receptor (TLR29) and three NOD1 copies in the whale shark. We found chondrichthyan and giant vertebrate genomes had decreased substitution rates compared to other vertebrates, but gene family expansion rates varied among vertebrate giants, suggesting substitution and expansion rates of gene families are decoupled in vertebrate genomes. Finally, we found gene families that shifted in expansion rate in vertebrate giants were enriched for human cancer-related genes, consistent with gigantism requiring adaptations to suppress cancer.

Data availability

Raw genome sequencing data have been deposited to SRA under SRX3471980. Raw transcriptome sequence sequence data are available at NCBI BioProject ID PRJDB8472 and DDBJ DRA ID DRA008572. The assembly has been deposited to GenBank and is accessioned as GCA_001642345.2.

The following data sets were generated

Article and author information

Author details

  1. Milton Tan

    Illinois Natural History Survey, University of Illinois Urbana-Champaign, Champaign, United States
    For correspondence
    miltont@illinois.edu
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9803-0827
  2. Anthony K Redmond

    Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
    Competing interests
    No competing interests declared.
  3. Helen Dooley

    Microbiology & Immunology, University of Maryland School of Medicine, Baltimore, United States
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2570-574X
  4. Ryo Nozu

    Okinawa Churashima Research Center, Okinawa Churashima Research Center, Okinawa, Japan
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1099-3152
  5. Keiichi Sato

    Okinawa Churashima Research Center, Okinawa Churashima Research Center, Okinawa, Japan
    Competing interests
    No competing interests declared.
  6. Shigehiro Kuraku

    Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Japan
    Competing interests
    Shigehiro Kuraku, Reviewing editor, eLife.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1464-8388
  7. Sergey Koren

    National Center for Biotechnology Information, Bethesda, United States
    Competing interests
    No competing interests declared.
  8. Adam M Phillippy

    National Center for Biotechnology Information, Bethesda, United States
    Competing interests
    No competing interests declared.
  9. Alistair DM Dove

    Georgia Aquarium, Georgia Aquarium, Atlanta, United States
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3239-4772
  10. Timothy Read

    Emory University School of Medicine, Atlanta, United States
    Competing interests
    No competing interests declared.

Funding

George Aquarium

  • Alistair DM Dove

Emory School of Medicine Development

  • Timothy Read

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 7,272
    views
  • 631
    downloads
  • 25
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Milton Tan
  2. Anthony K Redmond
  3. Helen Dooley
  4. Ryo Nozu
  5. Keiichi Sato
  6. Shigehiro Kuraku
  7. Sergey Koren
  8. Adam M Phillippy
  9. Alistair DM Dove
  10. Timothy Read
(2021)
The whale shark genome reveals patterns of vertebrate gene family evolution
eLife 10:e65394.
https://doi.org/10.7554/eLife.65394

Share this article

https://doi.org/10.7554/eLife.65394

Further reading

    1. Evolutionary Biology
    2. Neuroscience
    Gregor Belušič
    Insight

    The first complete 3D reconstruction of the compound eye of a minute wasp species sheds light on the nuts and bolts of size reduction.

    1. Evolutionary Biology
    2. Genetics and Genomics
    Julie N Chuong, Nadav Ben Nun ... David Gresham
    Research Article

    Copy number variants (CNVs) are an important source of genetic variation underlying rapid adaptation and genome evolution. Whereas point mutation rates vary with genomic location and local DNA features, the role of genome architecture in the formation and evolutionary dynamics of CNVs is poorly understood. Previously, we found the GAP1 gene in Saccharomyces cerevisiae undergoes frequent amplification and selection in glutamine-limitation. The gene is flanked by two long terminal repeats (LTRs) and proximate to an origin of DNA replication (autonomously replicating sequence, ARS), which likely promote rapid GAP1 CNV formation. To test the role of these genomic elements on CNV-mediated adaptive evolution, we evolved engineered strains lacking either the adjacent LTRs, ARS, or all elements in glutamine-limited chemostats. Using a CNV reporter system and neural network simulation-based inference (nnSBI) we quantified the formation rate and fitness effect of CNVs for each strain. Removal of local DNA elements significantly impacts the fitness effect of GAP1 CNVs and the rate of adaptation. In 177 CNV lineages, across all four strains, between 26% and 80% of all GAP1 CNVs are mediated by Origin Dependent Inverted Repeat Amplification (ODIRA) which results from template switching between the leading and lagging strand during DNA synthesis. In the absence of the local ARS, distal ones mediate CNV formation via ODIRA. In the absence of local LTRs, homologous recombination can mediate gene amplification following de novo retrotransposon events. Our study reveals that template switching during DNA replication is a prevalent source of adaptive CNVs.