The whale shark genome reveals patterns of vertebrate gene family evolution

  1. Milton Tan  Is a corresponding author
  2. Anthony K Redmond
  3. Helen Dooley
  4. Ryo Nozu
  5. Keiichi Sato
  6. Shigehiro Kuraku
  7. Sergey Koren
  8. Adam M Phillippy
  9. Alistair DM Dove
  10. Timothy Read
  1. University of Illinois Urbana-Champaign, United States
  2. Trinity College Dublin, Ireland
  3. University of Maryland School of Medicine, United States
  4. Okinawa Churashima Research Center, Japan
  5. National Institute of Genetics, Japan
  6. National Center for Biotechnology Information, United States
  7. Georgia Aquarium, United States
  8. Emory University School of Medicine, United States

Abstract

Chondrichthyes (cartilaginous fishes) are fundamental for understanding vertebrate evolution, yet their genomes are understudied. We report long-read sequencing of the whale shark genome to generate the best gapless chondrichthyan genome assembly yet with higher contig contiguity than all other cartilaginous fish genomes, and studied vertebrate genomic evolution of ancestral gene families, immunity, and gigantism. We found a major increase in gene families at the origin of gnathostomes (jawed vertebrates) independent of their genome duplication. We studied vertebrate pathogen recognition receptors (PRRs), which are key in initiating innate immune defense, and found diverse patterns of gene family evolution, demonstrating that adaptive immunity in gnathostomes did not fully displace germline-encoded PRR innovation. We also discovered a new Toll-like receptor (TLR29) and three NOD1 copies in the whale shark. We found chondrichthyan and giant vertebrate genomes had decreased substitution rates compared to other vertebrates, but gene family expansion rates varied among vertebrate giants, suggesting substitution and expansion rates of gene families are decoupled in vertebrate genomes. Finally, we found gene families that shifted in expansion rate in vertebrate giants were enriched for human cancer-related genes, consistent with gigantism requiring adaptations to suppress cancer.

Data availability

Raw genome sequencing data have been deposited to SRA under SRX3471980. Raw transcriptome sequence sequence data are available at NCBI BioProject ID PRJDB8472 and DDBJ DRA ID DRA008572. The assembly has been deposited to GenBank and is accessioned as GCA_001642345.2.

The following data sets were generated

Article and author information

Author details

  1. Milton Tan

    Illinois Natural History Survey, University of Illinois Urbana-Champaign, Champaign, United States
    For correspondence
    miltont@illinois.edu
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9803-0827
  2. Anthony K Redmond

    Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
    Competing interests
    No competing interests declared.
  3. Helen Dooley

    Microbiology & Immunology, University of Maryland School of Medicine, Baltimore, United States
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2570-574X
  4. Ryo Nozu

    Okinawa Churashima Research Center, Okinawa Churashima Research Center, Okinawa, Japan
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1099-3152
  5. Keiichi Sato

    Okinawa Churashima Research Center, Okinawa Churashima Research Center, Okinawa, Japan
    Competing interests
    No competing interests declared.
  6. Shigehiro Kuraku

    Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Japan
    Competing interests
    Shigehiro Kuraku, Reviewing editor, eLife.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1464-8388
  7. Sergey Koren

    National Center for Biotechnology Information, Bethesda, United States
    Competing interests
    No competing interests declared.
  8. Adam M Phillippy

    National Center for Biotechnology Information, Bethesda, United States
    Competing interests
    No competing interests declared.
  9. Alistair DM Dove

    Georgia Aquarium, Georgia Aquarium, Atlanta, United States
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3239-4772
  10. Timothy Read

    Emory University School of Medicine, Atlanta, United States
    Competing interests
    No competing interests declared.

Funding

George Aquarium

  • Alistair DM Dove

Emory School of Medicine Development

  • Timothy Read

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Dario Riccardo Valenzano, Max Planck Institute for Biology of Ageing, Germany

Version history

  1. Received: December 2, 2020
  2. Accepted: August 18, 2021
  3. Accepted Manuscript published: August 19, 2021 (version 1)
  4. Version of Record published: September 21, 2021 (version 2)

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 6,103
    views
  • 548
    downloads
  • 19
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Milton Tan
  2. Anthony K Redmond
  3. Helen Dooley
  4. Ryo Nozu
  5. Keiichi Sato
  6. Shigehiro Kuraku
  7. Sergey Koren
  8. Adam M Phillippy
  9. Alistair DM Dove
  10. Timothy Read
(2021)
The whale shark genome reveals patterns of vertebrate gene family evolution
eLife 10:e65394.
https://doi.org/10.7554/eLife.65394

Share this article

https://doi.org/10.7554/eLife.65394

Further reading

    1. Computational and Systems Biology
    2. Evolutionary Biology
    Ryan T Bell, Harutyun Sahakyan ... Eugene V Koonin
    Research Article

    A comprehensive census of McrBC systems, among the most common forms of prokaryotic Type IV restriction systems, followed by phylogenetic analysis, reveals their enormous abundance in diverse prokaryotes and a plethora of genomic associations. We focus on a previously uncharacterized branch, which we denote coiled-coil nuclease tandems (CoCoNuTs) for their salient features: the presence of extensive coiled-coil structures and tandem nucleases. The CoCoNuTs alone show extraordinary variety, with three distinct types and multiple subtypes. All CoCoNuTs contain domains predicted to interact with translation system components, such as OB-folds resembling the SmpB protein that binds bacterial transfer-messenger RNA (tmRNA), YTH-like domains that might recognize methylated tmRNA, tRNA, or rRNA, and RNA-binding Hsp70 chaperone homologs, along with RNases, such as HEPN domains, all suggesting that the CoCoNuTs target RNA. Many CoCoNuTs might additionally target DNA, via McrC nuclease homologs. Additional restriction systems, such as Type I RM, BREX, and Druantia Type III, are frequently encoded in the same predicted superoperons. In many of these superoperons, CoCoNuTs are likely regulated by cyclic nucleotides, possibly, RNA fragments with cyclic termini, that bind associated CARF (CRISPR-Associated Rossmann Fold) domains. We hypothesize that the CoCoNuTs, together with the ancillary restriction factors, employ an echeloned defense strategy analogous to that of Type III CRISPR-Cas systems, in which an immune response eliminating virus DNA and/or RNA is launched first, but then, if it fails, an abortive infection response leading to PCD/dormancy via host RNA cleavage takes over.

    1. Evolutionary Biology
    2. Neuroscience
    Daniel Thiel, Luis Alfonso Yañez Guerra ... Gáspár Jékely
    Research Article

    Neuropeptides are ancient signaling molecules in animals but only few peptide receptors are known outside bilaterians. Cnidarians possess a large number of G protein-coupled receptors (GPCRs) – the most common receptors of bilaterian neuropeptides – but most of these remain orphan with no known ligands. We searched for neuropeptides in the sea anemone Nematostella vectensis and created a library of 64 peptides derived from 33 precursors. In a large-scale pharmacological screen with these peptides and 161 N. vectensis GPCRs, we identified 31 receptors specifically activated by 1 to 3 of 14 peptides. Mapping GPCR and neuropeptide expression to single-cell sequencing data revealed how cnidarian tissues are extensively connected by multilayer peptidergic networks. Phylogenetic analysis identified no direct orthology to bilaterian peptidergic systems and supports the independent expansion of neuropeptide signaling in cnidarians from a few ancestral peptide-receptor pairs.