An improved zebrafish transcriptome annotation for sensitive and comprehensive detection of cell type-specific genes

  1. Nathan D Lawson  Is a corresponding author
  2. Rui Li
  3. Masahiro Shin
  4. Ann Grosse
  5. Onur Yukselen
  6. Oliver A Stone
  7. Alper Kucukural
  8. Lihua Zhu
  1. University of Massachusetts Medical School, United States
  2. University of Oxford, United Kingdom

Abstract

The zebrafish is ideal for studying embryogenesis and is increasingly applied to model human disease. In these contexts, RNA-sequencing (RNA-seq) provides mechanistic insights by identifying transcriptome changes between experimental conditions. Application of RNA-seq relies on accurate transcript annotation for a genome of interest. Here, we find discrepancies in analysis from RNA-seq datasets quantified using Ensembl and RefSeq zebrafish annotations. These issues were due, in part, to variably annotated 3' untranslated regions and thousands of gene models missing from each annotation. Since these discrepancies could compromise downstream analyses and biological reproducibility, we built a more comprehensive zebrafish transcriptome annotation that addresses these deficiencies. Our annotation improves detection of cell type-specific genes in both bulk and single cell RNA-seq datasets, where it also improves resolution of cell clustering. Thus, we demonstrate that our new transcriptome annotation can outperform existing annotations, providing an important resource for zebrafish researchers.

Data availability

All data generated in this study are available in accompanying source data files. Transcriptome annotation files described in this study are available for download at zf-transcriptome.umassmed.edu. Raw and processed RNA-seq data generated in this study are available at GEO (GSE152759).

The following data sets were generated
The following previously published data sets were used

Article and author information

Author details

  1. Nathan D Lawson

    Department of Molecular, Cell, and Cancer Biology, University of Massachusetts Medical School, Worcester, United States
    For correspondence
    nathan.lawson@umassmed.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7788-9619
  2. Rui Li

    Department of Molecular, Cell, and Cancer Biology, University of Massachusetts Medical School, Worcester, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Masahiro Shin

    Department of Molecular, Cell, and Cancer Biology, University of Massachusetts Medical School, Worcester, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. Ann Grosse

    Department of Molecular, Cell, and Cancer Biology, University of Massachusetts Medical School, Worcester, United States
    Competing interests
    The authors declare that no competing interests exist.
  5. Onur Yukselen

    Department of Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, United States
    Competing interests
    The authors declare that no competing interests exist.
  6. Oliver A Stone

    Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
    Competing interests
    The authors declare that no competing interests exist.
  7. Alper Kucukural

    Department of Bioinformatic and Integrative Biology, University of Massachusetts Medical School, Worcester, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9983-394X
  8. Lihua Zhu

    Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, United States
    Competing interests
    The authors declare that no competing interests exist.

Funding

National Heart, Lung, and Blood Institute (R35HL140017)

  • Nathan D Lawson

National Human Genome Research Institute (U01HG007910)

  • Onur Yukselen
  • Alper Kucukural

National Center for Advancing Translational Sciences (UL1TR001453)

  • Onur Yukselen
  • Alper Kucukural

National Institute of Neurological Disorders and Stroke (R21NS105654)

  • Nathan D Lawson

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Elisabeth Busch-Nentwich, University of Cambridge

Ethics

Animal experimentation: Zebrafish studies were performed in accordance with protocols #A2613 and #A2632 approved by the University of Massachusetts institutional animal care and use committee (IACUC).

Version history

  1. Received: February 6, 2020
  2. Accepted: August 21, 2020
  3. Accepted Manuscript published: August 24, 2020 (version 1)
  4. Version of Record published: September 11, 2020 (version 2)

Copyright

© 2020, Lawson et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 9,516
    views
  • 825
    downloads
  • 73
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Nathan D Lawson
  2. Rui Li
  3. Masahiro Shin
  4. Ann Grosse
  5. Onur Yukselen
  6. Oliver A Stone
  7. Alper Kucukural
  8. Lihua Zhu
(2020)
An improved zebrafish transcriptome annotation for sensitive and comprehensive detection of cell type-specific genes
eLife 9:e55792.
https://doi.org/10.7554/eLife.55792

Share this article

https://doi.org/10.7554/eLife.55792

Further reading

    1. Computational and Systems Biology
    2. Developmental Biology
    Gang Xue, Xiaoyi Zhang ... Zhiyuan Li
    Research Article

    Organisms utilize gene regulatory networks (GRN) to make fate decisions, but the regulatory mechanisms of transcription factors (TF) in GRNs are exceedingly intricate. A longstanding question in this field is how these tangled interactions synergistically contribute to decision-making procedures. To comprehensively understand the role of regulatory logic in cell fate decisions, we constructed a logic-incorporated GRN model and examined its behavior under two distinct driving forces (noise-driven and signal-driven). Under the noise-driven mode, we distilled the relationship among fate bias, regulatory logic, and noise profile. Under the signal-driven mode, we bridged regulatory logic and progression-accuracy trade-off, and uncovered distinctive trajectories of reprogramming influenced by logic motifs. In differentiation, we characterized a special logic-dependent priming stage by the solution landscape. Finally, we applied our findings to decipher three biological instances: hematopoiesis, embryogenesis, and trans-differentiation. Orthogonal to the classical analysis of expression profile, we harnessed noise patterns to construct the GRN corresponding to fate transition. Our work presents a generalizable framework for top-down fate-decision studies and a practical approach to the taxonomy of cell fate decisions.

    1. Developmental Biology
    2. Evolutionary Biology
    Zhuqing Wang, Yue Wang ... Wei Yan
    Research Article

    Despite rapid evolution across eutherian mammals, the X-linked MIR-506 family miRNAs are located in a region flanked by two highly conserved protein-coding genes (SLITRK2 and FMR1) on the X chromosome. Intriguingly, these miRNAs are predominantly expressed in the testis, suggesting a potential role in spermatogenesis and male fertility. Here, we report that the X-linked MIR-506 family miRNAs were derived from the MER91C DNA transposons. Selective inactivation of individual miRNAs or clusters caused no discernible defects, but simultaneous ablation of five clusters containing 19 members of the MIR-506 family led to reduced male fertility in mice. Despite normal sperm counts, motility, and morphology, the KO sperm were less competitive than wild-type sperm when subjected to a polyandrous mating scheme. Transcriptomic and bioinformatic analyses revealed that these X-linked MIR-506 family miRNAs, in addition to targeting a set of conserved genes, have more targets that are critical for spermatogenesis and embryonic development during evolution. Our data suggest that the MIR-506 family miRNAs function to enhance sperm competitiveness and reproductive fitness of the male by finetuning gene expression during spermatogenesis.