An improved zebrafish transcriptome annotation for sensitive and comprehensive detection of cell type-specific genes
Abstract
The zebrafish is ideal for studying embryogenesis and is increasingly applied to model human disease. In these contexts, RNA-sequencing (RNA-seq) provides mechanistic insights by identifying transcriptome changes between experimental conditions. Application of RNA-seq relies on accurate transcript annotation for a genome of interest. Here, we find discrepancies in analysis from RNA-seq datasets quantified using Ensembl and RefSeq zebrafish annotations. These issues were due, in part, to variably annotated 3' untranslated regions and thousands of gene models missing from each annotation. Since these discrepancies could compromise downstream analyses and biological reproducibility, we built a more comprehensive zebrafish transcriptome annotation that addresses these deficiencies. Our annotation improves detection of cell type-specific genes in both bulk and single cell RNA-seq datasets, where it also improves resolution of cell clustering. Thus, we demonstrate that our new transcriptome annotation can outperform existing annotations, providing an important resource for zebrafish researchers.
Data availability
All data generated in this study are available in accompanying source data files. Transcriptome annotation files described in this study are available for download at zf-transcriptome.umassmed.edu. Raw and processed RNA-seq data generated in this study are available at GEO (GSE152759).
-
Bulk RNA-seq data to assess an improved zebrafish transcriptome annotationNCBI Gene Expression Omnibus, GSE152759.
-
Morphogenesis and differentiation of embryonic vascular smooth muscle cells in zebrafishNCBI Gene Expression Omnibus, GSE119718.
-
Comprehensive identification of long non-coding RNAs expressed during zebrafish embryogenesisNCBI Gene Expression Omnibus, GSE32900.
-
Extensive alternative polyadenylation during zebrafish developmentNCBI Gene Expression Omnibus, GSE37453.
Article and author information
Author details
Funding
National Heart, Lung, and Blood Institute (R35HL140017)
- Nathan D Lawson
National Human Genome Research Institute (U01HG007910)
- Onur Yukselen
- Alper Kucukural
National Center for Advancing Translational Sciences (UL1TR001453)
- Onur Yukselen
- Alper Kucukural
National Institute of Neurological Disorders and Stroke (R21NS105654)
- Nathan D Lawson
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Ethics
Animal experimentation: Zebrafish studies were performed in accordance with protocols #A2613 and #A2632 approved by the University of Massachusetts institutional animal care and use committee (IACUC).
Copyright
© 2020, Lawson et al.
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 10,676
- views
-
- 902
- downloads
-
- 119
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Citations by DOI
-
- 119
- citations for umbrella DOI https://doi.org/10.7554/eLife.55792