Taxonium, a web-based tool for exploring large phylogenetic trees
Abstract
The COVID-19 pandemic has resulted in a step change in the scale of sequencing data, with more genomes of SARS-CoV-2 having been sequenced than any other organism on earth. These sequences reveal key insights when represented as a phylogenetic tree, which captures the evolutionary history of the virus, and allows the identification of transmission events and the emergence of new variants. However, existing web-based tools for exploring phylogenies do not scale to the size of datasets now available for SARS-CoV-2. We have developed Taxonium, a new tool that uses WebGL to allow the exploration of trees with tens of millions of nodes in the browser for the first time. Taxonium links each node to associated metadata and supports mutation-annotated trees, which are able to capture all known genetic variation in a dataset. It can either be run entirely locally in the browser, from a serverbased backend, or as a desktop application. We describe insights that analysing a tree of five million sequences can provide into SARS-CoV-2 evolution, and provide a tool at cov2tree.org for exploring a public tree of more than five million SARS-CoV-2 sequences. Taxonium can be applied to any tree, and is available at taxonium.org, with source code at github.com/theosanderson/taxonium.
Data availability
All code is available on GitHub. Data was not generated as part of this study. Data sources are indicated in the manuscript and raw data is available in all cases, without the need for requests.
Article and author information
Author details
Funding
Wellcome Trust (210918/Z/18/Z)
- Theo Sanderson
Wellcome Trust (FC001043)
- Theo Sanderson
Cancer Research UK (FC001043)
- Theo Sanderson
Medical Research Council (FC001043)
- Theo Sanderson
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Copyright
© 2022, Sanderson
This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 2,963
- views
-
- 283
- downloads
-
- 46
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Citations by DOI
-
- 46
- citations for umbrella DOI https://doi.org/10.7554/eLife.82392