Single-cell RNA-seq reveals hidden transcriptional variation in malaria parasites

8 figures, 1 table and 7 additional files

Figures

Establishment of a robust protocol for single-cell transcriptomic analysis of Plasmodium parasites.

(A) Overview of the single-cell RNAseq protocol. Steps in the original Smart-seq2 protocol (Picelli et al., 2013) that resulted in significant gains are highlighted in orange. (B) Relative numbers …

https://doi.org/10.7554/eLife.33105.003
Figure 2 with 2 supplements
Assessment of single-cell transcriptome sequence purity, diversity and accuracy.

(A) Individually sorted P. falciparum and P. berghei cells from a mixed pool revealed no doublets and little contamination. (B) Distributions of numbers of genes identified as expressed in our three …

https://doi.org/10.7554/eLife.33105.005
Figure 2—figure supplement 1
Dual sorting of P. berghei and P. falciparum cells shows that contamination from ambient RNA is low.

(A) Purified asexual late blood stage of GFP P. falciparum and mCherry P. berghei were mixed at a 1:1 ratio, inactivated in RNAlater, and sorted individually by flow cytometry, gated on respective …

https://doi.org/10.7554/eLife.33105.006
Figure 2—figure supplement 2
The GC content of transcript fragments agreed well with the GC content of genes.

There was no apparent over- or under-representation of GC-rich regions.

https://doi.org/10.7554/eLife.33105.007
Figure 3 with 2 supplements
Different cell types were successfully resolved using single-cell transcriptome.

(A) A combination of Principal Components Analysis (PCA), k-means clustering and comparison to bulk RNA-seq datasets was used to classify 144 high-quality P. berghei single cells, and revealed three …

https://doi.org/10.7554/eLife.33105.008
Figure 3—figure supplement 1
We detect stage-specific transcripts at a variety of expression levels.

Stage-specific genes at different expression levels, were identified from RNA-seq data from (Otto et al., 2014) for (A) asexual stages, (B) male gametocytes and (C) female gametocytes. Mean FPKM …

https://doi.org/10.7554/eLife.33105.009
Figure 3—figure supplement 2
Principal Components Analysis and classification of P. falciparum gametocyte cells.

(A) A combination of Principal Components Analysis (PCA), k-means clustering and comparison to bulk RNA-seq datasets was used to classify 191 high-quality P. falciparum gametocytes. A consensus of …

https://doi.org/10.7554/eLife.33105.010
Figure 4 with 4 supplements
Single-cell RNA-seq reveals hidden transcriptional variation in the asexual cell cycle.

(A) Pseudotime ordering (using [Trapnell et al., 2014]) of the asexual cells in was in close agreement with bulk RNA-seq datasets (predicted stage = consensus; see Materials and methods). (B) …

https://doi.org/10.7554/eLife.33105.011
Figure 4—figure supplement 1
Pseudotime reconstruction of the late asexual trajectory of P. falciparum.

PCA of 155 P. falciparum cells colored by pseudotime (A) or Monocle state (B); identified trajectory branches are displayed as circled numbers 1 and 2. (C) Differentially expressed genes were …

https://doi.org/10.7554/eLife.33105.012
Figure 4—figure supplement 2
The same subsets of transcripts show different patterns of expression around the end of the asexual cell cycle in conventional bulk RNA-seq data and pseudotime reconstructions of single cell RNAseq data.

A shared set of 651 genes identified as following a sigmoidal expression pattern through the intraerythrocytic developmental cycle (see Materials and methods) are shown in both bulk transcriptome …

https://doi.org/10.7554/eLife.33105.013
Figure 4—figure supplement 3
Recently published low-coverage, high-throughput single-cell RNA-seq data supports our finding of step changes in gene expression in the P. falciparum asexual cycle.

A heatmap showing logged, mean-normalised expression values for late asexual parasites from (Poran et al., 2017) ordered by pseudotime. Genes were ordered as for Figure 4—figure supplement 2A

https://doi.org/10.7554/eLife.33105.014
Figure 4—figure supplement 4
Analysis of the co-expression pattern of the ApiAP2 family of transcription factors (TFs) in asexual parasites.

(A) Expression of Plasmodium ApiAP2 genes in asexual parasites. Orthologous genes are presented on the same rows. (B) A co-expression network for P. berghei was built using significant positive and …

https://doi.org/10.7554/eLife.33105.015
Figure 5 with 1 supplement
After removing the signal of cell cycle progression, we identify a new class of cell-cycle independent variable genes.

(A) P. falciparum genes with >= 50% of their variance attributed to cell-cycle associated latent variable one vary in pseudotime. After removing variation associated with the cell cycle, 56 genes …

https://doi.org/10.7554/eLife.33105.016
Figure 5—figure supplement 1
Latent factor analysis of expression variation in cell cycle genes.

We found that only the first two latent variables explained at least 5% of variation in cell cycle genes (red line).

https://doi.org/10.7554/eLife.33105.017
Figure 6 with 2 supplements
Multigene families show variable expression within and between sexual stages of both P.

berghei and P. falciparum. (a) The heatmap shows gene expression levels for multigene family members differentially expressed between male and female P. berghei gametocytes. * gene variably …

https://doi.org/10.7554/eLife.33105.018
Figure 6—figure supplement 1
Multigene families show variable expression in sexual stages of both P.

berghei and P. falciparum, respectively. (A) Pir gene expression was highly variable across male gametocytes. In addition, more pir genes were expressed in males than females. These are distinct …

https://doi.org/10.7554/eLife.33105.019
Figure 6—figure supplement 2
Analysis of the co-expression pattern of the ApiAP2 family of transcription factors (TFs) in sexual parasites.

(A) Expression of Plasmodium ApiAP2 genes in sexual parasites. Orthologous genes are presented on the same rows. (B) A co-expression network for P. berghei was built using significant positive and …

https://doi.org/10.7554/eLife.33105.020
Author response image 1
Many reads map uniquely to similar pir genes.

Despite being very similar in identity (88% at the nucleotide level), most reads deriving from these transcripts map uniquely. It is notable here that there appears to be variable splicing of coding …

https://doi.org/10.7554/eLife.33105.029
Author response image 2
Plots of expression level against dropout rate for each cluster.

These data show that dropout rates within each cluster are generally very low and expression levels are high but cover a range of values. This makes it unlikely that all the genes in a cluster would …

https://doi.org/10.7554/eLife.33105.030

Tables

Table 1
Reagents permuted during optimisation of the single cell RNAseq protocol and stats of each treatment condition after sequencing.

Different combinations of the protocol were tested by sequencing. Initial trials were performed with 2 µl of lysis buffer, this was increased to 4 µl to augment capture efficiency. Permutations of …

https://doi.org/10.7554/eLife.33105.004
Conditions testedProtocolSSII, V30, 30 cyclesSSII, T30, 30 cyclesSmSc, T30, 30 cyclesSSII, T30, 25 cyclesSmSc, T30, 25 cyclesSmSc, T30, 25 cyclesSmSc, T30, 25 cyclesSmSc, T30, 25 cycles
CellsSexualAsexualAsexualAsexualAsexualSexualMixed bloodAsexual
SpeciesPfPfPfPfPfPfPbPf
Lysis buffer volume2 µl
4 µl
Oligo Dt (IDT)Anchored 30 bp
Non-Anchored 30 bp
Reverse transcriptaseSuperscript II (Life Technologies) 10U
Smartscribe (Clontech) 5U
Cycle number25
30
Sequencing machineHiSeq
MiSeq
Sequencing results summary% rRNA5.733.536.26.418.417.816.734.8
% coding genes4.411.339.310.53351.74940.5
% other9055.224.483.148.630.534.224.6
Median genes detected for 50k reads2584145174181502.5NANA
Total cells56666237182174
Cells passing filtersNANANANANA191144161
Median gene countNANANANANA20111922.51793

Additional files

Supplementary file 1

Marker genes identifying P. berghei mixed stage k-means clusters.

https://doi.org/10.7554/eLife.33105.021
Supplementary file 2

Genes identified as variable in asexual stage parasites

(a) Clusters of P. berghei genes in pseudotime. (b) GO term enrichment for clusters of P. berghei genes in pseudotime. GO class: bp = biological process, mf = molecular function, cc = cellular component. (c) Clusters of P. falciparum genes in pseudotime. (d) GO term enrichment for clusters of P. falciparum genes in pseudotime. (e) P. falciparum genes identified as variant independently of the cell cycle. Cell cycle variance is the proportion of the variance for that gene associated with the first two latent variables and therefore the cell cycle. Technical variance is the proportion of variance for that gene attributed technical noise. Biological variance is the variance left over and attributable to cell-cycle-independent variation. (f) GO term enrichment for P. falciparum cell-cycle-independent genes. (g) P. berghei genes identified as variant independently of the cell cycle.

https://doi.org/10.7554/eLife.33105.022
Supplementary file 3

Highly variable genes and enriched functions in P. berghei and P. falciparum gametocytes.

(a) Genes identified as variable in P. berghei female gametocytes. The p and q values were calculated using M3Drop. (b) GO term enrichment amongst gene from (a). (c) Genes identified as variable in P. berghei male gametocytes. (d) GO term enrichment amongst gene from (c). (e) Genes identified as variable in P. falciparum female gametocytes. (f) GO term enrichment amongst gene from (e).

https://doi.org/10.7554/eLife.33105.023
Supplementary file 4

Gene expression data for multigene families.

(a) Gene expression data for pirs in P. berghei cells underlying Figure 6—figure supplement 1A. (b) Gene expression data for vars in P. falciparum cells underlying Figure 3b. (c) Multigene family members differentially expressed between P. berghei male and females gametocytes. (d) Multigene family members differentially expressed between P. falciparum male and females gametocytes, based on bulk RNA-seq data from Lasonder et al. (2016).

https://doi.org/10.7554/eLife.33105.024
Supplementary file 5

Samples sequenced in this study

(a) Description of samples generated with the initial, unmodified Smart-seq2 protocol. (b) Description of samples generated with variants of the Smart-seq2 protocol, e.g. differing numbers of PCR cycles and different reverse transcriptases. (c) Samples used to assess contamination of single cells due to lysis. (d) Description of samples for P. berghei mixed blood stages. Sc3_k4 = clustering results for SC3 clustering of all cells with k = 4, sc3_k3 = SC3 clustering of all cells with k = 3, sc3_sex_k3 = SC3 clustering of only male and female gametocytes with k = 3 (used to identify outliers). Hoo is the best correlated timepoint from the Hoo et al. (2016) microarray data for each cell. Otto is the best correlated timepoint from the Otto et al RNA-seq data (Otto et al., 2014) for each cell. Consensus is our consensus call between the clustering and the correlations against these bulk datasets. Pass_filter is TRUE if that cell passed our filtering criteria. (e) Description of samples for P. falciparum asexual parasites. Lopez is the best correlated timepoint from the López-Barragán et al. (2011) bulk RNA-seq data. Otto is the best correlated timepoint from the Otto et al. (2010) bulk RNA-seq data. Pseudotime state is the path within pseudotime identified by Monocle. This was used to filter out minor paths. Pass_filter is TRUE if that cell passed our filtering criteria. (f) Description of samples for P. falciparum gametocytes. Lasonder is the best correlated samples from Lasonder et al. (2016) bulk RNA-seq data.

https://doi.org/10.7554/eLife.33105.025
Supplementary file 6

Gene count tables for the three large datasets included in the study.

(a) Read counts for P. berghei mixed blood stages. (b) Read counts for P. falciparum asexual parasites. (c) Read counts for P. falciparum gametocytes

https://doi.org/10.7554/eLife.33105.026
Transparent reporting form
https://doi.org/10.7554/eLife.33105.027

Download links