Single-cell RNA-seq reveals hidden transcriptional variation in malaria parasites

Version of Record: March 27, 2018

Download
Cite
Share
CommentOpen annotations (there are currently 0 annotations on this page).

Altmetric provides a collated score for online attention across various platforms and media.
See more details

1. Part of Collection
Malaria: A Collection of Articles

Edited by Olivier Silvie et al.

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Single-cell RNA-sequencing is revolutionising our understanding of seemingly homogeneous cell populations but has not yet been widely applied to single-celled organisms. Transcriptional variation in unicellular malaria parasites from the Plasmodium genus is associated with critical phenotypes including red blood cell invasion and immune evasion, yet transcriptional variation at an individual parasite level has not been examined in depth. Here, we describe the adaptation of a single-cell RNA-sequencing (scRNA-seq) protocol to deconvolute transcriptional variation for more than 500 individual parasites of both rodent and human malaria comprising asexual and sexual life-cycle stages. We uncover previously hidden discrete transcriptional signatures during the pathogenic part of the life cycle, suggesting that expression over development is not as continuous as commonly thought. In transmission stages, we find novel, sex-specific roles for differential expression of contingency gene families that are usually associated with immune evasion and pathogenesis.

https://doi.org/10.7554/eLife.33105.001

eLife digest

Malaria is a life-threatening disease that affects hundreds of millions of people every year and causes around 500,000 deaths, mostly among young children. The disease is caused by Plasmodium parasites, which have a complex life cycle that involves different stages in different hosts. During mosquito bites, the parasites can be transmitted to people where they spend part of their life cycle inside red blood cells. Inside these cells, they can multiply rapidly and eventually burst the blood cells, which causes some of the symptoms of the disease. The parasite also produces sexual stages, which can be passed on to the next mosquito that feeds on the host.

Scientists have been studying these different stages to better understand how the parasites manage to evade the human immune system so successfully. Most of the research has looked at how genes differ between large pools of parasites, but this approach hides important differences between individual parasites. Understanding variation and how individual parasites behave could help to develop new and effective drugs and vaccines for malaria.

Now, Reid et al. used a technique called single-cell RNA sequencing, which allowed them to hone in on individual genes within a single parasite. This revealed hidden patterns in the way the parasites use their genes across the life cycle. When the parasite is developing inside a red blood cell, distinct groups of genes turn on simultaneously and are later switched off together. Reid et al. found clues about the genes that might be controlling these groups. The experiments also showed that a set of genes previously thought to be involved solely in evading the immune system is also important for the transition from human to mosquito.

A next step will be to see if single-cell RNA sequencing technology could be used to reveal more about the basic biology of the parasite and how it resists drugs or evades the immune system. In the future, this may help to develop drugs that interfere with the synchronisation of these groups of genes to disrupt the parasite’s development and stop it from causing the disease. The genes involved in transmission between hosts could be another promising drug target, and one day, may help to eliminate the disease.

https://doi.org/10.7554/eLife.33105.002

Introduction

Malaria is caused by unicellular eukaryotic parasites from the Plasmodium genus. These organisms have a complex life cycle comprising many different developmental stages. In the blood of infected patients, asexual intra-erythrocytic replication of parasites is solely responsible for pathogenesis, whilst sexual stages, termed gametocytes, are the only stage capable of transmission to the next host via the mosquito vector. These distinct life stages have been extensively investigated using transcriptomic approaches (Otto et al., 2010; Bozdech et al., 2003a; López-Barragán et al., 2011; Llinás et al., 2006; Hall et al., 2005; Lasonder et al., 2016; Otto et al., 2014), but this has been largely at a population level. Little is known about how individual cells vary within stages.

Single-cell RNA-seq (scRNA-seq) produces transcriptomic profiles for multiple individual cells. This has allowed the decomposition of cell populations (Haber et al., 2017), uncovered previously unknown cell types (Grün et al., 2015) and enhanced our understanding of developmental pathways (Mohammed et al., 2017). Several scRNA-seq methods with different attributes have now been described (Ziegenhain et al., 2017), with some providing depth – a good representation of full length transcripts (Picelli et al., 2013) from tens or hundreds of cells – and others providing breadth, with poorer representation of transcriptomes but from a much greater number of cells (Macosko et al., 2015). scRNA-seq promises powerful new examinations of unicellular organisms, especially those that are difficult to obtain in large numbers or are not amenable to in vitro cultivation. A number of important questions in malaria biology will benefit from single-cell technology. For instance, what are the transcriptional switches in individual parasites that drive phenotypes such as commitment to the sexual development pathway (Sinha et al., 2014; Kafsack et al., 2014), parasite sequestration (Tembo et al., 2014) and immune evasion (Scherf et al., 2008).

A recent study (Poran et al., 2017) demonstrated the use of a high-throughput, low-coverage scRNA-seq technique (Drop-seq [Macosko et al., 2015]) to identify a signature of sexual commitment in Plasmodium. Here, we use a lower throughput (fewer cells), but higher coverage (both more genes detected and more of each gene’s length detected via full-length transcript sequencing) approach to examine transcriptional dynamics of the parasite during the blood stages in both the most popular rodent model parasite (P. berghei) and the most deadly human malaria parasite (P. falciparum). We show that this method is highly effective at capturing transcriptional variation associated with different parasite stages and cell cycle states, and we also uncover previously unknown aspects of the parasite's progression through its asexual cycle and in its sexual stages.

Results

Optimisation of a single-cell RNA-seq protocol for Plasmodium parasites

The greatest coverage of genes in mammalian cells using scRNA-seq has been achieved with the Smart-Seq2 protocol (Picelli et al., 2013). In this method, cells are sorted by FACS into individual wells, followed by full-length cDNA generation using a viral reverse transcriptase. This mediates the addition of a triple cytosine overhang to the 3′ end to the first strand cDNA that allows the annealing of a strand switching oligonucleotide for second strand synthesis and direct cDNA amplification by PCR. This plate-based approach tends to result in detection of more transcripts from more genes than other approaches (Svensson et al., 2017). Furthermore, it is a full-length transcript method, providing information about transcript structure, allowing deconvolution of splicing variants and inference about the strand of origin (Wu et al., 2015).

Initially, we trialled the standard version of the Smart-seq2 protocol (Picelli et al., 2013) on sorted, Plasmodium falciparum-infected single red blood cells (Figure 1A), adjusting only the number of PCR cycles (30 rather than 18) to account for the relatively low RNA content of protozoan cells. However, on average, only 10% of reads mapped to genes in the parasite genome and more than half of these mapped to rRNA genes (Figure 1B).

Figure 1

Download asset Open asset

Establishment of a robust protocol for single-cell transcriptomic analysis of *Plasmodium* parasites.

(A) Overview of the single-cell RNAseq protocol. Steps in the original Smart-seq2 protocol (Picelli et al., 2013) that resulted in significant gains are highlighted in orange. (B) Relative numbers of reads mapping to coding RNA and rDNA for our initial sequencing trial, averaged over all cells in that trial (n = 5). (C) The protocol was evaluated using qPCR of the *msp-1* transcript (PF3D7_09303000) on sorted pools of 10 asexual parasites (n = 8) (significance from Mann-Whitney test, p≤0.05 *p≤0.01 **p≤0.001 ***). The following reagents were tested: Oligo(dT)s containing a terminal anchoring base (A,G,C; V) or not (T) and of varying lengths (20 Ts vs. 30 Ts); four reverse transcriptase enzymes; 25 or 30 cycles of preamplification. (D) Relative numbers of reads mapping to coding RNA and rDNA for optimisation trials (6, 5, 6, 6 cells, respectively) and the main *P. falciparum* gametocyte (n = 237), *P. berghei* mixed blood (n = 182) and *P. falciparum* asexual (n = 189) datasets (final three bars). Asterisks indicate selected significant differences between proportions of reads mapping to coding genes, calculated using Mann-Whitney U (p≤0.05 *p≤0.01 **).

https://doi.org/10.7554/eLife.33105.003

To improve yield, we tested the impact of: removing the anchoring base from the oligo(dT) and varying length of the oligo(dT) primer (20 vs 30); changing the reverse transcription enzymes (SuperScriptII, SuperScriptIV, SMARTMMLV, and SmartScribe); and varying the number of amplification cycles (25 or 30). We generated libraries for pools of 10 sorted late stage P. falciparum cells and tested the abundance of transcripts from the msp-1 gene by quantitative RT-PCR. A longer, unanchored oligo(dT) primer (T30) significantly improved yield and SuperScript II and SMARTScribe were the highest yielding reverse transcriptases (Figure 1C). Amplification for 25 and 30 cycles appeared to give equivalent results (Figure 1C). To understand the impact of these permutations on transcriptome sequence complexity, we sorted individual P. falciparum cells and generated single-cell transcriptome libraries using the dT₃₀ oligo, either the SuperScript II or SmartScribe enzymes and either 25 or 30 cycles of PCR (Figure 1D). Significantly more genes were detected, with dramatically reduced rRNA contamination, using the SMARTScribe enzyme (Figure 1D; Table 1). Given equivalent results for 25 or 30 cycles, we opted to use the lower number of cycles for all subsequent experiments.

Table 1

Reagents permuted during optimisation of the single cell RNAseq protocol and stats of each treatment condition after sequencing.

Different combinations of the protocol were tested by sequencing. Initial trials were performed with 2 µl of lysis buffer, this was increased to 4 µl to augment capture efficiency. Permutations of the protocol that were tested were a terminal anchoring base (A,G,C; V) or not (T), two reverse transcriptase enzymes (Smartscribe (SmSc); Superscript II (SII)) and 25 or 30 cycles of preamplification. Both sexual and asexual cells of P. berghei and P. falciparum were tested. For each sequenced dataset, we calculated the mean percentages of rRNA, mRNA and other reads across the cells. For some samples we also downsampled the data to 50,000 reads per cell to allow comparison of the number of genes detected. This was done to determine differences in the complexity of each library. For the three larger datasets produced (P. falciparum gametocytes, P. berghei mixed blood stages, and P. falciparum asexual stages), we provide the numbers of pre- and post-filtered cells and median number of genes in those filtered cells.

https://doi.org/10.7554/eLife.33105.004

Conditions tested	Protocol	SSII, V30, 30 cycles	SSII, T30, 30 cycles	SmSc, T30, 30 cycles	SSII, T30, 25 cycles	SmSc, T30, 25 cycles	SmSc, T30, 25 cycles	SmSc, T30, 25 cycles	SmSc, T30, 25 cycles
	Cells	Sexual	Asexual	Asexual	Asexual	Asexual	Sexual	Mixed blood	Asexual
	Species	Pf	Pf	Pf	Pf	Pf	Pf	Pb	Pf
Lysis buffer volume	2 µl	✓
Lysis buffer volume	4 µl		✓	✓	✓	✓	✓	✓	✓
Oligo Dt (IDT)	Anchored 30 bp	✓
Oligo Dt (IDT)	Non-Anchored 30 bp		✓	✓	✓	✓	✓	✓	✓
Reverse transcriptase	Superscript II (Life Technologies) 10U	✓	✓		✓
Reverse transcriptase	Smartscribe (Clontech) 5U			✓		✓	✓	✓	✓
Cycle number	25				✓	✓	✓	✓	✓
Cycle number	30	✓	✓	✓
Sequencing machine	HiSeq						✓	✓	✓
Sequencing machine	MiSeq	✓	✓	✓	✓	✓
Sequencing results summary	% rRNA	5.7	33.5	36.2	6.4	18.4	17.8	16.7	34.8
	% coding genes	4.4	11.3	39.3	10.5	33	51.7	49	40.5
	% other	90	55.2	24.4	83.1	48.6	30.5	34.2	24.6
	Median genes detected for 50k reads	25	84	145	174	181	502.5	NA	NA
	Total cells	5	6	6	6	6	237	182	174
	Cells passing filters	NA	NA	NA	NA	NA	191	144	161
	Median gene count	NA	NA	NA	NA	NA	2011	1922.5	1793

Two potential sources of contamination are important to consider in scRNA-seq experiments. First, single-sorted cells could actually comprise multiple cells, resulting in a hybrid signal that adds noise to downstream analyses. Second, ambient RNA from lysed cells in the cell suspension could be transferred along with intact cells into each well. To evaluate these potential sources of contamination, we flow-sorted individual parasites from a mixture of fluorescently-labelled GFP P. falciparum (Pf) and mCherry P. berghei (Pb) late-stage parasites into a 96-well plate (Figure 2—figure supplement 1). We then prepared and sequenced transcriptome libraries for each cell. The reads were mapped to a combined reference of both genome sequences. No evidence for doublet events was found (Figure 2A) and, for each cell, the vast majority of reads (98.1% for P.berghei, 99.4% for P. falciparum) mapped uniquely to the genome of the expected species (Figure 2A). The few transcripts that mapped to the wrong genome were those most highly expressed in the other species and most likely to be picked up from the solution (Figure 2—figure supplement 1B,C). A very low number of individual ambient transcripts were detected (Figure 2—figure supplement 1D,E). Only 15 of 3566 transcripts detected in P. berghei cells were from P. falciparum, and none of these were differentially expressed, suggesting they will not affect our downstream analysis.

Figure 2 with 2 supplements see all

Download asset Open asset

Assessment of single-cell transcriptome sequence purity, diversity and accuracy.

(A) Individually sorted *P. falciparum* and *P. berghei* cells from a mixed pool revealed no doublets and little contamination. (B) Distributions of numbers of genes identified as expressed in our three main datasets. (C) Expressed genes (those with at least 10 reads in at least five cells) were representative of average gene length, suggesting that although the reverse transcriptase might not copy the whole of long transcripts, fragments of long genes are still detected. (D) Sequencing library preparation often introduces end bias, where either the 5’ or 3’ end of transcripts tend to be better covered. Our protocol introduced a small 5’-bias, which could be attributable to the reverse transcription sometimes initiating within transcripts in internal polyA regions, rather than in the 3’ poly-A tail.

https://doi.org/10.7554/eLife.33105.005

Having established the reliability of the protocol, we generated 188 single-cell transcriptomes of mixed asexual and sexual (gametocyte) blood-stage parasites of the rodent malaria model P. berghei. After filtering to remove transcriptomes with fewer than 25,000 total reads and fewer than 1000 detected genes (with at least one read), 144 high-quality transcriptomes remained. We then removed genes unless they had at least ten reads in each of five or more cells. In total, we detected expression of 4579 genes: over 90% of genes in the P. berghei genome. From each cell, we identified expression from, on average, 1981 genes (~33%), similar to the proportion of transcriptomes detected in mammalian single-cell experiments (Treutlein et al., 2014) (Figure 2B).

We also generated single-cell transcriptomes for the human malaria parasite P. falciparum and processed them using the same filtering procedure as for P. berghei. This resulted in 191 high-quality single-cell transcriptomes (of 237 total) for sexual stages, with an average 2090 genes detected, and 161 high-quality single-cell transcriptomes (of 174 total) for asexual stages, with an average 1712 genes detected (Figure 2B).

We used the P. berghei dataset to explore biases in the representation of transcripts sequenced with our protocol. First, we checked if some regions were overrepresented amongst our transcript sequences due to preferential amplification of less AT-rich sequences by PCR. Second, because the reverse transcriptase ought to process a complete mRNA in order to produce cDNA, we determined whether there was a bias against long genes. In fact, neither GC content (Figure 2—figure supplement 2) nor gene length (Figure 2C) had an impact on transcript detection. In the case of many long genes, the lack of a length-bias could be due to the sequencing of mRNA fragments, rather than full-length sequences. This suggests that the Smart-seq2 protocol is susceptible to internal priming by oligo-d(T) (as described in [Nam et al., 2002]) and template-switching at the exposed 5’ ends of mRNA fragments. The benefit of this is that we are able to assay transcription levels of long and short genes with similar accuracy. Many RNA-seq approaches display a signal bias towards the 5’ or 3’ end of transcripts and in our data, a slight 5’ bias was detected that might also reflect binding of oligo(dT) to internal polyA-rich regions of transcripts (Figure 2D).

Using single-cell RNA-seq to resolve parasite populations

Having developed and assessed our protocol for sequencing single-cell transcriptomes, we next determined whether different parasite stages could be resolved among the 144 P. berghei mixed blood stage transcriptomes. Using a combination of Principal Components Analysis (PCA), k-means clustering using SC3 (Kiselev, 2016), and comparison to bulk transcriptome datasets (Otto et al., 2014; Hoo et al., 2016), we classified each cell as male, female, or asexual (Figure 3A). Classification of cells is an important step in the analysis of single-cell transcriptome data but classifying all cells in a particular dataset can be a challenge. For Plasmodium, the availability of a variety of published bulk RNA-seq and microarray datasets enabled us to determine the approximate life stage of each cell. For P. berghei, we used a microarray dataset (Hoo et al., 2016) that examined the 24 hr asexual cycle at 2-hr intervals and an RNA-seq dataset (Otto et al., 2014) that included samples at three asexual timepoints (rings, trophozoites and schizonts) as well as mixed sex gametocytes. For each cell, we compared the list of genes ranked by expression level to those of each sample from the above data sets, picking the best-correlated time point. Male and female gametocytes were differentiated by examining marker genes from cell clusters made using SC3 (Kiselev et al., 2017). We established a manually annotated consensus classification for each cell based on the above analyses. Some cells appeared to have intermediate transcriptomes between asexuals and gametocytes and these were labelled as outliers. These may result from co-infected individual red blood cells.

Figure 3 with 2 supplements see all

Download asset Open asset

Different cell types were successfully resolved using single-cell transcriptome.

(A) A combination of Principal Components Analysis (PCA), k-means clustering and comparison to bulk RNA-seq datasets was used to classify 144 high-quality *P. berghei* single cells, and revealed three distinct subpopulations. Outliers may represent erythrocytes infected with both sexual and asexual stages or early stages in gametocyte development. (B) Three well-established markers of the male, female and asexual lineages (Mair et al., 2006; Liu et al., 2008; Moss et al., 2012) are concordant with our classification.

https://doi.org/10.7554/eLife.33105.008

The accuracy of our classification was strongly supported by established stage-specific markers (Figure 3B; Figure 3—figure supplement 1). Moreover, the confirmed absence of contaminating parasites of other life-cycle stages enabled us to determine a new, longer list of stage-specific markers (Supplementary file 1). We conducted similar analyses for two P. falciparum samples composed of asexual and sexual stages. Because they originated from two distinct pure samples, their classification was more straightforward and both sets of cells (asexual and sexual) correlated as expected with previously published bulk datasets (Otto et al., 2010; López-Barragán et al., 2011; Lasonder et al., 2016) (Figure 3—figure supplement 2).

Pseudotime analysis reveals abrupt transcriptional dynamics across the asexual stages

Plasmodium asexual development is replicative, yet it does not follow canonical eukaryotic cell cycle progression and although checkpoints are believed to exist, they have not been characterized (Gerald et al., 2011). Bulk RNA-seq studies monitoring transcriptional patterns along the complete asexual cycle of both human and rodent malaria parasite species have consistently revealed a continuous cascade of transcription initiation (Hoo et al., 2016; Bozdech et al., 2003b) similar to that seen in other eukaryotes (Spellman et al., 1998). Although these analyses have used synchronised parasite populations that allow reasonably tight windows of expression to be assayed, their resolution has been limited by surveying pools of cells within each expression window that can differ in developmental progression by several hours. Single-cell RNA-seq allows unsynchronised populations to be sampled, from across large parts of the cycle, and the order of cells in the cycle to be identified using pseudotime analysis (Trapnell et al., 2014). Pseudotime analysis orders cells into developmental trajectories by identifying cells with transcriptomes that are most similar to each other and placing those closest to each other in order. To reconstruct the latter part of the asexual development cycle, we first used M3Drop (Andrews and Hemberg, 2016) to identify genes that varied between the asexual cells. This tool takes account of the large number of zero values (drop outs) in the data that are due to the low capture rate inherent in single-cell approaches. We then used these genes to compare each transcriptome and carry out a pseudotime analysis with Monocle 2 (Trapnell et al., 2014). This enabled us to place each P. berghei and P. falciparum asexual cell along a developmental trajectory. The cell orderings determined by pseudotime analysis were highly concordant with previously published transcriptional studies of the developmental time course (Otto et al., 2010; López-Barragán et al., 2011; Otto et al., 2014; Hoo et al., 2016)(Figure 4A,B, Figure 4—figure supplement 1A,B). This demonstrates that single Plasmodium cells from an unsynchronised pool can be ordered by their transcriptional signatures to accurately derive a transcriptional map of development in the late asexual cycle (Figure 4C, Figure 4—figure supplement 1C).

Figure 4 with 4 supplements see all

Download asset Open asset

Single-cell RNA-seq reveals hidden transcriptional variation in the asexual cell cycle.

(A) Pseudotime ordering (using [Trapnell et al., 2014]) of the asexual cells in was in close agreement with bulk RNA-seq datasets (predicted stage = consensus; see Materials and methods). (B) Pseudotime ordering (using [Trapnell et al., 2014]) of the 125 *P. falciparum* late asexual cells was in close agreement with bulk RNA-seq datasets (predicted timepoint from [Otto et al., 2010], predicted stage = consensus; see Materials and methods). (C) Differentially expressed genes (identified using M3Drop [Andrews and Hemberg, 2016]) were clustered along pseudotime revealing groups of genes with abrupt expression profile changes during late asexual cycle. Functional enrichment in the clusters was in agreement with the expected shift from the growing trophozoite to the budding schizont (IMC = Inner Membrane Complex; micronemes and rhoptries are secretory organelles). ‘Hoo’ is the most similar timepoint in development in the Hoo et al. (2016) dataset.

https://doi.org/10.7554/eLife.33105.011

In stark contrast to the smooth transitions observed previously in bulk time course experiments (Bozdech et al., 2003a; Hoo et al., 2016), we observed abrupt changes in gene expression during the cell cycle of both P. berghei and P. falciparum (Figure 4C, Figure 4—figure supplement 1). Whereas a continuous cascade of transcription initiation along the asexual cycle can be seen in bulk RNA-seq data, single-cell data clearly revealed an abrupt transition in expression for the same genes (Figure 4—figure supplement 2). We also analysed recently published P. falciparum Drop-seq data (Poran et al., 2017) and observed a similar pattern (Figure 4—figure supplement 3). Step-wise progression in the cycle represents a departure from the common view and suggests a previously hidden transcriptional pattern, conserved across Plasmodium parasites. Nascent strand bulk RNA-seq had already called into question the cascading nature of transcription initiation in the asexual cycle (Lu et al., 2017).

We suspect that averaging across slightly asynchronous life cycle stages in bulk RNA-seq studies has previously masked the true nature of transitions along the asexual cell cycle. Individual parasites do not proceed along an incremental path of transcriptional change, but instead generally appear to undergo transcriptional shifts, turning on or shutting down expression of a whole repertoire of genes simultaneously. While these transcriptional modules appear to be rapidly turned on and off during development, they can overlap and cells may express two modules at once. A k-means analysis in pseudotime identified three clusters of genes (Trapnell et al., 2014) for each species (Figure 4C, Figure 4—figure supplement 1, Supplementary file 2). Cluster 1 in P. berghei (equivalent to cluster 2 in P. falciparum; Figure 4—figure supplement 1) was enriched for protein dynamics and energy metabolism including many ribosomal subunits, proteasome subunits and ATPases (Figure 4C). Cluster 2 in P. berghei (equivalent to cluster 3 in P. falciparum) was associated with the rhoptry secretory organelle, including ron2, ron4, ron5, ron12, rop14, rap1 and rap2/3. Cluster 3 in P. berghei was enriched for the microneme secretory organelle and the inner membrane complex, including sub2, ama1, ripr, imc1c, imc1e, imc1f, imc1g, imc1m and isp3. This latter cluster was not captured in P. falciparum. These clusters may represent discrete transcriptional modules that underlie parasitic cell cycle checkpoints during the transition from a metabolically active, fast growing trophozoite to a budding multinucleated schizont. We note that two essential ApiAP2 transcription factors (Figure 4—figure supplement 4) were associated with equivalent gene expression clusters in both species: PBANKA_1453700 (PF3D7_1239200) with the early cluster (1) and PBANKA_0939100 (PF3D7_1107800) with the late cluster (2), implicating them as potential regulators of these modules.

Some types of transcripts vary independently of the asexual cell cycle and these are conserved between stages and between species

Like many other cell types (Spellman et al., 1998; Kowalczyk et al., 2015), the point at which Plasmodium parasites are within their cell cycle dominates the transcriptional variation observed within a genetically clonal population. However, there are also genes that vary independently of the cell cycle including clonally variant gene families, which are found largely in the subtelomeric regions of the genome (Rovira-Graells et al., 2012). A unique chromatin environment is thought to allow switching between expression of different members of gene families and this mechanism allows parasite populations to adapt to the host immune system (var genes) (Scherf et al., 2008), establish chronic infection (pir genes) (Scherf et al., 2008) and vary red blood cell invasion pathways (p235) (Preiser et al., 1999). Because they enable the parasite to adapt to unexpected environments, members of these multigene families have been termed contingency genes (Reid, 2015). There is also evidence for variation in expression in response to nutrient sensing (Mancio-Silva et al., 2017) and to a variety of chemical interventions (Hu et al., 2010). We used a regression approach to identify genes that vary independently of the cell cycle (scLVM) (Buettner et al., 2015) by removing cell-cycle-dependent variation from P. falciparum asexual cells. To train this method, we used genes that varied in pseudotime (i.e. the cell cycle). We found that the first two latent factors of the expression data were driven by the cell cycle, each explaining at least 5% of variation in cell cycle genes (Figure 5—figure supplement 1). After adjusting for these, we identified 56 genes in P. falciparum asexual cells that showed residual variation (Figure 5A; Supplementary file 2). Unlike clonally variant genes identified in previous work (Rovira-Graells et al., 2012), these 56 genes were not located in subtelomeric regions. The products of these genes were involved in nucleosome assembly, the proteasome and vacuolar acidification, suggesting a role in controlling gene expression through transcription initiation, protein stability and protein localisation. The expression patterns of the 56 genes were not correlated, as might have been expected if they were part of a coordinated transcriptional response, such as a stress response. We therefore investigated whether the observed expression pattern resulted from variations in steady-state mRNA levels due to intermittent expression of these genes, followed by rapid mRNA decay. From a published dataset of mRNA half-lives in the asexual cycle, we found that these genes actually have moderately longer than average half lives (Figure 5B). This suggests that the variability of these genes was more likely to be driven by variable transcription initiation than by rapid decay. We found that these genes are more conserved in evolution than expected by chance (p=2.2e-16), and that that this is not simply because they tend to be highly expressed (Figure 5C). Intriguingly, 22 of these 56 genes are also variably expressed genes in the sexual stages, suggesting an intrinsic variability across the life cycle (Supplementary file 3E). Furthermore, similar types of genes were variable in P. berghei sexual stages (Supplementary file 1A,C), but we were unable to identify many cell-cycle-independent variable genes in P. berghei asexual cells, perhaps due to too few cells examined. It is yet to be seen whether the volatile expression of these genes is also reflected in protein abundance.

Figure 5 with 1 supplement see all

Download asset Open asset

After removing the signal of cell cycle progression, we identify a new class of cell-cycle independent variable genes.

(A) *P. falciparum* genes with >= 50% of their variance attributed to cell-cycle associated latent variable one vary in pseudotime. After removing variation associated with the cell cycle, 56 genes with >= 50% of their variance remained. Highly enriched functional terms associated with the two sets of genes are shown. (B) Here, we show that cell-cycle independent variable transcripts have similar half lives to genes in general during the ring and trophozoite stages. However, during the schizont stage and later, they are significantly longer. The data was derived from (Shock et al., 2007). (C) A conservation score, calculated based on mean amino acid substitution between *P. berghei* and *P. falciparum* proteins, was plotted against expression level (scran-l) for each cell-cycle-dependent and each cell-cycle-independent gene in *P. falciparum*. Density plots show the distributions of each of these parameters, highlighting that cell-cycle-independent genes tend to have higher conservation scores, but similar expression levels.

https://doi.org/10.7554/eLife.33105.016

Gametocytes exhibit sex-specific variable expression of contingency genes

Surprisingly, the most variably expressed genes in sexual stages were those from contingency gene families: var in P. falciparum and pir in P. berghei (Figure 6; Supplementary file 4). Contingency gene families are extremely evolutionarily labile and different species have different repertoires (Reid, 2015). Between P. falciparum and P. berghei, there is no evidence of homology between these families and while many are known or assumed to play a role in host–parasite interactions, the extent to which they might perform overlapping functions in the two species is unclear. Little is known about the role of these families in sexual stages and although transcriptional variation has not been observed, expression has (Florens et al., 2002) and suggests a role for contingency genes in transmission. Several important parts of transmission might require contingency genes encoding cell surface proteins. First, mature gametocytes are found in the blood and are thus susceptible to attack by the host adaptive immune system in much the same way as P. falciparum rings or P. berghei rings and trophozoites. Second, it has been suggested that gametocytes may cluster in order to make transmission more reliable and this might require antigenically variable cell surface proteins (Pichon et al., 2000). Finally, after transmission, gametes face a complex and hostile environment in the mosquito midgut where male gametes must rapidly find females, which they do at rates that are difficult to explain without invoking non-random movement such as chemotaxis (Lawniczak and Eckhoff, 2016). Our data revealed that males and females are very different in their expression of contingency gene families. In P. berghei male gametocytes, we observed significant variability of a set of pir genes (Otto et al., 2014) (p=0.014; Figure 6—figure supplement 1A; Supplementary file 4), whose protein products have previously been identified in male gametes (Talman et al., 2014), indicating a potential role in fertilization. This raises the intriguing possibility that variation in expression of these genes could impact male/female interactions during fertilization. We found no female-specific pir genes, instead, females showed transcriptional variation in members of subtelomeric multigene families fam-a and fam-b (Figure 6A; Supplementary file 4).

Figure 6 with 2 supplements see all

Download asset Open asset

Multigene families show variable expression within and between sexual stages of both P.

*berghei* and *P. falciparum*. (a) The heatmap shows gene expression levels for multigene family members differentially expressed between male and female *P. berghei* gametocytes. * gene variably expressed within male (orange) or female (green), *Lpl* = lysophospholipase, *ema1* = erythrocyte membrane antigen 1, (b) Read counts for *var* mRNAs in *P. falciparum* female gametocyte single cells and female and male gametocyte populations from bulk RNA-seq data. Only reads which spanned the var introns and only genes with at least two such reads were included. There were insufficient male single cells for analysis.

https://doi.org/10.7554/eLife.33105.018

In P. falciparum, the var genes are critical for establishing chronic infections through cytoadherence and antigenic variation (Scherf et al., 2008). Rather than finding significant variation in males, as expected from our findings in P. berghei, it was females that showed transcriptional variation within the var genes (p=0.0006; Figure 6B). In asexual parasites, expression of two different non-coding var transcripts is common and is involved in maintaining the mutually exclusive var gene expression that is essential for their immune evasion role (Amit-Avraham et al., 2015; Guizetti and Scherf, 2013). They are both transcribed from a bidirectional promoter within the single var intron. This means that the presence of coding var transcripts in gametocyte transcriptomes can be assessed by identifying intron-spanning reads. We found that within any single female cell, only a single var gene had reads supporting correct splicing, suggesting that mutually exclusive expression of var genes occurs in sexual stages, as it does in asexual parasites (Guizetti and Scherf, 2013). The coding var transcripts were always from internal var gene clusters, often with the upsC class of promoters, distinct from the subtelomeric var genes seen in asexual stages, with upsB and upsA promoters (Figure 6B; Figure 6—figure supplement 1B). Single male gametocytes were not represented well in this study, so instead we examined previously published bulk male and female gametocyte RNAseq data (Lasonder et al., 2016) for male var gene expression. Male gametocytes only ever showed mRNA from a single var gene, var2csa, known for its importance in pregnancy related malaria (Figure 6B; Supplementary file 4). This gene has also been proposed as an important regulator of var gene expression switching (Mok et al., 2008). Our novel observation that gametocytes show significant sex-specific variation in expression of large multigene families, hitherto known for their importance in asexual stages, suggests that their evolution and function may also be driven by sexual stage biology.

Plasmodium does not have sex chromosomes and the genetic underpinning of sexual dimorphism is very poorly understood. To explore the regulation of sexual dimorphism, we examined sex-specific expression of transcription factors in both species and conducted a co-expression analysis in males and females. We observed a marked, conserved sex-specific pattern of TF expression (Figure 6B, Figure 6—figure supplement 2). Interestingly, one female-specific TF in particular (ap2-o) has been previously shown to have a female function and is likely to have a role in differentiating male and female forms (Modrzynska et al., 2017).

Discussion

We have established an optimised protocol for generating single-cell transcriptome sequences of Plasmodium parasites with power to identify not only different cell types but also to explore potential functional variation from one cell to another. This protocol enables evaluation of full length transcripts, something required for evaluating the complex transcriptional patterns we observed for var genes but which is not currently possible with 3’ tag-based approaches (Poran et al., 2017). Furthermore, this method also has the advantage of providing information on nearly three times as many genes per cell compared to Drop-seq evaluations of the same species (~1900 on average here vs ~650 on average for Drop-seq) (Poran et al., 2017).

Future malaria studies will greatly benefit from the availability of both (i) low-coverage droplet-based methods allowing for a large number of cells to be analysed and (ii) high-coverage full-length transcript methods, allowing high-definition, focused analysis of flow cytometry sorted cells. During the optimisation of our protocol for Plasmodium parasites, we identified several decisive steps and permutable reagents that when modified were key determinants of transcriptome quality. We hope that this optimisation framework may assist in extending full-length transcript scRNA-seq to a much wider range of diverse eukaryotic cell types.

As well as establishing a new tool, our study has made several new observations about Plasmodium biology. First, we used single-cell data to produce high-resolution surveys of schizogony and observed sharp transcriptional transitions over the asexual life cycle, which was previously thought to be a continuous process. The intracellular cycle of Plasmodium is complex, consisting of several rounds of endomitotic DNA replication followed by a final synchronised cytokinesis. Although checkpoints are most likely required to ensure timeliness of complex cellular events, such as assembly of the red cell invasion machinery, they have not yet been identified (Gerald et al., 2011). We speculate that the sharp transitions we have observed correspond to such checkpoints. Although we found clues as to the possible underlying regulatory architecture, the true regulators remain to be confirmed.

A second major finding of our study was unexpected cell-to-cell variation in gene expression. Most genes are known to vary during the asexual, blood stage cell cycle with a single peak of expression (Bozdech et al., 2003a). Some genes in subtelomeric regions are known to vary independently of the cell cycle, by switching on and off in individual parasites. These include the multigene families of contingency genes known to be involved in sequestration and chronic infection (var and pir). But unexpectedly we found another class of genes varying independently of the cell cycle both in cycling and arrested cells. We found that, unlike contingency genes, they were highly conserved between species and the same types of genes were variable in parasite species infecting both humans and rodents. One could speculate that this is due to noisier signals associated specifically with some cellular function for which it is beneficial to relax transcriptional control. Generating variation in a population of many millions of closely related parasites occupying an ever varying host environment may be a bet-hedging strategy favouring success of at least some of the members of this population.

Finally, because our approach was able to dissect both male and female gametocyte transcriptomes, and assess expression of multigene families, we were able to discover an unexpected sex-specificity in expression of several multigene families. Especially intriguing is that these families are known to encode extracellular proteins involved in host-parasite interactions in asexual blood stages. They could have similar host interactive functions not yet described for sexual stages or have uncharted roles in sexual behaviour of the parasite. In the mammalian host, they might be involved in sequestration of mature gametocytes in the peripheral vasculature, as an immune evasion strategy or to aid in transmission through a mosquito bite. The sex-specific nature of the expression of var and pir genes could also indicate a possible role in fertilisation in the mosquito midgut.

Single-cell RNA-seq will have many applications for malaria parasites. Surveying parasites directly from patient samples in natural infections will undoubtedly lead to new understandings of the genes underlying important phenotypes. In addition to aspects addressed here, it may be particularly powerful for addressing the following problems: (i) analysis of small samples from nonculturable life-cycle stages or Plasmodium species that cannot yet be cultured such as the prevalent human parasite P. vivax, (ii) discovery of rare/undescribed cells states, (iii) characterisation of the effect of genetic alterations to generate high-dimensional phenotypes for many mutants in parallel (Bushell et al., 2017), and (iv) examination of cell-to-cell variability in the face of drugs and vaccines.

Share this article

Cite this article

Establishment of a robust protocol for single-cell transcriptomic analysis of Plasmodium parasites.

Reagents permuted during optimisation of the single cell RNAseq protocol and stats of each treatment condition after sequencing.

Assessment of single-cell transcriptome sequence purity, diversity and accuracy.

Different cell types were successfully resolved using single-cell transcriptome.

Single-cell RNA-seq reveals hidden transcriptional variation in the asexual cell cycle.

After removing the signal of cell cycle progression, we identify a new class of cell-cycle independent variable genes.

Multigene families show variable expression within and between sexual stages of both P.

Author details

Adam J Reid

Contribution

Contributed equally with

For correspondence

Competing interests

Arthur M Talman

Contribution

Contributed equally with

For correspondence

Competing interests

Hayley M Bennett

Contribution

Contributed equally with

Competing interests

Ana R Gomes

Contribution

Competing interests

Mandy J Sanders

Contribution

Competing interests

Christopher J R Illingworth

Contribution

Competing interests

Oliver Billker

Contribution

Competing interests

Matthew Berriman

Contribution

Competing interests

Mara KN Lawniczak

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism

Further reading