Abstract
Most malaria rapid diagnostic tests (RDTs) detect Plasmodium falciparum histidine-rich protein 2 (PfHRP2) and PfHRP3, but deletions of pfhrp2 and phfrp3 genes make parasites undetectable by RDTs. To better understand these deletions, we analyzed 19,289 public whole-genome-sequenced P. falciparum field samples. Pfhrp2 deletion only occurred by chromosomal breakage with subsequent telomere healing. Pfhrp3 deletions involved loss from pfhrp3 to the telomere and showed 3 patterns: no other associated rearrangement with evidence of telomere healing at breakpoint (Asia; Pattern 13-); associated with duplication of a chromosome 5 segment containing multidrug-resistant-1 gene (Asia; Pattern 13-5++); and most commonly, associated with duplication of a chromosome 11 segment (Americas/Africa; Pattern 13-11++). We confirmed a 13-11 hybrid chromosome with long-read sequencing, consistent with a translocation product arising from recombination between large interchromosomal ribosome-containing segmental duplications. Within most 13-11++ parasites, the duplicated chromosome 11 segments were identical to each other. Across parasites, multiple distinct haplotype groupings were consistent with emergence due to clonal expansion of progeny from intrastrain meiotic recombination. Together, these observations suggest negative selection normally removes 13-11++ pfhrp3 deletions, and specific conditions are needed for their emergence and spread including low transmission, findings that can help refine surveillance strategies.
Introduction
P. falciparum malaria remains a leading cause of childhood mortality in Africa
Plasmodium falciparum remains one of the most common causes of malaria and childhood mortality in Africa despite significant efforts to eradicate the disease1. The latest report by the World Health Organization estimated 247 million cases of malaria and 619,000 fatalities in 2021 alone with the vast majority of deaths occurring in Africa1.
The mainstay of malaria diagnosis across Africa is no longer microscopy but rapid diagnostic tests (RDTs) due to their simplicity and speed. Their swift adoption, now totaling hundreds of millions a year, coupled with effective artemisinin-based combination therapies (ACTs) has led to significant progress in malaria control2,3. The predominant and most sensitive falciparum malaria RDTs detect P. falciparum histidine-rich protein 2 (PfHRP2) and, to a lesser extent, its paralog PfHRP3 due to cross-reactivity.
Increasing numbers of pfhrp2 and pfhrp3-deleted parasites escaping diagnosis by RDTs
Unfortunately, a growing number of studies have reported laboratory and field isolates with deletions of pfhrp2 (PF3D7_0831800) and pfhrp3 (PF3D7_1372200) in the subtelomeric regions of chromosomes 8 and 13, respectively. The resulting lack of these proteins allows the parasite to fully escape diagnosis by PfHRP2-based RDTs3–7. Deleted parasites appear to be spreading rapidly in some regions and have compromised existing test-and-treat programs, especially in the Horn of Africa8–11. The prevalence of parasites with pfhrp2 and pfhrp3 deletion varies markedly across continents and regions in a manner not explained by RDT use alone. Parasites with these deletions are well-established in areas where PfHRP2-based RDTs have never been used routinely such as parts of South America7. Studies in Ethiopia, where false-negative RDTs owing to pfhrp2 and pfhrp3 deletions are common, suggest that the pfhrp3 deletion arose first given it is more prevalent and shows a shorter shared haplotype9. The reason why pfhrp3 deletion occurred prior to pfhrp2 remains unclear. A 1994 study of the HB3 laboratory strain reported frequent meiotic translocation of a pfhrp3 deletion from chromosomes 13 to 1112. Explanation of this mechanism, whether it might occur in natural populations, and how it relates to initial loss of pfhrp3 has not been fully explored.
Precise pfhfp2 and pfhrp3 deletion mechanisms remain unknown
Studies of P. falciparum structural rearrangements are challenging and pfhrp2 and pfhrp3 deletions particularly so due to their position in complex subtelomeric regions. Subtelomeric regions represent roughly 5% of the genome, are unstable, and contain rapidly diversifying gene families (e.g., var, rifin, stevor) that undergo frequent conversion between chromosomes mediated by non-allelic homologous recombination (NAHR) and double-stranded breakage (DSB) and telomere healing13–19. Subtelomeric exchange importantly allows for unbalanced progeny without the usual deleterious ramifications of altering a larger proportion of a chromosome. Of note, newly formed duplications predispose to further duplications or other rearrangements through NAHR between highly identical paralogous regions. Together, this potentiates the rapid expansion of gene families and their spread across subtelomeric regions20–25. The duplicative transposition of a subtelomeric region of one chromosome onto another chromosome frequently occurs in P. falciparum. Specifically, prior studies have found duplicative transposition events involving several genes including var2csa and cytochrome b17,26–29. Notably, pfhrp2 and pfhrp3 are adjacent to but not considered part of the subtelomeric regions, and recombination of var genes does not result in the deletion of pfhrp2 and pfhrp39,30.
Telomere healing, de novo telomere addition via telomerase activity, is associated with subtelomeric deletion events in P. falciparum that involve chromosomal breakage and loss of all downstream genes. Healing serves to stabilize the end of the chromosome. Deletion of the P. falciparum knob-associated histidine-rich protein (KAHRP or pfhrp1) and pfhrp2 genes via this mechanism was first reported to occur in laboratory isolates31. Since then, studies have defined the critical role of telomerase in P. falciparum and additional occurrences affecting a number of genes including pfhrp1, Pf332, and Pf87 in laboratory isolates15,32,33. For pfhrp1 and pfhrp2, this mechanism of deletion only occurred in laboratory isolates but not in clinical samples, suggesting the genes have important infections in normal infections and their loss is selected against32.
An improved understanding of the patterns and mechanisms of pfhrp2 and pfhrp3 deletions can provide important insights into how frequently they occur and the evolutionary pressures driving their emergence and help inform control strategies. Here, using available whole-genome sequences and additional long-read sequencing, we examined the pattern and nature of pfhrp2 and pfhrp3 deletions. Our findings shed light on geographical differences in pfhrp3 deletion patterns, their mechanisms, and how they likely emerged, providing key information for improved surveillance.
Results
Pfhrp2 and pfhrp3 deletions in the global P. falciparum genomic dataset
We examined all publicly available Illumina whole-genome sequencing (WGS) data from global P. falciparum isolates as of January 2023, comprising 19,289 field samples and lab isolates (Table S1). We analyzed the genomic regions containing pfhrp2 on chromosome 8 and pfhrp3 on chromosome 13 to detect nucleotide and copy number variation (e.g., deletions and duplications) using local haplotype assembly and sequencing depth. Regions on chromosomes 5 and 11 associated with these duplicates were also analyzed (Table S2, Table S4, Methods). We identified 27 parasites with pfhrp2 deletion, 172 with pfhrp3 deletion, and 21 with both pfhrp2 and pfhrp3 deletions. Across all regions, pfhrp3 deletions were more common than pfhrp2 deletions; specifically, pfhrp3 deletions and pfhrp2 deletions were present in Africa in 43 and 12, Asia in 53 and 4, and South America in 76 and 11 parasites. It should be noted that these numbers are not accurate measures of prevalence given that most WGS specimens have been collected based on RDT positivity.
Pfhrp2 deletion associated with variable breakpoints and telomeric healing
We further examined the breakpoints of 27 parasites (25 patient parasites and 2 lab isolates) (Supplemental Figure 1). Twelve parasites showed evidence of breakage and telomeric healing as suggested by telomere-associated tandem repeat 1 (TARE-1)14 sequence contiguous with the genomic sequence at locations where coverage drops to zero (Supplemental Figure 2). The majority of breakpoints occur within pfhrp2, found in 9 South American parasites and lab isolate D10 (Supplemental Figure 2). The other pfhrp2-deleted parasites did not have detectable TARE1 or evidence of genomic rearrangement but were amplified with sWGA, limiting the ability to detect the TARE1 sequence. Thus, pfhrp2 deletion likely occurs solely through breakage events with subsequent telomeric healing.
Three distinct pfhrp3 deletion patterns with geographical associations
Exploration of read depth revealed three distinct deletion copy number patterns associated with pfhrp3 deletion (chromosome 13: 2,840,236-2,842,840): first, sole deletion of chromosome 13 starting at various locations centromeric to pfhrp3 to the end of the chromosome with detectable TARE1 telomere healing and unassociated with other rearrangements (pattern 13-); second, deletion of chromosome 13 from position 2,835,587 to the end of the chromosome and associated with duplication of a chromosome 5 segment from position 952,668 to 979,203, which includes pfmdr1 (pattern 13-5++); and third, deletion of chromosome 13 commencing just centromeric to pfhrp3 and extending to the end of the chromosome and associated with duplication of the chromosome 11 subtelomeric region (pattern 13-11++) (Figure 1). Among the 172 parasites with pfhrp3 deletion, 21 (12.2%) were pattern 13-, 29 (16.9%) were pattern 13-5++, and the majority with 122 (70.9%) demonstrated pattern 13-11++. Pattern 13-11++ was almost exclusively found in parasites from Africa and the Americas, while 13-5++ was only observed in Asia (Figure 1).
Pattern 13- associated with telomere healing
The 21 parasites with pattern 13- had deletions of the core genome averaging 19kb (range: 11-31kb). Of these 13- deletions, 20 out of 21 had detectable TARE1 adjacent to the breakpoint, consistent with telomere healing (Supplemental Figure 3).
Pattern 13-5++ associated with NAHR-mediated pfmdr1 duplication and subsequent telomere healing
The 29 parasites with deletion pattern 13-5++ had a consistent loss of 17.9kb of chromosome 13 and a gain of 25kb from chromosome 5. These isolates have evidence of a genomic rearrangement that involves a 26bp AT di-nucleotide repeat at 2,835,587 on chromosome 13 and a 20bp AT di-nucleotide repeat at 979,203 on chromosome 5. Analysis revealed paired-end reads with discordant mapping with one read mapping to chromosome 13 and the other mapping to chromosome 5. Reads assembled from these regions form a contig of normally unique sequence that connects chromosome 13 (position 2,835,587) to chromosome 5 (position 979,203) in reverse orientation. Read depth coverage analysis revealed more than a two-fold increase on chromosome 5 from 979,203 to 952,668 with TARE1 sequence contiguously extending from 952,668, consistent with telomere healing. This 25kb duplication contained several genes including intact PF3D7_0523000 (pfmdr1) (Supplemental Figure 4 and Supplemental Figure 5), and TARE1 transition occurred within the gene PF3D7_0522900 (a zinc finger gene). Further read depth, discordant read, and assembly analysis revealed four 13-5++ parasites that, in addition to the chromosome 5 segment duplication on chromosome 13, had the described pfmdr1 tandem duplication on chromosome 5 associated with drug resistance, resulting in overall 3-fold read depth across pfmdr1 gene (Supplemental Figure 4)34.
Pattern 13-11++ predominated in the Americas and Africa
Pattern 13-11++ was observed in 74 American parasites, 39 African parasites, and 6 Asian parasites (Figure 1). Of the 122 parasites with this pfhrp3 deletion pattern, 98 parasites (73 of the 74 American, 20 of 40 African parasites, and 5 of 6 Asian) had near-identical copies of the chromosome 11 duplicated region. Near-identical copies were defined as having ≥99% identity (same variant microhaplotype between copies) across 382 variant microhaplotypes within the duplicated region far less than normal between parasite allelic differences (Table S4). These 98 parasites containing identical copies did not all share the same overall haplotypes but rather showed 11 major haplotype groups (Figure 5). The remaining 24 parasites had variation within this region; on average, 10.2% of variant sites differed between the copies (min 83.8% identity). The overall 11 haplotype groups showed geographical separation with distinct haplotypes observed in American and African strains. The overall haplotypes for the segment of chromosome 11 found within 13-11++ parasites could also be found within the parasites lacking the 13-11++ translocation (Supplemental Figure 6, Supplemental Figure 7, Supplemental Figure 8, Supplemental Figure 9, Supplemental Figure 10, and Supplemental Figure 11).
Pattern 13-11++ breakpoint occurs in a segmental duplication of ribosomal genes on chromosomes 11 and 13
Pattern 13-11++ has a centromeric breakpoint consistently occurring within a 15kb interchromosomal segmental duplication on chromosome 11 and 13. It was the largest duplication in the core genome based on an all-by-all unique k-mer comparison of the genome using nucmer35 (Supplemental Figure 12). The two copies on chromosome 11 and chromosome 13 in the reference genome were 99.0% identical (Figure 2) and oriented similarly. Each copy contained a centromeric 7kb region encoding 2-3 undefined protein products (98.9% identity) and a telomeric 8kb nearly identical region (99.7% identity) containing one of the two S-type13 ribosomal genes (S=sporozoite), which are primary expressed during life cycle stages in mosquito vector (Figure 2, Supplemental Figure 13 and Supplemental Figure 14). Pairwise alignment of the chromosomes 11 and 13 paralogs showed similar levels of allelic and paralogous identity and no consistent nucleotide differences were found between the paralogs leading to no distinct separation between copies when clustered (Figure 2). This suggests ongoing interchromosomal exchanges or conversion events maintaining paralog homogeneity.
Ribosomal gene segmental duplication exists in closely related P. praefalciparum
To look at the conservation of the segmental duplication containing the ribosomal genes, we examined genomes of closely related Plasmodium parasites in the Laverania subgenus, which comprised P. falciparum and other Plasmodium spp found in African apes. The Plasmodium praefalciparum genome, which is P. falciparum’s closest relative having diverged about 50,000 years ago36, also contained similar S-type rRNA loci on chromosomes 11 and 13 and had a similar gene synteny to P. falciparum in these regions and the region on chromosome 8 neighboring pfhrp2 (Supplemental Figure 15, Supplemental Figure 16, and Supplemental Figure 17). P. praefalciparum contained the 15.2kb duplicated region on both chromosomes 11 and 13 and was 96.7% similar to the 3D7 duplicated region. Other Laverania genomes36 were not fully assembled within their subtelomeric regions.
Previous PacBio assemblies did not fully resolve chromosome 11 and 13 subtelomeres
Given pattern 13-11++ was suggestive of duplication-mediated recombination leading to translocation, we examined high-quality PacBio genome assemblies of SD01 from Sudan and HB3 from Honduras, both containing the pfhrp3 deletion. However, the Companion37 gene annotations of chromosome 1130 showed that these strains were not fully assembled in the relevant regions (Supplemental Figure 18 and Supplemental Figure 19).
Combined analysis of additional Nanopore and PacBio reads confirmed a segmental duplicated region of the normal chromosome 11 and hybrid chromosome 13-11
To better examine the genome structure of pattern 13-11++, we whole-genome sequenced the 13-11++ isolates HB3 and SD01 with long-read Nanopore technology. We generated 7,350 Mb and 645 Mb of data representing an average coverage of 319x and 29.3x for HB3 and SD01, respectively. We combined our Nanopore data with the publicly available PacBio sequencing data and tested for the presence of hybrid chromosomes using a two-pronged approach: 1) mapping the long reads directly to normal and hybrid chromosome 11/13 constructs and 2) optimized de-novo assembly of the higher quality Nanopore long reads.
To directly map reads, we constructed 3D7-based representations of hybrid chromosomes 13-11 and 11-13 by joining the 3D7 chromosomal sequences at breakpoints in the middle of the segmental duplication (Methods). We then aligned all PacBio and Nanopore reads for each isolate to the normal and hybrid constructs to detect reads completely spanning the duplicated region extending at least 50bp into flanking unique regions (Figure 4). HB3 had 77 spanning reads across normal chromosome 11 and 91 spanning reads across hybrid chromosome 13-11. SD01 had two chromosome 11 spanning reads and one 13-11 chromosome spanning read. SD01 had a small number of spanning reads due to lower overall Nanopore reads secondary to insufficient input sample. Further analysis on SD01 revealed 4 regions within this duplicated region that had chromosome 11 and 13-specific nucleotide variation, which was leveraged to further bridge across this region for additional confirmation given SD01’s low coverage (Supplemental Figure 20 and Supplemental Figure 21). Neither isolate had long-reads spanning normal chromosome 13 or hybrid 11-13, which represented the reciprocal translocation product (Figure 4). Importantly, the other 12 isolates with intact pfhrp3 from the PacBio dataset30 all had reads consistent with normal chromosomes -- reads spanning chromosome 11 and chromosome 13 and no reads spanning the hybrid 13-11 or 11-13 chromosomes (Figure 4). Thus, long reads for HB3 and SD01 confirmed the presence of a hybrid 13-11 chromosome.
De novo long-read assemblies of pfhrp3-deleted strains further confirmed hybrid 13-11 chromosome
To further examine the parasites with hybrid 13-11 chromosomes and exclude potentially more complicated structural alterations involving other regions of the genome, de novo whole-genome assemblies were created for the HB3 and SD01 lab strains from Nanopore long reads. HB3 assembly yielded 16 contigs representing complete chromosomes (N50 1,5985,898 and L50 5). TARE-114 was detected on the ends of all chromosomes except for the 3’ end of chromosome 7 and the 5’ end of chromosome 5, indicating that telomere-to-telomere coverage had been achieved. SD01, however, with lower sequencing coverage, had a more disjointed assembly with 200 final contigs (N50 263,459, and L50 30). The HB3 and SD01 assemblies both had a chromosome 11 that closely matched normal 3D7 chromosome 11 and a separate hybrid 13-11 that closely matched 3D7 chromosome 13 until the ribosomal duplication region where it then subsequently best-matched chromosome 11 (Figure 3, Supplemental Figure 22). HB3’s 11 and hybrid 13-11 chromosomes had TARE-1 at their ends14, indicating that these chromosomes were complete. These new assemblies were further annotated for genes by Companion37. The contig matching the hybrid 13-11 for both strains essentially contained a duplicated portion of chromosome 11 telomeric to the ribosomal duplication. The duplicated genes within this segment included pf332 (PF3D7_1149000), two ring erythrocyte surface antigens genes (PF3D7_1149200, PF3D7_1149500), three PHISTs genes, a FIKK family gene, and two hypothetical proteins and ended with a DnaJ gene (PF3D7_1149600) corresponding to 3D7 genes PF3D7_1148700 through PF3D7_1149600 (Supplemental Figure 23 and Supplemental Figure 24). Homology between HB3 chromosomes 11 and 13-11 continued up through a rifin, then a stevor gene, and then the sequence completely diverged in the most telomeric region with a different gene family organization structure but both consisting of stevor, rifin, and var gene families along with other paralogous gene families (Supplemental Figure 23). The chromosome 13-11 SD01 contig reached the DNAJ protein (PF3D7_1149600) and terminated (Supplemental Figure 24), while normal 11 continued through 2 var genes and 4 rifin genes, likely because the assembly was unable to contend with the near complete identical sequence between the two chromosomes. Examination of the longer normal 11 portion revealed two-fold coverage and no variation. Therefore, it is likely the SD01 has identical 11 segments intact to the telomere of each chromosome.
Analysis of the 11 other PacBio assemblies30 with normal chromosome 11 showed that homology between strains also ended at this DnaJ gene (PF3D7_1149600) with the genes immediately following being within the stevor, rifin, and var gene families among other paralogous gene families. The genes on chromosome 13 deleted in the hybrid chromosome 13-11 corresponded to 3D7 genes PF3D7_1371500 through PF3D7_1373500 and include notably pfhrp3 and EBL-1 (PF3D7_1371600). The de-novo long-read assemblies of HB3 and SD01 further confirmed the presence of a normal chromosome 11 and hybrid chromosome 13-11 without other structural alterations.
Genomic refinement of breakpoint location for 13-11++
To better define the breakpoint, we examined microhaplotypes within the 15.2kb ribosomal duplication for the 98 13-11++ strains containing near-perfect chromosome 11 segments (Supplemental Figure 8). Within each strain, the microhaplotypes in the telomeric region are identical, consistent with a continuation of the adjacent chromosome 11 duplication. However, for nearly all strains, as the region traverses towards the centromere within the ribosomal duplication, there is an abrupt transition where the haplotypes begin to differ. These transition points vary but are shared within specific groupings correlating with the chromosome 11 microhaplotypes (Supplemental Figure 8). These transition points likely represent NAHR exchange breakpoints, and their varied locations further support that multiple intrastrain translocation events have given rise to 13-11++ parasites in the population.
Discussion
Here, we used publicly available short-read and long-read from parasites across the world and newly generated long-read sequencing data to identify pfhrp2 and pfhrp3 deletions and their mechanisms in field P. falciparum parasites. The limited number of pfhrp2-deleted strains showed chromosome 8 breakpoints predominantly in the gene with evidence of telomere healing, a common repair mechanism in P. falciparum16,18. We found that pfhrp3 deletions occurred through three different mechanisms. The least common mechanism was the simplest involving simple breakage loss of chromosome 13 from pfhrp3 to the telomere, followed by telomere healing (13- pattern). The second most common pattern 13-5++ was likely the result of NAHR, within 20-28bp di-nucleotide AT repeats translocating a 26,535bp region of chromosome 5 containing pfmdr1 onto chromosome 13, thereby duplicating pfmdr1 and deleting pfhrp3. There appeared to be one origin of 13-5++, which was only observed in the Asia population, and its continued presence was potentially driven by the added benefit of pfmdr1 duplication in the presence of mefloquine. The most common pattern, 13-11++, predominated in the Americas and Africa and was the result of NAHR between chromosome 11 and 13 within the large 15.2kb highly-identical ribosomal duplication, translocating and thereby duplicating 70,175bp of core chromosome 11 plus 15-87kb of paralogous sub-telomeric region replacing the chromosomal region on chromosome 13 that contained pfhrp3. Importantly, NAHR-mediated translocations resulting in deletion have repeatedly occurred based on evidence of multiple breakpoints and chromosome 11 haplotypes with identical copies in parasites. These findings combined with identical copies of the shared chromosome 11 segment suggest that these parasites represent multiple instances of intrastrain (self) NAHR-mediated translocation followed by clonal propagation of 13-11++ progeny. While Hinterberg et al. proposed that a general mechanism of non-homologous recombination of the subtelomeric regions may be responsible for translocating the already existing deletion of pfhrp312, our analysis would suggest ribosomal duplication-mediated NAHR is the likely cause of the pfhrp3 deletion itself. The high frequency of the meiotic translocation in the laboratory cross further supports the hypothesis that these NAHR-mediated translocations are occurring at a high frequency in meiosis in natural populations. Consequently, this suggests that progeny must be strongly selected against in natural populations apart from where specific conditions exist, allowing for pfhrp3 deletion to emerge and expand (e.g. South America and the Horn of Africa).
Positive selection due to drug resistance may underlie pattern 13-5++ translocation that duplicates pfmdr1 onto chromosome 13. In South East Asia, the only place containing pattern 13-5++, existing tandem duplications of pfmdr1 exist that provide mefloquine resistance, and mefloquine has been used extensively as an artemisinin partner drug, unlike Africa38. Discordant reads, local assembly, and TARE1 identification support NAHR-mediated translocation of pfmdr1 followed by telomeric healing to create a functional chromosome. All strains showed the same exact NAHR breakpoint and TARE1 localization consistent with a single origin event giving rise to all 13-5++ parasites. Interestingly, pfmdr1 duplications have been shown to be unstable with both increases and decreases in copy numbers frequently occurring39. During de-amplification, a free fragment of DNA containing a pfmdr1 copy may have been the substrate that integrated into chromosome 13 by NAHR, followed by telomerase healing to stabilize the 13-5 hybrid chromosome, analogous to var gene recombination events where double-stranded DNA is displaced, becoming highly recombinogenic17. A clonal expansion of 13-5++ parasites could be due to the benefit of the extra pfmdr1 copy on chromosome 13, the loss of pfhrp3, or both. Its expansion in SEA would be consistent with selection due to copy-number-associated mefloquine resistance given mefloquine’s extensive use as an individual and artemisinin partner drug in the region. Furthermore, given all isolates with evidence of this duplication either had only wild-type pfmdr1 or were mixed, the 13-5 chromosome copy most likely had wild-type pfmdr1. In 20 out of 27 pfmdr1 duplication cases with a mixed genotype of pfmdr1 (Supplemental Figure 5), the core genome pfmdr1 had the Y184F mutation with no other mutations detected within the pfmdr1 gene. Isolates containing only Y184F in pfmdr1 were shown to be outgrown by wild-type pfmdr140, which would mean having the wild-type pfmdr1 duplication on chromosome 13 might confer a stable (non-tandem) “heterozygous” survival advantage beyond just increased copy number-mediated resistance to mefloquine.
To confirm the NAHR event between 11 and 13 leading to loss of pfhrp3 observed in our analysis of short-read data, we long-read sequenced pfhrp3 deleted lab isolates, HB3 and SD01, to generate reads spanning the 15kb duplicated region, showing support for a normal chromosome 11 and a hybrid 13-11 in both isolates. These findings supported an NAHR event between the two 15kb duplicated regions causing this interchromosomal exchange and leading to progeny with a hybrid 13-11 chromosome lacking pfhrp3 and its surrounding genes from the 15kb duplicated region and onwards (Figure 3 and Figure 4). This was consistent with the genomic coverage pattern we observed in publicly available data from 122 pfhrp3-deleted field samples (Figure 1). Such translocation patterns have been described and also confirmed by long-read sequencing but have generally involved multigene families such as var genes within the subtelomeres16,17. The event described here represented a much larger section of loss/duplication of 70kb of the core genome in addition to the subtelomere.
We propose a mechanistic model in which homology misalignment and recombination between chromosomes 11 and 13 initially occurs in an oocyst from identical parasites predominately in low transmission settings, resulting in four potential progeny including one with normal chromosomes 11 and 13 and three with translocations (Figure 6). This could account for the identical haplotypes observed in the two copies of the chromosome 11 segment. Based on the identical haplotypes observed in the majority of parasites, the most direct and likely mechanism involves progeny with two copies of chromosome 11 recombining with an unrelated strain to yield unrelated chromosome 11 haplotypes. This duplication-mediated NAHR event occurs frequently during meiosis and can explain the frequent rearrangements seen between chromosomes 11 and 13 in the previous experimental cross of HB3 x DD212. Meiotic misalignment and subsequent NAHR is a common cause of high-frequency chromosomal rearrangements including in human disease (eg. in humans, 22q11 deletion syndrome due to misalignment of duplicated blocks on chromosome 22 occurs in 1 in 4000 births)41. This high frequency could explain why pfhrp3-deleted isolates are more common in many populations relative to pfhrp29,42,43, which likely requires infrequent random breaks along with rescue by telomere healing. In the future, more extensive sequencing of RDT-negative P. falciparum parasites is needed to confirm that there are no other deletion mechanisms responsible for pfhrp2 loss.
The lack of hybrid chromosome 13-11 worldwide suggests such events are normally quickly removed from the population due to fitness costs, an idea supported by recent in vitro competition studies in culture showing decreased fitness of pfhrp2/3-deleted parasites44. This decreased fitness of parasites with pfhrp2/3 deletions also argues against a mitotic origin as deletions arising after meiosis would have to compete against more numerous and more fit intact parasites. Additionally, pfhrp3 deletions arising in culture have not been observed. The fact that abundant pfhrp3 deletions have only been observed in low-transmission areas where within-infection competition is rare is consistent with this hypothesis of within-infection competition suppressing emergence. In the setting of RDT use, existing pfhrp3 deletions in such a low-transmission environment may provide a genetic background on which less frequent pfhrp2 deletion events can occur, leading to a fully RDT-negative parasite. This is supported by evidence that pfhrp3 deletion appears to predate pfhrp2 deletions in the Horn of Africa9.
The biological effects of pfhrp2 and pfhrp3 loss and potential selective forces are complicated due to other genes lost and gained and the extent of the rearrangements. Increased copies of genes on chromosome 11 could be beneficial, as pf332 on the chromosome 11 duplicated segment was found to be essential for the binding of the Maurer cleft to the erythrocyte skeleton and is highly expressed in patients with cerebral malaria45. Conversely, lack of this protein is likely detrimental to survival and may be the reason the reciprocal hybrid 11-13 was not observed in field isolates. Only lab isolate FCR3 had any indication from coverage data that it had a duplicated chromosome 13 and a deleted chromosome 11. Given that the majority of the publicly available field samples were collected from studies using RDT-positive samples and that RDT would have likely detected the increased PfHRP3 encoded by duplicated pfhrp3, sampling should not be biased against detecting parasites with this reciprocal hybrid 11-13. Thus, the lack of 11-13 rearrangement in field isolates suggests that the selective disadvantage of the lost and gained genes was strong enough to prevent its emergence in the natural parasite population.
While further studies are needed to determine the reasons for these geographical patterns of pfhrp3 deletions, our results provide an improved understanding of the mechanism of structural variation underlying pfhrp3 deletion. They also suggest general constraints against emergence in high-transmission regions due to within-host competition and that there are likely further specific requirements for emergence in low-transmission settings. If selective constraints of pfhrp2 and pfhrp3 deletions are similar, the high frequency of the NAHR-mediated loss and the additional drug pressure from duplication of pfmdr1 may explain why pfhrp3 loss precedes pfhrp2 loss despite RDT pressure presumably exerting stronger survival advantage with loss of pfhrp2 versus pfhrp3. However, given we still have a limited understanding of their biological roles, there may be situations where selective forces may favor loss of pfhrp2 relative to pfhrp3. Overall, our findings are clinically important, because continued loss of these genes without timely intervention may result in a rapid decrease in the sensitivity of HRP2-based RDTs. Future studies focused on these deletions including representative sampling are needed to determine the prevalence, interactions, and impacts of pfhrp2 and pfhrp3 deletions and to examine the selective pressures and complex biology underlying them.
Materials and methods
Genomic locations and read-depth analysis
Conserved non-paralogous genomic regions surrounding pfhrp2 and pfhrp3 were determined to study the genomic deletions encompassing these genes. This was accomplished by first marking the 3D7 genome with the program tandem repeat finder46, then taking 200bp windows, stepping every 100bp between these tandem repeats, and using LASTZ47(version 1.04.22) to align these regions against the reference genome 3D7 (version 3, 2015-06-18) and 10 currently available chromosomal-level PacBio assembled genomes30 that lacked pfhrp2 and pfhrp3 deletions. Regions that aligned at >70% identity in each genome only once were kept, and overlapping regions were then merged. Regions from within the duplicated region on chromosome 11 and chromosome 13 were kept if they aligned to either chromosome 11 or 13 but not to other chromosomes. Local haplotype reconstruction was performed on each region for 19,289 publicly available whole genome sequences (WGS) of P. falciparum field samples, including 24 samples from a recent study in Ethiopia9 where pfhrp2-/3- parasites are common. The called haplotypes were compared to determine which subregions contained variation in order to genotype each chromosome (Supplemental Figure 25). Coverage was determined for each genomic region by dividing the read depth of a region by the median read depth across the whole genome within each sample. Windows were examined on chromosomes 8, 11, and 13 starting from chromosome 8 1,290,239, chromosome 11 1,897,151, and chromosome 13 2,769,916 to their respective telomeres. The analyzed region on chromosome 5 that included pfmdr1 spanned from 929,384 to 988,747. All coordinates in the manuscript are zero-based positioning and are relative to P. falciparum 3D7 genome version 3 (version=2015-06-18) (Supplemental Figure 25, Supplemental Figure 26).
Tandem repeat associated element 1 (TARE1) analysis and telomere healing determination
The 7bp pattern of TT[CT]AGGG14 was used to determine the presence of TARE1, of which the presence of this pattern was required to occur at least twice in tandem. To search for the presence of TARE1 within the short read Illumina WGS datasets, reads from the entire regions of interest were pulled down across chromosomes 5, 8, 11, and 13, and the above TARE1 pattern was searched for in each read. Regions that had TARE1 detected in their reads were then assembled to ensure the TARE1 sequence was contiguous with the genomic region from which the reads were pulled. The regions that had the presence of TARE1 contiguous with genomic regions were then compared to the coverage pattern within the area, and a parasite was marked as having evidence of telomere healing if TARE1 was detected in the regions where coverage then dropped to 0, or down to the genomic coverage of the rest of the genome in the case of the chromosome 5 duplication.
Chromosome 5 pfmdr1 Duplication Breakpoint Determination
The breakpoints of recombination on chromosome 5 were determined by looking for discordant read pairs, mate mapping to different chromosomal positions, around areas of interest, then assembling the discordant pairs, and mapping back to the assembled contig. Breakpoints were then determined by looking at the coordinates where a contig switches from one chromosomal region to the next.
Homologous Genomic Structure
To investigate the genomic landscape of recent segmental duplications across the genome and around pfhrp2 and pfhrp3, an all-by-all comparison of 3D7 reference genome was performed by first finding kmers of size 31 unique within each chromosome and then determining the locations of these kmers in the other chromosomes. If kmers were found in adjacent positions in both chromosomes, they were then merged into larger regions.
Comparisons within Laverania
To investigate the origins of this region shared between chromosomes 11 and 13, the six closest relatives of Plasmodium falciparum within the Laverania subgenus with available assembled genomes were examined36. The genomes of all Laverania have recently been sequenced and assembled using PacBio and Illumina data36. The assemblies were analyzed using their annotations and by using LASTZ47 with 80% identity and 90% coverage of the genes in the surrounding regions on chromosomes 5, 8, 11, and 13.
Long-read sequences
All PacBio reads for strains with known or suspected pfhrp3 deletions were obtained by SRA accession numbers from the National Center for Biotechnology Information (NCBI): HB3/Honduras (ERS712858) and SD01/Sudan (ERS746009)30. To supplement these reads and to improve upon previous assemblies that were unable to fully assemble chromosomes 11 and 13, we further sequenced these strains using Oxford Nanopore Technologies’ MinION device48–50. The P. falciparum lab isolate HB3/Honduras (MRA-155) was obtained from the National Institute of Allergy and Infectious Diseases’ BEI Resources Repository, while the field strain SD01/Sudan was obtained from the Department of Cellular and Applied Infection Biology at Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen University in Germany. Nanopore base-calling was done with Guppy version 5.0.7. Genome assemblies were performed with Canu51, an assembly pipeline for high-noise single-molecule sequencing, and Flye52 using default settings. In order to assemble the low coverage and highly similar chromosome 11 and 13 segments of SD01, two assemblies were performed with Flye using chromosome 13-specific reads and chromosome 11-specific reads to get contigs that represented the chromosome 11 and 13 segments. HB3 was assembled using the Canu assembler with default settings. Note that SD01 had a more disjointed assembly likely due to coming from the last remaining cryopreserved vial that was low parasitemia and nonviable and subsequent lower amount of input DNA. The PacBio/Nanopore reads were mapped to reference genomes using Minimap2, a sequence alignment program 53. Mappings were visualized using custom R scripts54–56.
Data and resource availability
Nanopore data is available from the SRA (Project # pending). The datasets generated and/or analyzed during the current study are available at https://seekdeep.brown.edu/Analysis_Surrounding_HRP2_3_deletions/, while the code for analyzing Nanopore reads can be found in the Github repository https://github.com/bailey-lab/hrp3.
Acknowledgements and funding sources
Acknowledgements
We thank Drs. Ngwa Julius Che, Matthias Frank, and Gabriele Pradel from Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen University for generously providing a residual SD01 sample. The following reagent was obtained through BEI Resources, NIAID, NIH: Plasmodium falciparum, Strain HB3, MRA-155, contributed by Thomas E. Wellems.
Funding Sources
We thank the National Institutes of Allergy and Infectious Diseases (NIAID) for their support via the grants R01AI132547 (JJJ JBP and JAB) and K24AI134990 (JJJ)
Competing Interests
JBP reports past research support from the World Health Organization focused on pfhrp2 and pfhrp3 deletions; as well as research support from Gilead Sciences, non-financial support from Abbott Laboratories, and consulting for Zymeron Corporation outside the scope of this manuscript. All other authors declare that they have no competing interests.
Supporting information
References
- 1.World malaria report 2022
- 2.Malaria diagnostics: now and the futureParasitology 141:1873–1879
- 3.HRP2: Transforming Malaria Diagnosis, but with CaveatsTrends Parasitol 36:112–126
- 4.Genetic diversity of Plasmodium falciparum histidine-rich protein 2 (PfHRP2) and its effect on the performance of PfHRP2-based rapid diagnostic testsJ. Infect. Dis 192:870–877
- 5.Good practices for selecting and procuring rapid diagnostic tests for malaria
- 6.Plasmodium falciparum parasites lacking histidine-rich protein 2 and 3: a review and recommendations for accurate reportingMalar. J 13
- 7.Prevalence of Plasmodium falciparum lacking histidine-rich proteins 2 and 3: a systematic reviewBull. World Health Organ 98:558–568
- 8.Statement by the Malaria Policy Advisory Group on the urgent need to address the high prevalence of pfhrp2/3 gene deletions in the Horn of Africa and beyond
- 9.Plasmodium falciparum is evolving to escape malaria rapid diagnostic tests in EthiopiaNat Microbiol 6:1289–1299
- 10.HRP-2 deletion: a hole in the ship of malaria eliminationLancet Infect. Dis 18:826–827
- 11.Major Threat to Malaria Control Programs by Plasmodium falciparum Lacking Histidine-Rich Protein 2, EritreaEmerg. Infect. Dis 24:462–470
- 12.Interchromosomal exchange of a large subtelomeric segment in a Plasmodium falciparum crossEMBO J 13:4174–4180
- 13.Genome sequence of the human malaria parasite Plasmodium falciparumNature 419:498–511
- 14.Sequence and structure of a Plasmodium falciparum telomereMol. Biochem. Parasitol 28:85–94
- 15.Subtelomeric chromosome instability in Plasmodium falciparum: short telomere-like sequence motifs found frequently at healed chromosome breakpointsMutat. Res 324:115–120
- 16.Chromosome End Repair and Genome Stability in Plasmodium falciparumMBio 8
- 17.Rapid antigen diversification through mitotic recombination in the human malaria parasite Plasmodium falciparumPLoS Biol 17
- 18.DNA repair mechanisms and their biological roles in the malaria parasite Plasmodium falciparumMicrobiol. Mol. Biol. Rev 78:469–486
- 19.Telomere length dynamics in response to DNA damage in malaria parasitesiScience 24
- 20.Origins and functional impact of copy number variation in the human genomeNature 464:704–712
- 21.Paired-end mapping reveals extensive structural variation in the human genomeScience 318:420–426
- 22.A human genome structural variation sequencing resource reveals insights into mutational mechanismsCell 143:837–847
- 23.Mapping copy number variation by population-scale genome sequencingNature 470:59–65
- 24.Detecting non-allelic homologous recombination from high-throughput sequencing dataGenome Biol 16
- 25.Impact of homologous and non-homologous recombination in the genomic evolution of Escherichia coliBMC Genomics 13
- 26.Multiple var2csa-type PfEMP1 genes located at different chromosomal loci occur in many Plasmodium falciparum isolatesPLoS One 4
- 27.DNA secondary structures are associated with recombination in major Plasmodium falciparum variable surface antigen gene familiesNucleic Acids Res 42:2270–2281
- 28.Mitotic evolution of Plasmodium falciparum shows a stable core genome but recombination in antigen familiesPLoS Genet 9
- 29.Generation of antigenic diversity in Plasmodium falciparum by structured rearrangement of Var genes during mitosisPLoS Genet 10
- 30.Long read assemblies of geographically dispersed Plasmodium falciparum isolates reveal highly structured subtelomeresWellcome Open Res 3
- 31.Large deletions result from breakage and healing of P. falciparum chromosomesCell 55:869–874
- 32.Cloning and characterization of chromosome breakpoints of Plasmodium falciparum: breakage and new telomere formation occurs frequently and randomly in subtelomeric genesNucleic Acids Res 20:1491–1496
- 33.Plasmodium falciparum telomerase: de novo telomere addition to telomeric and nontelomeric sequences and role in chromosome healingMol. Cell. Biol 18:919–925
- 34.Recurrent gene amplification and soft selective sweeps during evolution of multidrug resistance in malaria parasitesMol. Biol. Evol 24:562–573
- 35.Versatile and open software for comparing large genomesGenome Biol 5
- 36.Genomes of all known members of a Plasmodium subgenus reveal paths to virulent human malariaNat Microbiol 3:687–697
- 37.Companion: a web server for annotation and analysis of parasite genomesNucleic Acids Res 44:W29–34
- 38.Globally prevalent PfMDR1 mutations modulate Plasmodium falciparum susceptibility to artemisinin-based combination therapiesNat. Commun 7
- 39.The landscape of inherited and de novo copy number variants in a Plasmodium falciparum genetic crossBMC Genomics 12
- 40.Balanced impacts of fitness and drug pressure on the evolution of PfMDR1 polymorphisms in Plasmodium falciparumMalar. J 20
- 41.Deletions and duplications of the 22q11.2 region in spermatozoa from DiGeorge/velocardiofacial fathersMolecular Cytogenetics 7https://doi.org/10.1186/s13039-014-0086-3
- 42.High-throughput Plasmodium falciparum hrp2 and hrp3 gene deletion typing by digital PCR to monitor malaria rapid diagnostic test efficacyElife 11
- 43.Evaluation of Histidine-Rich Proteins 2 and 3 Gene Deletions in Plasmodium falciparum in Endemic Areas of the Brazilian AmazonInt. J. Environ. Res. Public Health 18
- 44.Fitness Costs of pfhrp2 and pfhrp3 Deletions Underlying Diagnostic Evasion in Malaria ParasitesJ. Infect. Dis 226:1637–1645
- 45.Proteomic analysis of Plasmodium falciparum parasites from patients with cerebral and uncomplicated malariaSci. Rep 6
- 46.Tandem repeats finder: a program to analyze DNA sequencesNucleic Acids Res 27:573–580
- 47.Improved pairwise alignment of genomic DNAThe Pennsylvania State University
- 48.Characterization of individual polynucleotide molecules using a membrane channelProc. Natl. Acad. Sci. U. S. A 93:13770–13773
- 49.The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics communityGenome Biol 17
- 50.The potential and challenges of nanopore sequencingNat. Biotechnol 26:1146–1153
- 51.Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separationGenome Res 27:722–736
- 52.Assembly of long, error-prone reads using repeat graphsNat. Biotechnol 37:540–546
- 53.Minimap2: pairwise alignment for nucleotide sequencesBioinformatics 34:3094–3100
- 54.R: A Language and Environment for Statistical Computing
- 55.ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics
- 56.ComplexHeatmap: Make Complex Heatmaps
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
- Reviewed Preprint version 2:
- Version of Record published:
Copyright
© 2024, Hathaway et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.