Blood samples from Zanzibar and coastal Tanzania.

Parasites between Zanzibar and coastal mainland Tanzania are highly related but microstructure within Zanzibar is apparent.

A) Principal Component Analysis (PCA) comparing parasites from symptomatic vs. asymptomatic patients from coastal Tanzania and Zanzibar. Clusters with an identity by descent (IBD) value of greater than 0.90 were limited to a single representative infection to prevent local-structure of highly related isolates within shehias from driving clustering. B) A Discriminant Analysis of Principal Components (DAPC) was performed utilizing isolates with unique pseudohaplotypes, pruning highly related isolates to a single representative infection. Districts were included with at least 5 isolates remaining to have sufficient samples for the DAPC. For plotting the inset map, the district coordinates (e.g. Mainland, Kati, etc.) are calculated from the averages of the shehia centroids within each district.

Coastal Tanzania and Zanzibari parasites have more highly related pairs within their given region than between regions.

K-means clustering of shehia coordinates was performed using geographic coordinates all shehias present from the sample population to generate 5 clusters (colored boxes). All shehias were included to assay pairwise IBD between differences throughout Zanzibar. K-means cluster assignments were converted into interpretable geographic names Pemba, Unguja North (Unguja_N), Unguja Central (Unguja_C), Unguja South (Unguja_S) and mainland Tanzania (Mainland). Pairwise comparisons of within cluster IBD (column 1 of IBD distribution plots) and between cluster IBD (column 2-5 of IBD distribution plots) was done for all clusters. All IBD values greater than 0 were plotted for each comparison. In general, within cluster IBD had more pairwise comparisons containing high IBD identity.

Isolation by distance is shown between all Zanzibari parasites

(A), only Unguja parasites (B) and only Pemba parasites (C). Samples were analyzed based on geographic location, Zanzibar (N=136) (A), Unguja (N=105) (B) or Pemba (N=31) (C) and greater circle (GC) distances between pairs of parasite isolates were calculated based on shehia centroid coordinates. These distances were binned at 4km increments out to 12 km. IBD beyond 12km is shown in Supplemental Figure 8. The maximum GC distance for all of Zanzibar was 135km, 58km on Unguja and 12km on Pemba. The mean IBD and 95% CI is plotted for each bin.

Highly related pairs span long distances across Zanzibar.

Sample pairs were filtered to have IBD estimates of 0.25 or greater. Within shehia pairwise IBD estimates are shown within Unguja (Panel A) and Pemba (Panel B) as single points, with dark orange representing the greatest degree of IBD. Shehias labeled with black dots do not have within IBD estimates of .25 or greater. Between shehia IBD reflects pairs of parasites with IBD greater than or equal to 0.25, with the color of the connecting arc representing the degree of IBD and yellow representing maximal connectivity. Panel C shows the network of highly related pairs (IBD >0.25) within and between the 6 northern Pemba shehias (note: Micheweni is a shehia in Micheweni district). Samples (nodes) are colored by shehia and IBD estimates (edges) are represented on a continuous scale with increasing width and yellow-shading indicating higher IBD.

Complexity of infection (COI) and Fws metric shows a higher COI and lower Fws in asymptomatic than symptomatic infections in both mainland Tanzania and Zanzibar isolates.

COI (A) was estimated by the REAL McCOIL’s categorical method (H.-H. Chang et al., 2017). Mean COI for asymptomatic was greater than symptomatic infections for all regions (MAIN-A: 2.5 (2.1-2.9), MAIN-S: 1.7 (1.6-1.9), p < 0.05, Wilcoxon-Mann-Whitney test and ZAN-A: 2.2 (1.7-2.8), ZAN-S: 1.7 (1.5-1.9), p = 0.05, Wilcoxon-Mann-Whitney test). Fws (B) was estimated utilizing the formula, (1 - Hw)/Hp, where Hw is the within-sample heterozygosity and Hp is the heterozygosity across the population. Mean Fws was less in asymptomatic than symptomatic samples (MAIN-A: 0.67 (0.6-0.7), MAIN-S: 0.85 (0.8-0.9), p < 0.05, Wilcoxon-Mann-Whitney test and ZAN-A: 0.73 (0.6-0.8), ZAN-S: 0.84 (0.8-0.9), p = 0.05, Wilcoxon-Mann-Whitney test). A nonparametric bootstrap was applied to calculate the mean and 95% confidence interval (CI) from the COI and Fws values.

Drug resistance polymorphism prevalence in Zanzibar and coastal mainland Tanzania.