Falciparome phage library displays the proteome of Plasmodium falciparum in 62-aa peptides with 25-aa step size on T7 phage and also includes variant sequences of many antigens, including major …
Input sequences of different groups were filtered with CD-HIT to remove similar sequences with more than the indicated % identity in Table 2. The filtered sequences were then processed into peptides …
(a) Heatmap of Z-score enrichment over US controls for seroreactive peptides (rows) with >10% seropositivity across different age groups in the moderate and high exposure cohorts. Peptides are …
GO analysis of top seroreactive proteins.
Read counts corresponding to the 5th and 95th percentile in the distribution (indicated in blue) are within a 16-fold difference. Cumulative density plot of the distribution is shown in red.
Top - Pearson correlation matrix of depth-adjusted read counts across all samples. Technical replicates are placed symmetrically on rows and columns. Bottom three - Representative scatter plots of …
Top panel - PhIP-seq with polyclonal anti-GFAP enriches for GFAP peptides and enrichment is specific to IP with anti-GFAP, but is observed rarely in the Ugandan cohort and US controls. Left - …
Box plots of resultant number of seroreactive peptides for corresponding thresholds are shown for Ugandan samples and US controls. The final thresholds for calling seroreactivity were selected based …
All seroreactive peptides in each person were collapsed based on sequence similarity (sharing of 7mer identical motifs). The resulting number of non-redundant groups was used as a measure of …
Top - Box plot of number of domain variants seroreactive in the variable region V2 of RIFINs. Significantly different groups (KS test <0.05) are marked with an *. Bottom - Heatmap of proportion of …
(a) Examples of previously well-characterized antigens and (b) novel/previously under-characterized antigens identified in this dataset. Average percentage of people seropositive at each residue …
Location of seroreactive peptides identified in this dataset (red bar) and seroreactive 15-mer peptides identified using a high-density peptide array (black bar) in Jaenisch et.al. (peptides with …
(a) Distribution of cumulative frequency of repeat elements per protein is significantly higher (KS test p-value <0.05) in the seroreactive protein set than a randomly sampled subset of …
Left three: Conservative substitutions ([GA],[ST],[DE],[NQ],[RHK],[LVI],[YFW]) are allowed at all positions in the motif. Right three: Identical residues at all positions in the motif. For all six …
(a) Breadth of seroreactive non-repeat peptides per person is not significantly different (KS-test p-value >0.05) between the two exposure settings within each age group. (b) Breadth of seroreactive …
Age groups showing significant difference between the two transmission settings are marked by * based on a KS-test p-value <0.05.
Each dot represents a seroreactive repeat element and seropositivity for the repeat element in a given group was calculated as the percent of people in that group enriching for any seroreactive …
Groups showing significant difference between the two transmission settings are marked by * based on a KS-test p-value <0.05.
Groups showing significant differences are marked by * based on a KS-test p-value <0.05.
(a) Pipeline to identify inter-protein motifs (6-9aa) significantly enriched (FDR <0.001) in seroreactive peptides from different seroreactive proteins (different colors) over background. Background …
Top - Histogram of net charge and hydrophobicity index of the 911 inter-protein motifs (7-aa motifs with at least five identical residues and up to two conservative substitutions) in comparison to a …
(a) Design of the tiled peptide library showing segments in Peptide 4 overlapping with neighboring peptides. Start and end amino acid positions of each peptide are marked at either ends. (b) …
Each plot in orange depicts the Cumulative Distribution Function (CDF) for the proportion of people showing reactivity in >y proteins for the set of inter-proten motifs shared among n proteins. The …
(a) All seroreactive proteins except PfEMP1 (b) Proteins with >30% seropositivity.
Region | Age group (yrs) | No. of people | Proportion positive for infection at the time of sample collection | Time since last infection (days) - median (IQR) | Incidence of symptomatic malaria per year - median (IQR) | Household annual EIR* (infective bites / person) - median (IQR) |
---|---|---|---|---|---|---|
Tororo | 2–3 | 10 | 0.5 | 18.5 (0,85) | 5.8 (2.9,7.7) | 56 (33,148) |
4–6 | 30 | 0.66 | 0 (0,45) | 3.6 (2.6,4.8) | 59 (38,84) | |
7–11 | 30 | 0.63 | 0 (0,45) | 2.3 (2,4.3) | 46 (30,110) | |
>18 | 30 | 0.7 | 0 (0,45) | 1.2 (0.9,1.6) | 49 (35,94) | |
Kanungu | 2–3 | 10 | 0.1 | 155 (61,190) | 1.7 (0.9,2) | 4.3 (4, 14) |
4–6 | 30 | 0.2 | 114 (43,289) | 1.5 (0.7, 2.3) | 7.3 (4.5, 15) | |
7–11 | 30 | 0.13 | 121 (41,263) | 1.5 (0.6, 2) | 5.2 (4, 7) | |
>18 | 30 | 0.2 | 109 (38, 223) | 1.1 (0.8, 1.3) | 6.8 (4.8, 15.4) |
EIR – Entomological Inoculation Rate.
Input sequences before collapsing on similarity | Identity threshold for collapsing byCD-HIT | # Final collapsed Protein sequences | |
---|---|---|---|
P. falciparum reference proteome | 3D7, IT (10,771 total) | 99% | 6372 |
P. falciparum variant sequences |
| 100% (90% for CSP) | 1205 |
Other variants | P. reichnowi PfEMP1 (PFREICH) Anopheles - CE5 (5), SG6 (5) | ||
Anopheles salivary proteins | 53 proteins from 19 Anopheles species as described in Figure 1 of Arcà et al., 2017 | 98% | 708 |
Vaccine/Viral/Toxin sequences |
| 98% (90% for RotoAB) | 684 |
Laboratory positive controls |
| 98% | 11 |
TOTAL PROTEINS | 8,980 | ||
TOTAL PEPTIDES | 238,068 |
Reagent type (species) or resource | Designation | Source or reference | Identifiers | Additional information |
---|---|---|---|---|
Strain, strain background (E. coli) | BLT5403 | Novagen/EMD Millipore, T7 Select Kit | Cat# 70550–3 | |
Strain, strain background (T7 Bacteriophage) | T7 vector arms, Packaging extract | Novagen/EMD Millipore, T7 Select Kit | Cat# 70550–3 | |
Genetic reagent (T7 Bacteriophage library) | Falciparome | Made in this study | See Materials and Methods | |
Biological sample (Humans) | Ugandan cohort plasma | Kamya et al., 2015, Rek et al., 2016; Yeka et al., 2015 | ||
Biological sample (Humans) | US control plasma | New York Blood Center | ||
Antibody | Anti-Glial Fibrillary Associated Protein (rabbit, polyclonal) | Agilent | Cat# Z033429-2 | 1 ug used |
Peptide, recombinant protein | Protein A conjugated magnetic beads | Invitrogen/Thermo Fisher Sci | Cat# 10008D | |
Peptide, recombinant protein | Protein G conjugated magnetic beads | Invitrogen/Thermo Fisher Sci | Cat# 10009D | |
Peptide, recombinant protein | BSA Fraction V | Sigma-Aldrich | Cat# 10735094001 | |
Peptide, recombinant protein | T4 ligase | New England Bio | Cat# M0202S | |
Peptide, recombinant protein | Phusion DNA Polymerase | New England Bio | Cat# M0530L | |
Commercial assay or kit | T7 Select 10-3b Cloning kit | EMD Millipore | Cat# 70550–3 | |
Commercial assay or kit | Ampure XP Beads | Beckman Coulter | Cat# A63881 | |
Software, algorithm | CD-HIT | Fu et al., 2012; Li and Godzik, 2006 | http://weizhongli-lab.org/cd-hit/ | |
Software, algorithm | numpy | Open Source | https://doi.org/10.1109/MCSE.2011.37 | |
Software, algorithm | scipy | Open Source | https://www.nature.com/articles/s41592-019-0686-2 | |
Software, algorithm | Matplotlib | Open Source | https://ieeeexplore.ieee.org/document/4160265 | |
Software, algorithm | Cutadapt | Martin, 2011 | https://cutadapt.readthedocs.io/en/stable/ | |
Software, algorithm | Cytoscape | Shannon et al., 2003 | https://cytoscape.org |
List of 9927 seroreactive peptides identified in this dataset with their sequences.
Top 40 proteins with highest seropositivity and associated literature.
List of top 100 proteins with highest seropositivity used for GO analysis.
Seropositivity rate (proportion of people seropositive) for all 9927 seroreactive peptides across different groups in the two exposure settings.
Seropositivity rate (proportion of people seropositive) for top repeat elements across different groups in the two exposure settings.
List of inter-protein motifs and the proteins sharing them.
Motifs reported here are 7-mers with at least 5 identical amino acids and up to two conservative substitutions (and no wildcards).
Table describing the number of interprotein motifs obtained with varied parameters for calling the motifs.
Gene network file for interprotein motifs (7-mers with at least 5 identical amino acids and up to two conservative substitutions (and no wildcards)).
Can be visualized on Cytoscape.