Table legends.
(a) Protein annotation. List of the 436 proteins measured. GeneID = gene symbol of the gene encoding the protein (used as the main identifier in the manuscript); UniProt = UniProt ID; Olink Assay Name = protein id used by Olink; Protein Name = full protein name; Panel name = the name of the 92 protein multiplex Olink panel on which the protein was measured. (b) Enrichment of Reactome terms for the entire set of proteins measured. The results of enrichment testing for genes corresponding to all 436 measured proteins against the background of the genome. The analysis was performed against the Reactome pathways using string-db. The list of Reactome terms is ordered by the number of proteins associated with the term. (c) Differential abundance analysis for COVID-19-positive vs -negative ESKD patients in subcohort A and B. Summary statistics for all 436 proteins are shown. Pvalue = nominal p-value from linear mixed model. Adjusted Pvalue = p-values after Benjamini–Hochberg correction. Fold change = estimated fold change from regression coefficient. Proteins are ordered based on results in subcohort A: first by whether they are significant or not (at 5% FDR), then by fold change (from positive to negative). Note the associations are not ordered by p-value so strong associations do not necessarily appear at the top of the table. Significant adjusted p-values are coloured in green and non-significant in grey. Estimated fold changes are coloured in a gradient from red to blue for up or downregulated in COVID-19 +ve versus –ve, respectively. Sample size for subcohort A: n = 256 plasma samples from 55 COVID-19 positive ESKD patients, versus n = 51 ESKD controls (one sample per control patient). Sample size for subcohort B: 52 samples from 55 COVID-19 patients and 11 non-infected patient samples (single time-point). (d) Associations of proteins and COVID-19 severity (subcohort A). Summary statistics for all 436 proteins are shown. Pvalue = nominal p-value from linear mixed model. Adjusted Pvalue = p-values after Benjamini–Hochberg correction. Fold change = estimated fold change from regression coefficient. Proteins are ordered first by whether they are significant or not (at 5% FDR), then by linear gradient (effect size) from positive to negative. Note the associations are not ordered by p-value so strong associations do not necessarily appear at the top of the table. (e) Predictors of clinical course from Random Forests. Importance metrics for each protein for prediction according to a random forest model trained to predict current or future severe/critical disease using the first sample of each patient. Proteins are ordered by mean minimal depth across all trees – this was used as the primary importance metric. (f) Proteomic predictors of fatal COVID-19. Summary statistics from joint models for fatal disease. Results for all 436 proteins are shown. ‘Is significant’ indicates significance (green) or not (grey) at 5% FDR. The association coefficient for each protein indicates the direction and magnitude of the estimated log relative risk for death (red indicates higher protein levels increase risk of death, blue the opposite). 95% confidence intervals are plotted. (g) Associations of proteins and clinical laboratory measurements. Clinical variable = clinical lab tests: white cell count, lymphocyte count, neutrophil count, monocyte count, C-reactive protein, ferritin, d-dimer, troponin. (h) Longitudinal proteomic profiling with linear mixed models. Summary statistics from the linear mixed models used to identify proteins with differential temporal trajectories between mild/moderate (n = 28) and severe/critical COVID-19 patients (n = 27). Summary statistics for all 436 proteins are shown. Pvalue = nominal p-value from linear mixed model for the interaction term between time from symptom onset (days) and overall WHO severity (as a binary variable: mild–moderate or severe–critical). Adjusted Pvalue = p-values after Benjamini–Hochberg correction. ‘Is significant’ indicates significance (green) or not (grey) at 5% FDR. (i) Comparison to other proteomic studies of COVID-19 positive vs negative patients. Proteins that were differentially abundant in COVID-19 +ve vs -ve patients in our data are listed (5% FDR). TRUE indicates that the protein was reported as differentially abundant in the relevant previous proteomic study. The final column summarises whether the association was previously reported in any of the four studies. We have not harmonised significance thresholds between studies: we simply report whether the authors declared the protein significant by the threshold of their study. (j) Comparison to other proteomic studies of COVID-19 severity. Proteins that were associated with severity in our data are listed (5% FDR). TRUE indicates that the protein was reported as associated with severity in the relevant previous proteomic study. The final column summarises whether the association was previously reported in any one or more of the four studies. We have not harmonised significance thresholds between studies: we simply report whether the authors declared the protein significant by the threshold of their study. Results are shown for all 436 proteins against all eight lab measurements. Adjusted p-value = p-value from linear mixed model after Benjamini–Hochberg correction. Gradient indicates effect size and direction. A positive gradient (red) indicates higher concentrations of proteins are associated with higher clinical laboratory measurements. ‘Is significant’ indicates significance (green) or not (grey) at 5% FDR. Contemporaneous clinical laboratory tests were not available for all plasma samples. The proportion of samples for which contemporaneous lab tests were available were: white cell count 66%, neutrophils 66%, monocytes 66%, lymphocytes 66%, CRP 64%, ferritin 36%, troponin 35%, d-dimer 30%. (k) Per protein correlations between plasma and serum levels derived from the same blood sample in 11 COVID-19 negative ESKD patients. Plasma and serum were taken from 11 non-infected ESKD patients that were measured in both subcohort A (plasma) and B (serum). Pearson’s r was calculated for the 11 paired measurements for each protein. Proteins are ordered by r value; this column is coloured from red to blue for positive and negative r values, respectively. 95% confidence intervals are reported. We also report the variance of the NPX levels for each protein in plasma and in serum.