Early prediction of in-hospital death of COVID-19 patients: a machine-learning model based on age, blood analyses, and chest x-ray score

  1. Emirena Garrafa  Is a corresponding author
  2. Marika Vezzoli
  3. Marco Ravanelli
  4. Davide Farina
  5. Andrea Borghesi
  6. Stefano Calza
  7. Roberto Maroldi
  1. Department of Molecular and Translational Medicine, University of Brescia, Italy
  2. ASST Spedali Civili di Brescia, Department of Laboratory, Italy
  3. Department of Medical and Surgical Specialties, Radiological Sciences and Public Health, University of Brescia, Italy
  4. ASST Spedali Civili di Brescia, Department of Radiology, Italy
5 figures, 2 tables and 1 additional file

Figures

Flowchart of the data used in the empirical analyses.

The early-warning model (BS-EWM) was trained with a random forest on 70% of first-wave patients (rebalanced with the synthetic minority oversampling technique [SMOTE] procedure) and (i) validated on remaining 30% of first-wave patients (ii) tested on 676 second-wave patients. In detail, 2106 patients were randomly in training and validating, maintaining the same death prevalence of the first wave.

Correlation plot on biomarkers and Brescia chest X-ray score.

The relationships between 17 analytes and Brescia chest X-ray score are inspected with the Spearman correlation coefficients, ρs, which are represented in this correlation plot by means of blue and red circles (positive and negative correlation, respectively). The diameter of the circle is proportional to the magnitude of ρs and black crosses on them identify correlation not significantly different from zero (p-values > 0.05). The correlation matrix is reordered according to the hierarchical cluster analysis on the quantitative variables.

Relative variable importance measure (rel VIM).

In , Figure 3, A1, there is the rel VIM based on Gini index. It was extracted from a random forest where the outcome is dead/alive and covariates are: the 17 biomarkers, Brescia X-ray score, and age. The algorithm grows 10,000 trees where the number of splitting variables at each tree node is √(# covariates in the model). Missing values are imputed with the ‘on-the-fly-imputation’ algorithm. A model with the same features was run (Figure 3, A2) excluding the covariate ‘age’ since it was strongly associated with the risk of death, masking the role of remaining covariates.

Partial dependence plot (PDP) of random forest grown on the 17 biomarkers and Brescia X-ray score.

Considering the random forest that excludes the ‘age’ variable, the PDPs were computed in correspondence of covariates with relative variable importance measure (rel VIM) of Appendix 1—figure 2 > 60 (cut-off identified by the red dashed line) and p-value in Table 1 < 0.05. Of 10 most important variables in Appendix 1—figure 2, nine satisfy these two conditions (only fibrinogen was excluded since it was not significantly different in the two subpopulations deceased/alive). PDPs measure the effects of changes in covariate values taken one per time, on the risk of death. They are displayed from the most to the less important variable.

Receiver operating characteristic (ROC) curves of random forest, gradient boosting machine (GBM), and logistic regression.

ROC curves of three methods: (i) random forest, (ii) GBM, and (iii) logistic regression. Each graph reports the ROC curve computed in training (blue line, 70% of March–April’s patients), validating (dashed red line, 30% of March–April’s patients), and testing (dashed green lined, May–December’s patients).

Tables

Table 1
Descriptive statistics on all variables in the dataset stratified respect alive–dead. Comparison between first (March–April) and second (May–December) wave.
VariablesFirst wave: March–April (MA) 2020p-ValueSecond wave: May–December (MD) 2020p-Value
Alive(N = 1683)Dead(N = 423)Alive (N = 594)Dead (N = 82)
Age<0.001*<0.001*
Mean (SD)64.55 (14.27)76.21 (9.12)65.30 (15.20)76.72 (10.79)
Median(Q1, Q3)65.00(55.00, 75.00)77.00(72.00, 82.00)67.00(55.00, 77.00)80.00(72.25, 84.75)
Range19.00–97.0044.00–98.0018.00–97.0044.00–98.00
Sex0.0360.131
F613 (36.4%)131 (31.0%)240 (40.4%)26 (31.7%)
M1,070 (63.6%)292 (69.0%)354 (59.6%)56 (68.3%)
Days in hospital<0.001*0.008*
N-Miss10950
Mean (SD)14.15 (11.66)11.33 (10.98)14.95 (11.67)17.77 (10.75)
Median(Q1, Q3)11.00(7.00, 18.00)8.00(4.00, 15.00)12.00(7.00, 20.00)17.50(9.00, 25.00)
Range0.00–140.000.00–88.000.00–79.002.00–46.00
Score<0.001*<0.001*
Mean (SD)6.92 (4.40)8.77 (4.39)5.65 (4.48)8.23 (4.63)
Median(Q1, Q3)7.00(3.00, 10.00)9.00(6.00, 12.00)5.00(2.00, 9.00)9.00(5.25, 11.00)
Range0.00–18.000.00–18.000.00–18.000.00–17.00
D-dimer<0.001*<0.001*
N-Miss40611312816
Mean (SD)1155.03 (2218.51)3124.25 (8070.21)1538.17 (3123.38)4712.44 (8897.82)
Median(Q1, Q3)443.00(262.00, 985.00)944.50(476.50, 2970.75)739.50(427.50, 1341.25)1112.00(725.50, 3619.25)
Range200.00–47228.00200.00–60,342.00190.00–33,501.00190.00–35,000.00
Fibrinogen0.951*0.778*
N-Miss339117548
Mean (SD)530.53 (194.13)530.55 (213.69)523.94 (169.43)519.77 (213.05)
Median(Q1, Q3)520.00(381.00, 650.00)515.00(381.00, 654.00)512.00(405.00, 612.00)510.00(330.50, 649.00)
Range119.00–1339.0068.00–1333.00147.00–1371.00153.00–1287.00
LDH<0.001*<0.001*
N-Miss18892617
Mean (SD)321.25 (227.50)433.71 (205.10)308.30 (196.23)443.49 (707.95)
Median(Q1, Q3)283.00(222.00, 373.00)406.00(269.50, 545.50)273.00(218.00, 354.00)332.00(257.00, 442.50)
Range90.00–6689.00123.00–1365.00108.00–2565.00122.00–6310.00
Neutrophils<0.001*<0.001*
N-Miss231941
Mean (SD)5.67 (3.61)7.17 (4.39)5.80 (3.97)7.21 (4.13)
Median(Q1, Q3)4.83(3.29, 7.03)6.20(4.12, 9.02)4.78(3.42, 7.11)6.72(4.00, 9.77)
Range0.00–53.990.17–30.450.10–47.030.19–23.02
Lymphocytes<0.001*<0.001*
N-Miss231941
Mean (SD)1.43 (5.48)1.19 (4.29)1.22 (0.81)1.38 (4.63)
Median(Q1, Q3)1.04(0.75, 1.42)0.81(0.55, 1.18)1.06(0.72, 1.52)0.74(0.47, 1.06)
Range0.10–177.630.04–85.510.08–10.280.08–42.20
Neutrophils on lymphocytes<0.001*<0.001*
N-Miss231941
Mean (SD)6.18 (5.87)10.72 (11.71)7.19 (9.92)12.84 (13.09)
Median(Q1, Q3)4.52(2.84, 7.50)7.13(4.47, 13.06)4.32(2.63, 8.40)8.50(4.05, 15.19)
Range0.00–101.900.01–129.670.12–143.250.11–70.56
Neutrophils %<0.001*<0.001*
N-Miss221941
Mean (SD)0.73 (0.13)0.80 (0.12)0.73 (0.13)0.79 (0.16)
Median(Q1, Q3)0.74(0.66, 0.82)0.82(0.75, 0.88)0.73(0.64, 0.83)0.83(0.69, 0.89)
Range0.00–0.970.01–0.970.10–0.990.10–0.96
Lymphocytes %<0.001*<0.001*
N-Miss221941
Mean (SD)0.18 (0.11)0.13 (0.09)0.18 (0.11)0.13 (0.13)
Median(Q1, Q3)0.16(0.11, 0.23)0.11(0.07, 0.17)0.17(0.10, 0.25)0.10(0.06, 0.18)
Range0.01–0.970.01–0.990.01–0.880.01–0.88
PCR<0.001*0.004*
N-Miss4712210
Mean (SD)77.25 (75.76)117.68 (95.97)64.28 (73.38)98.59 (102.49)
Median(Q1, Q3)55.65(17.30, 111.60)99.20(42.80, 170.45)39.10(12.30, 91.10)74.80(20.12, 140.73)
Range0.30–479.000.70–471.100.30–483.200.30–593.80
WBC<0.001*0.011*
N-Miss211941
Mean (SD)7.73 (7.13)9.13 (7.46)7.65 (4.17)9.23 (6.25)
Median(Q1, Q3)6.62(4.87, 9.11)7.62(5.60, 10.74)6.67(5.02, 8.90)8.34(5.55, 12.04)
Range0.72–191.020.32–92.230.97–48.190.97–47.79
Basophils0.073*0.419*
N-Miss231941
Mean (SD)0.02 (0.02)0.02 (0.02)0.02 (0.04)0.02 (0.02)
Median(Q1, Q3)0.01(0.01, 0.02)0.01(0.01, 0.02)0.02(0.01, 0.03)0.01(0.01, 0.03)
Range0.00–0.310.00–0.150.00–0.840.00–0.11
Basophils %<0.001*0.024*
N-Miss221941
Mean (SD)0.00 (0.00)0.00 (0.00)0.00 (0.00)0.00 (0.00)
Median(Q1, Q3)0.00(0.00, 0.00)0.00(0.00, 0.00)0.00(0.00, 0.00)0.00(0.00, 0.00)
Range0.00–0.020.00–0.060.00–0.050.00–0.01
Eosinophils<0.001*0.015*
N-Miss231941
Mean (SD)0.06 (0.12)0.04 (0.10)0.06 (0.14)0.05 (0.13)
Median(Q1, Q3)0.01 (0.00, 0.07)0.00 (0.00, 0.02)0.01 (0.00, 0.06)0.00 (0.00, 0.03)
Range0.00–2.190.00–0.790.00–1.950.00–0.97
Eosinophils %<0.001*0.013*
N-Miss221941
Mean (SD)0.01 (0.02)0.00 (0.01)0.01 (0.02)0.01 (0.01)
Median(Q1, Q3)0.00(0.00, 0.01)0.00(0.00, 0.00)0.00(0.00, 0.01)0.00(0.00, 0.00)
Range0.00–0.270.00–0.120.00–0.250.00–0.07
Monocytes<0.001*0.683*
N-Miss231941
Mean (SD)0.56 (0.68)0.69 (3.32)0.55 (0.32)0.58 (0.41)
Median(Q1, Q3)0.47(0.32, 0.68)0.41(0.25, 0.63)0.49(0.33, 0.68)0.48(0.27, 0.77)
Range0.01–23.310.02–66.340.02–2.450.07–2.01
Monocytes %<0.001*0.034*
N-Miss221941
Mean (SD)0.08 (0.04)0.07 (0.05)0.08 (0.04)0.07 (0.05)
Median(Q1, Q3)0.07 (0.05, 0.10)0.06 (0.04, 0.08)0.07 (0.05, 0.10)0.06 (0.04, 0.09)
Range0.00–0.700.01–0.720.01–0.310.01–0.27p-Value
Ferritin F613 patients(82.39%)131 patients(17.61%)<0.001*240 patients(90.23%)26 patients(9.77%)0.372*
N-Miss15834435
Mean (SD)674.53 (817.61)1237.07 (2308.64)564.63 (526.39)2006.00 (4680.23)
Median(Q1, Q3)459.00(212.00, 820.50)700.00(353.00, 1347.00)433.00(216.00, 750.00)510.00(269.00, 722.00)
Range4.00–7687.0019.00–20,572.0011.00–3397.0081.00–20,941.00
Ferritin M1070 patients(78.56%)292 patients(21.44%)<0.001*354 patients(90.23%)56 patients(9.77%)0.007*
N-Miss25796505
Mean (SD)1353.00 (1359.86)1825.25 (1945.47)1181.95 (3295.92)1372.04 (1258.14)
Median(Q1, Q3)939.00(461.00, 1705.00)1262.50(572.25, 2323.25)737.50(405.25, 1283.00)1159.00(598.00, 1500.00)
Range23.00–11,513.0055.00–13,289.0025.00–56,039.00112.00–7058.00
  1. In bold and italics p-values < 0.05.

  2. *

    Wilcoxon rank-sum test.

  3. Fisher’s exact test.

Table 2
Performance metrics of methods: random forest, gradient boosting machine (GBM), and logistic regression.
MetricsRandom forestGBMLogistic regression
TrainingMarch–April (MA)ValidatingMarch–April (MA)TestingMay–Dec(MD)TrainingMarch–April (MA)ValidatingMarch–April (MA)TestingMay–Dec(MD)TrainingMarch–April (MA)ValidatingMarch–April (MA)TestingMay–Dec(MD)
AUC (DeLong)(95% CI)0.97(0.97–0.98)0.83(0.80–0.87)0.78(0.73–0.84)0.88(0.86–0.89)0.84(0.80–0.88)0.78(0.73–0.83)0.84(0.82–0.86)0.83(0.79–0.87)0.52(0.44–0.60)
Sensitivity(95% CI)0.93(0.91–0.97)0.82(0.72–0.92)0.73(0.54–1.00)0.85(0.80–0.88)0.80(0.66–0.90)0.77(0.65–0.94)0.80(0.77–0.84)0.84(0.76–0.91)0.87(0.18–1.00)
Specificity(95% CI)0.92(0.88–0.94)0.75(0.63–0.83)0.73(0.41–0.89)0.77(0.73–0.81)0.75(0.65–0.87)0.71(0.50–0.79)0.74(0.70–0.77)0.73(0.65–0.79)0.26(0.11–0.94)
  1. Comparison between the performances of three methods: random forest, GBM, and logistic regression model applied on the rebalanced dataset obtained with SMOTE methodology. Logistic regression predictions are computed using the 10-fold cross-validation in order to be comparable with random forest and GBM predictions (which use out-of-bag and 10-fold cross-validation, respectively).

Additional files

Supplementary file 1

Descriptive statistics on all variables.

(a) Descriptive statistics on all variables of the entire sample.

(b) Descriptive statistics on all variables in the dataset stratified respect first (March–April 2020) and second (May–December 2020) wave. Comparison between alive and dead. (c): Performance metrics of the random forest (RF) using or not a rebalanced dataset with the synthetic minority oversampling technique (SMOTE) methodology. In this table we compare the performance of two RFs applied on (i) a dataset rebalanced with the SMOTE methodology and (ii) the original dataset. This analysis suggests the use of SMOTE methodology before applying RF since the performance in training and validating groups (especially in terms of sensitivity) are better respect those obtained from the RF grown on the original dataset. (d): Performance metrics of the random forests (RFs) estimated on single biomarkers. (e): Optimal threshold for each biomarker to predict the outcome.

https://cdn.elifesciences.org/articles/70640/elife-70640-supp1-v3.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Emirena Garrafa
  2. Marika Vezzoli
  3. Marco Ravanelli
  4. Davide Farina
  5. Andrea Borghesi
  6. Stefano Calza
  7. Roberto Maroldi
(2021)
Early prediction of in-hospital death of COVID-19 patients: a machine-learning model based on age, blood analyses, and chest x-ray score
eLife 10:e70640.
https://doi.org/10.7554/eLife.70640