Spectral clustering of risk score trajectories stratifies sepsis patients by clinical outcome and interventions received

  1. Ran Liu  Is a corresponding author
  2. Joseph L Greenstein
  3. James C Fackler
  4. Melania M Bembea
  5. Raimond L Winslow  Is a corresponding author
  1. Institute for Computational Medicine, The Johns Hopkins University, United States
  2. Department of Biomedical Engineering, The Johns Hopkins University School of Medicine & Whiting School of Engineering, United States
  3. Department of Anesthesiology and Critical Care Medicine, The Johns Hopkins University School of Medicine, United States
  4. Department of Pediatrics, The Johns Hopkins University School of Medicine, United States
5 figures, 4 tables and 2 additional files

Figures

Figure 1 with 3 supplements
Risk score clusters obtained using spectral clustering on the 12 hr following time of early prediction.

Time 0 represents td, time of early prediction. Bold solid and dashed lines indicate mean risk within each cluster. Each solid line becomes a dashed line at the cluster median EWT (indicated on figure). Shaded areas indicate one standard deviation from the mean. Black horizontal line indicates risk score threshold for early prediction.

Figure 1—figure supplement 1
Eigenvalues of Graph Laplacian of post-early prediction risk trajectories.
Figure 1—figure supplement 2
Receiver operating characteristic curves for early prediction in eICU.

95% confidence intervals, estimated using 100 iterations of bootstrap, are indicated by the shaded area. Using XGBoost, we obtain an average performance of 0.912 AUC, 82.5% sensitivity, 84.1% specificity, a median early warning time of 10.3 hr, and 34.0% positive predictive value. Using GLM, we obtain an average performance of 0.894 AUC, 82.7% sensitivity, 80.8% specificity, a median early warning time of 10.5 hr, and 29.9% positive predictive value.

Figure 1—figure supplement 3
Risk score clusters obtained using spectral clustering on the 12 hr following time of early prediction in the MIMIC-III database.

Time 0 represents td, time of early prediction. Shaded areas indicate one standard deviation from the mean. The red horizontal line indicates risk score threshold for early prediction. Clusters are numbered in descending order of septic shock prevalence.

Figure 2 with 3 supplements
Physiological trajectories in (A) Lactate, (B) systolic blood pressure, and (C) heart rate for the 4 clusters of patients illustrated in Figure 1.

Solid lines indicate the mean value of each feature within each cluster. Shaded areas indicate an interval of 1 standard deviation from the mean.

Figure 2—figure supplement 1
Kullback-Leibler Divergence of Risk Score and physiological variables between the highest-risk and lowest-risk clusters in the window surrounding early prediction.
Figure 2—figure supplement 2
Physiological trajectories in (A) Lactate, (B) systolic blood pressure, and (C) heart rate for the 3 clusters of patients illustrated in Figure 1—figure supplement 3 in the MIMIC-III database.

Solid lines indicate the mean value of each feature within each cluster. Shaded areas indicate an interval of 1 standard deviation from the mean.

Figure 2—figure supplement 3
Kullback-Leibler Divergence of Risk Score and physiological variables between the highest-risk and lowest-risk clusters in the window surrounding time of early prediction in the MIMIC-III database.
Risk trajectory classification accuracy.

The duration of data used consequent to early prediction is specified by the x-axis. 90% confidence intervals, as empirically estimated using bootstrap, are indicated by the shaded area.

Figure 4 with 1 supplement
Risk score trajectories following the first instance of intervention.

Threshold for early prediction is indicated by the horizontal line. Bold lines indicate mean risk within each cluster. Shaded areas indicate one standard deviation from the mean. The mean time within each cluster between entry into the pre-shock state and the time of first intervention is indicated on the right-hand side of the figure. A positive number indicates that the time of first intervention is after the time of threshold crossing, whereas a negative number indicates that the first intervention precedes entry into pre-shock.

Figure 4—figure supplement 1
Eigenvalues of Graph Laplacian of post-intervention risk trajectories.
Visualization of spectral clustering of risk score trajectories, for a simple example with two clusters.

Risk score trajectories (A) are clustered by using distances between trajectories (B) to project the data into a space in which they are easily separable (C). K-means clustering of the data in this new space yields clusters of risk scores (D).

Tables

Table 1
Clusters in Figure 1 stratify by septic shock prevalence, mortality, and time to septic shock onset (EWT).

Clusters are numbered in descending order of septic shock prevalence.

Post-Prediction clusterSize% Septic Shock% MortalityMedian time to shock onset (EWT)
1 (High-risk)1558 (17.2%)76.5%43%9.8 hr
22672 (29.5%)46.3%34%12.6 hr
33538 (39.0%)26.0%22%15.3 hr
4 (Low-risk)1305 (14.4%)10.4%18%29.9 hr
Table 2
Proportion of patients in each of the four clusters who have received adequate fluid resuscitation or treatment with vasopressors by time of early prediction.
Cluster% Shock patients adequately fluid resuscitated% Shock patients treated with vasopressors
1 (High-risk)7.8%14.3%
29.8%22.2%
313.8%29.3%
4 (Low-risk)21.3%50.7%
Table 3
Mean values of features which are significantly different at a 99% confidence level between the post-prediction high-risk and low-risk clusters at the time point 1 hr preceding (A) first instance of adequate fluid resuscitation or (B) first instance of vasopressor administration.
A
High-riskLow-riskHigh-riskLow-risk
HR (bpm)99.496.0BUN (mg/dL)39.833.2
SBP (mmHg)107.2112.4pH7.307.33
DBP (mmHg)59.362.4PaCO2 (mmHg)40.541.9
MBP (mmHg)72.575.1Urine (mL/hr)5.63.2
Resp (bpm)22.721.6Resp SOFA1.10.6
FiO266.2%61.2%Nervous SOFA0.50.3
GCS12.112.8Cardio SOFA0.40.1
Platelets (k/μL)211.0232.1Liver SOFA0.10.0
Creatinine (mg/dL)2.21.9Coag SOFA0.50.3
Lactate (mmol/L)4.62.9Kidney SOFA1.20.8
B
High-riskLow-riskHigh-riskLow-risk
HR (bpm)100.493.3pH7.287.30
Resp (bpm)23.021.4PaCO240.843.2
CVP (mmHg)18.216.4Hemoglobin (g/dL)11.110.6
FiO271.2%64.2%Hematocrit33.9%32.7%
Table 4
Clusters in Figure 4 stratify by septic shock prevalence and mortality.

A similar number of patients are in each cluster.

Post-Intervention clusterSize% Septic Shock% Mortality
1 (High-risk)150675.942.2
2165452.233.7
3175838.526.9
4225721.118.2
5 (Low-risk)171119.226.4

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Ran Liu
  2. Joseph L Greenstein
  3. James C Fackler
  4. Melania M Bembea
  5. Raimond L Winslow
(2020)
Spectral clustering of risk score trajectories stratifies sepsis patients by clinical outcome and interventions received
eLife 9:e58142.
https://doi.org/10.7554/eLife.58142