Figures and data

Precision psychiatry: from individual differences to clinical prediction.
The path to precision psychiatry begins with research on individual differences, which exist across many dimensions. The first challenge is to develop tools to measure and characterize these differences (phenotyping). Next, clinically relevant differences must be identified (biomarker research). Finally, collections of relevant markers can be used to build prediction models. However, systemic problems in research practices (highlighted in red) at multiple steps in this pathway undermine progress.

Different levels of research questions with associated methods and metrics.
Much of the research planning and clinical translation strategy is currently formulated based on the mere existence of effects. This can be misleading, as translational potential is proportional not to statistical significance or model evidence, but to effect size. These observed effects can be further expressed as discriminative ability and - by accounting for real-world base rates - as predictive value and clinical utility. Each subsequent step becomes increasingly informative for gauging translational potential, as conveyed by the emojis.

Effect size vs. prediction metrics.
Standardized mean difference measures, such as Cohen’s d, capture how far the two distributions are from each other. If we place a decision threshold (red) to classify cases into two groups, there are four possible outcomes: True Positives (TP), False Positives (FP), False Negatives (FN), and True Negatives (TN). These four outcomes are used to derive all classification metrics, as presented on the right. In predictive utility analysis, we place a special emphasis on PR-AUC and NB (highlighted in blue), as they can convey both overall and context-specific model performance while accounting for real-world base rate, making them highly informative for understanding the translational potential of research findings.

Diagnostic prediction examples.
(a) A cognitive marker for depression, (b) FDA-cleared diagnostic biomarker p-tau217/Aβ42 for Alzheimer’s disease. Both figures are screenshots of E2P Simulator’s interface showing inputs on the left (effect size metrics, reliability values, and the base rate), the resulting data distributions in the center, and all predictive and utility metrics at the bottom and on the right. The red line is the classification threshold with the corresponding red markers on ROC-AUC, PR-AUC, and DCA plots.

Treatment response prediction example: task-based fMRI for predicting response to antidepressants.
A screenshot of E2P Simulator’s interface showing inputs on the left (effect size metrics, reliability values, and the base rate), the resulting data distributions in the center, and all predictive and utility metrics at the bottom and on the right. The red line is the classification threshold with the corresponding red markers on ROC-AUC, PR-AUC, and DCA plots.

Risk prediction examples.
(a) Double-deviant MMN for predicting transition to psychosis, (b) electronic health records for predicting suicide attempts. Both figures are screenshots of E2P Simulator’s interface showing inputs on the left (effect size metrics, reliability values, and the base rate), the resulting data distributions in the center, and all predictive and utility metrics at the bottom and on the right. The red line is the classification threshold with the corresponding red markers on ROC-AUC, PR-AUC, and DCA plots.

Estimating the strength and number of predictors needed for diagnostic prediction in depression.
A screenshot of the binary multivariable calculator with inputs on the left and results on the right. Here we show three example settings for achieving PR-AUC = 0.8, which at 8% base rate would correspond to ROC-AUC = 0.96; we would need either 20 predictors of d = 0.8 each and 0.1 collinearity (green line); 10 predictors of d = 0.8 each and no (0.0) collinearity (red line); or 5 predictors of d = 1.35 each with 0.1 collinearity (yellow line).

Estimating the strength and number of predictors needed for predicting treatment response to antidepressants
A screenshot of the continuous multivariable calculator with inputs on the left and results on the right: the resulting R2 and PR-AUC as a function of the number of parameters. Here we show two example curves for achieving R2 = 0.8, which at 15% corresponds to PR-AUC = 0.8; we would need 17 predictors of r = 0.4 to achieve R2 = 0.8 (green line). Interestingly, if we reduce the effect size of each predictor to r = 0.3, we would never reach R2 = 0.8, no matter how many predictors we have (red line).