Distributions of performance across subject traits and CV iterations when using different methods for prediction of subject traits on HCP data.
The best-performing methods are highlighted by black arrows in each plot. a) Pearson’s correlation coefficients (r) between predicted and actual variable values in deconfounded space as a measure of prediction accuracy (x-axis) of each method (y-axis). Larger values indicate that the model predicts more accurately. The linear Fisher kernel has the highest average accuracy among the time-varying methods, while the Ridge regression model in Riemannian space had the highest average accuracy among the time-averaged methods. Note that we here show the distribution across target variables and CV iterations but averaged over folds for visualisation purposes, while the fold-wise accuracies were used for significance testing. Asterisks indicate significant Benjamini-Hochberg corrected p-values of repeated k-fold cross-validation corrected t-tests below 0.05 (*). b) Coefficient of determination (R2) in deconfounded space (x-axis) for each of the methods (y-axis). The x-axis is cropped at -0.1 for visualisation purposes since individual runs can produce large negative outliers, see panel c. c) Normalised maximum errors (NMAXAE) in original (non-deconfounded) space as a measure of excessive errors (x-axis) by method (y-axis). Large maximum errors indicate that the model predicts very poorly in single cases. Differences between the methods mainly lie in the tails of the distributions, where the naïve normalised Gaussian kernel produces extreme maximum errors in some runs (NMAXAE > 10,000), while the linear naïve normalised kernel and the linear Fisher kernel, along with several time-averaged methods have the smallest risk of excessive errors (NMAXAE below 1). The x-axis is plotted on the log-scale. d) Robustness of prediction accuracies. The plot shows the distribution across variables of the standard deviation of correlation coefficients over folds and CV iterations on the x-axis for each method (on the y-axis). Smaller values indicate greater robustness. The linear Fisher kernel and the time-averaged Ridge regression model in Riemannian space are the most robust. Asterisks indicate significant Benjamini-Hochberg corrected p-values for repeated measures t-tests below 0.01 (**) and 0.001 (***). a), b), c) Each violin plot shows the distribution over 3,500 runs (100 iterations of 10-fold CV for all 35 variables) that were predicted from each method. d) Each violin plot shows the distribution over 35 variables that were predicted from each method.