Predictive models, predicting cognitive abilities from mental-health features via Partial Least Square (PLS).

a) predictive performance of the models, indicated by scatter plots between observed vs predicted cognitive abilities based on mental health. Cognitive abilities are based on the second-order latent variable, the g-factor, based on a confirmatory factor analysis of six cognitive tasks. All data points are from test sets. r is the average Pearson’s r across 21 test sites. The parentheses following the r indicate bootstrapped 95%CIs, calculated based on observed vs predicted cognitive abilities from all test sites combined. UPPS-P Impulsive and Behaviour Scale and the Behavioural Inhibition System/Behavioural Activation System (BIS/BAS) were used for child temperaments, conceptualised as risk factors for mental issues. Mental health includes features from CBCL and child temperaments. b) Feature importance of mental health, predicting cognitive abilities via PLS. The features were ordered based on the loading of the first PLS component. Univariate correlations were Pearson’s r between each mental-health feature and cognitive abilities. Error bars reflect 95%CIs of the correlations. CBCL = Child Behavioural Checklist (in green), reflecting children’s emotional and behavioural problems; UPPS-P = Urgency, Premeditation, Perseverance, Sensation seeking and Positive urgency Impulsive Behaviour Scale; BAS = Behavioural Activation System (in orange).

Predictive models predicting cognitive abilities from neuroimaging via opportunistic stacking and polygenic scores via Elastic Net.

a) Scatter plots between observed vs predicted cognitive abilities based on neuroimaging and polygenic scores. Cognitive abilities are based on the second-order latent variable, the g-factor, based on a confirmatory factor analysis of six cognitive tasks. The parentheses following the r indicate the bootstrapped 95%CIs, calculated based on observed vs predicted cognitive abilities from all test sites combined. All data points are from test sets. r is the average Pearson’s r across 21 test sites. The parentheses following the r indicate bootstrapped 95%CIs, calculated based on observed vs predicted cognitive abilities from all test sites combined. b) Feature importance of the stacking layer of neuroimaging, predicting cognitive abilities via Random Forest. For the stacking layer of neuroimaging, the feature importance was based on the absolute value of SHAP, averaged across test sites. A higher absolute value of SHAP indicates a higher contribution to the prediction. Error bars reflect standard deviations across sites. Different sets of neuroimaging features were filled with different colours: pink for dMRI, orange for fMRI, purple for rsMRI and green for sMRI. c) Feature importance of polygenic scores, predicting cognitive abilities via Elastic Net. For polygenic scores, the feature importance was based on the Elastic Net coefficients, averaged across test sites. We also plotted Pearson’s correlations between each polygenic score and cognitive abilities computed from the full data. Error bars reflect 95%CIs of these correlations.

Feature importance of each set of neuroimaging features, predicting cognitive abilities in the baseline data.

The feature importance was based on the Elastic Net coefficients, averaged across test sites. We did not order these sets of neuroimaging features according to their contribution to the stacking layer (see Figure 2). Larger versions of the feature importance for each set of neuroimaging features can be found in Supplementary Figures 4 - 13. MID = Monetary Incentive Delay task; SST = Stop Signal Task; DTI = Diffusion Tensor Imaging; FC = functional connectivity.

Predictive models, predicting cognitive abilities from socio-demographics, lifestyles and developmental adverse events via Partial Least Square (PLS).

a) Scatter plots between observed vs predicted cognitive abilities based on socio-demographics, lifestyles and developmental adverse events. Cognitive abilities are based on the second-order latent variable, the g-factor, based on a confirmatory factor analysis of six cognitive tasks. All data points are from test sets. r is the average Pearson’s r across 21 test sites. The parentheses following the r indicate bootstrapped 95%CIs, calculated based on observed vs predicted cognitive abilities from all test sites combined. b) Feature importance of socio-demographics, lifestyles and developmental adverse events, predicting cognitive abilities via Partial Least Square. The features were ordered based on the loading of the first component. Univariate correlations were Pearson’s correlation between each feature and cognitive abilities. Error bars reflect 95%CIs of the correlations. Different types of environmental factors were filled with different colours: orange for socio-demographics, purple for developmental adverse events and green for lifestyle. A dashed horizontal line in the follow-up feature importance figure distinguishes whether the variables were collected at baseline or follow-up.

Venn diagrams showing common and unique effects of proxy measures of cognitive abilities based on mental health, neuroimaging, polygenic scores and/or socio-demographics, lifestyles and developmental adverse events in explaining cognitive abilities across test sites.

We computed the common and unique effects in % based on the marginal R2 of four sets of linear-mixed models.