1. Neuroscience
Download icon

A prediction model of working memory across health and psychiatric disease using whole-brain functional connectivity

  1. Masahiro Yamashita
  2. Yujiro Yoshihara
  3. Ryuichiro Hashimoto
  4. Noriaki Yahata
  5. Naho Ichikawa
  6. Yuki Sakai
  7. Takashi Yamada
  8. Noriko Matsukawa
  9. Go Okada
  10. Saori C Tanaka
  11. Kiyoto Kasai
  12. Nobumasa Kato
  13. Yasumasa Okamoto
  14. Ben Seymour  Is a corresponding author
  15. Hidehiko Takahashi  Is a corresponding author
  16. Mitsuo Kawato  Is a corresponding author
  17. Hiroshi Imamizu  Is a corresponding author
  1. Advanced Telecommunications Research Institute International, Japan
  2. Kyoto University, Japan
  3. Showa University, Japan
  4. The University of Tokyo, Japan
  5. National Institute of Radiological Sciences, Japan
  6. Hiroshima University, Japan
  7. Kyoto Prefectural University of Medicine, Japan
  8. University of Cambridge, United Kingdom
  9. National Institute of Information and Communications Technology, Japan
Research Article
  • Cited 0
  • Views 944
  • Annotations
Cite this article as: eLife 2018;7:e38844 doi: 10.7554/eLife.38844

Abstract

Working memory deficits are present in many neuropsychiatric diseases with diagnosis-related severity. However, it is unknown whether this common behavioral abnormality is a continuum explained by a neural mechanism shared across diseases or a set of discrete dysfunctions. Here, we performed predictive modeling to examine working memory ability (WMA) as a function of normative whole-brain connectivity across psychiatric diseases. We built a quantitative model for letter three-back task performance in healthy participants, using resting state functional magnetic resonance imaging (rs-fMRI). This normative model was applied to independent participants (N = 965) including four psychiatric diagnoses. Individual’s predicted WMA significantly correlated with a measured WMA in both healthy population and schizophrenia. Our predicted effect size estimates on WMA impairment were comparable to previous meta-analysis results. These results suggest a general association between brain connectivity and working memory ability applicable commonly to health and psychiatric diseases.

https://doi.org/10.7554/eLife.38844.001

Introduction

Working memory is a goal-directed active information maintenance and manipulation in mind, forming a foundation for diverse complex cognitive functions, learning, and emotion regulation (Baddeley, 2003; Cowan, 2014; Etkin et al., 2015; Otto et al., 2013). A range of psychiatric disorders commonly shows working memory deficits, although the severity of the deficits is dependent of psychiatric diagnosis (Forbes et al., 2009; Lever et al., 2015; Millan et al., 2012; Snyder, 2014; Snyder et al., 2015). Working memory emerges by coordinating multiple related processes from sensory perception, cognitive control (e.g. updating, focused attention), to motor action and thus requires close communication among widespread brain regions (D'Esposito and Postle, 2015; Eriksson et al., 2015; Nee et al., 2013; Owen et al., 2005; Postle, 2006; Rottschy et al., 2012).

Functional connectivity (FC) quantifies how brain regions are temporally coordinated and is increasingly used to examine brain network architecture. Resting state (i.e. task-free) FC has been associated with a wide range of individual traits (Baldassarre et al., 2012; Dosenbach et al., 2010; Lewis et al., 2009; Seeley et al., 2007). For example, whole-brain FC models have recently demonstrated that sets of functional connections across widespread brain regions can predict performance on cognitive tasks (Finn et al., 2015; Rosenberg et al., 2016; Smith et al., 2015; Yamashita et al., 2015). These findings suggest that specific cognitive processes (e.g. memory, attention) may be represented by the corresponding interaction patterns among distributed brain networks, at least among healthy populations.

Functional connectivity has also provided insight into the biological basis of psychiatric disorders and shown that different diagnoses are related to unique patterns of FC (Baker et al., 2014; Harrison et al., 2009; Kaiser et al., 2015; Yahata et al., 2016). For example, a whole-brain FC-based model has been shown to reliably predict autism spectrum disorder (ASD), as well as individual clinical scores (Emerson et al., 2017; Lake et al., 2018; Yahata et al., 2016), suggesting that FC disruption is quantitatively relevant to behavioral abnormality. More broadly, this suggests that a specific relationship between FC and behavior might exist across many disparate diagnoses that have common symptoms, such as impairments in working memory.

With the above issues in mind, we set out to examine competing hypotheses about the relationship between FC and working memory ability (WMA) across healthy populations and a range of psychiatric diagnoses. In this study, we define working memory ability simply as a summary index of general working memory performance, without specializing sensory modality and underlying sub-functions. The first hypothesis proposes a distinct FC-WMA relationship for each diagnosis, rationalized by the fact that each psychiatric diagnosis is characterized by differential alterations in FC (Baker et al., 2014; Harrison et al., 2009; Kaiser et al., 2015; Yahata et al., 2016). This hypothesis predicts that the FC-WMA relationship among healthy populations will fail to generalize in predicting impairments across different diagnoses. The alternative hypothesis proposes a common FC-WMA relationship across health and multiple diagnoses. The rationale for this hypothesis is that previous studies have suggested that several cognitive functions, such as attention and memory, generalize to predict behavior in patients as well as in healthy populations (e.g. Kessler et al., 2016; Lin et al., 2018; O'Halloran et al., 2018; Rosenberg et al., 2016). This hypothesis predicts that a FC-WMA relationship estimated from whole-brain functional connections in healthy populations will generalize to predict working memory impairment across diagnoses.

To test these hypotheses, we built a prediction model of working memory ability in a letter 3-back task using whole-brain FC among a healthy population. Then, we examined whether the model was predictive of individual differences in behaviorally measured working memory ability not only in healthy individuals but also in individuals with schizophrenia. Moreover, we examined whether the model was predictive of group differences in working memory ability among four different psychiatric diagnoses by comparing the predicted effect sizes across diagnoses.

Results

Design

We constructed a prediction model of working memory ability among healthy individuals recruited at ATR (Advanced Telecommunications Research Institute International), Japan (ATR dataset; Figure 1A). To test its generalizability, independently collected resting state fMRI (rs-fMRI) was entered into this model to predict individual working memory ability. Specifically, we applied the model to independent test datasets of healthy individuals in the USA (the Human Connectome Project dataset, HCP dataset) and schizophrenia patients and their controls (Figure 1B). The predicted working memory ability was compared with actually measured working memory score. We emphasize that these individual differences analyses were performed to examine the relative (i.e. the ability to differentiate between good and bad performers), not absolute accuracy of the prediction model (i.e. the ability to predict specific level of performance). Moreover, the model was applied to patients with psychiatric diagnoses including schizophrenia (SCZ), major depressive disorder (MDD), obsessive-compulsive disorder (OCD), and ASD and also their age- and gender-matched healthy/typically developed controls (multiple psychiatric diagnoses dataset; Figure 1C). The effect size estimates of predicted working memory impairments were compared with that of behaviorally observed ones reported in previous meta-analysis studies. We note that individual behavioral scores on working memory ability were available in schizophrenia cohort but not available in other psychiatric diagnoses, and that the same SCZ dataset was analyzed from an individual difference perspective as well as a group-level difference perspective (see below).

Schematic diagram of model construction and generalization tests using independent datasets.

(A) Model was developed using a whole-brain resting state FC and a learning plateau of a letter 3-back task within healthy individuals from ATR dataset. (B) We applied the model to resting state FC patterns and predicted individual participant’s working memory ability. We first examined the external validity using an independent USA healthy dataset (HCP dataset: the upper flow chart in (B)). The predicted working memory ability was compared to actual working memory performance (visual-object N-back task and the NIH toolbox list sorting test). Then we examined the generalizability to a clinical population using a schizophrenia dataset (the lower flow chart in (B)). The predicted working memory ability was compared to actual working memory score measured by Digit sequencing test. (C) Using the multiple psychiatric diagnoses dataset, degree of working memory impairment for each diagnosis was predicted as differences from corresponding controls. The predicted impairments were validated by previous meta-analysis studies on digit-span across multiple diagnoses. Note that the HCP dataset’s task stimuli images are just illustration purpose and different from the original stimuli.

https://doi.org/10.7554/eLife.38844.002

Building prediction model

We used the ATR dataset (N = 17, age 19–24 years old) to develop a prediction model of working memory ability. The participants performed a letter 3-back task for about 25 sessions (80–90 min). Working memory ability was quantified by estimating a learning plateau in this task as follows. An individual learning curve was obtained by calculating d-prime for each session, and by smoothing the data points with five-session moving average (Figure 2—figure supplement 1). The individual learning curve was fitted by an inverse curve (y = a – b/x), where y is a d-prime in the x-th session, while a and b is a parameter for learning plateau and learning speed, respectively. We collected resting state fMRI data from each participant, and estimated whole-brain functional connectivity. We used network-level rather than node-level connectivity features to avoid overfitting to training samples (the curse of dimensionality). Specifically, we calculated FC values, based on 18 whole-brain intrinsic networks of BrainMap ICA (Laird et al., 2011), for pair-wise between-network (18 × 17/2 = 153) connections and within-network connections.

To evaluate test-retest reliability of this functional connectivity estimation method, we calculated intra-class correlation (ICC) using three external datasets: Beijing Normal University (BNU 1), Institute of Automation, Chinese Academy of Sciences (IACAS 1), and University of Utah (Utah 1). As a result, we obtained ICC values 0.34 ± 0.12 (range 0 to 0.65) for BNU 1, 0.26 ± 0.18 (range 0 to 0.66) for IACAS 1, and 0.21 ± 0.17 (range 0 to 0.59) for Utah one datasets. We found ICC values for the left FPN/right FPN: 0.28/0.23, 0.49/0.47, and 0.11/0.03 for BNU 1, IACAS 1, and Utah one dataset, respectively. According to an interpretation criteria of ICC (Landis and Koch, 1977), our connectivity estimation methods yielded ‘fair’ reliability (0.2 < ICC ≤ 0.4) for the three datasets. These results suggest that test-retest reliability of our methods are comparable to other common connectivity estimation methods (Birn et al., 2013; Noble et al., 2017).

Using sparse linear regression, individual letter 3-back learning plateaus were modeled as a linear weighted summation of automatically selected 16 functional connectivity values among 15 intrinsic networks (Figure 2). The letter 3-back learning plateaus were positively correlated with three functional connectivity values (P1-P3) and negatively correlated with the remaining 13 connectivity values (N1-N13). A contribution ratio of each connection to the working memory ability, which is determined by the product (weight x FC-value) at the connection, is represented as thickness of connection lines in Figure 2. Table 1 describes networks connected by the 16 connections, and the contribution ratio of each connection. The anatomical regions in the network are summarized in Supplementary file 2. We did not find a significant correlation between the predicted letter 3-back learning plateau and age (r = 0.21, p = 0.42), gender (r = 0.28, p = 0.28), or head motion (r = - 0.37, p = 0.14). This provided a normative prediction model based on healthy young Japanese participants.

Figure 2 with 1 supplement see all
Normative model of working memory ability (WMA).

Circle plot of networks and their connections in the model. Individual letter 3-back learning plateaus are predicted by a linear weighted summation of 16 FC values at 16 connections selected by a sparse linear regression algorithm. Connection thicknesses indicate contribution ratios (weight x FC at each connection). Connections are labeled ‘Positive/Negative (P/N)’ based on correlation coefficient signs with letter 3-back learning performances, whereas numbers indicate descending orders of contribution ratio. Each network’s color indicates relevance with working memory function based on BrainMap ICA (Laird et al., 2011); warmer colors indicate closer relevance to working memory function. See Table 1 for the networks connected by the selected 16 connections, and precise values of contribution ratio of each connection. Each network’s label and regions included in it are summarized in Supplementary file 2.

https://doi.org/10.7554/eLife.38844.003
Table 1
Selected connections and their contribution to working memory ability.
https://doi.org/10.7554/eLife.38844.012
LabelConnection (rank)Contribution ratio [%]
Positive features
P1Left fronto-parietal network (1)(within-network)33.9%
P2Supplemental motor network (3)Primary sensorimotor network (hand) (11)15.4%
P3Middle frontal and parietal network (2)Lateral temporal network (6)1.9%
Negative features
N1Cingulo-opercular network (5)Midbrain (10)13.9%
N2Right fronto-parietal network (4)Midbrain (10)11.4%
N3Right fronto-parietal network (4)Superior parietal network (18)9.2%
N4Supplemental motor network (3)Orbitofrontal network (14)5.0%
N5Middle frontal and parietal network (2)Primary sensorimotor network (mouth) (9)3.0%
N6Lateral occipital network (7)Auditory (15)2.6%
N7Left fronto-parietal network (1)Midbrain (10)1.8%
N8Cerebellum (12)Auditory (15)1.3%
N9Left fronto-parietal network (1)Lateral occipital network (7)0.3%
N10Lateral occipital network (7)Primary sensorimotor network (hand) (11)0.3%
N11Lateral occipital network (7)Superior parietal network (18)0.1%
N12Primary sensorimotor network (mouth) (9)Cerebellum (12)−0.0%
N13Left fronto-parietal network (1)Basal ganglia (8)−0.2%
  1. Rank indicates relevance with working memory function according to the BrainMap ICA.

Prediction in independent test set of healthy individuals

We next tested the model’s generalizability to an entirely independent healthy cohort using HCP dataset 500 Subjects Release (Van Essen et al., 2013). We restricted our analysis to participants for whom all rs-fMRI, visual-object N-back, the NIH Toolbox list-sorting test (Tulsky et al., 2014), and Raven’s progressive matrices with 24 items (Bilker et al., 2012) were available (N = 474; 194 males, 5 year age ranges in the Open Access Data: 22–25, 26–30, 31–35 and 36 + years old). Individual working memory performance was briefly measured by the visual-object N-back with 0-back and 2-back conditions (visual-object N-back score) and the list-sorting test (Figure 1B). The N-back scores were evaluated by the accuracy percentage of 2-back and 0-back conditions (86.0 ± 9.5% (SD), range 45.8% to 100%). The other working memory measure, the list-sorting test, is a sequencing task of visual or auditory stimuli (mean scores: 110.5 ± 11.6 (SD), range 80.8 to 144.5). Additionally, general fluid intelligence was assessed by Raven’s progressive matrices. The scores are integers that indicate the number of correct items (16.5 ± 4.8 (SD) from 4 to 24).

Before the generalization test, we found that the visual-object N-back task and the list-sorting test scores were positively correlated with general fluid intelligence (Spearman’s rank correlation ρ = 0.46, p = 3.3 × 10-26;ρ = 0.32, p = 5.7×10−13, respectively) and negatively correlated with average in-scanner head motion (Spearman’s rank correlation ρ = −0.24, p = 1.5 × 10-7; ρ = −0.12, p = 0.009) as shown in Figure 3—figure supplement 1. To exclude these contaminations, we performed a partial correlation analysis while factoring out these two variables. This revealed a significant partial correlation of the predicted working memory ability with the measured visual-object N-back scores (Spearman’s rank partial correlation ρ = 0.11; p = 0.0072, Figure 3A) and with the measured list-sorting scores (Spearman’s rank partial correlation ρ = 0.084; p = 0.034). The model captures FC variations specific to working memory ability independently of general fluid intelligence and head motion.

Figure 3 with 3 supplements see all
Generalizability to HCP dataset and schizophrenia dataset.

(A) Significant Spearman’s rank partial correlation between predicted letter 3-back learning performance and measured visual-object N-back accuracy while factoring out general fluid intelligence and head motion (ρ = 0.110, p = 0.0072). (B) Significant Pearson partial correlation between predicted letter 3-back performances and measured digit-sequencing scores while factoring out the composite BACS score and age (ρ = 0.248, p = 0.033).

https://doi.org/10.7554/eLife.38844.005

Furthermore, we examined whether the model prediction was more similar to the 2-back score than the 0-back score. Spearman’s rho partial correlation between the model prediction and task performance was 0.078 for 2-back task and 0.086 for 0-back task, while factoring out two confounding variables (fluid intelligence and head motion). There was no significant difference between the two correlation coefficients. Therefore, we could not conclude that the model prediction was more similar to 2-back score than 0-back score.

Prediction in individual schizophrenia patients and controls

We examined whether the prediction model also predicted individual differences in working memory ability using independently collected resting state fMRI scans of schizophrenia (SCZ) dataset. The schizophrenia patients (N = 58) and their age- and gender-matched controls (N = 60) underwent a cognitive test battery the Japanese version of Brief Assessment of Cognition in Schizophrenia (BACS-J) (Kaneda et al., 2007). This test battery is composed of six subtests including a digit sequencing test as a working memory measure. In this test, auditory sequences of numbers were presented, with increasing length from three to nine digits (Figure 1B). Participants repeated the sequences aloud by sorting in ascending order. The digit-sequencing scores were the number of correct trials among 28 trials (18.4 ± 4.1 (SD), range 10 to 27 in patients while 22.9 ± 4.3 (SD), range 12 to 28 in controls). Their composite BACS score was evaluated by average score of BACS’s five subtests other than the digit sequencing test.

First, we applied the model to patients with schizophrenia. Before the model application, we found that the digit-sequencing scores correlated positively with composite BACS score excluding working memory (r = 0.61, p = 3.0 × 10−7), negatively with age (r = −0.36, p = 0.005), but not with head motion (r = −0.03, p = 0.83) as shown in Figure 3—figure supplement 2. While controlling the age and the composite BACS score using a partial correlation analysis, the model predictions showed significant correlations with digit-sequencing scores (ρ = 0.25, p = 0.033, Figure 3B). Second, we applied the model to full sample of SCZ patients (N = 58) and controls (N = 60). We found that the digit-sequencing score was correlated positively with composite BACS score excluding working memory (r = 0.68, p = 2.0 × 10−17), and negatively with age (r = −0.36, p = 5.7×10−5), but not with head motion (r = −0.04, p = 0.68). While controlling the age and the composite BACS score, a partial correlation analysis showed that the model prediction is significantly correlated with the digit-sequencing score (ρ = 0.15, p = 0.048). Therefore, the model captures FC variations that are specific to working memory ability independently of age or the composite BACS score.

Furthermore, we examined whether the model predictions were correlated with digit-sequencing score in controls alone. Before the model application, we found that the digit-sequencing scores distributed non-normally (Lilliefors test, p = 0.001) and correlated positively with composite BACS score excluding working memory (Spearman's rho = 0.52, p = 2.1 × 10−5), but not with age (Spearman's rho = −0.19, p = 0.15) and head motion (Spearman's rho = 0.02, p = 0.88). While controlling the composite BACS score using a partial correlation analysis, the model predictions showed no significant correlations with digit-sequencing scores (Spearman's rho = −0.07, p = 0.60). This result is likely attributed to a ceiling effect in the BACS digit-sequencing score for controls (see Figure 3—figure supplement 3).

Prediction in four distinct psychiatric disorders

We addressed whether our model could quantitatively reproduce degrees of working memory deficits across four psychiatric diagnoses, including schizophrenia (SCZ), major depressive disorder (MDD), obsessive-compulsive disorder (OCD), and autism spectrum disorder (ASD). Their demographic data are summarized in Table 2. This dataset were collected at a Japanese neuropsychiatry consortium (Takagi et al., 2017; Yahata et al., 2016) (https://bicr.atr.jp/decnefpro/). Previous studies generally observed working memory impairment, in descending order of severity, in SCZ, MDD, OCD and ASD (Forbes et al., 2009; Lever et al., 2015; Snyder, 2014; Snyder et al., 2015). We predicted individual working memory ability by applying the prediction model of working memory ability to their resting state functional connectivity. Then, we compared the model predictions between patients and age- and gender-matched controls scanned at the same site to remove differences in scanner and imaging protocols between sites. Consequently, we identified significant differences in the predicted working memory ability between the patients and controls only for SCZ patients (two-tailed t-test for SCZ group: t116 = −3.68, P = (3.5 × 10−4) x 4 = 0.0014, Bonferroni corrected; Figure 4A). Next, we calculated individual patients’ Z-score (normalized difference between a patient and average of controls at the same site) of the predicted working memory ability for each diagnosis (Figure 4B). A one-way ANOVA revealed a significant main effect of diagnosis on the Z-score (F3,245 = 7.63, p = 6.8 × 10−5). The severity of the predicted impairment in SCZ patients was larger than all other diagnoses (post-hoc Holm’s controlled t-test, adjusted p < 0.05).

Table 2
Demographic data of the multiple psychiatric diagnoses dataset.
https://doi.org/10.7554/eLife.38844.011
DiagnosisSiteMeasurePatientsControlsTestP-value
SCZKYUN5860--
 Age37.935.2t116 = 1.70.1
(9.3)(8.4)
 Male %52%67%Fisher’s exact test0.13
MDDHRUN7763--
 Age41.639.3t138 = 1.30.21
(11.2)(12.0)
 Male %56%46%Fisher’s exact test0.31
OCDKPMN4647--
 Age32.230.3t91 = 1.10.28
(9.9)(8.7)
 Male %37%45%Fisher’s exact test0.53
ASDUTK (site1)N3333--
 Age32.834.7t64 = −1.00.3
(8.4)(7.0)
 Male %64%55%Fisher’s exact test0.62
 SHU (site2)N3638--
 Age29.932.5t72 = −1.50.14
(7.2)(7.4)
 Male %100%100%Fisher’s exact test1
 PooledN6971--
 Age31.333.5t138 = −1.80.08
(7.9)(7.2)
 Male %83%79%Fisher’s exact test0.67
  1. Site: KYU, Kyoto University; HRU, Hiroshima University; KPM, Kyoto Prefectural University of Medicine; UTK, University of Tokyo; SHU, Showa University. Measure:N’ indicates the number of subjects; ‘Age’ is shown as mean (SD); ‘Male %” is the fraction of male. The tests and p-values compare the patient and control groups within-site.

Figure 4 with 1 supplement see all
Prediction of diagnosis-specific alterations of working memory ability.

(A) Predicted letter 3-back working memory ability for patients (N = 58, 77, 45, and 69 for SCZ, MDD, OCD, and ASD, respectively) and their age- and gender-matched healthy/typically developed controls (HC, N = 60, 62, 47, and 71) shown as kernel density. For illustration purposes, distribution of each control group was standardized to that of the ATR dataset, and the same linear transformation was applied to patients’ distributions. μ indicates mean value for each group. (B) Violin plots of Z-scores for predicted working memory ability alterations. White circles indicate medians. Box limits indicate 25th and 75th percentiles. Whiskers extend 1.5 times interquartile range from 25th and 75th percentiles. (C) Comparison of estimated effect sizes for working memory deficits. k indicates number of studies included in the meta-analyses (Forbes et al., 2009; Snyder, 2014; Snyder et al., 2015). Error bars indicate 95% confidence intervals.

https://doi.org/10.7554/eLife.38844.009

The predicted working memory ability alteration was more negative in the order of SCZ, MDD, OCD, and ASD with effect sizes (Hedge’s g) of −0.68, –0.29, −0.16, and 0.09, respectively. Meta-analyses on working memory ability measured by digit span tasks (digit-span score) (Forbes et al., 2009; Snyder, 2014; Snyder et al., 2015) could provide a quantitative measure of working memory impairment for each diagnosis in terms of the effect size. Red horizontal lines in Figure 4C indicate confidence intervals of the effect sizes according to the meta-analysis studies. The effect sizes of the predicted working memory impairment fell within confidence intervals for forward digit-span in SCZ, MDD, and OCD and for backward digit-span in OCD (Figure 4C). Therefore, the model capturing a normal range of variation in working memory ability reproduced not only the order but also the quantitative aspects of working memory deterioration across the four distinct diagnoses. Note that no meta-analysis was available for ASD. However, previous studies generally showed little differences in verbal working memory ability from typically developed controls (Koshino et al., 2005; Lever et al., 2015; Williams et al., 2005), consistent with the predictions of our model. Moreover, we examined whether these predicted effect sizes were more similar to working memory ability or general cognitive ability (IQ) reported in meta-analysis studies (Abramovitch et al., 2018; Ahern and Semkovska, 2017; Heinrichs and Zakzanis, 1998). As illustrated in Figure 4—figure supplement 1, the predicted working memory ability falls within confidence interval of the IQ effect size only for first-episode MDD while it falls within confidence interval of effect size of working memory (forward digit span) for every diagnosis. Regarding the relative order of effect sizes, the effect size of IQ deficits can be ordered as SCZ, OCD, and MDD (first episode). In contrast, the effect sizes of working-memory deficits (as measured by digit-span task) can be ordered as SCZ, MDD and OCD, which is consistent with the order predicted by our model. Therefore, predicted working-memory deficits were more similar to observed deficits in working memory than those in fluid intelligence. We identified no significant differences in head motion between patients and their healthy controls in any diagnosis (two-tailed t-tests, at largest t138 = 1.77, p > 0.080 observed in ASD group).

Functional connectivity patterns in psychiatric diagnoses

Given a common FC-WMA relationship across these diagnoses, we examined how diagnosis-dependent working memory impairment resulted from FC alteration patterns. Since the model is the weighted summation of 16 FC values, increased/decreased working memory ability is determined by the sum of the increased/decreased weighted-FC values of the connections. To investigate the effect of FC alteration at each connection on working memory impairment, for each individual patient, we examined the difference in weighted-FC value at each connection from the average of the corresponding controls. We call this difference the D-score (see Materials and methods). By averaging the D-scores within each diagnosis, Figure 5A shows how the accumulation of diagnosis-dependent D-scores resulted in difference in working memory impairment across the diagnoses. Some D-scores are relatively constant, while others are variable across diagnoses. For example, the D-score for P1 was commonly negative regardless of the diagnosis, while the D-score for N6 was negative or positive, dependent on the diagnosis (inset, Figure 5A). Therefore, Figure 5A qualitatively suggests that diagnosis-dependent working memory impairment is derived from complex FC alterations patterns.

Figure 5 with 3 supplements see all
Accumulation of function connectivity differences exhibits diagnosis-specific working memory ability.

(A) Accumulation of averaged D-scores for all 16 connections. Bold black line indicates summation of contributions by all connections, corresponding to predicted working memory ability alteration. This figure shows how diagnosis-specific working memory impairment results from complex disturbances of multiple connections. Upper panel depicts two representative alteration patterns across diagnoses. While connection P1 commonly decreased working memory ability across diagnoses, connection N6 distinctly affected working memory ability (decrease in SCZ and MDD and increase in OCD and ASD). (B) Z-scores (normalized D-scores) for each diagnosis. Left asterisks and lines indicate significant differences in mean Z-scores between two diagnoses (p < 0.05, Bonferroni corrected). Vertical lines across horizontal bars indicate Z-scores averaged across connections. (C) Z-scores for connection that showed a significant effect of diagnosis. Connections were sorted by small p values of diagnosis effect (Kruskal-Wallis test, Q < 0.05, FDR corrected).

https://doi.org/10.7554/eLife.38844.013

To compare the weighted FC-values between the diagnoses, we calculated standardized score (Z-score) by dividing D-score by the standard deviation of the corresponding control group (see Materials and methods). We entered the Z-score in a two-way ANOVA with diagnosis as a between-participant factor and connection as a within-participant factor. We found a significant main effect of diagnosis (p < 1.0 × 10−4; Figure 5B). SCZ patients showed significantly more negative mean Z-scores across connections than the other diagnoses (post-hoc diagnosis-pair-wise comparisons, p < 0.05). This suggests that the global patterns of the working memory ability-related 16 connections in SCZ were more severely disrupted than the other three diagnoses.

We found a significant interaction effect between diagnosis and connection in the Z-scores (p < 1.0 × 10−5), suggesting that FC alterations at particular connections are diagnosis-dependent. In seven connections (Figure 5C), the Z-scores were significantly different among the four diagnoses (Q < 0.05, false discovery rate (FDR) corrected), suggesting that these connections are differentially altered across the diagnoses. Conversely, no significant differences in the Z-scores were observed across diagnoses in the remaining nine connections (Figure 5—figure supplement 1).

Furthermore, we examined whether the working memory ability-related 16 connections were more consistently altered in patients relative to controls than connections excluded from the model. Specifically, we first quantified the effect of diagnostic labels on connectivity alterations for every connection by using chi-square values of a Kruskal-Wallis test (a non-parametric version of one-way ANOVA). Then, we tested if the distribution of the chi-square values was different between the model’s 16 connections and other 155 connections (Kolmogorov–Smirnov test). Consequently, we found no significant difference between the distributions (p = 0.30). This means that connectivity is altered at the working memory ability-related connections as well as the other connections (Figure 5—figure supplement 2).

To understand these results from global brain networks, we grouped the 18 networks into seven clusters based on the hierarchical clustering of the networks performed in the BrainMap ICA study (Laird et al., 2011). They were named fronto-parietal, motor/visuospatial, emotion/interoception, audition/speech, visual, cerebellum and default-mode clusters (Supplementary file 2). We selected connections bridging between different clusters. We then fixed a cluster and summed D-scores (averaged across participants of each diagnosis) of the connections that have nodes (networks) in the cluster. This summation was repeated for every cluster. Figure 5—figure supplement 3 shows the summed D-score for each cluster and diagnosis. The four diagnoses commonly showed altered connectivity related to the fronto-parietal cluster. This confirmed importance of the fronto-parietal networks across diagnosis. The motor/visuospatial, audition/speech, and visual clusters are associated with lower working memory in schizophrenia and MDD. This suggests that dysfunctions in motor and sensory systems are related to lower working memory in specific diagnoses. Note that we could not find any connections that have a node in the default-mode cluster, which does not appear in the figure.

Discussion

We built a prediction model of working memory ability using data-driven analysis of whole-brain connectivity among healthy Japanese individuals. Our model predicted individual differences of working memory ability in SCZ patients. It also reproduced the order of working memory impairment for four distinct diagnoses (i.e., SCZ > MDD > OCD > ASD). Moreover, the magnitudes of reproduced impairment were consistent with previous meta-analyses. Our results provide the first evidence for a common whole-brain FC-WMA relationship across healthy populations and a range of psychiatric disorders. That is, our results support the idea that working memory impairment in psychiatric disorders is a continuous deviation from a normal pattern while preserving the common relationship between brain-wide connectivity and working memory ability. Our detailed examination suggested that the difference in degrees of the impairment across the diagnoses results from both common and diagnosis-specific connectivity changes within the common FC-WMA relationship.

Our model’s generalizability to completely independent datasets is supported by rigorous methods. Our simple model by combining low-number (=18) of network nodes and sparse estimation of relevant FC compensated the weakness of relatively small training sample size (=17). Also, our careful evaluation of working memory ability in 1500 trials devoting about 80–90 min enhanced measurement precision by reducing trial-by-trial variability at the individual participant level (Smith and Little, 2018). In this way, our training dataset ensured that our model captured low-complexity FC pattern essential for individual working memory ability. We note that larger sample sizes do not really always improve prediction accuracy. Using identical methods in our model construction, we developed a new prediction model of visual-object N-back score using the HCP dataset as a training dataset (N = 474). However, this model failed to provide significant prediction even within the training samples (R2 = 0.005). This seemingly unexpected result may partly result from differences in the way where individuals’ working memory ability were evaluated. The HCP conducted the N-back task in a limited time (160 trials for each individual), which may be noisy to precisely characterize individual ability. On the other hand, Rosenberg et al. built a model with careful examination of cognitive performance (more than 40 min of attention task), using modest (N = 25) training samples and demonstrated robust generalization to independent test sets (Rosenberg et al., 2016). Our model’s accuracy was comparable with their model of attention ability for an external test set (r ~ 0.3).

We carefully excluded the spurious correlations (Siegel et al., 2016; Whelan and Garavan, 2014). We examined general intellectual/cognitive ability, age, and head motion and confirmed that these disturbance variables had a minimal effect on prediction. Moreover, we analyzed age- and gender-matched controls from the same sites and compared the alterations from the controls (Z-scores), thereby minimizing the false positives that could be derived from age, gender, or imaging sites/parameters.

Our proposed two-stage approach, which builds a normative model and applies it to multiple diagnoses, is an effective technique to systematically compare neural substrates across multiple diagnoses. Clinical measures of attention deficit hyperactivity disorder were previously predicted by FC patterns that determine attention ability in healthy populations (Rosenberg et al., 2016), suggesting common connectivity-cognition relationships across healthy and clinical populations. By extending this approach, we directly examined working memory ability of the patients in cognitive tasks rather than assessments of clinical symptoms based on subjective report or behavioral observation. We also tested the model across not only a single diagnosis but also multiple diagnoses.

By coherently establishing FC-cognition relationships from normal to abnormal, our two-stage approach could potentially cluster multiple psychiatric disorders based on neurobiological measures and behaviors (Insel et al., 2010). Such neurobiological insights into behavioral abnormality are consistent with recent transdiagnostic studies of genomics (Cross-Disorder Group of the Psychiatric Genomics Consortium, 2013; O'Donovan and Owen, 2016; Plomin et al., 2009) and neuroimaging (Clementz et al., 2016; Goodkind et al., 2015; Sheffield et al., 2017), which indicate that some neurobiological changes are shared across psychiatric diagnoses. Consistent with our results, recent studies provide evidences indicating that models of attention and memory generalize to predict behavior in patient as well as in healthy populations (Kessler et al., 2016; Lin et al., 2018; O'Halloran et al., 2018; Rosenberg et al., 2016).

Our results identified alterations in large-scale network clusters that correlated with working memory impairment (Figure 5—figure supplement 3). First, the four diagnoses commonly showed altered connections related to the fronto-parietal networks. This finding support a hypothesis that the executive control network regulates symptoms, and its dysregulation is a shared neural substrate across diagnostic categories (Cole et al., 2014; Smucny et al., 2018). Second, visual and auditory networks were associated with lower working memory ability in schizophrenia and MDD. These two results are consistent with the hypothesis that cognitive function is disrupted regarding not only top-down executive control but also bottom-up sensory processes (Javitt, 2009). Recently, a neurophysiological study has suggested contributions of motor and premotor neurons to encoding serial order of working memory (Carpenter et al., 2018). This is consistent with our result that alterations of connections related to the motor/visuospatial networks were associated with lower working memory ability in schizophrenia, MDD, and ASD.

The test-retest reliability of our functional connectivity estimation methods was fair level (0.2 < ICC ≤ 0.4) according to an interpretation criteria of ICC (Landis and Koch, 1977). A previous study on test-retest reliability of functional connectivity between 18 different brain regions (Birn et al., 2013), reported similar ICC values (ICC ~0.2) when scan length was 6 to 15 min. Another previous study examined functional connectivity reliability (Noble et al., 2017), using 268 regions from whole-brain, also reported that 6 min of scan length yielded similar reliability (dependability coefficient ~0.2 to 0.4). Although our connectivity estimation methods cannot reach clinically recognized request (ICC > 0.8), these studies suggest that test-retest reliability of our methods are comparable to other common connectivity estimation methods.

Regarding the large contribution of the left fronto-parietal network (FPN) to our prediction model in comparison to the right FPN, the BrainMap ICA on which our network definition is based gives us useful information. The BrainMap ICA paper (Laird et al., 2011) reported that IC18 (left FPN) has greater functional relevance to working memory than IC15 (right FPN) based on meta-analyses of thousands of publications. Moreover, a meta-analysis on N-back task with different stimulus modality (Owen et al., 2005) found monitoring of verbal stimuli was strongly associated with left ventrolateral prefrontal cortex (a part of left FPN), while monitoring of spatial locations activated right lateralized frontal and parietal regions. In the current study, we used a letter 3-back task that requires encoding alphabet letters, which are more related to word monitoring than location monitoring. Therefore, the left FPN would be expected to contribute more to our prediction model than the right FPN. Although large portion of the model relies on the within-network connectivity of the left FPN (~34% contribution), the right FPN also showed a substantial contribution to working memory via negative connectivity N2 (connection with the midbrain network, please see Figure 2) and N3 (connection with the superior parietal network) (~20% contribution).

The primary limitation of this study is the assumption that our model captures general capability of working memory not restricted to letter 3-back performance. Working memory is an umbrella term which involves multiple distinct sensory modalities and executive functions, and the empirical findings and theoretical conceptualization is still rapidly extending (Chatham and Badre, 2015; Cogan et al., 2017; D'Esposito and Postle, 2015; Ma et al., 2014; Myers et al., 2017; Serences, 2016). Rather than focus a single specific domain, we utilized any domains of working memory performance (letter N-back, visual object N-back, digit-sequencing task, and digit span). Future work may reveal more elaborate findings for FC-WMA relationships based on more nuanced definition of WMA, since distinct types of working memory tasks are engaged with specific neural processes (Nee et al., 2013; Owen et al., 2005). Second, working memory performance was measured only in schizophrenia patients but not in other diagnostic groups. Therefore, it was impossible to compare their predicted working memory ability with measured scores. This presents challenges for the between-group comparisons in the patient samples. Third, although the participants are matched on age and sex within each site, the groups may differ along a number of dimensions beyond working memory (e.g. medication status, scanning protocol, and potentially IQ and other cognitive abilities). It is difficult to fully control every dimension, and little is known how such dimensions affect estimation of functional connectivity. Fourth, the results in the HCP dataset showed that only a little variance can be explained by our model. This may be attributed to considerable differences between the HCP dataset and the ATR dataset. The major differences include population location (the American vs. the Japanese), and working memory task properties that contrast in sensory modality (visual object vs. verbal), number of observations made for each individual (160 vs. 1500 trials), difficulty level (0-back and 2-back vs. 3-back), and measurement environment (in vs. out of MRI scanner). Finally, in the HCP dataset, the model predictions were not more closely related to 2-back than 0-back performance. This result suggest that the model may capture abilities beyond working memory.

In conclusion, our data provide a unified working memory ability framework across healthy populations and multiple psychiatric disorders. Our whole-brain functional connectivity model quantitatively predicted individual working memory ability in independently collected cohorts of healthy populations and patients with any of four psychiatric diagnoses (N = 965). Our results suggest that the FC-WMA relationship identified in healthy populations is commonly preserved in these psychiatric diagnoses and that working memory impairment in a range of psychiatric disorders can be explained by the cumulative effect of multiple disturbances in connectivity among distributed brain networks. Our findings lay the groundwork for future research to develop a quantitative, brain-wide-connectivity-based prediction model of human cognition that spans health and psychiatric disease.

Materials and methods

ATR dataset

This dataset was a final set of participants after excluding individuals who exhibited noisy data collected in our previous study (Yamashita et al., 2015). Here, we used this dataset to construct a normative prediction model of working memory ability regarding a letter 3-back task learning plateau (N = 17, age 19–24 years old, 11 males). Recent study on predictive modeling of a single specific task performance using fMRI connectivity has reported comparable sample size (Baldassarre et al., 2012; Rosenberg et al., 2016).

Working memory assessment

The participants performed a letter 3-back task (Figure 1A) over 25 sessions of training, with 60 trials for each session (1500 trials in total training sessions taking about 80–90 min). We obtained an individual learning curve by calculating the d-prime for each session, and by smoothing the data points with five-session moving average (Figure 2—figure supplement 1). The individual learning curve was fitted by an inverse curve (y = a – b/x), where y is a d-prime in the x-th session, while a and b is a parameter for learning plateau and learning speed, respectively. We used the estimated learning plateau (a) for a measure of individual working memory ability (letter 3-back WMA). More detailed information is described in our previous paper (Yamashita et al., 2015).

Functional connectivity estimation

We recorded a rs-fMRI scan with 3 × 3 × 3.5 mm spatial resolution and a temporal resolution of 2.0 s for each participant (5 min 4 s). After removing the first two volumes, the data were preprocessed with slice timing correction, motion correction, and spatial smoothing with an isotropic Gaussian kernel (full width at half maximum = 8 mm). To remove several sources of spurious variance, we regressed out six motion parameters and the averaged signals over gray matter, white matter, and cerebrospinal fluid (Fox et al., 2005). The gray matter signal regression improves FC estimation by effectively removing motion-related artifacts (Burgess et al., 2016; Ciric et al., 2017; Power et al., 2014). Finally, we performed ‘scrubbing’ (Power et al., 2012) in which we removed scans where framewise displacement was > 0.5 mm.

We used network-level rather than node-level connectivity features to avoid overfitting to training samples (the curse of dimensionality). Specifically, we calculated FC values, based on the 18 whole-brain intrinsic networks of BrainMap ICA (Laird et al., 2011), for pair-wise between-network (18 × 17/2 = 153) connections and within-network connections. Between-network FC was calculated as Pearson’s correlation between blood-oxygen-level dependent signal time courses averaged across voxels within each network. Within-network FC was calculated as mean voxel-wise correlations within each of 18 networks.

Test-retest reliability of functional connectivity estimation

To examine test-retest reliability of the functional connectivity estimation method, we calculated intra-class correlation (ICC) using three different datasets from Consortium for Reliability and Reproducibility (Zuo et al., 2014). We picked up following three datasets, Beijing Normal University (BNU 1), Institute of Automation, Chinese Academy of Sciences (IACAS 1), and University of Utah (Utah 1). We selected these datasets because 1) they have test-retest data across fMRI sessions, 2) ages of participants are comparable with those in our discovery dataset that was used for the construction of our model (ATR dataset), 3) two datasets include Asian participants (participants in ATR dataset are Japanese). BNU one includes data from 57 healthy young volunteers (age 19–30 years, 30 males) who completed two MRI scan sessions within an interval of approximate 6 weeks (33–50 days, mean 40.94 days). All were right-handed and had no history of neurological and psychiatric disorders. The resting state fMRI data was collected for 6 min 46 s. Detailed information is available for BNU one at http://fcon_1000.projects.nitrc.org/indi/CoRR/html/bnu_1.html. Seven participants (‘BNU25914’ to ‘BNU25920’) have incomplete data, thus these data were not used for the analysis. IACAS one includes data from 28 healthy young volunteers (age 19–43 years, 13 males) who completed two MRI scan sessions within an interval of approximate 6 weeks (20–343 days, mean 75.2 days). The resting state fMRI data was collected for 8 min. Detailed information is available for IACAS one at http://fcon_1000.projects.nitrc.org/indi/CoRR/html/iacas_1.html. Utah 1 includes 26 healthy young volunteers (age 8–39 years; 26 males) who completed two MRI scan sessions at least two years apart (733–1,187 days, mean 928.4 days). The resting state fMRI data was collected for 8 min 4 s. Detailed information is available for Utah one at http://fcon_1000.projects.nitrc.org/indi/CoRR/html/utah_1.html. To estimate functional connectivity, we used the same analysis pipeline. We obtained 171 (153 between-network and 18 within-network) functional connectivity values for each participant/session (test or retest). To estimate test-retest reliability, intra-class correlation (ICC) was calculated for each of the 171 functional connectivity values (univariate test-retest reliability). ICC was calculated by following equation:

ICC=(MSbMSw)/{MSb+(k1)MSw}

where, MSb is the between-subjects mean squared error and MSw is the within-subjects mean squared error and k is the number of independent fMRI measures (i.e. k = 2 for test and retest). We put negative ICC values to be zeros as done by previous studies (e.g. Zhang et al., 2011).

Developing prediction model

To predict individual learning plateaus in the letter 3-back task, we performed a sparse linear regression analysis (Sato, 2001) on the whole-brain FC values (http://www.cns.atr.jp/cbi/sparse_estimation/sato/VBSR.html). Individual working memory ability was modeled as a linear weighted summation of FC values at a small number of connections among the intrinsic networks. The connections were automatically selected by the sparse linear regression algorithm. In our previous study (Yamashita et al., 2015), we employed a leave-one-out cross-validation to estimate the prediction accuracy, and the analysis achieved high prediction accuracy within this dataset (R2 = 0.73). To build a single prediction model, we utilized all the data (N = 17) as the training set.

Human connectome project (HCP) dataset

The dataset was collected in the HCP and shared as 500 Subjects Release (Van Essen et al., 2013). We restricted our analysis to participants for whom all rs-fMRI, visual-object N-back, the NIH Toolbox list-sorting test (Tulsky et al., 2014), and Raven’s progressive matrices with 24 items (Bilker et al., 2012) were available (N = 474; 194 males, 5 year age ranges in the Open Access Data: 22–25, 26–30, 31–35 and 36 + years old).

Working memory assessment

Individual working memory performance was briefly measured by the visual-object N-back with 0-back and 2-back conditions (visual-object N-back score) and the list-sorting test (Figure 1B). The N-back task was performed in two fMRI runs, and each run contains eight task blocks of 10 trials (80 trials for each 0-back and 2-back condition). The scores were evaluated by the accuracy percentage of 2-back and 0-back conditions (86.0 ± 9.5% (SD), range 45.8% to 100%). The other working memory measure, the list-sorting test, is a sequencing task of visual or auditory stimuli (mean scores: 110.5 ± 11.6 (SD), range 80.8 to 144.5). Additionally, general fluid intelligence was assessed by Raven’s progressive matrices. The scores are integers that indicate the number of correct items (16.5 ± 4.8 (SD) from 4 to 24).

Examination of model prediction

We used rs-fMRI data (2 mm isotropic spatial resolution and a temporal resolution of 0.72 s) that were pre-processed and denoised by a machine learning tool that removes structured noise and moment-to-moment motion parameters (Salimi-Khorshidi et al., 2014). Additionally, spatial smoothing (full width at half maximum = 4 mm) and nuisance regression was performed using average signals of gray matter, white matter, and cerebro-spinal fluid. To extract slow oscillation and removes high-frequency noise (e.g. cardiac pulsation around 0.3 Hz), a band-pass filter (0.009–0.08 Hz) was applied and volumes with framewise displacement > 0.5 mm were removed. After preprocessing of fMRI data, FC was estimated following procedures described in the ATR dataset. We entered FC values into the prediction model developed from the ATR dataset, and predicted individual working memory ability. Because the scores of the visual-object N-back task and list-sorting test showed non-normal distributions, we used a nonparametric statistical test (Spearman’s rank correlation) to examine the model prediction accuracy. We compared the rank correlation coefficient between the predicted working memory ability and actual working memory scores with the null distribution obtained by shuffling participant labels (10,000 permutations). Links between behavioral scores and motion measures were preserved. Specifically, the subject label was shuffled for predicted working memory ability while the subject labels were preserved for confounding factors (e.g. age, fluid intelligence, and head motion).

Multiple psychiatric diagnoses dataset

These data were collected at a Japanese neuropsychiatry consortium (Takagi et al., 2017; Yahata et al., 2016Yahata et al., 2017; Yamada et al., 2017; Yamashita et al., 2018) (https://bicr-resource.atr.jp/decnefpro/). Resting state fMRI analysis was performed in the same way as described in ATR dataset. We performed slice timing correction and then motion estimation. The estimated motion parameters were used to estimate excessive motion data by frame-wise displacement > 0.5 mm. We did not remove a frame before or after the excessive motion. We conducted quality control for the rs-fMRI data and excluded participants if more than 40% of their total number of volumes of their data were removed by the scrubbing method. We calculated the ratio of excluded volumes to the total number of volumes for each subject, and averaged within patients or controls for each diagnosis. They were 2.3 ± 5.5 % / 1.4 ± 2.9% (patients/controls) for SCZ, 2.7 ± 6.6 % / 2.4 ± 6.3% for MDD, 0.4 ± 0.8 % / 0.7 ± 1.7% for OCD, and 1.4 ± 3.8 % / 4.6 ± 8.5% for ASD. We found a significant difference in the ratio between patients and controls only for ASD (t97.3 = 2.91, p = 4.4×10−3). We detected outliers within each group (defined as values > 3 SD from the mean) for a control participant of MDD and a patient with OCD (N = 1, 1, respectively). These two participants were excluded from further analysis. After their data quality was assured, age- and gender-matched healthy control subjects were included in the analysis. Consequently, we used the rs-fMRI data of patients with SCZ, MDD, OCD, and ASD (N = 58, 77, 46, and 69, respectively) as well as their age- and gender-matched healthy/typically developed controls (N = 60, 63, 47, and 71). These sample sizes were comparable to or even larger than recent generalization test analysis (Rosenberg et al., 2016). Demographic data is summarized in Table 2. Scanning parameters are reported in Supplementary file 1.

Participants with SCZ

Patients with SCZ diagnosed with the patient edition of the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) Axis I Disorders (SCID) (First et al., 1995) were recruited from in- and out-patients facilities in the Kansai region, Japan. The controls had no history of psychiatric illness, as screened with the non-patient edition of the SCID (First et al., 2002), and it was confirmed that their first-degree relatives had no history of psychotic disorders. Exclusion criteria for all individuals included a history of head trauma, neurological illness, serious medical or surgical illness, and substance abuse. All participants were physically healthy when they undertook the scanning. All the patients with SCZ had received antipsychotic medication. The mean ± SD values of medications based on chlorpromazine equivalents were 608 ± 459 mg/day. Other medications that the patients received were as follows; antiparkinsonism drugs (N = 23), anxiolytics and sleep inducing drugs (N = 39). The study design was approved by the Committee on Medical Ethics of Kyoto University and was conducted in accordance with the Code of Ethics of the World Medical Association. After being given a complete description of the study, all participants gave written informed consent.

Participants with MDD

Patients with MDD were recruited from local clinics in Hiroshima, Japan, and all the patients were screened with the DSM-IV criteria for a unipolar MDD diagnosis, using the mini-international neuropsychiatric interview (M.I.N.I.) (Otsubo et al., 2005; Sheehan et al., 1998). No patient had current or past SCZ episodes. Healthy participants were recruited from the local community. They were interviewed with the M.I.N.I. and none showed a history of psychiatric disorders according to DSM-IV criteria. At the time of scanning, six MDD individuals were medication free, and the rest of MDD individuals had been administered the following psychotropic drugs: antidepressants (N = 70), antipsychotics (N = 7), antiepileptics (N = 7), anxiolytics (N = 15), and sleep inducing drugs (N = 26), before the scanning. Around half of participants had been administered multiple drugs (N = 36). The current study protocol was approved by the Ethics Committee of Hiroshima University. Prior to the administration of any experimental procedure, written informed consent was obtained from all the participants.

Participants with OCD

Patients were recruited at the Kyoto Prefectural University of Medicine Hospital, Kyoto, Japan. All patients were primarily diagnosed as OCD using the Structured Clinical Interview for DSM-IV Axis I Disorders-Patient Edition (SCID) (First et al., 1995). Exclusion criteria were 1) cardiac pacemaker or other metallic implants or artifacts; 2) significant disease, including neurological diseases, disorders of the pulmonary, cardiac, renal, hepatic, or endocrine systems, or metabolic disorders; 3) prior psychosurgery; 4) DSM-IV diagnosis of mental retardation and pervasive developmental disorders based on a clinical interview and psychosocial history; and 4) pregnancy. We excluded patients with current DSM-IV Axis I diagnosis of any significant psychiatric illness except OCD as much as possible and only four patients with trichotillomania, one patient with tic disorder, and one patient with tic disorder and specific phobia were included as patients with comorbidity. Thirty-five OCD individuals and 34 healthy controls were also included in a published paper (Abe et al., 2015).

There was no history of psychiatric illness in the healthy controls as determined by the Structured Clinical Interview for DSM-IV Axis I Disorders, Non-patient Edition (SCID-NP) (First et al., 2002). Additionally, we confirmed that there was no psychiatric treatment history in any of their first-degree relatives. At the time of scanning, 40 OCD individuals were medication free, whereas the remaining five OCD individuals had been administered the following psychotropic drugs: anxiolytics (N = 2), antidepressants (N = 5), before the scanning. Some participants had been administered multiple drugs (N = 2). The Medical Committee on Human Studies at the Kyoto Prefectural University of Medicine approved all the procedures in this study. All participants gave written, informed consent after receiving a complete description of the study.

Participants with ASD (site 1)

Patients with ASD were recruited through the Department of Child Psychiatry and Neuropsychiatry at the University of Tokyo Hospital and via an advertisement on the website of the University of Tokyo Hospital. All ASD participants (N = 33) were diagnosed with pervasive developmental disorder based on the DSM-IV-TR criteria (Association, 2000). DSM-IV-TR diagnoses of autistic disorder, Asperger’s disorder, or pervasive developmental disorder not otherwise specified (N = 22, N = 3, and N = 8, respectively) were supported by Autism Diagnostic Observation Schedule (Lord et al., 1994) (N = 33) and Autism Diagnostic Interview-Revised (Catherine Lord, Rutter, & Le Couteur, 1994) (N = 25). The Japanese version of M.I.N.I. (Otsubo et al., 2005; Sheehan et al., 1998) was used to evaluate psychiatric comorbidity. No participant satisfied the diagnostic criteria for substance use disorder, bipolar disorder, or SCZ. The intelligence quotient (IQ) scores of participants with ASD were obtained using the Wechsler adult intelligence scale-revised (WAIS-R) or third edition (WAIS-III). The full-scale IQs of all of the individuals with ASD were measured and found to be greater than 85. Typically-developed individuals were recruited from the local community. M.I.N.I. was used to confirm that none of the typically developed individuals met the diagnostic criteria for any psychiatric disorder. The IQs of the typically developed individuals were estimated using the Japanese version of the national adult reading test (Matsuoka et al., 2006). All participants were right-handers according to the Edinburgh Handedness Inventory (Oldfield, 1971). They completed the Japanese version of the autism-spectrum quotient (Wakabayashi et al., 2007). At the time of scanning, 10 ASD individuals were medication free, whereas the remaining 23 ASD individuals had been administered the following psychotropic drugs: anxiolytics (N = 17), antidepressants (N = 19), antipsychotics (N = 15), antiepileptics (N = 5), and sleep inducing drugs (N = 17), before the scanning. Some participants had been administered multiple drugs (N = 19). All participants provided written informed consent as approved by The Ethics Committee of the Graduate School of Medicine and Faculty of Medicine at the University of Tokyo.

Participants with ASD (site 2)

Patients with ASD were recruited from outpatient units of the Karasuyama Hospital, Tokyo, Japan. A team of three experienced psychiatrists and a clinical psychologist assessed all patients. All patients were diagnosed with ASD based on the criteria of the DSM-IV (Association, 2000) and a medical chart review. The assessment consisted of participant interviews about developmental history, present illness, life history, and family history and was performed independently by a psychiatrist and a clinical psychologist in the team. Patients were also asked to bring suitable informants who had known them in early childhood. At the end of the interview, the patients were formally diagnosed with a pervasive developmental disorder by the psychiatrist if there was a consensus between the psychiatrist and clinical psychologist; this process required approximately three hours. The group of typically developed individuals was recruited by advertisements and acquaintances. None of the typically developed individuals reported any severe medical problem or any neurological or psychiatric history. None of them satisfied the diagnostic criteria for any psychiatric disorder. The IQ scores of all participants with ASD were evaluated using either the WAIS-III or the WAIS-R, while those of typically developed individuals were estimated using the Japanese version of the national adult reading test (Matsuoka et al., 2006). Every participant with ASD was considered to be high functioning, because his or her full-scale IQ score was higher than 80. Participants completed the Japanese version of the autism-spectrum quotient (Wakabayashi et al., 2007). At the time of scanning, 25 ASD individuals were medication free, whereas the remaining 11 ASD individuals were administered the following psychotropic drugs: anxiolytics (N = 4), antidepressant (N = 6), antipsychotics (N = 6), antiepileptics (N = 2), and sleep-inducing drugs (N = 8). Some participants were administered multiple drugs (N = 7). The Ethics Committee of the Faculty of Medicine of Showa University approved all the procedures used in this study, including the method of obtaining consent, in accordance with the Declaration of Helsinki. Written informed consent was obtained from all the participants after fully explaining the purpose of this study. Any concern regarding the possibility of reduced capacity to consent on his or her own was not voiced by either the ethics committee or patients’ primary doctors.

Working memory assessment

The SCZ patients and their healthy controls underwent the Japanese version of Brief Assessment of Cognition in Schizophrenia (BACS-J) (Kaneda et al., 2007). This cognitive battery is composed of six subtests including a digit sequencing test as a working memory measure. In this test, auditory sequences of numbers were presented, with increasing length from three to nine digits (Figure 1B). Participants repeated the sequences aloud by sorting in ascending order. The digit-sequencing scores were the number of correct trials among 28 trials (18.4 ± 4.1 (SD), range 10 to 27 in patients while 22.9 ± 4.3 (SD), range 12 to 28 in controls). Their composite BACS score was evaluated by average score of BACS’s five subtests other than the digit sequencing test.

Examination of model prediction (SCZ patients and controls)

After preprocessing of fMRI data, FC was estimated following procedures described in the ATR dataset. We entered FC values into the normative model and predicted individual’s working memory ability. For individual SCZ patients and controls, we performed a partial correlation analysis between the predicted and the actual working memory performance while factoring out age and composite BACS score excluding working memory (see above). We examined the statistical significance on the prediction accuracy by permutation tests as described in HCP dataset.

Examination of model prediction (multiple diagnoses)

After prediction of working memory ability for each patient, we investigated working memory impairments in each of the four diagnoses. Specifically, patients’ predicted letter 3-back working memory were evaluated by the Z-scores standardized to their age-and gender-matched controls collected in the same site. After confirming the homoscedasticity (Bartlett’s test, p = 0.17), the standardized predicted working memory ability differences were entered in a one-way ANOVA with diagnosis as a between-participant factor. Post-hoc pair-wise comparisons were corrected using Holm’s method.

Comparison of functional connectivity differences

We are interested in how working memory ability is determined by functional connectivity, and if the relationship between working memory ability and connectivity is altered by psychiatric disorders (e.g. if our model constructed from healthy controls can predict working memory of patients). The predicted working memory ability in our model is a weighted summation of connectivity values, meaning that alteration in working memory is determined by the product of connectivity values and model weights. For example, the working memory deficit caused by alteration of a specific connection is large, even if difference in a connectivity value between patients and controls is small, when the weight for the connection is large. Conversely, the working memory deficit is small, even if difference in a connectivity value is large, when the weight is small. Therefore, we mainly analyzed product of connectivity values and model weights.

To illustrate how each connection contributed to the predicted letter 3-back working memory ability differences, we defined the D-scores (difference-score) as follows. First, to align the FC value distribution of the control groups across the diagnostic groups, for every connection’s Gaussian distribution N(μ, σ), each control group’s FC value was transformed to an ATR dataset’s distribution N(μATR, σATR) using a linear transformation. The same transformation was performed for corresponding patient FC values. Since the predicted letter 3-back working memory ability is the weighted summation of the FC values, we can calculate each connection’s differences in the weighted-FC values from the control average. Specifically, if wi is a regression weight and xi,p and xi,c are the FC values at connection i of patient p and control c, weighted-FC difference Di,p (D-score) becomes

Di,p=wixi,pmean(wixi,c)=wi{xi,pmean(xi,c)}.

To statistically compare the magnitude of the weighted-FC differences across diagnoses, the D-score was standardized for each patient and each connection:

Zi,p={wixi,pmean(wixi,c)}/SD(wixi,c)=Di,p/SD(wixi,c).

Next, we examined the effects of diagnosis and connection on the Z-score. We tested the null hypotheses of (i) no main effect of diagnosis and (ii) no interaction effect between diagnosis and connection. Since these Z-scores showed heterogeneous variances across the diagnoses and connections, we calculated data-specific p values based on permutation tests as follows. First, to examine the main effect of diagnosis, we shuffled the diagnosis labels (i.e. SCZ, MDD, OCD, and ASD) and performed a two-way ANOVA with diagnosis as a between-participant factor and connection as a within-participant factor, obtaining the F value for the main effect of diagnosis. We also performed post-hoc permutation tests to compare the disorder-pair-wise differences. We shuffled the diagnosis labels within a pair of diagnoses (e.g., SCZ and MDD), performed a two-way ANOVA, and obtained F values for the main effect of diagnosis. Furthermore, we examined the interaction of diagnosis and connection by shuffling both the diagnosis and connection labels and performed a two-way ANOVA, obtaining F values for interaction. These permutations were repeated 10,000 times for the main effects and 100,000 times for the interaction effect. The reported p values indicate how many times the observed F values were obtained in the repetitions. A post-hoc Kruskal-Wallis test was performed to examine the simple main effects of the diagnosis on all the connectivity alteration Z-scores. We performed false discovery rate (FDR) correction to account for multiple comparisons (Benjamini-Hochberg method, Q < 0.05).

References

  1. 1
  2. 2
  3. 3
  4. 4
    Diagnostic and Statistical Manual of Mental Disorder (Fourth Edition)
    1. AP Association
    (2000)
    DSM Library.
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
    Structured Clinical Interview for DSM-IV Axis I Disorders
    1. MB First
    2. RL Spitzer
    3. M Gibbon
    4. JBW Williams
    (1995)
    Washington: American Psychiatric Association Publishing.
  26. 26
    Structured Clinical Interview for DSM-IV-TR Axis I Disorders, Research Version, Non-Patient Edition
    1. MB First
    2. RL Spitzer
    3. M Gibbon
    4. JBW Williams
    (2002)
    New York: New York State Psychiatric Institute.
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
    The functional brain organization of an individual predicts measures of social abilities in autism spectrum disorder
    1. EMR Lake
    2. ES Finn
    3. SM Noble
    4. T Vanderwal
    5. X Shen
    6. MD Rosenberg
    7. RT Constable
    (2018)
    bioRxiv, 290320, 10.1101/290320.
  40. 40
  41. 41
  42. 42
  43. 43
    Resting-state functional connectivity predicts cognitive impairment related to alzheimer's disease
    1. Q Lin
    2. MD Rosenberg
    3. K Yoo
    4. TW Hsu
    5. TP O'Connell
    6. MM Chun
    (2018)
    Frontiers in Aging Neuroscience, 10, 10.3389/fnagi.2018.00094.
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
    The Mini-International Neuropsychiatric Interview (M.I.N.I.): The development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10
    1. DV Sheehan
    2. Y Lecrubier
    3. KH Sheehan
    4. P Amorim
    5. J Janavs
    6. E Weiller
    7. T Hergueta
    8. R Baker
    9. GC Dunbar
    (1998)
    The Journal of Clinical Psychiatry 59:22–33.
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
    A neural marker of obsessive-compulsive disorder from whole-brain functional connectivity
    1. Y Takagi
    2. Y Sakai
    3. G Lisi
    4. N Yahata
    5. Y Abe
    6. S Nishida
    7. T Nakamae
    8. J Morimoto
    9. M Kawato
    10. J Narumoto
    11. SC Tanaka
    (2017)
    Scientific Reports, 7, 10.1038/s41598-017-07792-7.
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
    A small number of abnormal brain connections predicts adult autism spectrum disorder
    1. N Yahata
    2. J Morimoto
    3. R Hashimoto
    4. G Lisi
    5. K Shibata
    6. Y Kawakubo
    7. H Kuwabara
    8. M Kuroda
    9. T Yamada
    10. F Megumi
    11. H Imamizu
    12. JE Náñez
    13. H Takahashi
    14. Y Okamoto
    15. K Kasai
    16. N Kato
    17. Y Sasaki
    18. T Watanabe
    19. M Kawato
    (2016)
    Nature Communications, 7, 10.1038/ncomms11254, 27075704.
  83. 83
  84. 84
  85. 85
  86. 86
    Harmonization of resting-state functional MRI data across multiple imaging sites via the separation of site differences into sampling bias and measurement bias
    1. A Yamashita
    2. N Yahata
    3. T Itahashi
    4. G Lisi
    5. T Yamada
    6. N Ichikawa
    7. Takamura N
    8. Yoshihara Y
    9. Kunimatsu A
    10. Okada N
    11. Yamagata H
    12. Matsuo K
    13. Hashimoto R
    14. Okada G
    15. Sakai Y
    16. Morimoto J
    17. Narumoto J
    18. Shimada Y
    19. Kasai K
    20. Kato N
    21. Takahashi H
    22. Okamoto Y
    23. Tanaka SC
    24. Yamashita O
    25. Kawato M
    26. H Imamizu
    (2018)
    bioRxiv, 10.1101/440875.
  87. 87
  88. 88

Decision letter

  1. Michael Breakspear
    Reviewing Editor; QIMR Berghofer Medical Research Institute, Australia
  2. Michael J Frank
    Senior Editor; Brown University, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "A prediction model of working memory across health and psychiatric disease using whole-brain functional connectivity" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Michael Frank as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Monica Rosenberg (Reviewer #2); Xi-Nian Zuo (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The central findings – transdiagnostic WM prediction – builds nicely on recent work on fingerprinting, prediction and transdiagnostic analyses. It is actually quite surprising that training a model on such a small set of healthy subjects generalizes across diagnoses and demographics that lie well outside the training data, but therein lies the value of the paper. While the relatively "dirty" acquisition and cohort details may not be ideal from a pure research perspective, it probably adds to the ecological validity and clinical translatability.

The paper has had two very thorough, excellent technical reviews. All major concerns are reasonable and should be addressed. Four concerns warrant specific commentary:

1) Reviewer 2's first point regarding the greater specificity of 2-back over 0-back to working memory and whether there are other cognitive processes at play.

2) Reviewer 3's first point regarding test-retest reliability. Ideally, you could pursue the reviewer's request here, although I note he has offered alternatives if this is not possible.

3) Much of the model (~34%) relies on left FP self-correlation – some sort of proxy for the internal coherence of that ICA map. None comes from the right FP – a slightly odd dependence on one feature and an asymmetry. It would be reassuring if this stood up to the test retest reliability analyses.

4) Given that you use parametric test statistics, it is not obvious why you also employed resampling to ascertain significance.

Reviewer #2:

In a training sample of 17 healthy adults, the authors built a model to predict d' on a 3-back task from between- and within-network resting-state functional connectivity. They applied the model to resting-state data from an independent sample of 474 healthy adults from the Human Connectome Project dataset and found that model predictions were significantly correlated with n-back task performance when controlling for fluid intelligence and motion, significantly (inversely) correlated with fluid intelligence when controlling for n-back performance and motion, and not significantly correlated with motion when controlling for n-back performance and fluid intelligence. They applied the model to a second independent sample of resting-state data from 58 individuals with schizophrenia and found that predictions were correlated with a working memory measure when controlling for general cognitive ability and age. Based on these external validation results they argue that the model is generalizable and specific to working memory abilities.

They next applied the model to three additional datasets with patient and control populations. They found that predicted degree of working memory impairment relative to matched controls was greatest for patients with schizophrenia followed by patients with major depressive disorder, obsessive compulsive disorder, and autism spectrum disorder. This ordering replicates the degree of working memory impairments reported by previous meta-analyses.

Overall this paper is a rigorous example of neuroimaging-based predictive modeling based on the generalization to two external validation datasets and between-group comparisons in three additional independent samples. My enthusiasm for the work is only slightly dampened by questions about the patient-vs.-control analyses, the specificity of the working memory model, and the feature importance analysis.

1) The authors take steps to show that model predictions are related to working memory ability specifically rather than cognitive ability more generally, but additional analyses would strengthen this claim. First, the measure of working memory in the HCP dataset includes 0-back task performance, which indexes sustained attention and attentional control rather than working memory. Are model predictions more closely related to 2-back than to 0-back task accuracy? Furthermore, because even 2-back and 3-back tasks measure a number of processes beyond working memory (Kane et al. 2007, J Exp Psychol Learn Mem Cogn), it would be informative to test whether model predictions are related to another measure of working memory such as performance on the NIH toolbox list-sorting task. Second, in the group-level analyses, are predicted working memory deficits more similar to working memory deficits observed in meta-analyses than they are to deficits observed in fluid intelligence or other cognitive domains (in terms of effect size or relative ordering across disorders)?

2) The lack of working memory measures presents challenges for the between-group comparisons in the patient samples. Although within each site the samples are matched on age and sex, the groups differ along a number of dimensions beyond working memory (e.g., medication status and potentially IQ and other cognitive abilities), and it's not clear whether patients and controls were scanned under the same protocols at each site, or whether protocols differed between sites. Limitations due to these potentially confounding factors should be clearly outlined in the manuscript. Related to this, are there working memory scores for the controls in the schizophrenia sample? If so, did predicted impairment reflect observed impairment in that sample, and does the model hold when applied to this full sample of patients and controls together?

3) It appears that the measure of general cognitive ability in the schizophrenia dataset includes a verbal memory measure. How correlated are working and verbal memory scores in this sample, and what is the justification for including it in the general cognitive ability score rather than treating it as a variable of interest?

4) Why was the HCP 500-subject release (2014) used rather than the 900-subject release (2015) or the 1200-subject release (2017)? Although the external generalization results are strong I would find it even more convincing if the model generalized to the full sample of HCP individuals.

5) The analysis of model weights and functional connectivity alterations between patient and control groups is somewhat confusing. Why does Figure 2 visualize the product of mean FC values and model weights (which will change depending on an individual's unique FC values) rather than just the raw model coefficients? Why do the patient/control difference scores incorporate model weights, rather than simply reflect changes in FC networks predicting working memory? It would be helpful to explain these choices in greater detail.

6) More details about the scrubbing procedure applied would be useful. Were frames before and after high-motion volumes excluded? What was the distribution of number of excluded volumes in each dataset? Did this differ by dataset or group?

7) The manuscript is lacking a discussion of predictive network anatomy, anatomy of networks that change vs. stay consistent between disorders, and implications for cognitive psychology or cognitive neuroscience. What do the current findings tell us about working memory and the functional networks that support it from a basic science perspective?

Reviewer #3:

The authors performed predictive models to examine FC-WMA across psychiatric diseases using a verbal 3-back task. This work was done using a set of data cohorts. The predicted effect size estimates on verbal WMA impairment were comparable to previous meta-analysis results. I personally enjoyed reading this manuscript. This is a very nice sample for reproducible brain-behavior association studies. I would be happy to support its acceptance of a publication in the eLife journal. However, I still have several concerns, which need to be fully addressed before the publication.

Summary of concerns:

1) It is highly important for studies using clinical patients to choose a measurement tool with high test-retest reliability. The authors employed ICA-derived networks as spatial profiles for whole brain FC modeling, however none of any references was given to support its reliability reaching to the clinically recognized request (ICC > 0.8). Is there any possibility of performing a test-retest analysis using public test-retest datasets (e.g., Consortium for Reliability and Reproducibility) to demonstrate the reliability matched to the level in clinic. At least, the literature on test-retest reliability of rfMRI metrics should be carefully documented if you cannot do it in the reasonable time frame (e.g., < 2months), see a review on this topic from my lab (PMID: 24875392). Meanwhile, dual regression with group ICA has been a highly reliable method, and the authors compared it with the FC method for the predictive modeling?

2) Head motion: Power et al. recently (PMID: 28880888) demonstrated that the order of performing preprocessing rfMRI data has effects on the performance of head motion removal. Of important relevance here is that the data should not be corrected for slice timing differences before the head motion estimated and reduced. The authors should check if their findings are influenced by such a change. Regarding the preprocessing, it is worth noting that ways of dealing with motion are different across data cohorts. How will this have an impact on reproducibility of the findings?

3) Demographical factors: It is widely known that age and sex have effects on FC, and how these two affect the observations reported here?

4) Figures: In Figure 1C, it is quite confusing that all the drawings of the graphical brain are the same across different clinical diagnoses (SCZ, MDD, OCD and ASD).

5) The authors have done a good work on dealing with head motion. However, just a curious point, several work also demonstrated potentially meaningful factors embedded in head motion as trait of human beings. At this point, interesting points related to the current work are: 1) Is there any relationship between motion and WMA? 2) Is there any correlation between global signal and WMA? 3) If so, what is the causal relationship among the four (motion, global signal, WMA and FC)?

6) Is there any plan in place to share the data publicly?

https://doi.org/10.7554/eLife.38844.023

Author response

[…] The paper has had two very thorough, excellent technical reviews. All major concerns are reasonable and should be addressed. Four concerns warrant specific commentary:

1) Reviewer 2's first point regarding the greater specificity of 2-back over 0-back to working memory and whether there are other cognitive processes at play.

To address this, we have performed additional analyses and – the details of which are in our response letter to reviewer #2. Briefly, the results can be summarized as follows: (i) we did not identify a significant difference in prediction accuracy of our model between the 2-back and 0-back scores; (ii) our model prediction was significantly correlated with another HCP working memory (list-sorting) score; (iii) the effect size of working memory deficits predicted by our model was more similar to digit span than general cognitive ability (IQ). Therefore, we demonstrated specificity of our model to working memory at two of three issues raised by reviewer #2.

2) Reviewer 3's first point regarding test-retest reliability. Ideally, you could pursue the reviewer's request here, although I note he has offered alternatives if this is not possible.

We elected to examine test-retest reliability of our methods for functional connectivity estimation that uses ICA-based intrinsic network definition. As suggested by reviewer #3, we used open data from the Consortium for Reliability and Reproducibility (CoRR) and calculated intra-class correlation (ICC) of functional connectivity values. These results are described in our response to reviewer #3’s comments, but briefly, we found that our methods are broadly comparable to other common connectivity estimation methods.

3) Much of the model (~34%) relies on left FP self-correlation – some sort of proxy for the internal coherence of that ICA map. None comes from the right FP – a slightly odd dependence on one feature and an asymmetry. It would be reassuring if this stood up to the test retest reliability analyses.

Our network definition is based on the BrainMap ICA which gives us useful information regarding the functional relevance of the ICA-derived networks based on meta-analyses of thousands of publications. In the BrainMap ICA paper (Laird et al., 2011; http://www.brainmap.org/icns/), IC18 (left fronto-parietal network; left FPN) showed greater functional relevance to working memory than IC15 (right FPN). Moreover, a meta-analysis on N-back task with different stimulus modality (Owen et al., 2005) found that monitoring of verbal stimuli was strongly associated with the left ventrolateral prefrontal cortex (a part of left FPN), while monitoring of spatial locations activated right lateralized frontal and parietal regions. In the current study, we used a letter 3-back task that requires encoding alphabet letters, which are more related to word monitoring than location monitoring. Therefore, the left FPN would be expected to contribute more to our prediction model than the right FPN.

Although much of the model relies on the within-network connectivity of the left FPN (~34% contribution), the right FPN also showed a substantial contribution to working memory via negative connectivity N2 (connection with the midbrain network, please see Figure 2) and N3 (connection with the superior parietal network) (~20% contribution).

We examined test-retest reliability by calculating intra-class correlation (ICC) for within-network connectivity of these networks using three different datasets from Consortium for Reliability and Reproducibility (CoRR) as fully described in our response to reviewer #3. Briefly, we found ICC values for the left FPN/right FPN: 0.28/0.23, 0.49/0.47, and 0.11/0.03 for BNU 1, IACAS 1, and Utah 1 dataset, respectively. According to an interpretation criteria of ICC (Landis and Koch, 1977), these results suggest that the two networks show comparable reliability with each other (0.2 < ICC < 0.5) for mid-term test-retest inter-session intervals (2-3 months: BNU 1 and IACAS datasets). Therefore, it is unlikely that difference in the test-retest reliability between the two networks induced the model’s dependence on left FPN.

We added the above issues to the Discussion section (eighth paragraph).

4) Given that you use parametric test statistics, it is not obvious why you also employed resampling to ascertain significance.

We applied the resampling tests to data shown in Figures 3A and 3C to ascertain significance of the correlation analysis between the predicted and actual working memory abilities in HCP datasets. However, we agree with the editor’s point that they are redundant. We have therefore removed the resampling results (Figures 3B and 3D) from our revised manuscript.

We used another resampling test in examination of effect of diagnosis and/or connection on degree of functional connectivity change (“Z-score”). Although a parametric two-way analysis of variance (ANOVA) requires homogenous variance across diagnosis/connection, some connections showed inhomogeneous variance across diagnoses. Thus, we used permutation tests to examine whether effect of diagnosis/connection significantly affects the degree of functional connectivity change. This is mentioned in the last paragraph of the subsection “Comparison of functional connectivity differences”.

Below, we provide an in-depth response to the individual issues raised by the reviewers.

Reviewer #2:

[…] 1) The authors take steps to show that model predictions are related to working memory ability specifically rather than cognitive ability more generally, but additional analyses would strengthen this claim. First, the measure of working memory in the HCP dataset includes 0-back task performance, which indexes sustained attention and attentional control rather than working memory. Are model predictions more closely related to 2-back than to 0-back task accuracy? Furthermore, because even 2-back and 3-back tasks measure a number of processes beyond working memory (Kane et al. 2007, J Exp Psychol Learn Mem Cogn), it would be informative to test whether model predictions are related to another measure of working memory such as performance on the NIH toolbox list-sorting task. Second, in the group-level analyses, are predicted working memory deficits more similar to working memory deficits observed in meta-analyses than they are to deficits observed in fluid intelligence or other cognitive domains (in terms of effect size or relative ordering across disorders)?

First, we have examined whether the model prediction is more similar to the 2-back score than the 0-back score. Spearman’s rho partial correlation between the model prediction and task performance was 0.078 for 2-back task and 0.086 for 0-back task, while factoring out two confounding variables (fluid intelligence and head motion). There was no significant difference between the two correlation coefficients. Therefore, we could not conclude that the model prediction is more similar to 2-back score than 0-back score.

According the reviewer’s comment, we investigated the correlation between the predicted working memory ability and performance of a list-sorting task, which is a working memory task included in the NIH toolbox. The list-sorting score correlated positively with fluid intelligence (Spearman’s rho 0.32, P = 5.7 x 10-13) and negatively with head motion (Spearman’s rho -0.12, P = 0.009). Therefore, we performed a partial correlation analysis while factoring out these two variables. We found significant positive correlation between the predicted working memory ability and the list sorting score (Spearman’s rho = 0.084, P = 0.034). These results provide an evidence that the prediction model predicts working memory capability measured by another working memory task other than the 3-back task. These results and methods are described in the Results section (subsection “Prediction in Independent Test Set of Healthy Individuals”) and Materials and methods section (subsection “Human Connectome Project (HCP) Dataset”).

Second, we examined meta-analyses on fluid intelligence performed in the psychiatric diagnoses to examine if predicted working memory deficits are more similar to working memory deficits observed in meta-analyses than deficits in fluid intelligence. We found no systematic meta-analysis or review from a transdiagnostic viewpoint. Therefore, we searched for a meta-analysis on fluid intelligence for each diagnosis and found a small number of reports on this issue, as follows:

Schizophrenia: Heinrichs and Zakzanis reported that mean effect size is larger in full scale IQ measured by WAIS-Revised (d = -1.24, number of studies k = 35) than digit span (d = -0.61, k = 18). Rajji et al. reported that effect size is larger in full scale IQ (first-episode: d = 0.89, k = 29; youth-onset: d = -1.77, k = 15; late-onset: d = -1.61, k = 4) than digit span (first-episode: d = 0.64, k = 24; youth-onset: d = -0.85, k = 7; late-onset: d = -0.87, k = 5). These two meta-analyses commonly suggest that schizophrenia patients show larger effect size in IQ reduction than working memory deficits.

MDD: We found a review paper about fluid intelligence in first episode depression. Ahern and Semkovska reported that mean effect size is larger in digit span (forward: d = -0.35, k = 3; backward: d = -0.33, k = 4) than IQ composite (d = -0.26, k = 10). The IQ composite effect size is similar to predicted working memory effect size (d = -0.29).

OCD: We found a meta-analysis on fluid intelligence in OCD patients. Abramovitch et al. reported that mean effect size in full scale IQ in OCD was -0.35 (k = 40). Snyder et al. reported smaller effect size for digit span forward (d = -0.08, k = 19) and backward (d = -0.21, k = 11).

ASD: We found no meta-analysis regarding general cognitive ability or intelligence.

Figure 4—figure supplement 1 shows effect sizes of deficit for fluid intelligence (IQ) as reported in the above meta-analyses (with the largest k for each diagnosis) in comparison to those for working memory. Note that the predicted working memory ability falls within the confidence interval of the IQ effect size only for first-episode MDD while it falls within confidence interval of effect size of working memory (forward digit span) for every diagnosis. Regarding the relative order of effect sizes, the effect size of fluid intelligence deficits can be ordered as schizophrenia, OCD, and MDD (first episode). In contrast, the effect sizes of working-memory deficits (as measured by digit-span task) can be ordered as schizophrenia, MDD and OCD, which is consistent with the order predicted by our model. Therefore, predicted working-memory deficits were more similar to observed deficits in working memory than those in fluid intelligence. These results are described in the Results section (subsection “Prediction in Four Distinct Psychiatric Disorders”, last paragraph) of our revised manuscript.

Collectively, although we cannot conclude that the model prediction is more similar to 2-back score than 0-back score, the model was able to predict another measure of working memory (i.e. List-sorting). Moreover, predicted working memory deficits are more similar to working-memory deficits observed in meta-analyses than deficits in fluid intelligence.

2) The lack of working memory measures presents challenges for the between-group comparisons in the patient samples. Although within each site the samples are matched on age and sex, the groups differ along a number of dimensions beyond working memory (e.g., medication status and potentially IQ and other cognitive abilities), and it's not clear whether patients and controls were scanned under the same protocols at each site, or whether protocols differed between sites. Limitations due to these potentially confounding factors should be clearly outlined in the manuscript. Related to this, are there working memory scores for the controls in the schizophrenia sample? If so, did predicted impairment reflect observed impairment in that sample, and does the model hold when applied to this full sample of patients and controls together?

We agree with the reviewer and added limitations to the Discussion section as follows:

“Second, working memory performance was measured only in schizophrenia patients but not in other diagnostic groups. Therefore, it was impossible to compare their predicted working memory ability with measured scores. This presents challenges for the between-group comparisons in the patient samples.”

“Third, although the participants are matched on age and sex within each site, the groups may differ along a number of dimensions beyond working memory (e.g., medication status, scanning protocol, and potentially IQ and other cognitive abilities). It is difficult to fully control every dimension, and little is known how such dimensions affect estimation of functional connectivity.”

We added lines to the table in Supplementary file 1 to make it clear how many patients and controls were scanned with each scanner and protocol.

We newly analyzed data including the full sample of schizophrenia patients and controls, and found a consistent result with the previous result in only the patient samples. Specifically, we found that the digit-sequencing score was correlated positively with composite BACS score excluding working memory (see our reply to the next comment) (r = 0.68, P = 2.0 x 10-17), and negatively with age (r = -0.36, P = 5.7 x 10-5), but not with head motion (r = -0.04, P = 0.68). While controlling the age and the composite BACS score, a partial correlation analysis showed that the model prediction is significantly correlated with the digit-sequencing score (ρ = 0.15, P = 0.048).

These results are included in the Results section of the revised manuscript (subsection “Prediction in Individual Schizophrenia Patients and Controls”, last paragraph).

3) It appears that the measure of general cognitive ability in the schizophrenia dataset includes a verbal memory measure. How correlated are working and verbal memory scores in this sample, and what is the justification for including it in the general cognitive ability score rather than treating it as a variable of interest?

First, we found positive correlation between observed working memory and verbal memory scores of BACS in the schizophrenia patients (r = 0.49, P = 1.0 x 10-3). Moreover, the working memory scores showed positive correlation with the other four sub scores (‘verbal fluency’, ‘motor speed’, ‘executive function’, and ‘attention and processing speed’) (see Author response image 1). Therefore, we needed to factor out effects of these correlated scores to examine if the prediction model was specific to the working memory.

Author response image 1
Pearson’s correlation matrix of BACS sub-scores.

The reason why we treated verbal memory as a variable of no-interest is that the verbal memory measured by BACS seemed very different from working memory measured by the letter 3-back task. We constructed the prediction model based on the letter 3-back task. This task is a recognition test in which alphabet letters (e.g. B, J) are presented moment to moment, and requires participants to update their memory and to group the letters into a unit (chunking). In contrast, the BACS verbal memory task is a free recall test of 15 words, so participants are required to recall lexical/semantic representations. Therefore, we assumed that cognitive functions necessary for performing the 3-back task are different from functions necessary for performing the verbal memory.

Thanks to reviewer #2’s comments, we realized that “verbal 3-back task” was a confusing term and changed it to “letter 3-back task”. Also, we found that the word “general cognitive ability” is not appropriate. Now we call it “composite BACS score excluding working memory”.

4) Why was the HCP 500-subject release (2014) used rather than the 900-subject release (2015) or the 1200-subject release (2017)? Although the external generalization results are strong I would find it even more convincing if the model generalized to the full sample of HCP individuals.

When we analyzed the data, only HCP 500-subject dataset was released. We are eager to test generalization of our model to the full sample of HCP. However, our analysis needs calculation of voxel-wise correlation when we calculate functional connectivity within a network (e.g., left fronto-parietal network). This calculation is time consuming process, which needs 8 hours for each subject. Therefore, it was impossible for us to finish the analysis of the full sample of HCP within a reasonable (i.e. two month) time-frame.

5) The analysis of model weights and functional connectivity alterations between patient and control groups is somewhat confusing. Why does Figure 2 visualize the product of mean FC values and model weights (which will change depending on an individual's unique FC values) rather than just the raw model coefficients? Why do the patient/control difference scores incorporate model weights, rather than simply reflect changes in FC networks predicting working memory? It would be helpful to explain these choices in greater detail.

We are interested in how working memory ability is determined by functional connectivity, and if the relationship between working memory ability and connectivity is altered by psychiatric disorders (e.g., if our model constructed from healthy controls can predict working memory of patients). The predicted working memory ability in our model is a weighted summation of connectivity values, meaning that alteration in working memory is determined by the product of connectivity values and model weights. For example, the working memory deficit caused by alteration of a specific connection is large, even if difference in a connectivity value between patients and controls is small, when the weight for the connection is large. Conversely, the working memory deficit is small, even if difference in a connectivity value is large, when the weight is small. Therefore, we mainly analyzed product of connectivity values and model weights.

We added the above explanations to our revised manuscript’s Methods and Materials section (subsection “Comparison of functional connectivity differences”, first paragraph).

6) More details about the scrubbing procedure applied would be useful. Were frames before and after high-motion volumes excluded? What was the distribution of number of excluded volumes in each dataset? Did this differ by dataset or group?

Regarding the first question, frames were excluded if the motion was excessive at each time-point (frame-wise displacement > 0.5 mm). We did not remove a frame before or after the excessive motion. We added these details about the scrubbing procedure to the Materials and methods section:

“We performed slice timing correction and then motion estimation. The estimated motion parameters were used to estimate excessive motion data by frame-wise displacement > 0.5 mm. We did not remove a frame before or after the excessive motion”.

We calculated the ratio of excluded volumes to the total number of volumes for each subject, and averaged within patients or controls for each diagnosis. They were 2.3 ± 5.5% / 1.4 ± 2.9% (patients/controls) for SCZ, 2.7 ± 6.6% / 2.4 ± 6.3% for MDD, 0.4 ± 0.8% / 0.7 ± 1.7% for OCD, and 1.4 ± 3.8% / 4.6 ± 8.5% for ASD. We found a significant difference in the ratio between patients and controls only for ASD (t97.3= 2.91, P = 4.4 x 10-3). However, it is unlikely that this difference caused a problem in our result because there was not a significant difference in the predicted working memory ability between ASD patients and their controls. We reported the ratios for each diagnosis (subsection “Multiple Psychiatric Diagnoses Dataset”, first paragraph).

7) The manuscript is lacking a discussion of predictive network anatomy, anatomy of networks that change vs. stay consistent between disorders, and implications for cognitive psychology or cognitive neuroscience. What do the current findings tell us about working memory and the functional networks that support it from a basic science perspective?

To increase interpretability of our results from the perspective of cognitive neuroscience, we grouped the 18 networks into seven clusters, each of which is more applicable to a functional understanding. We investigated how alterations of connections between these clusters affected working memory ability. We added the results (Figure 5—figure supplement 2) to the Results section and our interpretations to the Discussion section as follows:

In the Results section:

“To understand these results from global brain networks, we grouped the 18 networks into seven clusters based on the hierarchical clustering of the networks performed in the BrainMap ICA study (Laird et al., 2011). […] Note that we could not find any connections that have a node in the default-mode cluster, which did not appear in the figure.”

In the Discussion section:

“Our results identified alterations in large-scale network clusters that correlated with working memory impairment (Figure 5—figure supplement 2). […] This is consistent with our result that alterations of connections related to the motor/visuospatial networks were associated with lower working memory ability in schizophrenia, MDD, and ASD.”

Reviewer #3:

[…] 1) It is highly important for studies using clinical patients to choose a measurement tool with high test-retest reliability. The authors employed ICA-derived networks as spatial profiles for whole brain FC modeling, however none of any references was given to support its reliability reaching to the clinically recognized request (ICC > 0.8). Is there any possibility of performing a test-retest analysis using public test-retest datasets (e.g., Consortium for Reliability and Reproducibility) to demonstrate the reliability matched to the level in clinic. At least, the literature on test-retest reliability of rfMRI metrics should be carefully documented if you cannot do it in the reasonable time frame (e.g., < 2months), see a review on this topic from my lab (PMID: 24875392). Meanwhile, dual regression with group ICA has been a highly reliable method, and the authors compared it with the FC method for the predictive modeling?

We thank the reviewer for pointing out the important measurement of test-retest reliability. As the reviewer suggested, we examined ICC for the resting state functional connectivity method using Consortium for Reliability and Reproducibility (CoRR) database. We added the text below to the Results, Discussion, and Materials and Methods sections respectively.

as follows.

Materials and methods:

“1. Datasets

We analyzed data from Consortium for Reliability and Reproducibility (CoRR) that facilitates assessment of test-retest reliability and reproducibility for resting state functional connectivity (Zuo et al., 2014). We picked up following three datasets, Beijing Normal University (BNU 1), Institute of Automation, Chinese Academy of Sciences (IACAS 1), and University of Utah (Utah 1). We selected these datasets because 1) they have test-retest data across fMRI sessions, 2) ages of participants are comparable with those in our discovery dataset that was used for the construction of our model (ATR dataset), 3) two datasets include Asian participants (participants in ATR dataset are Japanese), 4) sizes of datasets are relatively small (we had to finish our analysis within 2 months).

[…]

3. Test–retest reliability

Intra-class correlation (ICC) was calculated for each of the 171 functional connectivity values (univariate test-retest reliability). ICC was calculated by following equation:

ICC = (MSb – MSw)/{MSb + (k – 1)MSw}

where, MSb is the between-subjects mean squared error and MSw is the within-subjects mean squared error and k is the number of independent fMRI measures (i.e., k = 2 for test and retest). We put negative ICC values to be zeros as done by previous studies (e.g., Zhang et al., 2011).”

Results:

“We obtained ICC values 0.34 ± 0.12 (range 0 to 0.65) for BNU 1, 0.26 ± 0.18 (range 0 to 0.66) for IACAS 1, and 0.21 ± 0.17 (range 0 to 0.59) for Utah 1 datasets. We found ICC values for the left FPN/right FPN: 0.28/0.23, 0.49/0.47, and 0.11/0.03 for BNU 1, IACAS 1, and Utah 1 dataset, respectively.”

Discussion

“According to an interpretation criteria of ICC (Landis and Koch, 1977), our connectivity estimation methods yielded “fair” reliability (0.2 < ICC ≤ 0.4) for the three datasets. A previous study on test-retest reliability of functional connectivity between 18 different brain regions (Birn et al., 2013), reported similar ICC values (ICC ~ 0.2) when scan length was 6 to 15 minutes. Another previous study examined functional connectivity reliability (Noble et al., 2017), using 268 regions from whole-brain also reported that 6 min of scan length yielded similar reliability (dependability coefficient ~ 0.2 to 0.4). Although our connectivity estimation methods cannot reach clinically recognized request (ICC > 0.8), these studies suggest that test-retest reliability of our methods are comparable to other common connectivity estimation methods.”

2) Head motion: Power et al. recently (PMID: 28880888) demonstrated that the order of performing preprocessing rfMRI data has effects on the performance of head motion removal. Of important relevance here is that the data should not be corrected for slice timing differences before the head motion estimated and reduced. The authors should check if their findings are influenced by such a change. Regarding the preprocessing, it is worth noting that ways of dealing with motion are different across data cohorts. How will this have an impact on reproducibility of the findings?

We re-analyzed the data by conducting the motion estimation before slice timing correction. We summarize the results below:

Regarding the individual-level prediction on the SCZ dataset, we found a marginally significant positive correlation between the predicted working memory and the digit-sequencing score (r = 0.21, P = 0.059), but partial correlation analysis did not show a significant correlation (rho = 0.10, P = 0.26) while factoring out the BACS composite score and age. We found the model prediction correlated not only with the digit-sequencing score but also with the BACS composite score (r = 0.21, P = 0.056). Based on the strong correlation between the digit-sequencing and the BACS composite score (r = 0.61), obtaining prediction specific to the subtest was a quite challenging goal. These results may reduce the model’s specificity to working memory at least for SCZ dataset but still the model prediction was related to cognitive ability rather than other confounding factors (age and motion).

Regarding the multiple psychiatric diagnoses dataset, we found almost the same results as our previous results even after changing this procedure, as follows.

We found two additional MDD patients with excessive motion (40% of data showed frame-wise displacement > 0.5 mm) and removed them from further analysis. We detected outliers in the model prediction within each group (defined as values > 3 SD from the mean): a patient of SCZ, two control participants of MDD, and a patient with OCD.

We identified significant differences in the predicted working memory between the patient and controls only for SCZ patients (two-tailed t-test for SCZ group: t115 = -3.11, P = (2.4 x 10-3) x 4 = 0.0096, Bonferroni corrected). This result is similar to the original results:

Original manuscript: “We identified significant differences in the predicted working memory between the patient and controls only for SCZ patients (two-tailed t-test for SCZ group: t116 = -3.68, P = (3.5 x 10-4) x 4 = 0.0014, Bonferroni corrected; Figure 4A)”

Next, we calculated individual patients’ Z-score (normalized difference between a patient and average of controls at the same site) of the predicted working memory for each diagnosis. A one-way ANOVA revealed a significant main effect of diagnosis on the Z-score (F3,242 = 6.09, P = 5.2 x 10-4). The severity of the predicted impairment in SCZ patients was larger than all other diagnoses (post-hoc Holm’s controlled t-test, adjusted P < 0.05). These results are similar with the original results:

Original manuscript: “A one-way ANOVA revealed a significant main effect of diagnosis on the Z-score (F3,245 = 7.63, P = 6.8 x 10-5). The severity of the predicted impairment in SCZ patients was larger than all other diagnoses (post-hoc Holm’s controlled t-test, adjusted P < 0.05).”

The predicted working memory alteration was more negative in the order of SCZ, MDD, OCD, and ASD with effect sizes (Hedge’s g) of -0.57, -0.29, -0.18, and 0.15, respectively. These results are similar to the original results:

Original manuscript: “The predicted working memory alteration was more negative in the order of SCZ, MDD, OCD, and ASD with effect sizes (Hedge’s g) of -0.68, -0.29, -0.16, and 0.09, respectively.”

These results of the group-level analyses were almost the same as our previous results after changing the order of slice timing correction and motion estimation.

3) Demographical factors: It is widely known that age and sex have effects on FC, and how these two affect the observations reported here?

At the model building stage, we found no significant effect of age and sex on predicted working memory as described in our revised manuscript:

“We did not find a significant correlation between the predicted letter 3-back learning performance and age (r = 0.21, P = 0.42), gender (r = 0.28, P = 0.28)”.

Next, we examined if age affected predicted working memory in multiple psychiatric diagnoses dataset. We found a significant negative correlation between age and predicted working memory in ASD at site 1 (r = -0.38, P = 1.8 x 10-3), suggesting that younger participants were predicted to have greater working memory. We found no significant effect of age on predicted working memory in the other groups.

We also examined how predicted working memory differed between males and females within each diagnosis. Consequently, we found no significant effect of sex on predicted working memory (t-tests: SCZ t116 = 0.37, P = 0.72; MDD t137 = 1.13, P = 0.26, OCD t90 = 0.17, P = 0.86, ASD t138 = 1.28, P = 0.20).

Because age and sex were controlled between patients and controls in each diagnosis (Table 1), it is unlikely that age and sex affect the observations in this study.

4) Figures: In Figure 1C, it is quite confusing that all the drawings of the graphical brain are the same across different clinical diagnoses (SCZ, MDD, OCD and ASD).

We thank the reviewer for pointing this out. We have changed the thickness of the lines for each diagnoses that illustrate connectivity difference among different diagnoses.

5) The authors have done a good work on dealing with head motion. However, just a curious point, several work also demonstrated potentially meaningful factors embedded in head motion as trait of human beings. At this point, interesting points related to the current work are: 1) Is there any relationship between motion and WMA? 2) Is there any correlation between global signal and WMA? 3) If so, what is the causal relationship among the four (motion, global signal, WMA and FC)?

1) We examined correlation between head motion and working memory.

ATR dataset: There was no significant correlation between head motion and observed 3-back task performance (r = 0.23, P = 0.37).

SCZ dataset:

Patients only (N = 58), there was no significant correlation between head motion and observed digit-sequencing performance (r = -0.03, P = 0.83).

Patients and controls (N = 118), there was no significant correlation between head motion and observed digit-sequencing performance (r = -0.04, P = 0.68).

HCP dataset: There was a significant correlation between head motion and observed N-back accuracy (Spearman’s rank correlation rho = -0.24, P = 1.5 x 10-7) and list-sorting score (Spearman’s rank correlation = -0.12, P = 0.009). Note that we factored out the head motion using a partial correlation analysis when we investigated a correlation between the predicted and actual working memory performance.

2) and 3) To our understanding, for calculation of global signal, we need to subtract the fMRI signal in a baseline period from that in a task period for every voxel in the brain and then average the subtracted values across the voxels. Such subtraction is needed because an fMRI signal value is arbitrary one which changes according to runs. However, in the current study, we measured only resting-state fMRI in which there was no baseline or task period. Therefore, we were unable to address the questions about the global signal.

6) Is there any plan in place to share the data publicly?

We shared data in which informed consents for data sharing were obtained from participants at https://bicr.atr.jp/dcn/en/download/database-wmp/. We clarified which data are not allowed to be shared due to the lack of informed consents in Supplementary file 1.

ATR dataset:

https://bicr.atr.jp/dcn/en/download/database-wmp/

Multiple psychiatric diagnoses dataset:

https://bicr.atr.jp/dcn/en/download/database-wmp/

HCP dataset:

https://www.humanconnectome.org

These are specified in “Data availability” subsection in our revised manuscript.

https://doi.org/10.7554/eLife.38844.024

Article and author information

Author details

  1. Masahiro Yamashita

    Brain Information Communication Research Laboratory Group, Advanced Telecommunications Research Institute International, Kyoto, Japan
    Contribution
    Conceptualization, Resources, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1520-2548
  2. Yujiro Yoshihara

    Department of Psychiatry, Graduate School of Medicine, Kyoto University, Kyoto, Japan
    Contribution
    Resources, Data curation
    Competing interests
    No competing interests declared
  3. Ryuichiro Hashimoto

    Medical Institute of Developmental Disabilities Research, Showa University, Tokyo, Japan
    Contribution
    Resources, Data curation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9661-3412
  4. Noriaki Yahata

    1. Department of Youth Mental Health, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
    2. Molecular Imaging Center, National Institute of Radiological Sciences, Chiba, Japan
    Contribution
    Resources, Data curation
    Competing interests
    No competing interests declared
  5. Naho Ichikawa

    Department of Psychiatry and Neurosciences, Graduate School of Biomedical and Health Sciences, Hiroshima University, Hiroshima, Japan
    Contribution
    Resources, Data curation
    Competing interests
    No competing interests declared
  6. Yuki Sakai

    1. Brain Information Communication Research Laboratory Group, Advanced Telecommunications Research Institute International, Kyoto, Japan
    2. Department of Psychiatry, Graduate School of Medical Science, Kyoto Prefectural University of Medicine, Kyoto, Japan
    Contribution
    Resources, Data curation
    Competing interests
    No competing interests declared
  7. Takashi Yamada

    1. Brain Information Communication Research Laboratory Group, Advanced Telecommunications Research Institute International, Kyoto, Japan
    2. Medical Institute of Developmental Disabilities Research, Showa University, Tokyo, Japan
    Contribution
    Resources, Data curation
    Competing interests
    No competing interests declared
  8. Noriko Matsukawa

    Department of Psychiatry, Graduate School of Medicine, Kyoto University, Kyoto, Japan
    Contribution
    Resources, Data curation
    Competing interests
    No competing interests declared
  9. Go Okada

    Department of Psychiatry and Neurosciences, Graduate School of Biomedical and Health Sciences, Hiroshima University, Hiroshima, Japan
    Contribution
    Resources, Data curation
    Competing interests
    No competing interests declared
  10. Saori C Tanaka

    Brain Information Communication Research Laboratory Group, Advanced Telecommunications Research Institute International, Kyoto, Japan
    Contribution
    Funding acquisition
    Competing interests
    No competing interests declared
  11. Kiyoto Kasai

    1. Department of Youth Mental Health, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
    2. Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
    Contribution
    Funding acquisition
    Competing interests
    No competing interests declared
  12. Nobumasa Kato

    Medical Institute of Developmental Disabilities Research, Showa University, Tokyo, Japan
    Contribution
    Funding acquisition
    Competing interests
    No competing interests declared
  13. Yasumasa Okamoto

    Department of Psychiatry and Neurosciences, Graduate School of Biomedical and Health Sciences, Hiroshima University, Hiroshima, Japan
    Contribution
    Funding acquisition
    Competing interests
    No competing interests declared
  14. Ben Seymour

    1. Brain Information Communication Research Laboratory Group, Advanced Telecommunications Research Institute International, Kyoto, Japan
    2. Computational and Biological Learning Laboratory, Department of Engineering, University of Cambridge, Cambridge, United Kingdom
    3. Center for Information and Neural Networks, National Institute of Information and Communications Technology, Osaka, Japan
    Contribution
    Conceptualization, Supervision, Funding acquisition, Project administration, Writing—review and editing
    For correspondence
    bjs49@cam.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1724-5832
  15. Hidehiko Takahashi

    Department of Psychiatry, Graduate School of Medicine, Kyoto University, Kyoto, Japan
    Contribution
    Conceptualization, Supervision, Funding acquisition, Project administration, Writing—review and editing
    For correspondence
    hidehiko@kuhp.kyoto-u.ac.jp
    Competing interests
    No competing interests declared
  16. Mitsuo Kawato

    Brain Information Communication Research Laboratory Group, Advanced Telecommunications Research Institute International, Kyoto, Japan
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing
    For correspondence
    kawato@atr.jp
    Competing interests
    No competing interests declared
  17. Hiroshi Imamizu

    1. Brain Information Communication Research Laboratory Group, Advanced Telecommunications Research Institute International, Kyoto, Japan
    2. Department of Psychology, The University of Tokyo, Tokyo, Japan
    Contribution
    Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Writing—original draft, Writing—review and editing
    For correspondence
    imamizu@gmail.com
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1024-0051

Funding

Council for Science, Technology and Innovation (ImPACT Program)

  • Masahiro Yamashita
  • Mitsuo Kawato
  • Hiroshi Imamizu

Japan Agency for Medical Research and Development (Brain/MINDS)

  • Kiyoto Kasai

Wellcome Trust

  • Ben Seymour

Arthritis Research UK (21357)

  • Ben Seymour

Ministry of Education, Culture, Sports, Science, and Technology (‘Development of BMI Technologies for Clinical Application’ of the Strategic Research Program for Brain Sciences and JP18dm0307008)

  • Mitsuo Kawato

Japan Society for the Promotion of Science (KAKENHI 26120002)

  • Hiroshi Imamizu

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This research was conducted as the ‘Development of BMI Technologies for Clinical Application’ of the Strategic Research Program for Brain Sciences supported by Japan Agency for Medical Research and Development (AMED). This research was supported by AMED under Grant Number JP18dm0307008. Drs. Yamashita, Kawato, and Imamizu were also supported by the ImPACT Program of Council for Science, Technology and Innovation (Cabinet Office, Government of Japan). Dr. Imamizu was also partially supported by JSPS KAKENHI Grant Number 26120002. Dr. Kasai was partially supported by Brain/MINDS, AMED. Data were provided in part by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University.

Ethics

Human subjects: ATR dataset was acquired using protocol (#12-101) according to the Declaration of Helsinki and approved by the Ethics Committee at Advanced Telecommunication Research Institute International. All participants gave written informed consent. Data from SCZ group was acquired by study design that was approved by the Committee on Medical Ethics (#R0027) of Kyoto University and was conducted in accordance with the Code of Ethics of the World Medical Association. All participants gave written informed consent. Data from MDD group was acquired by study protocol (#E-38) that was approved by the Ethics Committee of Hiroshima University. All participants gave written informed consent. Data from OCD group was acquired by study protocol (#RBMR-C-1098-5) that was approved by the Medical Committee on Human Studies at the Kyoto Prefectural University of Medicine. All participants gave written informed consent. Data from ASD group at the University of Tokyo was acquired by study protocol (#3048 and #3150) approved by the Ethics Committee of the Graduate School of Medicine and Faculty of Medicine at the University of Tokyo. All participants gave written informed consent. Data from ASD group at Showa University was acquired by study protocol (#893) that was approved by Ethics Committee of the Faculty of Medicine of Showa University. All participants gave written informed consent.

Senior Editor

  1. Michael J Frank, Brown University, United States

Reviewing Editor

  1. Michael Breakspear, QIMR Berghofer Medical Research Institute, Australia

Publication history

  1. Received: June 1, 2018
  2. Accepted: December 8, 2018
  3. Accepted Manuscript published: December 10, 2018 (version 1)
  4. Version of Record published: January 8, 2019 (version 2)

Copyright

© 2018, Yamashita et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 944
    Page views
  • 195
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)