1. Neuroscience
Download icon

Cortical network architecture for context processing in primate brain

  1. Zenas C Chao Is a corresponding author
  2. Yasuo Nagasaka
  3. Naotaka Fujii Is a corresponding author
  1. RIKEN Brain Science Institute, Japan
Research Article
Cited
1
Views
1,777
Comments
0
Cite as: eLife 2015;4:e06121 doi: 10.7554/eLife.06121

Abstract

Context is information linked to a situation that can guide behavior. In the brain, context is encoded by sensory processing and can later be retrieved from memory. How context is communicated within the cortical network in sensory and mnemonic forms is unknown due to the lack of methods for high-resolution, brain-wide neuronal recording and analysis. Here, we report the comprehensive architecture of a cortical network for context processing. Using hemisphere-wide, high-density electrocorticography, we measured large-scale neuronal activity from monkeys observing videos of agents interacting in situations with different contexts. We extracted five context-related network structures including a bottom-up network during encoding and, seconds later, cue-dependent retrieval of the same network with the opposite top-down connectivity. These findings show that context is represented in the cortical network as distributed communication structures with dynamic information flows. This study provides a general methodology for recording and analyzing cortical network neuronal communication during cognition.

https://doi.org/10.7554/eLife.06121.001

eLife digest

If we see someone looking frightened, the way we respond to the situation is influenced by other information, referred to as the ‘context’. For example, if the person is frightened because another individual is shouting at them, we might try to intervene. However, if the person is watching a horror video we may decide that they don't need our help and leave them to it. Nevertheless, it is not clear how the brain processes the context of a situation to inform our response.

Here, Chao et al. developed a new method to study electrical activity across the whole of the brain and used it to study how monkeys process context in response to several different social situations. In the experiments, monkeys were shown video clips in which one monkey—known as the ‘video monkey’—was threatened by a human or another monkey, or in which the video monkey is facing an empty wall (i.e., in three different contexts). Afterwards, the video monkey either displays a frightened expression or a neutral one. Chao et al. found that if the video monkey looked frightened by the context, the monkeys watching the video clip shifted their gaze to observe the apparent threat. How these monkeys shifted their gaze depended on the context, but this behavior was absent when the video monkey gave a neutral expression.

The experiments used an array of electrodes that covered a wide area of the monkeys' brains to record electrical activity of nerve cells as the monkeys watched the videos. Chao et al. investigated how brain regions communicated with each other in response to different contexts, and found that the information of contexts was presented in the interactions between distant brain regions. The monkeys' brains sent information from a region called the temporal cortex (which is involved in processing sensory and social information), to another region called the prefrontal cortex (which is involved in functions such as reasoning, attention, and memory). Seconds later, the flow of information was reversed as the monkeys utilized information about the context to guide their behavior.

Chao et al.'s findings reveal how information about the context of a situation is transmitted around the brain to inform a response. The next challenge is to experimentally manipulate the identified brain circuits to investigate if problems in context processing could lead to the inappropriate responses that contribute to schizophrenia, post-traumatic stress disorder and other psychiatric disorders.

https://doi.org/10.7554/eLife.06121.002

Introduction

Context is the contingent sensory or cognitive background for a given situation. Different contexts can dramatically alter perception, cognition, or emotional reactions and decision-making, and in the brain network context can be represented during sensory encoding or mnemonic retrieval. The study of context is important for understanding the link between perception and cognition, in terms of both behavioral and neural processing, and the neural mechanisms underlying contextual information processing have been studied in a variety of domains including visual perception (Bar, 2004; Schwartz et al., 2007), emotion (De Gelder, 2006; Barrett et al., 2007; Barrett and Kensinger, 2010), language (Hagoort, 2005; Aravena et al., 2010), and social cognition (Ibañez and Manes, 2012). In the brain, context is proposed to require an interplay between bottom-up and top-down information processing in distributed neural networks (Tononi and Edelman, 1997; Friston, 2005). However, a comprehensive functional view of the brain circuits that mediate contextual processing remains unknown because bottom-up and top-down processes are often concurrent and interdependent, making the temporal and spatial resolution of their neural network organization difficult to separate.

To understand contextual information processing, we developed a fundamentally new approach to study high-resolution brain network architecture. The approach combines broadband neural recording of brain activity at high spatial and temporal resolution with big data analytical techniques to enable the computational extraction of latent structure in functional network dynamics. We employed this novel methodological pipeline to identify functional network structures underlying fast, internal, concurrent, and interdependent cognitive processes during context processing in monkeys watching video clips with sequentially staged contextual scenarios. Each scenario contained a conspecific showing emotional responses preceded by different situational contexts. With specific combinations of context and response stimuli, this paradigm allowed an examination of context-dependent brain activity and behavior by isolating context processing as a single variable in the task.

To measure large-scale brain network dynamics with sufficient resolution, we used a 128-channel hemisphere-wide high-density electrocorticography (HD-ECoG) array to quantify neuronal interactions with high spatial, spectral, and temporal resolution. This ECoG system has wider spatial coverage than conventional ECoG and LFP (Buschman and Miller, 2007; Pesaran et al., 2008; Haegens et al., 2011) and single–unit activity (Gregoriou et al., 2009), higher spatial resolution than MEG (Gross et al., 2004; Siegel et al., 2008), broader bandwidth than EEG (Hipp et al., 2011), and superior temporal resolution to fMRI (Rees and House, 2005; Freeman et al., 2011). After recording, we interrogated large-scale functional network dynamics using a multivariate effective connectivity analysis to quantify information content and directional flow within the brain network (Blinowska, 2011; Chao and Fujii, 2013) followed by big data analytical approaches to search the database of broadband neuronal connectivity for a latent organization of network communication structures.

Results

Large-scale recording of brain activity during video presentations

Monkeys watched video clips of another monkey (video monkey, or vM) engaging with a second agent (Figure 1) while cortical activity was recorded with a 128-channel ECoG array covering nearly an entire cerebral hemisphere. Three monkeys participated, one with a right hemisphere array (Subject 1), and two in the right (Subjects 2 and 3) (Figure 1—figure supplement 1). The data are fully accessible online and can be downloaded from the website Neurotycho.org.

Figure 1 with 1 supplement see all
Subjects observe situational contexts with high-density electrocorticography (HD-ECoG) recording.

We recorded 128-channel HD-ECoG signals from monkeys viewing video clips of a conspecific under three different situational contexts and two responses. The subject (lower-left, green circles represent ECoG electrodes) was seated in front of a TV monitor showing video clips consisting of a Waiting period of 2.5 s followed by a Context period of 1.5 s with one of three interactions between a video monkey (vM) on the left and a second agent on the right: vM threatened by a human (Ch), threatened by another monkey (Cm), or an empty wall (Cw). Next, a curtain closed to conceal the second agent followed by a Response period of 3 s with the vM showing either a frightened (Rf), or neutral expression (Rn). Pairwise combination of the contexts and responses produced six different video clips.

https://doi.org/10.7554/eLife.06121.003

The video clips started with a context between the two agents (Context period) followed by a response to the context (Response period). Six different video clips were created from three contexts, vM threatened by a human (Ch), threatened by another monkey (Cm), or facing an empty wall (Cw), combined with two responses, vM showing a frightened expression (Rf), or neutral expression (Rn), which were termed ChRf, CmRf, CwRf, ChRn, CmRn, and CwRn (see Videos 1–6). Each video contained audio associated with the event, for example, sounds of a threatening human (Ch) and a frightened monkey (Rf). Each video represents a unique social context-response scenario, For example, ChRf shows a human threatening a monkey (vM) followed by the monkey's frightened response. These staged presentations were designed to examine whether different contexts (Ch, Cm, or Cw) would give rise to context-dependent brain activity even with the same responses (Rf or Rn).

Eye movements demonstrate context- and response-dependent behaviors

During the task, subjects freely moved their eyes to observe the video interactions. We monitored eye movements to examine these spontaneous behavioral reactions and the associated zones in the video. We divided the trials into two conditions based on whether the context stimulus was visually perceived: C+ where the subject was looking at the screen during the Context period, and C where the subject was either closing its eyes or looking outside of the screen. Example eye movements are shown in Figure 2—figure supplement 1.

We first investigated which side of the video monitor the monkey attended. When the context was perceived (C+) and the response stimulus was Rf, subjects focused more on the right section during the Response period, indicating interest in the curtain, or the threat behind the curtain, than the frightened vM (Figure 2A). This preference was absent when the response stimulus was Rn or the context was not visually perceived (C). This is behavioral evidence that gaze direction preference required not only the vM response, but also perception of the preceding context, which demonstrated a cognitive association between the perception of the context and response stimuli.

Figure 2 with 1 supplement see all
Context- and response-dependent eye movements.

(A) Measurements of gaze shifting revealed a behavioral association between the context and response phases of the task. The gaze positions averaged from three subjects are shown for each trial type (ChRf, CmRf, CwRf, ChRn, CmRn, and CwRn) and condition (C+ and C). Gaze shifting was quantified by gaze positions significantly different from baseline values (αBonf = 0.05, baseline: gray bar), and was found only in Rf trials under the C+ condition (upper-left panel, the timing of gaze shifts are indicated on top, where the color represents the trial type indicated on the right). Black vertical lines represent the following events (see labels on the x-axis): (a) onset of the Context period, (b) the curtain starts closing, (c) the curtain is fully closed, (d) onset of the Response period, and (e) end of the Response period (onset of the next trial). (B) Context and response dependence in gazing behavior. Gaze positions between different trial types were compared, separately in C+ and C. For each comparison (y-axis), the timing of significant differences are shown as circles (αBonf = 0.05), where blue, green, and red circles represent context dependence in Rf, context dependence in Rn, and response dependence, respectively. Gazing behavior showed both response dependence and context dependence, but only in C+.

https://doi.org/10.7554/eLife.06121.011

We then compared gazing behaviors from different trial types to identify behaviors selective to different scenarios (ChRf, CmRf, CwRf, ChRn, CmRn, or CwRn) and conditions (C+ or C). We performed nine pairwise comparisons on gaze positions from different scenarios, separating C+ and C conditions, to examine their context and response dependence. For context dependence, we compared behaviors from trials with different context stimuli but the same response stimulus (6 comparisons: CmRf vs CwRf, CwRf vs ChRf, and ChRf vs CmRf for context dependence in Rf; CmRn vs CwRn, CwRn vs ChRn, and ChRn vs CmRn for context dependence in Rn). For response dependence, we compared behaviors from trials with the same context stimulus but with different response stimuli (3 comparisons: CmRf vs CmRn, CwRf vs CwRn, and ChRf vs ChRn).

A context and response dependence was found in gazing behavior (Figure 2B). In C+, significant differences in gaze position were found during the Response period between CmRf and CwRf, and CwRf and ChRf, but not between ChRf and CmRf (blue circles in left panel). This indicated that gaze shifting in CmRf and ChRf was comparable and stronger than in CwRf. This context dependence was absent when the response stimuli were Rn (green circles in left panel). Furthermore, a significant response dependence was found during the Response period for all contexts (CmRf vs CmRn, CwRf vs CwRn, and ChRf vs ChRn) (red circles in left panel) consistent with the results described in Figure 2A. In C, the context and response dependence found in C+ was absent (right panel). These results indicated that the subjects' gaze shift during the Response period showed both response dependence (Rf > Rn) and context dependence (CmCh > Cw), but only when context was perceived (C+ > C).

Mining of large-scale ECoG data for cortical network interactions

To analyze the large-scale ECoG dataset, we identified cortical areas over the 128 electrodes in the array by independent component analysis (ICA). Each independent component (IC) represented a cortical area with statistically independent source signals (Figure 3—figure supplement 1, and experimental parameters in Table 1).

Table 1

Experimental parameters

https://doi.org/10.7554/eLife.06121.013
Subject 1Subject 2Subject 3
ExperimentHemisphere implantedRightLeftLeft
# of electrodes128128128
# of trials per class150150150
# of trials preserved per class (mean ± std) (see trial screening in ‘Materials and methods’)117.7 ± 3.5122.2 ± 3.1109.5 ± 3.6
# of C+ trials per class (mean ± std)64.8 ± 5.260.3 ± 6.757.3 ± 5.6
# of C trials per class (mean ± std)52.8 ± 1.961.8 ± 8.152.2 ± 3.5
ICA (see Figure 3—figure supplement 1)# of ICs for 90% variance explained583839
# of ICs preserved (see IC screening in ‘Materials and methods’)49 (removed ICs 1, 2, 3, 4, 5, 11, 44, 46, and 47)33 (removed ICs 1, 2, 7, 8, and 29)36 (removed ICs 2, 10, and 27)

We then measured the causality of a connection from one cortical area (source area) to another (sink area) with a multivariate effective connectivity measure based on Granger causality: direct directed transfer function (dDTF) (Korzeniewska et al., 2003), which can represent phase differences between the two source signals to provide a time-frequency representation of their asymmetric causal dependence. We acquired dDTFs from all connections for each trial type (12 types: six scenarios and two conditions), and measured event-related causality (ERC), by normalizing the dDTF of each time point and each frequency bin to the median of the corresponding baseline control values. Thus, ERCs represent the spectro-temporal dynamics of network interactions evoked by different scenarios and conditions. Examples of ERCs are shown in Figure 3A.

Figure 3 with 3 supplements see all
Identification of latent structures in context- and response-dependent cortical network interactions.

(A) Event-related causalities (ERCs) between cortical areas. Example ERCs for a connection (IC 8 to IC 14, the corresponding cortical areas shown on the top) in two scenarios (CmRf and CwRf in C+) from Subject 1 are shown. Each ERC represents the spectro-temporal dynamics of causality evoked by a scenario, calculated as the logarithmic ratio between the direct directed transfer function (dDTF) and corresponding baseline values (baseline: gray bar), and measured in decibel (dB). Black vertical lines represent task events explained in Figure 2. (B) ∆ERCs, or the significant differences in ERCs between the two trial types (CmRfCwRf) (αFDR = 0.05, false discovery rate correction) are shown. The results were either 0 (no significant difference), +1 (significantly greater), or −1 (significantly weaker). (C) 3D tensor of ∆ERCs. The data for the entire study were organized in three dimensions: dynamics (top), function (middle), and anatomy (bottom). Top: ∆ERCs shown in B describe the dynamics of difference in causality of a connection between two trial types, presented as a vector in 3D space (illustrated as a bar, where each segment represents a ∆ERC value). Middle: For the same connection, ∆ERCs from other comparisons were pooled to describe the functional dynamics of the connection (illustrated as a plate). Bottom: Functional dynamics from all connections were pooled to summarize the functional network dynamics in a subject (illustrated as a block). The data from all subjects were further combined to assess common functional network dynamics across subjects. (D) Parallel factor analysis (PARAFAC) extracted five dominant structures from the 3D tensor with consistency (>80%, also see Figure 3—figure supplement 2). Each structure represented a unique pattern of network function, dynamics, and anatomy (e.g., Func. 1, Dyn. 1, and Anat. 1 for Structure 1).

https://doi.org/10.7554/eLife.06121.014

We compared ERCs from different trial types to identify networks selectively activated in different scenarios (ChRf, CmRf, CwRf, ChRn, CmRn, or CwRn) and conditions (C+ or C). We performed nine pairwise comparisons on ERCs from different scenarios, separating C+ and C trials, to examine their context and response dependence. To examine context dependence, we compared ERCs from trials with different contexts but the same response (6 comparisons). In contrast, to examine response dependence, we compared ERCs from trials with the same context but with different responses (3 comparisons). This approach is similar to the eye movement analysis (Figure 2B). The comparisons were performed with a subtractive approach to derive a significant difference in ERCs (∆ERCs, Figure 3B). Hence, ∆ERCs revealed network connections, with corresponding time and frequency, where ERCs were significantly stronger or weaker in one scenario compared to another.

We pooled ∆ERCs from all comparisons, conditions, connections, and subjects, to create a comprehensive broadband library of network dynamics for the entire study. To organize and visualize the dataset, we created a tensor with three dimensions: Comparison-Condition, Time-Frequency, and Connection-Subject, for the functional, dynamic, and anatomical aspects of the data, respectively (Figure 2C). The dimensionality of the tensor was 18 (nine comparisons under two conditions) by 3040 (160 time windows and 19 frequency bins) by 4668 (49 × 48 connections for Subject 1, 33 × 32 for Subject 2, and 36 × 35 for Subject 3).

To extract structured information from this high-volume dataset, we deconvolved the 3D tensor into multiple components by performing parallel factor analysis (PARAFAC), a generalization of principal component analysis (PCA) to higher order arrays (Harshman and Lundy, 1994) and measured the consistency of deconvolution under different iterations of PARAFAC (Bro and Kiers, 2003). Remarkably, we observed five dominant structures from the pooled ∆ERCs that represented functional network dynamics, where each structure contained a comprehensive fingerprint of network function, dynamics, and anatomy (Figure 3D, and Figure 3—figure supplement 2). These five structures were robust against model order selection for ICA (Figure 3—figure supplement 3).

Discrete structured representations of functional network dynamics

The five structures are shown in Figure 4 (Structures 1 and 2) and Figure 5 (Structures 3, 4, and 5). Each structure represented a unique functional network dynamics, described by its compositions in the three tensor dimensions. The first tensor dimension (panel A) represented the differences across comparisons for each structure. We identified the significant differences and reconstructed the activation levels to show how each structure was activated under different scenarios and conditions (see the ‘Materials and methods’). The second tensor dimension (panel B) represented spectro-temporal dynamics for each structure. The third tensor dimension (panel C) represented the anatomical connectivity pattern for each structure. We measured three connectivity statistics: (1) causal density is the sum of all outgoing and incoming causality for each area, showing areas with busy interactions; (2) causal outflow is the net outgoing causality of each area, indicating the sources and sinks of interactions; and (3) maximum flow between areas is the maximal causality of all connections between cortical areas (7 areas found with busy interactions were chosen) (see results for individual subjects in Figure 4—figure supplements 1–3). The extracted statistics were robust across all subjects with different electrode placements suggesting that the structures were bilaterally symmetric across hemispheres (Figure 4—figure supplement 4).

Figure 4 with 4 supplements see all
Network structures for perception of context and response.

Each structure was defined by three dimensions: function, dynamics, and anatomy. (A) Function: The function dimension showed each structure's context and response dependence. Top: For each structure, the first tensor dimension contained 18 differences for nine pairwise comparisons in the C+ or C condition. Significant differences are highlighted (*, α = 0.05, see the ‘Materials and methods’). Bottom: The comparisons with significant differences were used to reconstruct how each structure was selectively activated. Each oval and its vertical position represent the trial type and its activation level, respectively. Blue, green, or red arrows indicate significant context dependence under Rf, significant context dependence under Rn, and significant response dependence, respectively (each corresponds to a significance highlighted in the top panel). (B) Dynamics: The dynamics dimension indexed each structure's activation in different times and frequencies. Black vertical lines represent events, as explained in Figure 2. (C) Anatomy: The anatomy dimension showed each structure's activation in different connections. Three connectivity statistics, averaged across subjects after brain map registration, are shown on the lateral and medial cortices. Top: Cortical areas with greater causal density represent areas with busier interactions. Middle: Cortical areas with positive (red) and negative (blue) causal outflows represent the sources and sinks of interactions, respectively. Bottom: The direction and strength of each maximum flow between areas are indicated by the direction and size (and color) of an arrow, respectively. Seven cortical areas were determined for visualization: the visual (V), parietal (P), prefrontal (PF), medial prefrontal (mPF), motor (M), anterior temporal (aT), and posterior temporal (pT) cortices.

https://doi.org/10.7554/eLife.06121.018
Figure 5 with 1 supplement see all
Network structures for context representation and modulation.

The function (A), dynamics (B), and anatomy (C) dimensions of Structures 3, 4, and 5. Structures 3 and 4 represent the initial formation/encoding and later reactivation/retrieval of abstract context information, respectively, and Structure 5 represents context-dependent top-down feedback that modulates eye gaze or visual attention. Same presentation details as in Figure 4.

https://doi.org/10.7554/eLife.06121.023

Structure 1 was activated first, with context dependence only in the Context period (Cm > Ch > Cw) suggesting sensory processing that can discriminate between contextual stimuli, that is, context perception. The context dependence was weaker but remained in C (C+ > C), suggesting that auditory information in context stimuli was processed. The spectral dynamics of Structure 1 emerged primarily in the high-γ band (>70 Hz), and contained mostly bottom-up connections from the posterior to anterior parts of temporal cortex.

Structure 2 was the earliest activated in the Response period, with only response dependence (Rf > Rn), and was independent of whether context stimuli were visually perceived (C+C). Thus, Structure 2 corresponds to sensory processing that can discriminate between response stimuli, that is, response perception. Spectrally, Structure 2 emerged in both high-γ and β bands, and contained connections similar to those in Structure 1, with an additional communication channel from anterior temporal cortex to the prefrontal cortex (PFC). Therefore, Structures 1 and 2 represent the multisensory processing of audiovisual stimuli, and Structure 2 could underlie the additional evaluation of emotional valence associated with the stimuli.

Structure 3 was activated the second earliest in the Context period showing a generalized context dependence (CmCh > Cw) representing the abstract categorization of the context (‘an indeterminate agent is threatening vM’). Similar to Structure 1, the dependence in Structure 3 was weaker in C (C+ > C), which suggested that the creation of abstract contextual information depended on the initial perception of context stimuli. Structure 3 appeared mainly in the β band (10–30 Hz), and contained primarily bottom-up connections from the posterior temporal cortex (mainly the area TEO) to the anterior temporal cortex (mainly the temporal pole) and the lateral and medial PFC.

Structure 4 showed the same generalized context dependence as Structure 3, but during the Response period when context stimuli were absent and only in C+Rf (not in C+Rn and C). The absence of context dependence in C+Rn and C suggested that Structure 4 required both vM responses with high emotional valence and its context. Moreover, Structure 4 exhibited spatial and spectral characteristics similar to Structure 3 (Figure 5—figure supplement 1). We conclude that Structures 3 and 4 represent the same or very similar neural substrate, differing only in when and how they were activated. Structure 3 corresponds to the initial formation/encoding of the contextual information, while Structure 4 represents the Rf -triggered reactivation/retrieval of the contextual information. Therefore, Structures 3 and 4 represent the generalized, abstract perceptual and cognitive content of the context.

Structure 5 showed context dependence (CmCh > Cw) in C+Rf (not in C+Rn and C), and response dependence (Rf > Rn) in C+ (not in C) during the Response period, and appeared mainly in α and low-β bands (5–20 Hz). Anatomically, the structure showed primarily top-down connections between posterior temporal cortex, the anterior temporal cortex, and the lateral and medial PFC. Remarkably, Structure 5 is the only one demonstrating clear top-down connections, with the same context and response dependence as the gaze behavior (see Figure 2B). These results suggest that Structure 5 corresponds to a network for the context-dependent feedback modulation of eye gaze or visual attention during the task, and the other four structures index internal processes that lead to this behavioral modulation.

Functional, dynamical, and anatomical correlations between network structures

We investigated how the structures coordinated with each other during the task by examining how they correlated with each other in the functional, dynamical, and anatomical domains.

To study function, we evaluated how each structure's context and response dependence correlated with others', by measuring correlation coefficients of structures' differences in comparisons across contexts in Rf, across contexts in Rn, and across responses (Figure 6A) (detailed in the ‘Materials and methods’). Significant correlations between two structures indicated that one structure's activation affected another's, and vice versa, demonstrating a causal interdependence or a common external driver. Across contexts in Rf (Figure 6A, left), Structure 1 significantly correlated to Structures 3 and 4, which were themselves significantly correlated to Structure 5. However, across contexts in Rn (Figure 6A, middle), a significant correlation was found only between Structures 1 and 3. These results confirmed that sensory perception of the context stimuli could be significantly correlated to the formation of an abstract context, and, in turn this abstract context could be significantly correlated to its reactivation and top-down modulation when a response had high emotional valence. Across responses (Figure 6A, right), Structure 2 significantly correlated to Structure 4, which was itself significantly correlated to Structure 5. This indicated that that top-down modulation is the integration of response information and abstract context information.

Coordination and co-activation of network structures.

(A) Functional coordination: The coordination between structures was evaluated by the correlation coefficients between structures' context and response dependence (the differences shown in Figures 4A, 5A). Each panel illustrates how Structure i (y-axis) correlated with Structure j (x-axis) in context dependence in Rf (left), context dependence in Rn (middle), and response dependence (right). Significant correlations are indicated as asterisks (α = 0.05) (see ‘Materials and methods’). (B) Dynamic co-activation: The dynamics correlation was shown by correlation coefficients between structures' temporal and spectral activation. Each panel shows how Structure i correlated with Structure j in temporal dynamics (left) and frequency profile (right). Significant correlations are indicated as asterisks (α = 0.05). (C) Anatomical overlap: The anatomical similarity was indexed by the ratio of shared anatomical connections between structures. Each panel illustrates the ratio of the number of shared connections between Structures i and j and the total number of connections in Structure i. Results obtained from three subjects are shown separately. (D) Undirected pathways of connections shared by all structures for each subject (top), and those appearing in at least one structure for each subject (bottom). The lateral cortical surface is shown on the left for Subject 1, and on the right for Subjects 2 and 3. Shared pathways (lines) between two cortical areas (circles) of the top 1, 5, 10, and 25% connections are shown. Pathways with greater strengths are overlaid on those with weaker strengths.

https://doi.org/10.7554/eLife.06121.025

To examine dynamics, we tested whether network structures had mutually exclusive or overlapping spectro-temporal dynamics. We measured the temporal dynamics of each structure by summing up the activation in the second tensor dimension across frequencies. Significant correlations in temporal dynamics were found between structures activated in the Context period (Structures 1 and 3) and the Response period (Structures 2, 4, and 5) (Figure 6B, left). We then measured the spectral profile of each structure by summing up the activation in the second tensor dimension over time. Significant correlations in spectral profiles were found among structures with β band activation (Structures 2, 3, 4, and 5) (Figure 6B, right).

To investigate anatomy, we identified directed connections with the top 10% strengths in the third tensor dimension, and examined the shared top 10% connections between structures for each subject (Figure 6C). The numbers of shared connections between all structures were particularly high (>70% shared) between Structures 3 and 4 (abstract contextual information), and Structures 1 and 2 (perception). We examined the undirected pathways that exclude the directionality of connections and found pathways shared by all structures and subjects in and from the temporal cortex to PFC (Figure 6D, top). Pathways appearing in at least one structure were widespread across cortex (Figure 6D, bottom).

These results demonstrate the functional coordination and spatio-spectro-temporal co-activation of the five identified network structures, and reveal the multiplexing property of large-scale neuronal interactions in brain: simultaneous information transfer in similar frequency bands along similar anatomical pathways could be functionally reconstituted into distinct cognitive operations depending on other networks' ongoing status. This type of information would be difficult to extract from traditional EEG/MEG/fMRI analyses.

Discussion

In this study, we demonstrate that context can be represented by dynamic communication structures involving distributed brain areas and coordinated within large-scale neuronal networks, or neurocognitive networks (Varela et al., 2001; Fries, 2005; Bressler and Menon, 2010; Siegel et al., 2012). Our analysis combines three critical properties of neurophysiology—function, dynamics, and anatomy—to provide a high-resolution large-scale description of brain network dynamics for context. The five network structures we identified reveal how contextual information can be encoded and retrieved to modulate behavior with different bottom-up or top-down configurations. The coordination of distributed brain areas explains how context can regulate diverse neurocognitive operations for behavioral flexibility.

Context is encoded by interactions of large-scale network structures

These findings show that context can be encoded in large-scale bottom-up interactions from the posterior temporal cortex to the anterior temporal cortex and the lateral and medial PFC. The PFC is an important node in the ‘context’ network (Miller and Cohen, 2001; Bar, 2004), where the lateral PFC is believed to be critical for establishing contingencies between contextually related events (Fuster et al., 2000; Koechlin et al., 2003), and the medial PFC is involved in context-dependent cognition (Shidara and Richmond, 2002) and conditioning (Fuster et al., 2000; Koechlin et al., 2003; Frankland et al., 2004; Quinn et al., 2008; Maren et al., 2013). Our results indicate that abstract contextual information can be encoded not only within the PFC, but in PFC interactions with lower-level perceptual areas in the temporal cortex. These dynamic interactions between unimodal sensory and multimodal association areas could explain the neuronal basis of why context networks can affect a wide range of cognitive processes, from lower-level perception to higher-level executive functions.

Apart from the bottom-up network structure that encodes abstract context, we discovered other network structures that process either lower-level sensory inputs for context encoding or integrate contextual information for behavioral modulation. Evidently, brain contextual processing, from initial perception to subsequent retrieval, is represented not by sequential activation but rather sequential modular communication among participating brain areas. Thus, we believe that the network structures we observed represent a module of modules, or ‘meta-module’ for brain communication connectivity. Further investigation of this meta-structure organization for brain network communication could help determine how deficits in context processing in psychiatric disorders such as schizophrenia (Barch et al., 2003) and post-traumatic stress disorder (Milad et al., 2009) could contribute to their etiology.

Cognition as a modular organization consisting of network structures

These results suggest a basic structural organization of large-scale communication within brain networks that coordinate context processing, and provide insight into how apparently seamless cognition is constructed from these network communication modules. In contrast to previous studies where brain modularity is defined as a ‘community’ of spatial connections (Bullmore and Sporns, 2009; Sporns, 2011), or coherent oscillations among neuronal populations in overlapping frequency bands (Siegel et al., 2012), our findings provide an even more general yet finer grained definition of modularity based on not only anatomical and spectral properties, but also temporal, functional, and directional connectivity data. The relationships among network properties in the functional, temporal, spectral, and anatomical domains revealed network structures whose activity coordinated with each other in a deterministic manner (Figure 7A), despite being highly overlapping in time, frequency, and space (Figure 7B). Such multiplexed, yet large-scale, neuronal network structures could represent a novel meta-structure organization for brain network communication. Further studies will be needed to show whether these structures are components of cognition.

Context as a sequence of interactions between network structures.

(A) Coordination between network structures (S1 to S5, circles), under Rn (top) or Rf (bottom) responses. In both response contingencies, context perception (S1) encoded contextual information (S3). However, when the response stimulus contained high emotional valence (Rf, bottom), response perception (S2) reactivates the contextual information (S4), resulting in top-down modulation feedback (S5) that shares the same context and response dependence as the gazing behavior (black arrow and rounded rectangles). Green, blue, and red arrows represent correlations in context dependence in Rn, context dependence in Rf, and in response dependence, respectively (see Figure 6A). (B) Temporal, spectral, and spatial profiles and overlap in defined network structures. Network structures can be characterized by frequency range (labeled on the left) and connectivity pattern (shown on the right). Their temporal activations are plotted over trial time, with a ‘sound-like’ presentation, where a higher volume represents stronger activation. Black vertical lines represent the events as indicated in Figure 2.

https://doi.org/10.7554/eLife.06121.026

Applications for large-scale functional brain network mapping

We developed an analytical approach using an unbiased deconvolution of comprehensive network activity under well-controlled and staged behavioral task conditions. This workflow enabled us to identify novel network structures and their dynamic evolution during ongoing behavior. In principle, this approach can be generally useful to investigate how network structures link neural activity and behavior. However, we caution that the latent network structures we identified were extracted computationally, and therefore will require further confirmatory experiments to verify their biological significance, particularly the causality of the connectivity patterns within each structure and the functional links bridging different structures. The biological meaning of the identified network structures could be achieved by selective manipulation of neuronal pathways by electrical or optogenetic stimulation linked to the ECoG array by neurofeedback, or neuropharmacological manipulations.

The general class of network structures we identified are not necessarily unique to context, By recording with a hemisphere-wide ECoG array and applying our analytical methodologies to other cognitive behaviors and tasks in non-human primates, we fully expect to observe similar network structures. Our approach of pooling large-scale data across subjects may be useful to extract network structures that are generalizable, because neural processes specific to individual subjects or trials will cancel. Indeed, the stable and consistent trial responses across subjects in our chronic ECoG recordings suggest that the network structures we isolated may be candidate innate, elemental units of brain organization. Conversely, future identification of unique differences in network structures between subjects could offer insight into structures related to individual trait and state variability, and the network-level etiology of brain diseases (Belmonte et al., 2004; Uhlhaas and Singer, 2006).

Materials and methods

Subjects and materials

Customized 128-channel ECoG electrode arrays (Unique Medical, Japan) containing 2.1 mm diameter platinum electrodes (1 mm diameter exposed from a silicone sheet) with an inter-electrode distances of 5 mm were chronically implanted in the subdural space in three Japanese macaques (Subjects 1, 2, and 3). The details of surgical methods can be found on Neurotycho.org. In Subject 1, electrodes were placed to cover most of the lateral surface of the right hemisphere, also the medial parts of the frontal and occipital lobes. In Subject 2, a similar layout was used, but in the left hemisphere. In Subjects 3, all electrodes were placed on the lateral surface of the left hemisphere, and no medial parts were covered. The reference electrode was also placed in the subdural space, and the ground electrode was placed in the epidural space. Electrical cables leading from the ECoG electrodes were connected to Omnetics connectors (Unique Medical) affixed to the skull with an adaptor and titanium screws. The locations of the electrodes were identified by overlaying magnetic resonance imaging scans and x-ray images. For brain map registration, the electrode locations and the brain outlines from Subjects 1 and 3 were manually registered to those from Subject 2 based on 13 markers in the lateral hemisphere and 5 markers in the medial hemisphere (see Figure 1—figure supplement 1).

All experimental and surgical procedures were performed in accordance with the experimental protocols (No. H24-2-203(4)) approved by the RIKEN ethics committee and the recommendations of the Weatherall report, ‘The use of non-human primates in research’. Implantation surgery was performed under sodium pentobarbital anesthesia, and all efforts were made to minimize suffering. No animal was sacrificed in this study. Overall care was managed by the Division of Research Resource Center at RIKEN Brain Science Institute. The animal was housed in a large individual enclosure with other animals visible in the room, and maintained on a 12:12-hr light:dark cycle. The animal was given food (PS-A; Oriental Yeast Co., Ltd., Tokyo, Japan) and water ad libitum, and also daily fruit/dry treats as a means of enrichment and novelty. The animal was occasionally provided toys in the cage. The in-house veterinary doctor checked the animal and updated daily feedings in order to maintain weight. We have attempted to offer as humane treatment of our subject as possible.

Task design

During the task, each monkey was seated in a primate-chair with its arms and head gently restrained, while a series of video clips was presented on a monitor (Videos 1–6). In one recording session, each of six video clips was presented 50 times, and all 300 stimuli were presented in a pseudorandom order in which the same stimulus would not be successively presented. In order to keep the monkey's attention to the videos, food items were given after every 100 stimuli. Each monkey participated three recording sessions within a week. Each stimulus consisted of three periods: Waiting, Context, and Response periods. During the Waiting period, a still picture created by pixel-based averaging and randomizing the all frames of stimuli was presented without sound for 2. During the first 0.5 s of the Context period, a still image of an actor (a monkey) and an opponent (a monkey, a human, or wall) was presented with the sound associated with the opponents. The actor was always positioned on the left side of the image. Then a curtain in the video started to close from the right side toward the center to cover the opponent. The curtain closing animation took 0.5 s, and the curtain stayed closed for another 0.5 s. During the Response period, one of two emotional expressions of the actor (frightening or neutral) was presented with sound for 3 s, followed by the Waiting period of the next trial.

ECoG and behavior recordings

An iMac personal computer (Apple, USA) was used to present the stimuli on a 24-in LCD monitor (IOData, Japan) located 60 cm away from the subject. The sound was presented through one MA-8BK monitor speaker (Roland, Japan) attached to the PC. The experiments were run by a program developed in MATLAB (MathWorks, USA) with Psychtoolbox-3 extensions (Brainard, 1997). The same PC was used to control the experiments and the devices for recording monkey's gaze and neural signals via USB-1208LS data acquisition device (Measurement Computing Co., USA). A custom-made eye-track system was used for monitoring and recording the monkey's left (Subject 1) or right (Subjects 2 and 3) eye at 30 Hz sampling (Nagasaka et al., 2011). Cerebus data acquisition systems (Blackrock Microsystems, USA) were used to record ECoG signals with a sampling rate of 1 kHz.

Trial screening

Trials during which the subject's eye position was within the screen area more than 80% of the time during the first 0.5 s of the Context period were classified as C+ trials. The rest of the trials were identified as C trials, where the subject either closed its eyes or the eye position was outside the screen or outside the recording range (±30°).

Data analysis

ICA

ICA was performed on the data combined C+ and C trials to acquire a common basis for easier interpretation of the results. On the other hand, dDTF and the following analyses (ERC and SD-ERC) were calculated from C+ and C trials separately, and later combined in PARAFAC analysis.

Preprocessing

The 50 Hz line noise was removed from raw ECoG data by using the Chronux toolbox (Bokil et al., 2010). The data was then downsampled four times, resulted in a sampling rate of 250 Hz. Trials with abnormal spectra were rejected by using an automated algorithm from the EEGLAB library (Delorme et al., 2011), which has been suggested as the most effective method for artifact rejection (Delorme et al., 2007). The numbers of trials preserved are shown in Table 1.

Model order selection

The model order, that is, the number of components, for ICA was determined by the PCA of the data covariance matrix, where the number of eigenvalues accounted for 90% of the total observed variance. The resulted model order is shown in Table 1.

ICA algorithm

ICA was performed in multiple runs using different initial values and different bootstrapped data sets by using the ICASSO package (Himberg et al., 2004) with the FastICA algorithm (Hyvärinen and Oja, 1997), which could significantly improve the reliability of the results (Meinecke et al., 2002). In the end, artifactual components with extreme values and abnormal spectra were discarded by using an automated algorithm from EEGLAB (Delorme and Makeig, 2004). The number of components preserved after this screening process is shown in Table 1.

Multivariate spectral causality

Spectral connectivity measures for multitrial multichannel data, which can be derived from the coefficients of the multivariate autoregressive model, require that each time series be covariance stationary, that is, its mean and variance remain unchanged over time. However, ECoG signals are usually highly nonstationary, exhibiting dramatic and transient fluctuations. A sliding-window method was implemented to segment the signals into sufficiently small windows, and connectivity was calculated within each window, where the signal is locally stationary.

Preprocessing

Three preprocessing steps were performed to achieve local stationarity: (1) detrending, (2) temporal normalization, and (3) ensemble normalization (Ding et al., 2000). Detrending, which is the subtraction of the best-fitting line from each time series, removes the linear drift in the data. Temporal normalization, which is the subtraction of the mean of each time series and division by the standard deviation, ensures that all variables have equal weights across the trial. These processes were performed on each trial for each channel. Ensemble normalization, which is the pointwise subtraction of the ensemble mean and division by the ensemble standard deviation, targets rich task-relevant information that cannot be inferred from the event-related potential (Ding et al., 2000; Bressler and Seth, 2011).

Window length selection

The length and the step size of the sliding-window for segmentation were set as 250 ms and 50 ms, respectively. The window length selection satisfied the general rule that the number of parameters should be <10% of the data samples: to fit a VAR model with model order p on data of k dimensions (k ICs selected from ICA), the following relation needs to be satisfied: w ≥ 10 × (k^2 × p/n), where w and n represent the window length and the number of trials, respectively.

Model order selection

Model order, which is related to the length of the signal in the past that is relevant to the current observation, was determined by the Akaike information criterion (AIC) (Akaike, 1974). In all subjects, a model order of nine samples (equivalent to 9 × 4 = 36 ms of history) resulted in minimal AIC and was selected. The selected model order also passed the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test, thus maintained local stationarity. Furthermore, the VAR model was validated by the whiteness test and the consistency test.

Spectral connectivity

dDTF (Korzeniewska et al., 2003), a VAR-based spectral connectivity measure, was calculated by using the Source Information Flow Toolbox (SIFT) (Delorme et al., 2011) together with other libraries, such as Granger Causal Connectivity Analysis (Seth, 2010) and Brain-System for Multivariate AutoRegressive Timeseries (Cui et al., 2008). A detailed tutorial of VAR-based connectivity measures can be found in the SIFT handbook (http://sccn.ucsd.edu/wiki/SIFT).

ERC

To calculate ERC at time t and frequency f, or ERC(t, f), dDTF at time t and frequency f, or dDTF(t, f), was normalized by the median value during the baseline period at frequency f, or dDTFbaseline (f):

(1) dDTFbaseline(f)=median(dDTF(tbaseline,f)),ERC(t,f)=10log10(dDTF(t,f)dDTFbaseline(f)).

Comparisons for context and response dependencies

Nine comparisons were performed on the ERCs obtained from different social scenarios in C+ and C trials separately. To examine context dependency, three comparisons were performed between scenarios with different contexts followed by the frightened response (ChRfCmRf, CmRfCwRf, and CwRfChRf), and another three comparisons were performed between scenarios with different contexts followed by the neutral response (ChRnCmRn, CmRnCwRn, and CwRnChRn). To examine response dependency, three comparisons were performed between scenarios with different responses under the same contexts (ChRfChRn, CmRfCmRn, and CwRfCwRn). False discovery rate (FDR) control was used to correct for multiple comparisons in multiple hypothesis testing, and a threshold of αFDR = 0.05 was used.

PARAFAC

PARAFAC was performed by using the N-way toolbox (Andersson and Bro, 2000), with the following constraints: no constraint on the first tensor dimension, and non-negativity on the second and the third tensor dimensions. The non-negativity constraint was introduced mainly for a more simple visualization of the results. The convergence criterion, that is, the relative change in fit for which the algorithm stops, was set to be 1e-6. The initialization method was set to be DTLD (direct trilinear decomposition) or GRAM (generalized rank annihilation method), which was considered the most accurate method (Cichocki et al., 2009). Initialization with random orthogonalized values (repeated 100 times, each time with different random values) was also shown for comparison.

Connectivity statistics

The connectivity statistics used in this study were calculated from the connectivity matrix (a weighted directed relational matrix) from each latent network structure and each subject, by using the Brain Connectivity Toolbox (Rubinov and Sporns, 2010).

Causal density and outflow

Node strength was measured as the sum of weights of links connected to the node (IC). The causal density of each node was measured as the sum of outward and inward link weights (out-strength + in-strength), and the causal outflow of each node was measured as the difference between outward and inward link weights (out-strength—in-strength). For visualization, each measure was spatially weighted by the absolute normalized spatial weights of the corresponding IC. For example, assume the causal density and causal outflow of IC ic is Densityic, and Outflowic, respectively, and the spatial weights of IC ic on channel ch is Wic,ch, then the spatial distributions of causal density and causal outflow on each channel will be:

(2) Densitych=icabs(Wic,ch)max(abs(Wic,ch))Densityic,Outflowch=icabs(Wic,ch)max(abs(Wic,ch))Outflowic.

Maximum flow between areas

Seven cortical areas were first manually determined: the visual (V), parietal (P), prefrontal (PF), medial prefrontal (mPF), motor (M), anterior temporal (aT), and posterior temporal (pT) cortices. The maximum link among all links connected two areas was selected to represent the maximum flow between the two areas.

Activation levels from comparison loadings

For each experimental condition (C+ or C trials), we determined the significant comparisons for each latent network structure by performing trial shuffling. For each shuffle, dDTF and SR-ERC were recalculated after trial type was randomly shuffled. A new tensor was formed and we estimated how the five spectrotemporal connectivity structures identified from the original data contributed in the shuffled data, by performing PARAFAC on the tensor from the shuffled data with the last two tensor dimensions (Time-Frequency and Connection-Subject) fixed with the values acquired from the original data. Trial shuffling was performed 50 times. The loading from the original data that was significantly different than the loadings from the shuffled data is identified as the significant loading (α = 0.05). The comparisons with significant loadings showed the context and response dependencies of each structure, and further revealed the activation levels of each structure in different scenarios.

Interdependencies among latent network structures

We determined the interdependencies among latent network structures by examining how the activation of one structure could affect the activation of the others. To achieve this, we examined the comparison loadings in the first tensor dimension. For the correlations of structures' activation differences across contexts under Rf, we focused on the six comparison loadings representing activation differences across contexts under Rf (ChRfCmRf, CmRfCwRf, and CwRfChRf) from C+ and C trials, and evaluated the correlations of these comparison loadings from different structures. The p-values for testing the hypothesis of no correlation were then computed. For the correlations of structures' activation differences across contexts under Rn, we used the same approach to evaluate the correlations of structures' activation differences across contexts under Rn (ChRnCmRn, CmRnCwRn, and CwRnChRn). For the correlations of structures' activation differences across responses, we evaluated the correlations of structures' activation differences across responses (ChRfChRn, CmRfCmRn, and CwRfCwRn).

Video 1
Video clip for CmRf trials.

The clip contains the Context and Response periods.

https://doi.org/10.7554/eLife.06121.005
Video 2
Video clip for ChRf trials.

The clip contains the Context and Response periods.

https://doi.org/10.7554/eLife.06121.006
Video 3
Video clip for CwRf trials.

The clip contains the Context and Response periods.

https://doi.org/10.7554/eLife.06121.007
Video 4
Video clip for CmRn trials.

The clip contains the Context and Response periods.

https://doi.org/10.7554/eLife.06121.008
Video 5
Video clip for ChRn trials.

The clip contains the Context and Response periods.

https://doi.org/10.7554/eLife.06121.009
Video 6
Video clip for CwRn trials.

The clip contains the Context and Response periods.

https://doi.org/10.7554/eLife.06121.010

References

  1. 1
  2. 2
    The N-way Toolbox for MATLAB
    1. CA Andersson
    2. R Bro
    (2000)
    Chemometrics and Intelligent Laboratory Systems 52:1–4.
    https://doi.org/10.1016/S0169-7439(00)00071-X
  3. 3
  4. 4
    Visual objects in context
    1. M Bar
    (2004)
    Nature Reviews Neuroscience 5:617–629.
    https://doi.org/10.1038/nrn1476
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
    Advanced Methods in Neuroethological Research
    1. ZC Chao
    2. N Fujii
    (2013)
    Advanced Methods in Neuroethological Research, Springer.
  18. 18
    Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation
    1. A Cichocki
    2. R Zdunek
    3. AH Phan
    4. S-I Amari
    (2009)
    Wiley.
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
    Inter-area correlations in the ventral visual pathway reflect feature integration
    1. J Freeman
    2. TH Donner
    3. DJ Heeger
    (2011)
    Journal of Vision, 11, 10.1167/11.4.15.
  27. 27
  28. 28
    A theory of cortical responses
    1. K Friston
    (2005)
    Philosophical Transactions of the Royal Society B 360:815–836.
    https://doi.org/10.1098/rstb.2005.1622
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
    Contextual social cognition and the behavioral variant of frontotemporal dementia
    1. A Ibañez
    2. F Manes
    (2012)
    Neurology 78:1354–1362.
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
    Visibility reflects dynamic changes of effective connectivity between V1 and fusiform cortex
    1. G Rees
    2. A House
    (2005)
    Neuron 46:811–821.
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58

Decision letter

  1. Timothy Behrens
    Reviewing Editor; Oxford University, United Kingdom

eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see review process). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.

Thank you for sending your work entitled “Mesoscopic brain networks regulate cognitive enchainment in social monitoring” for consideration at eLife. Your article has been favorably evaluated by three reviewers, one of whom, Timothy Behrens (Senior Editor and Reviewing Editor) is a member of our board.

The editor and the other reviewers discussed their comments before we reached this decision, and the editor has assembled the following comments to help you prepare a revised submission.

All three reviewers were very impressed by the unusual nature of the data, by the sophisticated and revealing analysis that was able to simplify complex data to the dynamics of network activity that underlies social observations in the task, and all reviewers were excited by the ability to study this particular brain network in macaque monkeys, given its importance in social cognition in many human studies.

Example praise for the manuscript in review was as follows:

Reviewer 1:

The neural mechanisms mediating social behavior appear to be distributed across the brain, including areas thought to be specialized for social information processing (such as those in the temporal lobe and medial prefrontal cortex) and others that serve more general purpose functions (such as those involved in reward, decision-making, executive control, and attention). One impediment to understanding how these circuits interact to translate sensory perception into behavior is that the methods typically applied suffer from either poor temporal resolution (fMRI in humans), poor spatial resolution (EEG), or limited coverage (single unit recording in animals). The current paper uses wide-scale recordings from intracranial EEG (ECoG) system to simultaneously assess interactions amongst cortical areas during social information processing. Importantly, the authors apply sophisticated, data-driven analytical tools to derive the information flow between and amongst these areas. This is an important advance.

Reviewer 2:

This paper describes a novel approach to extract information about the functional connectivity of distributed brain networks, and how these connections evolve through time as a function of carrying out social observation. The paper is unique in two respects. First, the dataset contains whole-hemisphere ECoG is collected whilst three non-human primates perform a social task (although the authors have published several papers with similar ECoG data previously). Second, the analysis approach is novel and innovative. dDTF is used to identify connections between regions, and then factor analysis is used to isolate how these vary as a function of different task conditions. Using this approach, the authors identify five separate networks (‘structures’). These superficially have similarities in terms of their connectional structure (e.g. structures 1/2, and structures 3/4), but differ in terms of their temporal dynamics and their activation across conditions. Intriguingly, some of the networks contain classic ‘social’ regions, such as the superior temporal sulcus. By examining how these networks evolve through time, the authors attempt to reveal the chain of events underlying different forms of social observation.

Reviewer 3:

This paper reports extremely unusual data from ECoG recordings of macaque monkeys viewing other monkeys engaged in socially threatening situations. It also reports a novel and potentially powerful set of analysis tools for analysing functional networks acquired at high temporal resolution in ECoG data.

There are several key strengths and novelties about the paper.

(1) Intriguing patterns of brain activity and functional connections are reported that have perhaps never been recorded from outside and fMRI scanner, and certainly not in social situations. These patterns are reminiscent of human brain areas that respond to complex social tasks.

(2) The broad coverage of the ECoG data by comparison to most other macaque monkey recordings allows the analysis of information flow between brain areas. Because of the temporal resolution, these analyses can also begin to make directional inferences.

(3) The complex nature of ECoG data requires sophisticated data compression techniques. The authors are extremely inventive in how they analyse their data – they develop tools which compress the data into its digestible patterns, but which maintain the key comparisons between conditions, between brain areas, and between task times. I find this very impressive.

However, there were several features of the data that limited the reviewers' enthusiasm. In brief, these were broadly to do with the task and the over-interpretation of the data. We believe that with a thorough rewrite of the manuscript, to focus on describing the data clearly rather than making interpretations of the data, both in terms of its relevance to particular social behaviours, and in terms of causal mechanisms that are not supported directly by the data, it will be possible to improve the manuscript to remove these concerns.

Specifically:

Comments about the task:

Reviewer #1:

1) The task, which the authors refer to as monitoring of a social context, involves only passive viewing without any differential response expected (or found) on the part of the observer monkey in reaction to different ‘social scenarios’. The only statistically significant difference in behavior is the reported difference in left vs. right gaze positions during the response phase of the trial when monkeys view a frightened vM vs. a neutral vM. Since the authors did not find this result to vary under different ‘social’ contexts, it remains unclear whether/what the observer monkeys made out of the different social situations examined in the study.

2) In the Abstract, it might be more precise to talk about “mapping the network structure” as monkeys viewed scenes leading up to examining the valence on a conspecific's face rather than calling it a “social cognitive behavior”/“social context monitoring” since the behavior per se doesn't inform us in this regard.

3) In the Methods, the task lacks a non-social control, which will in turn depend on what the authors, for the purpose of the expt., define as ‘social’. For instance, is the context of two monkeys looking at each other ‘social’ in which case, the non-social control could be another monkey looking away from the monkey on the left of the screen. Is monkey-monkey looking at each other more ‘social’ than monkey-human looking at each other? These are of course tough questions to answer from a single experiment but it will be still useful to discuss the authors' view on these issues as their basis for designing the expt.

On the other hand, if the element of affect/threatening the monkey on the left is what constitutes social in this case, a non-social control could be a non-threatening/neutral monkey on the right or perhaps an inanimate monkey/human with a threatening expression on it?

Furthermore, the monkey in the video facing an empty wall does not control for the presence of an object or an individual. There are clear sensory/perceptual differences between a monkey face, a human face, and a wall.

Reviewer #3:

It is not clear from the monkey's behaviour that the social nature of the task, rather than the perceptual differences between stimuli, is what is important. In my view this slightly confounds the clear interpretation of these signals as social signals.

Comments about the interpretation of the data:

Reviewer #1:

4) In the Results, the authors say that “subjects tended to focus on the right section during the Response period when the response stimuli were Rf (C+Rf trials)” vs C+Rn trials as well as C trials. They conclude that this “indicated that gazing behavior required not only vM responses with high emotional valence, but also the context of vM's response. This suggested that the gazing behavior by the subject represented an automatic or intuitive reaction to socially relatable scenarios (e.g. ‘vM was frightened after being threatened’).”

I think the comparison that the authors analyzed suggest that monkeys are interested in knowing what is behind the curtain when vM is frightened vs. when it is not. This comparison doesn't take into account the nature of context preceding the response since what frightened the monkey – monkey vs. human vs. empty wall – was not compared against each other here. In fact, when the context was indeed taken into account, the authors did not find any significant context dependence at all (Figure 1–figure supplement 3) and hence, in my opinion, the reported difference in gazing location does not make the case for subjects ‘monitoring the context of a social scenario’ at all. This is a critical concern.

Reviewer #2:

I came away with a less than clear impression of what the findings had taught us. I think that this was partly a result of the way the paper was structured. The focus, in the Abstract, Introduction and initial results, was quite heavily on the mathematical technique used as opposed to the results obtained with this technique. Upon reaching the Results, there were some clearly interesting findings, but many of the interpretations were dependent upon a reverse inference from the brain regions activated (Poldrack RA, TiCS 2006), as opposed to an inference based on the task manipulation. I would therefore urge the authors to shift the emphasis in the initial part of the paper towards what their technique and dataset tells us about the dynamics of connectivity in these brain regions, as opposed to being so heavily focused on the methodology.

Reviewer #3:

The nature of the task in combination with the complex analysis often makes interpretation of the results complex, and the authors resort to an interpretation that does not rely on the data. There are examples of this throughout the manuscript. Here is a typical one:

“This result suggests that Structure 5 underlies the context-dependent feedback modulation of response perception linked to social reasoning in the task (‘why vM is frightened?’ or ‘is vM frightened because it was threatened by something?’ or ‘should I be concerned that vM is frightened?’).”

This kind of inference is inappropriate, and is also unnecessary.

The remaining major comments were either about the technicalities of the analysis, or about the reporting of these technicalities. Where possible, we would like you to address these technical concerns. More broadly speaking, we would like you to focus on clarifying the more technical aspects of the manuscript so that the manuscript can be clearly understood by a broad audience.

Other comments:

Reviewer #1:

In the subsection “Deconvolution Analysis or Cortical Information Processing,” wouldn't it be more useful task-wise to identify independent sources by using only C+ trials instead of merging data from C+ & C trials for ICA? Did the authors do this analysis? How does it affect the results?

In the first paragraph of the subsection “Dynamic Cognitive Chain Describes Social Context Monitoring” to examine how the network interactions change as a function of context, wouldn't it be better to examine the correlations between activation differences across contexts (Cm & Ch vs. Cw) rather than between C+ vs. C trials?

In the same subsection, as per Figure 5A, none of the correlations with structure 2 are significant. In that case does a causal dependence of structure 4 and 5 on structure 2 apply?

In the second paragraph of the same subsection, does this analysis include the time courses of gaze positions and scanning behavior of all contexts (Cm, Ch & Cw) and trials (C+ & C)? Did you find the timing correlations to be different across contexts?

Reviewer #2:

I would therefore urge the authors to shift the emphasis in the initial part of the paper towards what their technique and dataset tells us about the dynamics of connectivity in these brain regions, as opposed to being so heavily focused on the methodology. That said, there were also times where it was unclear what order different techniques were being applied, and how they were being applied. For instance, it is mentioned that ICA was first applied, but it was unclear what the input dimensions of the ICA were, or how the obtained components were subsequently used for the dDTF and PARAFAC analysis. I felt that a clear ‘analysis pipeline’ diagram, starting with raw data and ending with the key results, would be very helpful to include as a supplementary figure.

Finally, I felt that the presentation of the comparison-condition component in Figures 3 and 4 was unclear, in that it seemed diagrammatic whereas presumably it was based upon the statistics of the comparison being performed. A more quantitative approach to presenting this data would help to make it more clear and compelling.

Reviewer #3:

The manuscript is written from a very technical perspective, and does not introduce the key neuroscience issues well. This makes it a very tough read, particularly for a broad interest journal such as eLife.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled “Mesoscopic brain networks regulate cognitive enchainment in social monitoring” for further consideration at eLife. Your revised article has been evaluated by Timothy Behrens (Senior Editor and Reviewing Editor) and the original reviewers.

The reviews are appended below, and the sentiment in the reviews was reiterated in discussion between reviewers. Essentially, we all remain impressed with the data, are aware that the volume and complexity of data requires innovative new analytical tools, and we still all believe that the proposed analyses are likely very interesting. However, two of the reviewers are clear (and the editorial team agree) that the manuscript cannot possibly be published in a broad interest journal in its present form. Whilst the technical details are more revealing and the claims better substantiated in this current revision, the clarity of the manuscript has, in our view, not improved. It is very difficult indeed to parse the manuscript to understand what the central contribution is. The figures are not well explained in the legends – the legends report details that might be more appropriate in a technical methods section and do not perform the main function of a figure legend, which is to explain how to read the figure.

We would like to ask you to follow the reviewers’ advice below and to restructure and rewrite the paper, so it can be followed in detail by a naïve reader.

Reviewer #1:

I think the authors have done a fine job revising this paper, which presents a novel analytical approach to analyzing density neurophysiological data gathered in monkeys viewing a set of different social scenarios. I'm not yet convinced that the descriptor “social scenario viewing task” is much better than “social monitoring task”– it's both a mouthful and I think still puts too much emphasis on the idea of a task. It might be more concise and precise to say the monkeys were viewing social scenarios.

Reviewer #2:

The authors have put quite some work into making the manuscript clearer, and providing more substantial information concerning the methodology. But the main concerns of the reviewers seem to have held: because of the task design, it is very difficult to draw any strong conclusions about the role of the identified networks in social cognition. In their own words, they acknowledge “without a proper control, we can't conclusively link our results to social cognition.” As such, the authors now explicitly state (in their response) that the main conclusion from the paper is about the development of a novel method. Not much can be learnt about the explicit meaning of the underlying cognitive processes.

The question then becomes, is this novel method of sufficiently broad interest and importance that it will change the views of the community? The authors main claim seems to be that it will reveal previously undiscovered ‘cognitive chains’. What exactly is meant by this? That brain regions are activated sequentially in response to a task, and that different brain areas will be recruited depending upon the cognitive function? On the one hand, this seems to be something that we already know from many years of MEG and EEG research in humans. On the other, it is clear that the spatial resolution of the ECoG data far surpasses this research, and the extraction of network structure is very different from what has gone before. Nevertheless, I still very much struggled to understand whether this extraction of network structure was (a) valid or (b) important. In terms of validity, if I were reviewing this paper at a methods journal, I'd expect a set of examples in simulated data to convince me that the method works well and robustly. In terms of importance, it's precisely because the task is poorly controlled that I don't get an “aha!” moment when examining the results that convinces me that it has definitely worked.

I can see that there is a lot of potential in the paper, and it seems unfair to dismiss what could be an important set of findings about a novel technique. But equally I didn't find the results and structure of the paper sufficiently compelling or clear to warrant publication at present. I'd be open to other reviewers pointing out what they felt was the evidence that the technique works convincingly, or that it has produced a particularly important result.

Reviewer #3:

The authors have done a good job dealing with the technical concerns and no longer over-interpret the data.

However, there still remain concerns about whether the manuscript as currently written can be understood by a broad interest readership, or even a relatively specialised readership within the field. Indeed the mathematical expertise needed even to understand the basic analysis is extreme, and the authors do a very poor job in making the analysis comprehensible. Like the other reviewers, I suspect, I am still finding it difficult to really evaluate the neuroscientific findings, as I cannot fully understand their implications.

Despite the unusual and high quality of the data and the sophisticated nature of the analysis, it is therefore very difficult to understand what we have learnt about the cognitive processes.

For example, the figure legends do not explain how to read the figures. The new diagram asked for by a different reviewer is difficult to understand.

It is essential that the authors address this in both the Abstract and the main text before this can be published in a journal such as eLife.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled “Cortical network architecture for context processing in primate brain” for further consideration at eLife. Your revised article has been favorably evaluated by Timothy Behrens (Senior Editor).

The new manuscript is, in my view, dramatically clarified, and the data remain extremely exciting. However, in the rewrite the focus of the Abstract and the Introduction has been moved towards the technical achievements of the manuscript. This presents a difficulty for eLife because the manuscript has not been reviewed or considered as a technical manuscript. In the Discussion, the nature of the claims is much more balanced between the technical innovations and the neuroscience claims (except perhaps in the first overarching paragraph where the balance is again towards the technical). eLife cannot publish the manuscript on the basis of the technical claim alone, but I believe it will be relatively easy to adjust the Abstract and Introduction to highlight and to be clear about the new neuroscience claim.

In my view, the neuroscience claim is still not clearly stated in the Abstract or Introduction. You summarise your findings as follows in the Abstract:

“Collectively, the five structures delineated the flow of information in the network, including two isomorphic variants defining the encoding and retrieval, respectively, of contextual information.”

and in the Introduction:

“The structures we identified provide new insights on how contextual information is processed and help to identify relationships linking network communication and behavior.”

Both of these statements are descriptions of the success of the technique, and not of new neuroscience findings. In brief, what new insights do they provide?

In the Discussion you are much more clear about this in the subsection “Context is Encoded by Interactions of Large-Scale Network Structures”.

If I am correct, it seems like the key claims can be summarised as follows:

a) Large-scale network interactions are different in different contexts.

b) Bottom-up connections from posterior temporal cortex to anterior temporal cortex and medial PFC can encode a context whilst it is being first processed and held on-line.

c) These exact same brain regions exhibit the opposite top-down connectivity only seconds later when the context is being applied to incoming sensory information.

d) The extent to which the bottom-up connectivity is active during the processing of context predicts the extent to which the top-down connectivity will act when the context is later being applied.

To me, these claims seem to be striking and important and it is these claims that the manuscript has been judged on, but these claims no longer appear anywhere in the Abstract or Introduction and are only in a subsection of the Discussion. So I am asking for one further revision. In this round of revision, I am asking for changes to the Abstract, to the Introduction (and possibly also to the Discussion if you choose), which concisely and precisely highlight the new neuroscience claims, and which change the tone of the current version of the manuscript from being largely a methodological innovation, to being a balanced manuscript which introduces a new technique to make a clear and precise claim about the contextual processing of sensory information.

https://doi.org/10.7554/eLife.06121.027

Author response

Comments about the task:

Reviewer #1:

1) The task, which the authors refer to as monitoring of a social context, involves only passive viewing without any differential response expected (or found) on the part of the observer monkey in reaction to different ‘social scenarios’. The only statistically significant difference in behavior is the reported difference in left vs. right gaze positions during the response phase of the trial when monkeys view a frightened vM vs. a neutral vM. Since the authors did not find this result to vary under different ‘social’ contexts, it remains unclear whether/what the observer monkeys made out of the different social situations examined in the study.

We agree that the observer monkey’s behavior did not directly show “social context monitoring”. Therefore, in the revised manuscript, we have de-emphasized the “social” implication of the conclusions and provided a more precise, conservative interpretation in the paper. Instead, we emphasize the main conclusion that our data reveal a method to simplify complex brain physiological data into mesoscopic network modules underlying implicit cognitive processes related to sensory input of the observer monkey, and whose explicit meaning remains an interesting subject for future investigation. In the revised Introduction we write: “Current methods for studying neurocognitive networks in humans and primates depend on drawing correlations between brain network activity and behavior. […] Here, we employ a new approach that uses an unbiased decomposition of total network activity under diverse task conditions, enabling the computational extraction of latent structure in functional network interactions and post-hoc examination of its dynamic evolution during behavior”.

However, we assert that our task was fundamentally social in nature, since the subjects viewed video clips containing explicit social interactions between a conspecific and second agent. Furthermore, while the eye scanning behavior during the Context period indeed did not show context specificity, they clearly indicated the subjects’ attempt to access the social relationship between the agents and demonstrated the subjects’ observational association with the vM. Therefore, we renamed the task a “social scenario viewing task” to avoid the implication that the task required subjects to process social contextual information. Furthermore, our functional connectivity analysis demonstrated the existence of neural networks that were not observable by external behavior. Thus, we believe that the term “social monitoring” is appropriate since it could refer to an internal process.

2) In the Abstract, it might be more precise to talk about “mapping the network structure” as monkeys viewed scenes leading up to examining the valence on a conspecific's face rather than calling it a “social cognitive behavior”/“social context monitoring” since the behavior per se doesn't inform us in this regard.

We agree that the observer monkey’s behavior did not show the explicit processing of social context, but we believe there was a clear social observation was in the task (see response above). The reviewer’s proposed description is precise to a fault, but we would like to suggest that “the valence on a conspecific’s face” is part of a social context between two agents that the monkey is observing with interest, as measured by eye scanning between key elements in the scene.

To address this issue, in the Abstract, we now state: “Here, we describe the functional network structure of primate brain during social monitoring where subjects passively viewed social scenarios with staged situational contexts”.

3) Methods: The task lacks a non-social control, which will in turn depend on what the authors, for the purpose of the expt., define as ‘social’. For instance, is the context of two monkeys looking at each other ‘social’ in which case, the non-social control could be another monkey looking away from the monkey on the left of the screen. Is monkey-monkey looking at each other more ‘social’ than monkey-human looking at each other? These are of course tough questions to answer from a single experiment but it will be still useful to discuss the authors' view on these issues as their basis for designing the expt.

On the other hand, if the element of affect/ threatening the monkey on the left is what constitutes social in this case, a non-social control could be a non-threatening/ neutral monkey on the right or perhaps an inanimate monkey/ human with a threatening expression on it?

Furthermore, the monkey in the video facing an empty wall does not control for the presence of an object or an individual. There are clear sensory/perceptual differences between a monkey face, a human face, and a wall.

The reviewer’s former understanding of the social aspect in the task is correct: the interaction between a conspecific and a second agent in the video. We assert that both monkey-monkey and monkey-human scenarios are social conditions, and that monkey-wall is a non-social condition (no second agent). We agree the monkey-wall condition is not an appropriate non-social control, and without a proper control, we can’t conclusively link our results to social cognition. Therefore, we toned down the social aspect in the revised manuscript (see response above).

In the revised manuscript, we also clarified our rationale for the experimental design: “To allow the controlled recruitment of multiple interdependent cognitive processes in a simple and natural setting, we introduced a social scenario viewing task, where monkey passively viewed video clips in which another monkey (video monkey, or vM) was socially engaged with a second agent.”

Reviewer #3:

It is not clear from the monkey's behaviour that the social nature of the task, rather than the perceptual differences between stimuli, is what is important. In my view this slightly confounds the clear interpretation of these signals as social signals.

The subjects’ scanning behavior during the Context period suggested that they were monitoring the relationship between the two agents. Therefore, we believe that the resultant context-dependent brain signals we measured reflect not only perceptual differences, but also social situational differences. However, we agree that further non-social control experiments are needed to definitively conclude this. Thus, in the revised manuscript, we have toned down the implication of social cognition, and instead emphasized the concept of mesoscopic network modules underlying concurrent and intertwined cognitive processes (see response above).

Comments about the interpretation of the data:

Reviewer #1:

4) Results: The authors say that “subjects tended to focus on the right section during the Response period when the response stimuli were Rf (C+Rf trials)” vs C+Rn trials as well as C trials. They conclude that this “indicated that gazing behavior required not only vM responses with high emotional valence, but also the context of vM's response. This suggested that the gazing behavior by the subject represented an automatic or intuitive reaction to socially relatable scenarios (e.g. ‘vM was frightened after being threatened’).

I think the comparison that the authors analyzed suggest that monkeys are interested in knowing what is behind the curtain when vM is frightened vs. when it is not. This comparison doesn't take into account the nature of context preceding the response since what frightened the monkey – monkey vs. human vs. empty wall – was not compared against each other here. In fact, when the context was indeed taken into account, the authors did not find any significant context dependence at all (Figure 1–figure supplement 3) and hence, in my opinion, the reported difference in gazing location does not make the case for subjects ‘monitoring the context of a social scenario’ at all. This is a critical concern.

While the eye scanning behavior during the Context period suggested that the subjects were evaluating the social situation presented in the context stimuli, we agree that there’s no contextual specificity in the explicit subjects’ behavior and that the implication of the subjects “monitoring the context of a social scenario” should be presented more conservatively in the text. Therefore, we renamed the task a “social scenario viewing task” instead of “social context monitoring task”, and the cognitive behavior underlying the identified brain networks as “social monitoring” instead of “social context monitoring” (see responses above).

In the revised manuscript, we elaborated on the explanation underlying this more conservative, nuanced conclusion of social monitoring, and instead placed more emphasis on our discovery of brain networks in structured cognition that are not observable via conventional behavioral measurements. In the Abstract we write: “Contextual specificity found in these components was absent in explicit behavioral output, revealing that the connectivity organization in cognition was internally generated.”

Reviewer #2:

I came away with a less than clear impression of what the findings had taught us. I think that this was partly a result of the way the paper was structured. The focus, in Abstract, Introduction and initial results, was quite heavily on the mathematical technique used as opposed to the results obtained with this technique. Upon reaching the results, there were some clearly interesting findings, but many of the interpretations were dependent upon a reverse inference from the brain regions activated (Poldrack RA, TiCS 2006), as opposed to an inference based on the task manipulation. I would therefore urge the authors to shift the emphasis in the initial part of the paper towards what their technique and dataset tells us about the dynamics of connectivity in these brain regions, as opposed to being so heavily focussed on the methodology.

In the revised manuscript, we set up our revised emphasis on neurobiology in the Introduction: “Current methods for studying neurocognitive networks in humans and primates depend on drawing correlations between brain network activity and behavior. […] Here, we employ a new approach that uses an unbiased decomposition of total network activity under diverse task conditions, enabling the computational extraction of latent structure in functional network interactions and post-hoc examination of its dynamic evolution during behavior”.

In the revised manuscript, we also clarified the key findings in the Introduction:

“Furthermore, the findings of functionally-specific brain network structures […] in long-hypothesized “cognitive chains”.

Reviewer #3:

The nature of the task in combination with the complex analysis often makes interpretation of the results complex, and the authors resort to an interpretation that does not rely on the data. There are examples of this throughout the manuscript. Here is a typical one:

“This result suggests that Structure 5 underlies the context-dependent feedback modulation of response perception linked to social reasoning in the task (‘why vM is frightened?’ or ‘is vM frightened because it was threatened by something?’ or ‘should I be concerned that vM is frightened?’).”

This kind of inference is inappropriate, and is also unnecessary.

We agree that our interpretations on the behavior were overreaching, and we removed them across the revised manuscript. In the revised Results section, our interpretations of the network structures now solely rely on the functional specificity (Dimension 1) and temporal structure (Dimension 2), which we elaborated later in the Discussion section based on reverse inference from the frequency profiles (Dimension 2) and the brain regions activated (Dimension 3).

The remaining major comments were either about the technicalities of the analysis, or about the reporting of these technicalities. Where possible, we would like you to address these technical concerns. More broadly speaking, we would like you to focus on clarifying the more technical aspects of the manuscript so that the manuscript can be clearly understood by a broad audience.

Other comments:

Reviewer #1:

5) In the subsection “Deconvolution Analysis or Cortical Information Processing,” wouldn't it be more useful task-wise to identify independent sources by using only C+ trials instead of merging data from C+ & C trials for ICA? Did the authors do this analysis? How does it affect the results?

In an earlier analysis presented in the 2012 Japanese Neuroscience Meeting, we screened out the C trials and focused only on C+ trials. The results, including the ICA results and the network modules, were very similar to those from the merged C+ and C data. Thus, we included C trials to increase the variety in experimental conditions (Dimension 1), which could provide additional information to help verify the functionality of each structure. Moreover, our deconvolution of the merged data into discrete networks highlights the strength of our method to analytically simplify complex physiological data, and might be scalable to even larger datasets.

6) In the first paragraph of the subsection “Dynamic Cognitive Chain Describes Social Context Monitoring” to examine how the network interactions change as a function of context, wouldn't it be better to examine the correlations between activation differences across contexts (Cm & Ch vs. Cw) rather than between C+ vs. C trials?

Agreed. We re-analyzed the data and revised Figure 5 and the corresponding figure legends and supplemental materials. In the Results, we now state: “To achieve this, we evaluated the correlations of the structures’ activation differences […] top-down modulation can integrate response information with abstract contextual information”.

In the same subsection, as per Figure 5A, none of the correlations with structure 2 are significant. In that case does a causal dependence of structure 4 and 5 on structure 2 apply?

To address this concern we performed a new analysis (revised Figure 5A, right) showing that a significant correlation was observed only between Structures 2 and 5, suggesting that top-down modulation integrated response information with abstract contextual information. Based on the results of this further analysis, we revised the graphical relationships among the 5 structures in Figure 5C.

In the second paragraph of the same subsection, does this analysis include the time courses of gaze positions and scanning behavior of all contexts (Cm, Ch & Cw) and trials (C+ & C)? Did you find the timing correlations to be different across contexts?

The analysis included the time courses from all contexts in C+Rf. The reason is that only in C+Rf trials did we observe both scanning and gazing behaviors that were significantly different than baseline.

We clarified this in the text: “We focused on the behavior in C+Rf trials, where both scanning and gazing behaviors changed significantly during the trials. We measured cross-correlations between the temporal dynamics of each structure and the median time course of scanning frequency (blue trace in Figure 1B) and gaze position (blue trace in Figure 1C) (Figure 5B).”

Also, we found no significant difference in cross-correlations across contexts. This is not surprising since no significant differences in behavior were observed across contexts (Figure 1figure supplement 4).

Reviewer #2:

I would therefore urge the authors to shift the emphasis in the initial part of the paper towards what their technique and dataset tells us about the dynamics of connectivity in these brain regions, as opposed to being so heavily focussed on the methodology. That said, there were also times where it was unclear what order different techniques were being applied, and how they were being applied. For instance, it is mentioned that ICA was first applied, but it was unclear what the input dimensions of the ICA were, or how the obtained components were subsequently used for the dDTF and PARAFAC analysis. I felt that a clear ‘analysis pipeline’ diagram, starting with raw data and ending with the key results, would be very helpful to include as a supplementary figure.

In accord with the reviewers suggestion, we revised the Abstract and Introduction, to better motivate our questions and key findings around how our methodological approach can extract highly concurrent and intertwined cognitive processes that are not be accessible by conventional behavioral measurements (see response above). That is, the neurobiological question we address is fundamentally how to access the structure of physiological information during intrinsic cognition, that at present may be ambiguously defined as mind-wandering or task-free networks.

As suggested by the reviewer, we also created a clear diagram for the analysis pipeline, including the increasing dimensionality of all variables from raw ECoG signals to the final latent network structures (new Figure 2figure supplement 3).

Finally, I felt that the presentation of the comparison-condition component in Figures 3 and 4 was unclear, in that it seemed diagrammatic whereas presumably it was based upon the statistics of the comparison being performed. A more quantitative approach to presenting this data would help to make it more clear and compelling.

We agree that it is important to show the original data in the Results, instead of a diagrammatic view. Therefore, in revised Figures 3A and 4A, we show the original scores from 18 comparisons and their statistical significance. The original Figures 3A and 4A now are combined in Figure 3figure supplement 1.

Reviewer #3:

The manuscript is written from a very technical perspective, and does not introduce the key neuroscience issues well. This makes it a very tough read, particularly for a broad interest journal such as eLife.

In the revised manuscript, we redirected our emphasis to the more accessible issue of how our novel approach can extract functionally meaningful units from high dimensionality data during highly concurrent and intertwined cognitive processes. Importantly, even apparently simple forms of cognition can be complex, showing contingencies and parallel processing, and critical subtleties in the physiological structure of neural data would not be accessible by standard or traditional methods in neuroscience that depend on drawing explicit correlations between brain network activity and an empirical and quantifiable behavior. More generally, we have tried to simplify the text in key areas for the broad eLife readership.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Reviewer #1:

I think the authors have done a fine job revising this paper, which presents a novel analytical approach to analyzing how density neurophysiological data gathered in monkeys viewing a set of different social scenarios. I'm not yet convinced that the descriptor “social scenario viewing task” is much better than “social monitoring task” – it's both a mouthful and I think still puts too much emphasis on the idea of a task. It might be more concise and precise to say the monkeys were viewing social scenarios. Other than that concern, I think the paper is ready to go.

We have significantly restructured the paper. We removed almost all the “social” parts because the social aspect of the task is only between video agents and therefore not central to the main conclusions. Instead, we focus the revision on neural processing of “context”. We study context by combining large-scale recording and analysis to decipher brain networks for fast, internal, concurrent, and interdependent cognitive processing. Our findings demonstrate that context processing is composed of network structures. These can be encoded in bottom-up interactions between sensory and association areas, and top-down interactions for behavioral modulation.

Regarding the task, we now describe it simply as “monkeys watched videos of two agents interacting in different situational contexts.”

Reviewer #2:

I still remain quite unsure where I stand with this paper. The authors have put quite some work into making the manuscript clearer, and providing more substantial information concerning the methodology. But the main concerns of the reviewers seem to have held: because of the task design, it is very difficult to draw any strong conclusions about the role of the identified networks in social cognition. In their own words, they acknowledge this: “without a proper control, we can't conclusively link our results to social cognition.” As such, the authors now explicitly state (in their response) that the main conclusion from the paper is about the development of a novel method. Not much can be learnt about the explicit meaning of the underlying cognitive processes.

We agree that it’s difficult to draw firm conclusions on social cognition. Thus, we revised our paper to focus on functional network dynamics for context processing. We show how to extract computational structures from brain network dynamics, and map the internal processing of context. Our findings demonstrate the organization of a “context” network showing context is encoded in large-scale bottom-up interactions among distributed brain areas, and could later be reactivated top down to correlated eye movements.

The link between brain networks and behavior was established by a new eye movement analysis showing context and response dependence in gazing behavior (see new Figure 2). In the previous analysis, we compared eye movements among trials with different contexts (ChRf vs. CmRf vs. CwRf, and ChRn vs. CmRn vs. CwRn), and found no context dependence. In the new analysis, we performed the same 9 comparisons used in the brain activity analysis to identify context and response dependence: “We performed 9 pairwise comparisons on gaze positions from different context situations, separating C+ and C conditions, to examine their context and response dependence. For context dependence, we compared behaviors from trials with different context stimuli but the same response stimulus (6 comparisons: CmRf vs CwRf, CwRf vs ChRf, and ChRf vs CmRf for context dependence in Rf; CmRn vs CwRn, CwRn vs ChRn, and ChRn vs CmRn for context dependence in Rn). For response dependence, we compared behaviors from trials with the same context stimulus but with different response stimuli (3 comparisons: CmRf vs CmRn, CwRf vs CwRn, and ChRf vs ChRn).”

By using the same comparisons for both behavior and brain activity analyses we could directly compare their context and response dependence.

The question then becomes, is this novel method of sufficiently broad interest and importance that it will change the views of the community?

We believe that our new method and its application to the issue of context processing in primate brain more than meets the high criteria of eLife for novelty and advance. The computational methods we use in the paper have literally never been reported before for this type of data. The question of whether it is significant in a biological meaning remains to be verified by others in the field, but we believe this is not a limitation of the study. How to handle big data from neurophysiology and extract biological meaning from its computational analyses is one of if not the most important question in neuroscience in the next 25 years. Our study provides a road map for one type of big data analysis of high density ECoG data.

Our approach of combining large-scale high-resolution neuronal recording and unbiased data-driven analysis is aimed at disentangling simultaneous and interdependent functional brain networks to provide a high-resolution brain-wide network description of context processing. Our findings demonstrate the structure of a “context” brain network showing that contextual information can be encoded in the dynamic interactions between the sensory and association areas, which could explain the neuronal basis of why context processing can affect a wide range of cognitive operations, from lower-level perception to higher-level executive functions. Moreover, our data-driven analysis also discovered other modular brain networks that either process lower-level sensory inputs for context encoding or integrate contextual information for behavioral modulation.

The brain’s ability to process contextual information provides enormous behavioral flexibility, while deficits were thought to lead to psychiatric disorders such as schizophrenia and post-traumatic stress disorder. Even though contextual information processing has been studied in a variety of cognitive domains and brain areas, a comprehensive functional view of the brain circuits that mediate contextual processing and modulation remain lacking. The value of our study is that we can use unbiased computational tools to extract structure from physiological data that appears to track actual cognitive processing. New methods should be evaluated in their potential to provide quantitative assessments of raw data. We do not want to be penalized because our data do not fit with less complex models drawn from fMRI, EEG. MEG methods.

The authors main claim seems to be that it will reveal previously undiscovered ‘cognitive chains’. What exactly is meant by this? That brain regions are activated sequentially in response to a task, and that different brain areas will be recruited depending upon the cognitive function? On the one hand, this seems to be something that we already know from many years of MEG and EEG research in humans. On the other, it is clear that the spatial resolution of the ECoG data far surpasses this research, and the extraction of network structure is very different from what has gone before.

In the revision, we have eliminated discussion of “cognitive chains”. The concept we focused on instead is “sequential and modular network interactions”, which means that not only brain regions are activated at different times during a task, but their interactions contain modular structures. In the new Figure 7, we show that modular networks whose activity coordinated with each other in a deterministic manner, even though they were overlapping in time, frequency, and space. We believe that these multiplexed neuronal structures represent a meta-structure organization for brain network communication.

Nevertheless, I still very much struggled to understand whether this extraction of network structure was (a) valid or (b) important. In terms of validity, if I were reviewing this paper at a methods journal, I'd expect a set of examples in simulated data to convince me that the method works well and robustly. In terms of importance, it's precisely because the task is poorly controlled that I don't get an “aha!” moment when examining the results that convinces me that it has definitely worked.

I can see that there is a lot of potential in the paper, and it seems unfair to dismiss what could be an important set of findings about a novel technique. But equally I didn't find the results and structure of the paper sufficiently compelling or clear to warrant publication at present. I'd be open to other reviewers pointing out what they felt was the evidence that the technique works convincingly, or that it has produced a particularly important result.

In summary, we have addressed these concerns by:

1) Revising the paper to focus on cortical networks underlying “context processing”, instead of “social cognition” or “cognitive chains”.

2) Providing new behavioral analyses to reveal context-dependent eye movement to enable a more direct comparison between brain activity and behavior.

3) Clarifying and simplifying the methods and results with improved text and figures.

Reviewer #3:

The authors have done a good job dealing with the technical concerns and no longer over-interpret the data.

However, there still remain concerns about whether the manuscript as currently written can be understood by a broad interest readership, or even a relatively specialised readership within the field. Indeed the mathematical expertise needed even to understand the basic analysis is extreme, and the authors do a very poor job in making the analysis comprehensible. Like the other reviewers, I suspect, I am still finding it difficult to really evaluate the neuroscientific findings, as I cannot fully understand their implications.

Despite the unusual and high quality of the data and the sophisticated nature of the analysis, it is therefore very difficult to understand what we have learnt about the cognitive processes.

For example, the figure legends do not explain how to read the figures. The new diagram asked for by a different reviewer is difficult to understand.

It is essential that the authors address this in both the Abstract and the main text before this can be published in a journal such as eLife.

In this revision, we provided a new figure (Figure 3) to provide a more intuitive idea of our analysis. We also refined the main result figures (Figures 4 and 5), removed unnecessary supplemental figures, and move the more technical descriptions from the main text and figure legends to the Materials and methods. Furthermore, we have given the paper to two readers outside of the field, and incorporated their feedback in this revision.

For neuroscientific findings, we focus the revision on neural processing of “context”: “The findings provide fundamental insights into context processing. […] These dynamic interactions between the unimodal sensory and multimodal association areas could explain the neuronal basis of why context processing can affect a wide range of cognitive operations, from lower-level perception to higher-level executive functions.”

And: “We showed that contextual processing, from initial perception to later modulation, was represented not by sequential activation of functionally specialized brain areas, but by sequential communication among functionally specialized brain areas. Thus, we believe that the network structures we observed represent a module of modules, or “meta-module” for brain connectivity, and further investigation on this meta-structure organization for brain network communication could help determine how deficits in context processing could contribute to psychiatric disorders such as schizophrenia (Barch et al., 2003) and post-traumatic stress disorder (Milad et al., 2009).”

[Editors' note: further revisions were requested prior to acceptance, as described below.]

[...] If I am correct, it seems like the key claims can be summarised as follows:

a) Large-scale network interactions are different in different contexts.

b) Bottom-up connections from posterior temporal cortex to anterior temporal cortex and medial PFC can encode a context whilst it is being first processed and held on-line.

c) These exact same brain regions exhibit the opposite top-down connectivity only seconds later when the context is being applied to incoming sensory information.

d) The extent to which the bottom-up connectivity is active during the processing of context predicts the extent to which the top-down connectivity will act when the context is later being applied.

To me, these claims seem to be striking and important and it is these claims that the manuscript has been judged on, but these claims no longer appear anywhere in the Abstract or Introduction and are only in a subsection of the Discussion. So I am asking for one further revision. In this round of revision, I am asking for changes to the Abstract, to the Introduction (and possibly also to the Discussion if you choose), which concisely and precisely highlight the new neuroscience claims, and which change the tone of the current version of the manuscript from being largely a methodological innovation, to being a balanced manuscript which introduces a new technique to make a clear and precise claim about the contextual processing of sensory information.

We deeply appreciate and fully agreed with the comments and suggestions. In this revision, we rewrote the Abstract and significantly restructured the Introduction, with the goal to balance the paper by clarifying and emphasizing the neuroscience insights our paper provides for context processing. We also fine-tuned the Discussion accordingly.

https://doi.org/10.7554/eLife.06121.028

Article and author information

Author details

  1. Zenas C Chao

    1. Laboratory for Adaptive Intelligence, RIKEN Brain Science Institute, Wako-shi, Japan
    Contribution
    ZCC, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    1. zenas.c.chao@gmail.com
    Competing interests
    The authors declare that no competing interests exist.
  2. Yasuo Nagasaka

    1. Laboratory for Adaptive Intelligence, RIKEN Brain Science Institute, Wako-shi, Japan
    Contribution
    YN, Conception and design, Acquisition of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  3. Naotaka Fujii

    1. Laboratory for Adaptive Intelligence, RIKEN Brain Science Institute, Wako-shi, Japan
    Contribution
    NF, Conception and design, Drafting or revising the article, Contributed unpublished essential data or reagents
    For correspondence
    1. na@brain.riken.jp
    Competing interests
    The authors declare that no competing interests exist.

Funding

Ministry of Education, Culture, Sports, Science, and Technology (Grant-in-Aid for Scientific Research on Innovative Areas, 23118002)

  • Naotaka Fujii

The funder had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Charles Yokoyama for valuable discussion and paper editing, and Naomi Hasegawa and Tomonori Notoya for medical and technical assistance. We also thank Jun Tani and Douglas Bakkum for their critical comments.

Ethics

Animal experimentation: All experimental and surgical procedures were performed in accordance with the experimental protocols (No. H24-2-203(4)) approved by the RIKEN ethics committee and the recommendations of the Weatherall report, ‘The use of non-human primates in research’. Implantation surgery was performed under sodium pentobarbital anesthesia, and all efforts were made to minimize suffering. No animal was sacrificed in this study. Overall care was managed by the Division of Research Resource Center at RIKEN Brain Science Institute. The animal was housed in a large individual enclosure with other animals visible in the room, and maintained on a 12:12-hr light:dark cycle. The animal was given food (PS-A; Oriental Yeast Co., Ltd., Tokyo, Japan) and water ad libitum, and also daily fruit/dry treats as a means of enrichment and novelty. The animal was occasionally provided toys in the cage. The in-house veterinary doctor checked the animal and updated daily feedings in order to maintain weight. We have attempted to offer as humane treatment of our subject as possible.

Reviewing Editor

  1. Timothy Behrens, Reviewing Editor, Oxford University, United Kingdom

Publication history

  1. Received: December 16, 2014
  2. Accepted: August 30, 2015
  3. Version of Record published: September 29, 2015 (version 1)

Copyright

© 2015, Chao et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,777
    Page views
  • 378
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Comments

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Developmental Biology and Stem Cells
    Cyrille Ramond et al.
    Research Article