Figures and data

(A) Experimental tasks. Participants learnt two tasks with distinct structures – a hierarchy task (left) and a flat task (right). In the hierarchy task, one input feature specifies which of the other input features determines the response category hierarchically. In the flat task, the conjunction of all three input features determines the response category. (B) An example trial. The stimulus consists of a child’s face, an outdoor scene and the spoken number nine. The response panel consists of a square and a circle. If the subject selected the circle category, they would press the right key to indicate their response as the circle is on the right of fixation. (C) Stimulus categories employed. Different stimulus sets were employed for the two tasks for each participant and the stimulus–task mapping was counterbalanced across participants. (D) Task agnostic and task tailored representation hypotheses make distinct predictions for how the lPFC may accommodate tasks with different structures. Each plot represents a hypothetical neural space defined by the firing rates of 3 different neurons. The activity of the population in response to each trial types are shown as individual dots. Illustration of geometries for the flat task (top panel) and hierarchy task (bottom panel) under the two hypotheses. A task agnostic representational geometry employs a high-dimensional format which enhances separability across many different dimensions, allowing for multiple different readouts. Therefore, the same geometry can serve both tasks (left). Task-tailored representations leverage specifically enhance the separability of dimensions relevant to the task, while reducing it for irrelevant dimensions to enhance generalizability, such that the manifold is optimized for the task structure. Therefore, distinct geometries are learnt for each task.

Behavioral performance of scanning group.
Error rates (A) and response times (B) during training (lines) and scanning (dots) phase. Participants were slower to learn and had a higher asymptotic error rate on the flat task at the end of the training phase. Error rates increased slightly in the scanner phase. Lines represent block-wise performance in the training phase. Dots represent mean overall performance in the scanning phase. Participants did not show an improvement in response time on the flat task. Participants were significantly slower to respond in the scanner (points on the right) compared to the end of training. (C) Behavioral evidence that participants used distinct strategies in flat vs hierarchy tasks. During training (left plots), in the flat task, RT costs were similar for switching of the different features. In the hierarchy task, there was a significantly larger cost when the auditory dimension (AF), which signaled superordinate context, switched. This pattern of hierarchical switching across tasks was maintained during the scanning phase (right plots). (D) In the scanning phase, response category and motor response interacted in driving RT. VF1 = Visual Feature 1, VF2 = Visual Feature 2, AF = Auditory Feature, RC = Response Category, MR = Motor Response. All error bars reflect 95% confidence intervals.

Information content of lPFC and pAC representations.
Cross-validated decoding accuracies for all task features in left lPFC (A), right lPFC (B), left pAC (C) and right pAC (D). While left lPFC shows diverse coding of various task features, left and right pAC only show coding of auditory stimulus information. Across both, the flat task (E) and hierarchy task (F), left lPFC shows selective coding of task-relevant stimulus information. On the other hand, left pAC shows obligatory coding of auditory stimulus information regardless of its relevant for the task. VF1 = Visual feature 1, VF2 = Visual Feature 2, AF = Auditory Feature, RC = Response Category, MR = Motor Response. All error bars reflect 95% confidence intervals.

(A) Separability (left panel) was assessed by decoding all possible balanced dichotomies. The same set of 8 patterns can be split into different dichotomies or classes. For example, three dichotomies are illustrated as either red vs yellow points, outlined vs not outlined points, and points marked with x vs unmarked. A total of 35 balanced dichotomies are possible. Shattering dimensionality is the mean cross-validated decoding accuracy averaged across all dichotomies. Generalizability or abstraction was assessed through cross-generalization (middle panel). Classifiers were trained on half the points assigned to each label (blue in middle panel), and tested on the other half (green in middle panel), with all possible combinations of training and test sets evaluated. The averaged cross-classification generalization performance (CCGP). Finally, cross-cluster representational alignment is assessed by measuring the angle (8) between the coding axis associated with the same feature in the two clusters. The coding axis is mathematically the weight vector of the trained classifier. (B) Shattering dimensionality, defined here as the mean cross-validated decoding accuracy across all 35 balanced binary dichotomies, is a measure of the separability of the representation. The separability of representations in lPFC and pAC was significantly above chance levels, and was not different across tasks or regions (B, left panel). Separability for the representation of the task-relevant inputs was significantly higher than for the orthogonal, task-irrelevant inputs in both flat (B, middle panel) and hierarchy (B, right panel) tasks. (C) Different task variables were encoded in an abstract, generalizable form in the lPFC (C, left panel) across the two tasks. In the flat task, the response category (RC) was encoded in abstract form, while in the hierarchy task it was the auditory feature (AF). In the pAC (C, right panel) only the auditory feature was abstractly coded. VF1 = Visual feature 1, VF2 = Visual Feature 2, AF = Auditory Feature, RC = Response Category. All error bars reflect 95% confidence intervals.

Properties of the local representation in the flat (orange) and hierarchy(purple) task.
Local representational content was strikingly different across the task structures (A) In the flat task, all task variables could be decoded above chance levels in each category cluster. (B) In the hierarchy task, only the context-relevant stimulus feature is decoded above chance levels in each context cluster. (C) In the flat task, CCGP was at chance levels for all task variables. (D) In the hierarchy task, CCGP was above chance for the context-relevant stimulus feature in each context. (E) Local separability was starkly different across the two task structures. Assessed at the group-level with permutation tests, 14 dichotomies were linearly separable in the flat task while only 2 were linearly separable in the hierarchy task (E, left panel), with a paired t-test confirming greater separability in the flat task (t19 = 2.8, p = 0.006). Local shattering dimensionality (E, right panel) was also significantly higher in the flat task (t19 = 2.67, p = 0.015). (F) Local representations were not aligned across clusters in either the flat task (F, left panel) or the hierarchy task (F, right panel). Mean angles between coding axes for the same variable across clusters were close to orthogonal and significantly higher than the within-cluster angles. Mean angles pool across all decodable variables. All error bars reflect 95% confidence intervals.

Task-relevant components of neural pattern variability in lPFC correlates with behavior.
Results of mixed-effects regression of trial-by-trial response times and choices on trial-by-trial estimates of signed distances from hyperplane obtained for each task-relevant feature for the flat (A, B) and hierarchy (C, D) tasks are presented as forest plots. In the flat task, trial-by-trial variability in both response times (A) and choices (B) was explained uniquely by neural variability in the coding of response categories in left lPFC. In the hierarchy task, trial-by-trial response time variability was explained by variability in the coding of context information (C) while choices were explained by variability in the coding of the context-relevant stimulus feature (D). Error bars reflect standard error. * < 0.05; ** < 0.01; *** < 0.001.