In the main text, we identify three striking properties of the two dominant shared modulators: (1) they each target one of the two V4 hemispheres; (2) they preferentially target the task-specific neurons within these hemispheres; and (3) their variance decreases under cued attention. Here we show that these features are present within higher-dimensional modulator models. In Figure 2—figure supplement 3, we show that these additional modulatory components do not convey any additional structure in these three domains. It is useful to first view the modulator model as a form of exponential-family Principal Component Analysis (PCA) (Collins et al., 2001; Solo and Pasha, 2013;Pfau et al., 2013). Standard PCA, like the shared modulator model, uncovers directions in signal space of maximal variation. However, PCA suffers from an identifiability problem: it can uniquely recover the subspace in which a small set of signals lie, but not the coordinate axes. PCA does select a particular orthogonal coordinate system to represent this subspace, but this solution is not unique, is sensitive to noise, and typically reveals little about the underlying generative process. This same identifiability problem is present with the shared modulator model. In the two-modulator case, we are able to resolve the ambiguity in the coordinate system by exploiting anatomical information (Figure 2b–c; see Materials and methods). However, the problem of identifiability becomes more acute in higher dimensions. Here, we show that the results presented in the main text for the 1-modulator/hemisphere model also hold for the unconstrained 2-modulator model. We also extend the 2-modulator results to the 3- and 4-modulator cases. This is necessary as, unlike standard PCA, the solutions to our equations in lower dimensions do not necessarily lie within subspaces of higher-dimensional solutions. This is because the regularization scheme and algorithm we use create biases that disrupt any strict nesting. We therefore need to explicitly test whether the structures we uncover in the 2-modulator model are also present in the 3- and 4-modulator models. And this needs to be done under the limitations of the identifiability problem, i.e. without choosing a particular coordinate system for the modulation subspace. First column: In Figure 2b–c, we showed that the vectors of modulator coupling weights for LHS units and RHS units in the 2D modulator model were typically orthogonal. Here we show that this holds in higher dimensions. For each recording day, we measured the angle between the average weight vector for LHS units (), and the average weight vector for RHS units (), i.e. the arc cosine of their inner product. The 2-modulator hemisphere-constrained model used in most of the main text has this orthogonality enforced by constraint (top row). For the unconstrained 2-, 3-, and 4-modulator models (remaining rows), the blue histograms show the distribution of these angles across recording days. For comparison, we shuffle the anatomical labels on each unit and repeat the analysis to obtain the red histograms. The clustering of the actual data around π/2 indicates near orthogonality of the hemispheric weights. Second column: In Figure 2d, we showed that neurons which were task-relevant (i.e. had larger d′ values) were more strongly coupled to the (1D) hemispheric shared modulators. Here, we show that this holds in higher dimensions. For each recording day, we measured the magnitudes of all units’ coupling weight vectors, . Green histograms show the distribution of magnitudes for the quartile of units with largest d′ values; brown histograms show the distribution of magnitudes for the quartile with the smallest d′ values. Third column: In Figure 3b, we showed that the variance of the (1D) hemispheric shared modulators changed according to the attentional cue: specifically, when the cue switched, one hemispheric modulator decreased in variance, while the other increased in variance. To show that this holds in higher dimensions, it is necessary to construct an appropriate metric for this change in second-order statistics that generalizes to higher dimensions, and that also does not depend on a choice of coordinate system. To accomplish this, we measure the effect of the attentional cue as a change in the covariance of the (multivariate) modulator. Considering the change from the cue-right to the cue-left condition, we can measure the effect on the modulator’s second-order statistics via the ratio of the two modulator covariances, . The eigenvalues of this matrix then provide a coordinate-system-free measure of how the modulator statistics change. If the largest eigenvalue, λmax, is significantly greater than 1, then there is a direction in modulation space that became more variable due to the switch in cue. If the smallest eigenvalue, λmin, is significantly less than 1, then there is a direction in modulation space that became less variable due to the switch in cue. Eigenvalues close to 1 indicate that the variance of modulation in that direction was unchanged by the cue. Thus these two values, λmax and λmin, play an analogous role to the ratios of modulator variance examined in Figure 3b. The scatter plots show the distribution of λmax and λmin for the higher-dimensional modulator models. Blue points show these eigenvalues from each recording day; red points show the distributions obtained if we shuffle the cue labels for each trial. Importantly, when λmax exceeds 1 and λmin is less than 1 (i.e. when the points lie in the lower right quadrant), then the change in attentional cue is causing an increase in modulator variance in one direction, and a decrease in modulator variance in an orthogonal direction. These effects are clear (and significant, compared with the null distribution in red) in all cases.