Demographics of participants.

CTQ=Childhood Trauma Questionnaire, MZQ = Mentalisation Questionnaire, RGPTSB=Revised Green Paranoid Thoughts Scale (Persecutory Subscale), RGPTSA=Revised Green Paranoid Thoughts Scale (Referential Subscale), CAMSQ=Certainty About Mental States Questionnaire. ETMCQ=Epistemic Trust, Mistrust and Credulity Questionnaire, M=Male, F=Female, O=Other. For continuous variables, all means are stated with corresponding standard deviations in brackets. Significant differences are highlighted in bold.

Task and Model Space.

(A) Participants were invited to play a three-phase, repeated social value orientation paradigm—the Intentions Game—with virtual partners. Phase 1 of the Intentions Game lasted 36 trials and asks participants to make a forced choice between two options as to how to split points with an anonymous virtual partner. An example of a prosocial-individualistic pair of options could be (self=5, other=5) or (self=10, other=5) - if the participant chooses option 1 they could be viewed as less individualistic and more prosocial as the outcomes to the other do not change, but the self would earn less. In phase 2, lasting 54 trials, participants were asked to predict the decisions of a new anonymous partner using the same two-forced choice set-up and the same option pairs; participants were given feedback on whether they were correct or incorrect in their prediction. We used Amazon Web Services to create a novel server architecture to match participants and (virtual) partners (Burgess et al., 2023). Partners in phase 2 were matched to be approximately 50% different from the participant with respect to their choices in phase 1 to ensure all participants needed to learn about their phase 2 partner, and to provide a mechanism to examine whether beliefs about partners had an effect on the self. Phase 3 was identical to phase 1, although participants were informed that they were matched with a third anonymous partner, unconnected to the partners in phase 1 and 2. At the end of the game, if participants collected over 1000 points overall, they were entered into a lottery to win a bonus. (B) We created four models that may explain the data and to test theories of social generalization. Model M1 assumes participants are subject to both self-insertion and social-contagion, that is, participants used their own preferences as a prior about their partner in phase 2, and partner behaviour subsequently influenced participant’s preferences in phase 3. Model M4 assumes participants are subject to neither self-insertion nor social contagion, instead forming a novel prior around the phase 2 partner rather than using their own preferences and failing to be influenced by their partner after observation. Models M2 and M3 suggest participants are only explained by either self-insertion or social-contagion, not both. (C) We assume that participants choices in phase 1 are governed by both a median and standard deviation . Participants insert their median preferences into their prior beliefs over their partner in phase 2, but with a different standard deviation to allow for flexibility and learning . The combination of the prior and posterior belief uncertainty about the partner , the precision participants have over their own preferences , and the median posterior of the participant and partner * form the new median and standard deviation over participant preferences in phase 3 . (D) In contrast to M1, M4 generates a new central tendency over the partner in phase 2 which disconnects participant preferences and prior beliefs. M4 also assumes that the same parameters that generated participant choices in phase 1 also generate choices in phase 3. (E) Simulating our model demonstrates how different combinations of α (preferences for absolute self-reward) and β (preferences for relative reward; prosocial-competitiveness) lead to changes in the discrepancy of value between participants and partners (left panel). We also show how increasing uncertainty over self-beliefs, and higher precision over partners, causally draws participants more toward the beliefs of their partner in phase 3 and increases their precision over their phase 3 beliefs (Moutoussis et al., 2016).

Parameter and model specification.

Grey shading = parameters relevant to representations of the self (ppt). Orange shading = parameters relevant to representations of the other (par). Free = parameters are random variables to fit through model inversion. Derived = parameter is calculated from latent values within the model. SD = standard deviation.

Beliefs between groups and within phases.

(A) We used random-effects hierarchical model fitting and comparison to jointly estimate group level and individual level parameters based on real data from participants (Piray et al., 2021). CON participants were best fit by M1, whereas BPD participants were best fit by M4 on a group level. Looking within each model by simulating the beliefs of each participant reveals that – as expected – CON participants use the median of their self-preferences (black distribution) as a basis for their prior beliefs about partners (light orange distribution), and that the precision of their posterior beliefs about partners (dark orange distribution) and the precision of their own self preferences leads to a shifted model of the self (grey distribution). BPD participants on the other hand have a disintegrated prior over their partner which is not subject to their own self representation. Likewise, there is no change in self-preferences following learning, and thus an absence of the light grey distribution. For illustration, we focus on beliefs over relative preferences (β) and use real individual participants as exemplars for illustration. (B) Across models we extracted the common parameters that generate the behaviour of both CON and BPD participants – that is, their median and standard deviation over both α (absolute reward preferences) and β (relative value preferences), the flexibility over participants’ prior beliefs about their partners over each dimension, and the absolute change in posterior beliefs in phase 2 over each dimension . Using hierarchical Bayesian t-tests we demonstrated the mean difference in parameter values between groups. Purple values lower than 0 indicate that the BPD participants had significantly smaller parameter values. Here we find that BPD participants were less individualistic, equally prosocial, and more certain about their self-preferences. BPD participants were also less flexible over their beliefs about a partner’s absolute reward preferences and updated their beliefs less across the board. (C) We also calculated the Kullback-Leibler divergence (DKL) of beliefs between each trial (t-1 vs t) on each trial during phase 2. We observed three things: 1. All participants display more sensitive updates initially, 2. all participants ‘cool off’ in their sensitivity over the course of phase 2, and 3. BPD participants make significantly less sensitive updating throughout the course of phase 2 vs. CON participants. (D) Examining reaction times of participants over phase 2 revealed that participants became faster at making predictions as phase 2 continued. We also find that participant similarity interacted with trial to change reaction times, such that higher participant-partner similarity reduced reaction times in earlier trials but this difference was attenuated over time. Participant(PPT)- partner(PAR) similarity was calculated as the combined distance between participant and partner parameters determined by server matching along absolute (α) and relative reward (β) axes. Similarity was visualised as dichotomous for illustration but treated as a continuous variable in our analyses) (E) Examining participants under a blanket assumption that participants in both BPD and CON groups were influenced by their partner revealed that BPD participants were significantly less influenced by their partner across the board, both with respect to their phase 3 median and standard deviation of beliefs. Kruskal-Wallis tests were used between groups within the visualisation. *=p<0.05, **=p<0.01, ***=p<0.001, ****=p<0.0001.

Model Accuracy.

(A) We used random-effects hierarchical model fitting and comparison to jointly estimate group level and individual level parameters on simulated data (Piray et al., 2019). CON participants were best fit by M1, whereas BPD participants were best fit by M4 (B) Server matching between participant and partner in phase two was successful, with participants being approximately 50% different to their partners with respect to the choices each would have made on each trial in phase 2 (mean similarity=0.49, SD=0.12). Model accuracy across the task was very high (mean accuracy=0.8, SD=0.12). Model accuracy within each phase was very high (mean accuracy[phase1]=0.83, SD[phase1]=0.16; mean accuracy[phase2]=0.77, SD[phase2]=0.14; mean accuracy[phase3]=0.82, SD[phase3]=0.17). Loglikelihood values were also well above what would be expected had the model fitted the data by chance (median=-40.68, SD=22.7; chance value=-87.33). Choice probabilities generated by the model on each trial were also well above chance thresholds (median=0.91, SD=0.24; chance value=0.5). (C) The spearman association between the responsibility allocated for each participant during real and recovered model comparison was highly correlated on the diagonal. There was some correlation between M1-M2 but this was due to M2 being a nested model of M1, sharing similar free parameters; this was not worrying in light of excellent model identifiability overall in the synthetic comparison. Associations between real and recovered parameters from the dominant model within each BPD and CON participants was very high with few cross correlations on the off-diagonal. In both confusion and parameter recovery matrices, white spaces indicate insignificant associations at the p > 0.01 level. (D) (top panel) The relationship between uncertainty over the self and uncertainty over the other with respect to the change in the precision (left) and median-shift (right) in phase 3 beliefs. CON participant self and other uncertainty is overlaid onto the plot to demonstrate the degree to which their beliefs should change in phase 3 according to the model. (bottom panel) Correlating the model-predicted median shift in beliefs and derived change in beliefs between phase 1 and 3 demonstrates a very strong association (r = 0.88, p < 0.001). For the purposes of visualisation we cap real and simulated values <15 for compactness, although the true correlation reported is irrelevant to this visual constraint. (E) (leftpanel) We overlay model-predicted (solid line) and real observed (dashed line) trial-by-trial probabilities extracted from a linear model for a correct prediction by participants. For raw trial by trial updating see Supplementary Figure 5. Both closely match, (middle panel) There was no significant difference (ns) for BPD and CON participants with respect to their total correct answers over phase 2. (right panel) Model-predicted and real observations in phase 2 total scores were highly correlated in both groups (CON r=0.84, p<0.001; BPD r=0.89, p<0.001).

Psychometric correlations.

(A) We conducted ranked spearman correlations between belief flexibility and updating in phase 2 controlling for true baseline similarity with respect to server derived parameters. We found that childhood trauma was negatively associated with flexibility and updating over relative reward preferences. Persecutory ideation scores were negatively associated with belief flexibility and updating across the board. (B). We conducted ranked spearman correlations between belief flexibility and absolute partner-participant dissimilarity * — with respect to server-derived parameters - in phase 2. Only flexibility over relative reward preferences in phase 2 was associated with harmful intent attributions. Increased absolute participant-partner dissimilarity was associated with lower self-interest attributions, and increased relative participant-partner dissimilarity was associated with high harmful intent attributions.

Group Level Parameter Values.

BPD participants were explained by M4 which has two extra free parameters than CON participants who were best explained by M1.

Individual Level Parameter Distributions Per Group.

BPD (purple) participants were explained by M4 which has two extra free parameters (alpha_par) and (beta_par) than CON participants (blue) who were best explained by M1.

Simulation of Phase 2 priors that may be drawn from a memory of an aversive other vs from the self alone.

We can imagine a scenario where a prosocial participant (typical of BPD and CON) has a strong impression of an other from memory who is particularly aversive (competitive). Using a mixture of the median belief of the self (;classified in phase 1) and a mixture of the belief about how this notional competitive other would act we can create a causal model of how priors in phase 2 about an anonymous partner might draw on different sources. Here, the median of the prior over the partner in phase 2 is a mixture of median belief of self and ‘notional’ other . An equal mixture of self and other belief would equally explain the naïve prior BPD participants hold over their partner in phase 2. However, as mentioned, given that BPD participants hold a naïve prior even when they are themselves competitive goes against this hypothesis. It is worth testing.

Exemplar distribution from an individual with a diagnosis of BPD who was competitive in phase 1 and matched with a partner who was prosocial in phase 2. We note that irrespective of the valence of BPD participants’ preferences, there was still a neutral prior generated that was not integrated into the model of self.

(top panels) Raw trial-wise probability of correct responses from real and model-simulated observations for each group. Probabilities were approximated by grouping by trial across each group, summing the total correct responses and dividing by 54. (bottom panel) Cumulative percentage of correct predictions in phase 2 for each group are shown as thick solid lines. Individual cumulative scores are depicted as thin translucent lines.

2D Distribution of participant and partner parameters estimated through Bayesian inference at the AWS server backend during the participant-partner matching protocol. As a sanity check we also assessed the degree to which server-derived participant parameters matched model-fitting derived model parameters; any discrepancy may have inappropriately matched partners to participants on the server-side. We observed excellent correlations between server-derived participants (not used for analysis; only for partner matching in game) and model-derived phase 1 parameters *.

Spearman Correlations Between Psychometric Scores at Baseline and Self/Other Parameters.

(Top) Psychometric correlations with parameters for self. (Bottom) Psychometric correlations with parameters for other. All correlations with p-values > 0.05 are omitted.

Spearman’s ρ between psychometric measures and change absolute change in self-preferences from phase 1 to 3.

All beliefs metrics are extracted from M3 which assumes all participants engage in social contagion. Cred = Credulity. Delta = whether the shift in belief was along preferences for absolute (alpha) or relative (beta) reward.

Linear random effects relationship between reaction time (ms) and belief updating.

Grey lines are individual participants. Black line is the average linear effect. Reaction time is capped at 10000ms for visual illustration, but linear models do not apply an upper limit.

Option pair rewards for each phase and their corresponding ‘type’. Within phase order of trials were randomised. P=Prosocial, I=Individualistic, C=Competitive. S1 = reward to self for option 1. S2 = reward to self for option 2. 01 = reward to other for option 1. 02 = reward to other for option 2.