Experimental paradigm. (a)

The task comprised three phases: Solo, Learning, and Observed phases. (e) During the Solo phase, participants were asked to make a series of risky choices alone, which were used to measure their own risk preference. (a,b,f) During the Learning phase, participants were introduced with two random partners and asked to predict their choices. Unbeknownst to participants, one partner had risk-aversive (Risk-averse partner) and the other partner had risk-tolerant (Risk-seeking partner) preferences. To help partner identification, each partner was labeled with an alphabet letter (A or B) and color-coded (counterbalanced). On each trial, an agent identifier that indicates the identity of the predicted partner was presented on the center of the screen. (a,g) During the Observed phase, participants were asked to make the same type of gamble choices as the Solo phase. Critically, at the beginning of some trials (‘Observer trial’), participants were informed that their choice on the corresponding trial will be later used in the Learning phase for one of the two assigned partners. On these Observer trials, the identity of the designated partner on each trial was presented as an avatar observing through an open door. ‘No observer trials’, the trials at which individuals’ choices will not be presented to any partners, were informed with a vacant open door. (c) To depict individuals’ prediction performance during the Learning phase, participants’ prediction choices were binned with a bin-size of 6 trials and 3-trial overlaps. Along the repeated trials of prediction with feedbacks, individuals successfully learned the two partners’ simulated risk preferences. Error bars indicate s.e.m. (d) At the end of the Learning phase, individuals were asked to answer to a few questions regarding their impression about each partner’s characteristics. Particiapnts’ reports on the question ‘How risky was this partner?’ showed a consistent pattern with their prediction behavior, such that they evaluated the Risk-seeking partner to be significantly more riskier than the Risk-averse partner (t(42)=-35.83, P=4.10e-33). Grey dots represent each individual’s evalation score; Erro rbars indicate s.e.m.; ***P < 0.001.

Behavioral results. (a)

During the Learning phase, individuals were asked to predict partners’ choices. On the very first prediction trial, individuals had to make predictions without any information about partners. Compared to the choices that individuals would have made based on their estimated risk preferences, individuals predicted that both partners are more likely to choose the risky option (Risk-averse partner: χ2=3.33, P=0.068; Risk-seeking partner: χ2=7.37, P=0.0066). (b) Individuals did not receive feedbacks about their predictions on the last 10 trials in the Learning phase. On these trials, participants still predicted that one partner whose true risk preference is set to be riskier than the other partner to make riskier choices, and vice versa (t(42)= -21.54, P=2.56e-24). Each dot represents an individual participant. (c) In the Observed phase, individuals made gambling choices under three different conditions: Risk-averse observer, No observer, and Risk-seeking observer trials. Relative to No observer trials, participants made more safe gambles when the risk- averse partner was to observe their choices, and more risky gambles when the risk-seeking partner was to observe (repeated-measures ANOVA, F(2, 42) = 6.82, P = 0.0018; paired t- tests: Risk-averse vs. No observer: t(42) = –2.28, P = 0.028; Risk-seeking vs. No observer: t(42) = 1.84, P = 0.072; Risk-averse vs. Risk-seeking: t(42) = –3.28, P = 0.0021). Error bars indicate s.e.m.; †P < 0.1, *P < 0.05, **P < 0.01. ***P < 0.001. (d) Model parameters were estimated using the Social reliance model. Light-colored dots represent individuals. Filled green dots and empty markers indicate means and medians of each parameter, respectively. (e) Estimated Social reliance parameters well explained individuals’ choices during the Observed phase. Specifically, individuals who relied the most on the observers’ choice tendencies chose the risky option the least when the Risk-averse partner would be observing, but chose the risky option the most when the Risk-seeking partner would be observing. Each dot represents individuals, and solid lines indicate regression lines.

mPFC and TPJ are recruited for valuation under social observation in addition to the regions tracking non-social subjective value. (a)

When viewing gamble options during the Solo phase, trial-by-trial probability of the chosen option was positively encoded in the vmPFC (x = –3, y = 62, z = –13, kE = 165, cluster-level PFWE, SVC = 0.009) and vStr (x = 3, y = 14, z = -10, kE =40, cluster-level PFWE, SVC = 0.015), and negatively encoded in the dACC (x = 12, y = 32, z = 29, kE = 386, cluster-level PFWE, SVC = 0.005). These brain regions were set as regions-of-interest (ROI) for the decision-making signals in the Observed phase where gambling choices were identical besides the social context. (b) To examine whether the same decision-tracking regions were recruited in the Observed phase, trial-by- trial probability of the chosen option was calculated based on our suggested Social reliance model. As expected, the same type of decision probability information comprising the social and non-social components was tracked in the ROIs during the Observed phase. Each dot represents individual and error bars indicate s.e.m.; *P < 0.05; **P < 0.01; ***P < 0.001. (c) Whole brain analysis revealed that trial-by-trial probability of the chosen option was positively encoded in the bilateral TPJ when individuals were viewing gamble options during the Observed phase (left TPJ: x = –54, y = –37, z = 14, kE = 104, Punc. < 0.001; right TPJ: x = 63, y = –40, z = 17, kE = 191, cluster-level PFWE, SVC = 0.019). (d) An additional whole-brain analysis revealed that the mPFC responded to the initial social cue (x = –3, y = 50, z = 14, kE = 22, Punc. < 0.005).

TPJ-mPFC connectivity is associated with individuals’ social reliance.

To examine whether the mPFC and the TPJ interacted with each other while individuals made choices under social observation, we conducted psychophysiological interaction (PPI) analyses. (a,b) The functional connectivity between the mPFC from fig. 3d and its adjacent mPFC region was positively associated with log-transformed Social reliance (peak at [x = 3, y = 50, z = 5], kE = 74, cluster-level PFWE, SVC = 0.011). The clusters displayed in yellow Punc. < 0.005 and red Punc. < 0.001. (c) Additional PPI analysis between the TPJ from fig. 3c and the mPFC from fig. 4a (a region sensitive to the initial social cue) was also positively associated with log-transformed Social reliance (r = 0.43, P = 0.018). (d) The positive relationship between individuals’ social reliance and TPJ-mPFC connectivity was mediated by the mPFC-mPFC connectivity (a: β = 0.15, P = 0.0013, b: β = 2.12, P = 1.42e-06, a × b: β = 0.32, P = 0.0016). Black and gray arrows indicate significant and non-significant associations between the components, respectively. Red arrow indicates a significant mediation effect; *P < 0.05; **P < 0.01; ***P < 0.001.