Introduction

It is well-known that individuals often make choices differently around social others (Abrams & Hogg, 1990; Cikara et al., 2011; Guassi Moreira et al., 2018). Although the most commonly observed social influence is to conform with the choices observed from social others (Hardin & Higgins, 1996), another line of research has shown that the mere presence of social others biases individuals’ choices to become riskier (Gardner & Steinberg, 2005; Haddad et al., 2014). This unidirectional behavioral bias was particularly observed in adolescents (Chein et al., 2011; Ciranka & Van den Bos, 2019), and thus previous studies largely focused on the characteristics observed in adolescents (e.g., heightened social sensitivity (Albert & Steinberg, 2011; Foulkes & Blakemore, 2016; Lundborg, 2006) and developmental imbalance between reward sensitive system and cognitive control system (Chein et al., 2011; Somerville et al., 2010)) to explain why being under this type of social context (presence of social others) affects decision-makers in a seemingly different manner. However, recent studies showed that the unidirectional influence of social others’ presence may be also observed in adults (Otterbring, 2021), and that the extent of such influence depends on the identity of the observer (Van Hoorn et al., 2018). These data suggest that besides the neurodevelopmental characteristics, there exists an active processing of information about the social context which determines how individuals respond to others’ presence. Expanding this perspective, it can be inferred that the beliefs individuals have about social others, which can be changed and established by learning, may have a crucial role in determining the direction of social influence. Yet, this hypothesis about the impacts of social others’ presence have not been explicitly tested.

To examine how individuals’ beliefs about social observers affect their decisions about risky options, we used a three-phased gambling task (Fig. 1a, S1) in which 43 healthy participants (male/female = 25/18, age = 21.35 ± 2.42; Table S1) made choices between one safe (i.e., guaranteed payoff) and one risky options. In the first phase of the task (‘Solo phase’), individuals were asked to make a series of gambling choices alone (Fig. 1e). In the second phase (‘Learning phase’), individuals were asked to predict gamble choices of the two random partners who were introduced as previously participated players whose choices were recorded (Fig. 1b,f). This phase was expected to provide individuals to learn about the two partners where unbeknownst to participants, one partner was set to be risk-averse and the other partner was set to be risk-seeking (see Fig. S3 for the partners’ preferences). In the third phase (‘Observed phase’), individuals were given with the same set of gamble choices they faced in the Solo phase, each pair of gambles iterated three times in total, but all shuffled in a random order (Fig. 1g). Critically, on some trials, individuals were told that their choices would be used for one of the partners’ Learning phase. In this way, we implemented two types of trials where the choices would be observed by the partners (Risk- averse and seeking observer trials) and one type of trials where the choices would be made alone (No observer trials). By examining individuals’ choice patterns in the Learning phase, we aimed to examine the impacts of observers on individuals’ risky decision-making.

Experimental paradigm. (a)

The task comprised three phases: Solo, Learning, and Observed phases. (e) During the Solo phase, participants were asked to make a series of risky choices alone, which were used to measure their own risk preference. (a,b,f) During the Learning phase, participants were introduced with two random partners and asked to predict their choices. Unbeknownst to participants, one partner had risk-aversive (Risk-averse partner) and the other partner had risk-tolerant (Risk-seeking partner) preferences. To help partner identification, each partner was labeled with an alphabet letter (A or B) and color-coded (counterbalanced). On each trial, an agent identifier that indicates the identity of the predicted partner was presented on the center of the screen. (a,g) During the Observed phase, participants were asked to make the same type of gamble choices as the Solo phase. Critically, at the beginning of some trials (‘Observer trial’), participants were informed that their choice on the corresponding trial will be later used in the Learning phase for one of the two assigned partners. On these Observer trials, the identity of the designated partner on each trial was presented as an avatar observing through an open door. ‘No observer trials’, the trials at which individuals’ choices will not be presented to any partners, were informed with a vacant open door. (c) To depict individuals’ prediction performance during the Learning phase, participants’ prediction choices were binned with a bin-size of 6 trials and 3-trial overlaps. Along the repeated trials of prediction with feedbacks, individuals successfully learned the two partners’ simulated risk preferences. Error bars indicate s.e.m. (d) At the end of the Learning phase, individuals were asked to answer to a few questions regarding their impression about each partner’s characteristics. Particiapnts’ reports on the question ‘How risky was this partner?’ showed a consistent pattern with their prediction behavior, such that they evaluated the Risk-seeking partner to be significantly more riskier than the Risk-averse partner (t(42)=-35.83, P=4.10e-33). Grey dots represent each individual’s evalation score; Erro rbars indicate s.e.m.; ***P < 0.001.

Previous functional neuroimaging studies on decision-making under social contexts revealed a set of brain regions that have critical roles in processing social information (Blakemore, 2008; Hiser & Koenigs, 2018; Mukerji et al., 2019). For example, the medial prefrontal cortex (mPFC) is known to encode social information not specific to valuation (Amodio & Frith, 2006; Mitchell et al., 2006; Van Overwalle, 2009) , but also known to encode value signals estimated in perspectives of others (Behrens et al., 2008; Ruff & Fehr, 2014; Sul et al., 2015; Wittmann et al., 2016). Such neural instantiation of social valuation is dissociable from the patterns observed in the ventromedial prefrontal cortex (vmPFC), which is a region known to track both the subjective valuations of non-social monetary rewards and the combined values of their own preferences with their preferences for (or aversion to) social contexts (Hare et al., 2010; Mitchell et al., 2006; Van Overwalle, 2009) (Amodio & Frith, 2006). The temporoparietal junction (TPJ) is another region known to play an important role in social cognitive functions, including inferring others’ intention and in learning about others (Behrens et al., 2008; Boorman et al., 2013; Charpentier et al., 2020; Park et al., 2021; Samson et al., 2004; Saxe & Kanwisher, 2003; Saxe & Kanwisher, 2013; Van Overwalle, 2009; Young et al., 2010). Corroborating with these neuroimaging data, excitatory stimulation of the TPJ improved social cognition (Santiesteban et al., 2012), while TPJ dysfunctions in psychiatric patients (e.g., schizophrenia and autism spectrum disorder) were associated with their social impairments (Carter & Barch, 2007; Lombardo et al., 2011). Previously, it was shown that when others’ choices were revealed, this network of brain regions is involved in inferring others’ intentions and using the social information in the process of decision-making (Hampton et al., 2008; Zhang & Gläscher, 2020). We hypothesized that even if others’ choices are not explicitly presented, simple presence of social others may trigger inference about others’ potential choices, and the same set of brain regions will play an important role in value-based decision-making.

To test our hypotheses, we scanned a subset of 30 participants (male/female = 16/14, age = 21.77 ± 2.16; Table S1) and used a computational modeling approach in conjunction with neuroimaging to investigate the mechanisms via which the presence of social observer affects individuals’ decision processes. Behaviorally, we first tested whether individuals successfully learned partners’ choice preferences, and then investigated whether their decisions, made under the belief of being observed by others, could be accounted for by a combination of their own preferences and the simulated choice tendencies of the observers. Neurally, we predicted that the choices made under social observation would be signaled both in the valuation-association regions (vmPFC, mPFC) and the region associated with social inference (TPJ). Our results provide neural and behavioral evidence for a role of social observers in affecting individuals’ decisions.

Results

Individuals initially believe that others would make riskier choices than they would

As the main purpose of the current study, we aimed to investigate the impacts of two different observing partners, each of who has different risk preferences, on individuals’ risky decision- making. To examine individuals’ own risk preferences before they were exposed to any potential social influence, all participants were asked to make a series of gamble choices by themselves in the initial phase of the task (Solo phase; Fig. 1a, e). After the measurement of individuals’ preferences, they were introduced to two random partners and asked to predict these partners’ gambling choices (Learning phase; Fig. 1b, f). Since individuals were not provided with any information about either of the partners, we assumed that individuals’ very first predictions about partners’ choices might reflect their initial beliefs about others’ preferences. Compared with the choices that individuals were expected to make based on their Solo phase choice patterns (i.e., simulated choices based on individuals’ risk preferences), they reported that partners might make riskier choices (Fig. 2a; Risk-averse partner: 𝜒2= 3.33, P = 0.068; Risk-seeking partner: 𝜒2 = 7.37, P = 0.0066). This result indicates that individuals may initially believe anonymous others to be more risk seeking than themselves.

Behavioral results. (a)

During the Learning phase, individuals were asked to predict partners’ choices. On the very first prediction trial, individuals had to make predictions without any information about partners. Compared to the choices that individuals would have made based on their estimated risk preferences, individuals predicted that both partners are more likely to choose the risky option (Risk-averse partner: χ2=3.33, P=0.068; Risk-seeking partner: χ2=7.37, P=0.0066). (b) Individuals did not receive feedbacks about their predictions on the last 10 trials in the Learning phase. On these trials, participants still predicted that one partner whose true risk preference is set to be riskier than the other partner to make riskier choices, and vice versa (t(42)= -21.54, P=2.56e-24). Each dot represents an individual participant. (c) In the Observed phase, individuals made gambling choices under three different conditions: Risk-averse observer, No observer, and Risk-seeking observer trials. Relative to No observer trials, participants made more safe gambles when the risk- averse partner was to observe their choices, and more risky gambles when the risk-seeking partner was to observe (repeated-measures ANOVA, F(2, 42) = 6.82, P = 0.0018; paired t- tests: Risk-averse vs. No observer: t(42) = –2.28, P = 0.028; Risk-seeking vs. No observer: t(42) = 1.84, P = 0.072; Risk-averse vs. Risk-seeking: t(42) = –3.28, P = 0.0021). Error bars indicate s.e.m.; †P < 0.1, *P < 0.05, **P < 0.01. ***P < 0.001. (d) Model parameters were estimated using the Social reliance model. Light-colored dots represent individuals. Filled green dots and empty markers indicate means and medians of each parameter, respectively. (e) Estimated Social reliance parameters well explained individuals’ choices during the Observed phase. Specifically, individuals who relied the most on the observers’ choice tendencies chose the risky option the least when the Risk-averse partner would be observing, but chose the risky option the most when the Risk-seeking partner would be observing. Each dot represents individuals, and solid lines indicate regression lines.

Across repeated attempts to predict partners’ choices with feedback, it was confirmed that individuals successfully learned to dissociate and predict the choices each partner would have made (see Materials and Methods for details; Fig. 1c). Furthermore, individuals’ prediction responses on subsequent 10 prediction trials where no feedback was provided (Fig. 2b) as well as self-reports about the perceived riskiness of the partners collected at the end of the Learning phase (Fig. 1d) consistently showed that they were able to distinguish one partner from the other, and correctly estimate the partners’ risk preferences (Predicted risk preference: t(42) = -11.46, P = 1.66e-14; Self-report: t(42) = -35.83, P = 4.10e-33). These results suggest that independent of the initial beliefs about social others, individuals can learn about others’ preferences through experience.

Social observation from safe and risky partners affects individuals’ choices

During the Observed phase, individuals were instructed that some of their choices would be ‘observed’ (see Materials and Methods for the instruction) by one of the two partners whose risk preferences they had learned about in the Learning phase (‘Observer trial’; Fig. 1g). Compared with the trials where the choices were made alone without any observers (‘No observer trial’), individuals were indeed affected by learned preferences of the observing partners (Fig. 2c). Specifically, under the observation of the risk-averse partner, individuals chose the risky gamble significantly less than they did under no social observation. Under the observation of the risk-seeking partner, individuals were swayed toward making riskier choices, such that the probability of choosing the risky over the safe gamble was marginally larger than that under no observation and significantly larger than that under the risk-averse partner’s observation. Participants’ average probability of choosing the risky gamble under no social observation was comparable to that measured during the Solo phase, which suggests that the observed changes in choices are indeed the results of social observation rather than task repetition. These results indicate that social observation affects individuals’ decisions, and that the direction of this social influence depends on individuals’ beliefs about others.

Individuals’ simulated choices of social observers shape how they make risky choices

Various previous studies examined the impacts of social context on decision-making processes, but the suggested mechanisms by which individuals were affected by the social information depended on how the information was presented. For example, when the choices of social others were explicitly revealed, observing the choices added additional utilities on the options chosen by others and swayed individuals to follow others (Chung et al., 2015; Chung et al., 2020). On the contrary, when individuals were given with the chances to learn about one other player’s risk preference, but not provided with her/his choices explicitly, their preferences tended to change closer to the learned preference of the other player and in turn, their choices became similar with others (Suzuki et al., 2016). Here, similar to the latter case, participants did not see what others chose. However, the current study differed from both cases in that individuals were informed that their choices would be observed by others whose preferences they had the opportunity to learn.

We hypothesized that if individuals are sensitive to the identity of the currently observing partner, they would take into account the learned preferences of others in computing their choices rather than simply in guiding the direction how to change their own preferences. To test our hypothesis, we constructed a computational model assuming that individuals mix their choice tendencies based on their own preferences and the simulated choice tendencies derived from the perceived (or belief about) preferences of the observing partner. Specifically, we estimated a ‘Social reliance’ parameter (ω) that accounts for the extent to which the simulated choice tendencies of others (PPartner) contribute to individuals’ decisions (P(risky) = (1 – ω) POwn + ω PPartner; see Materials and Methods for details). Note that the tendencies of how individuals (POwn) or partners (PPartner) make risky choices were formalized using a power utility function (each option’s utility U = ∑pi(vi)ρ per expected utility theory (Kahneman & Tversky, 1979) for a gamble that has an ith outcome vi with probability pi for an individual whose risk preference is ρ) and the softmax decision rule (Huettel et al., 2006; Preuschoff et al., 2006). In one extreme where ω = 0, the model converges to POwn, indicating that individuals do not change how they make risky choices even under the social context (i.e., being under others’ observation). The other extreme (ω = 1) makes the model equivalent to PPartner, indicating that individuals always choose according to the risk preference they believed the partner holds.

A formal model comparison against two alternative models—the Risk preference change model (Suzuki et al., 2016) and the Other-conferred utility model (Chung et al., 2015) (see Materials and Methods for model descriptions)—confirmed that individuals’ choice behaviors were best explained by our suggested Social reliance model (Fig. S2a). We simulated individual choice data for 43 simulated subjects assuming the usage of each of the three potential models, and by re-estimating model parameters, we confirmed that the simulated behaviors from each model were best explained by the corresponding model (i.e., model recovery) (Fig. S2b). In addition, all parameters were recoverable (risk preference [ρ]: r = 0.98, P < 0.001; social reliance [ω]: r = 0.43, P = 0.0038; value sensitivity for No observer trials [λNoObs]: r = 0.86, P < 0.001; value sensitivity for Risk-averse observer trials [λAverseObs]: r = 0.49, P < 0.001; value sensitivity for Risk-seeking observer trials [λSeekingObs]: r = 0.72, P < 0.001; see Materials and Methods for model and parameter recovery details).

The estimated Social reliance parameter (ω) showed consistent pattern with model-agnostic measures that capture the extent to which individuals were influenced under partners’ observation (Fig. 2d, e). Specifically, individuals who relied the most on partners’ simulated choice tendency (i.e., largest ω) showed the largest shift toward making safer choices under the risk-averse partner’s observation compared to the choices they made under no observation (r = -0.57, P = 7.36e-05), and showed the largest shift toward risker choices under the risk- seeking partners’ observation (r = 0.34, P = 0.024). Note that the rank order of individuals’ risk preference parameters (ρ) was highly consistent between contexts with and without social observation (r = 0.73, P = 2.14e-08; Fig. S3, S4a). However, individual differences in risk preference were more pronounced under partners’ observation (χ2 = 18.90, P = 1.45e-05; Fig. S4b), such that the risk preference of individuals who made riskier (or safer) choices alone shifted toward stronger risk-seeking (risk-aversion) under social observation (r = 0.33, P = 0.031; Fig. S4c). These results suggest that there are two independent impacts of social observation on risky decision-making; individuals’ own risk preferences are more pronounced, independent of the identity of the observing partner, and furthermore, individuals’ choices are shaped by their beliefs about the currently observing partner’s preferences. Corroborating these model-based results, individuals’ self-reports about the impression they had on partners (e.g., similarity, trustworthiness), collected after the Learning phase (Table S9), were consistent with these parameterized impacts of social observation (Fig. S4, S5).

Neural responses support social reliance and the effect of social observation on shaping individuals’ final choices

To investigate neural instantiation of decision processes under social observation, we tested two neural hypotheses constructed based on our suggested computational model. First, if individuals followed our model, we would expect to find neural representations that reflect individuals’ final choices. To set regions-of-interest (ROIs) for analyses in the Observed phase, we first analyzed individuals’ blood-oxygen-level dependent (BOLD) responses at the time at which they viewed choice options during the Solo phase, and sought for the brain regions that track trial-by-trial decision probabilities (i.e., model-estimated Prob(chosen)) (see DM0 in Materials and Methods). We found that the ventromedial prefrontal cortex (vmPFC; x = –3, y = 62, z = –13, kE = 165, cluster-level PFWE, SVC = 0.009), ventral striatum (vStr; x = 3, y = 14, z = -10, kE =40, cluster-level PFWE, SVC = 0.015), and dorsal anterior cingulate cortex (dACC; x = 12, y = 32, z = 29, kE = 386, cluster- level PFWE, SVC = 0.005)—brain regions known to be involved in valuation and decision- making (Boorman et al., 2013; Christopoulos et al., 2009; Croxson et al., 2009; Kable & Glimcher, 2007; Knutson & Bossaerts, 2007; Rangel & Hare, 2010; Rudebeck et al., 2008; Rushworth et al., 2011; Wunderlich et al., 2009)—tracked final decision probabilities in the Solo phase (Fig. 3a, Table S2). These results were largely the same when the trial-by-trial utility differences were used as a parametric modulator instead (Fig. S6, Table S3). From the subsequent ROI analyses in the Observed phase, consistent with the results in the Solo phase, BOLD responses from the ROIs significantly tracked individuals’ final decision probabilities (vmPFC: t(29) = 1.77, P = 0.044, vStr: t(29)=3.21, P = 0.0016, dACC: t(29) = –4.07, P = 0.00017; Fig. 3b).

mPFC and TPJ are recruited for valuation under social observation in addition to the regions tracking non-social subjective value. (a)

When viewing gamble options during the Solo phase, trial-by-trial probability of the chosen option was positively encoded in the vmPFC (x = –3, y = 62, z = –13, kE = 165, cluster-level PFWE, SVC = 0.009) and vStr (x = 3, y = 14, z = -10, kE =40, cluster-level PFWE, SVC = 0.015), and negatively encoded in the dACC (x = 12, y = 32, z = 29, kE = 386, cluster-level PFWE, SVC = 0.005). These brain regions were set as regions-of-interest (ROI) for the decision-making signals in the Observed phase where gambling choices were identical besides the social context. (b) To examine whether the same decision-tracking regions were recruited in the Observed phase, trial-by- trial probability of the chosen option was calculated based on our suggested Social reliance model. As expected, the same type of decision probability information comprising the social and non-social components was tracked in the ROIs during the Observed phase. Each dot represents individual and error bars indicate s.e.m.; *P < 0.05; **P < 0.01; ***P < 0.001. (c) Whole brain analysis revealed that trial-by-trial probability of the chosen option was positively encoded in the bilateral TPJ when individuals were viewing gamble options during the Observed phase (left TPJ: x = –54, y = –37, z = 14, kE = 104, Punc. < 0.001; right TPJ: x = 63, y = –40, z = 17, kE = 191, cluster-level PFWE, SVC = 0.019). (d) An additional whole-brain analysis revealed that the mPFC responded to the initial social cue (x = –3, y = 50, z = 14, kE = 22, Punc. < 0.005).

Second, we expected that under social observation, brain regions distinctive from that involved in the Solo phase would be additionally recruited. A subsequent whole-brain analysis (see DM1 in Materials and Methods) revealed additional brain regions that tracked final decision probabilities during the Observed phase (Table S4). In particular, bilateral TPJ, brain regions not implicated in the Solo phase, positively tracked trial-by-trial model- estimated decision probabilities (left TPJ: x = –54, y = –37, z = 14, kE = 64, Punc. < 0.001; right TPJ: x = 63, y = –40, z = 17, kE = 191, cluster-level PFWE, SVC = 0.019; Fig. 3c). The involvement of the TPJ in social contexts is consistent with previous studies that showed its role in social cognitive functions, including mentally simulating others’ motives (Behrens et al., 2008; Boorman et al., 2013; Charpentier et al., 2020; Park et al., 2021).

At the beginning of each trial in the Observed phase, individuals were first cued with the presence (or absence) of an observing partner (Fig. 1g, S1c), which we believe would trigger subsequent social processing. A subsection of the mPFC showed higher responses on Observer compared to No observer trials, indicating that the region is involved in social processing when necessary (x = –3, y = 50, z = 14, kE = 22, Punc. < 0.005; Fig. 3d, Table S5; see DM3 in Materials and Methods). We tested whether the mPFC was also involved significantly more during the decision process under social observation, particularly in individuals who relied more on the simulated behaviors of others. To examine this hypothesis, we conducted a psychophysiological interaction (PPI) analysis where the mPFC identified above (Fig. 3d, Table S5) was set as a seed, [Observer trials – No observer trials] was set as a psychological factor, and the model-estimated Social reliance parameter was entered as a covariate (see Materials and Methods for PPI design). We confirmed that the functional connectivity between the mPFC, which is sensitive to cues regarding the presence of an observing partner, and its adjacent mPFC region (x = 3, y = 50, z = 5, kE = 74, cluster-level PFWE, SVC = 0.011; Fig. 4a, b, Table S5), overlapping with the mPFC identified above, was positively associated with individuals’ social reliance.

TPJ-mPFC connectivity is associated with individuals’ social reliance.

To examine whether the mPFC and the TPJ interacted with each other while individuals made choices under social observation, we conducted psychophysiological interaction (PPI) analyses. (a,b) The functional connectivity between the mPFC from fig. 3d and its adjacent mPFC region was positively associated with log-transformed Social reliance (peak at [x = 3, y = 50, z = 5], kE = 74, cluster-level PFWE, SVC = 0.011). The clusters displayed in yellow Punc. < 0.005 and red Punc. < 0.001. (c) Additional PPI analysis between the TPJ from fig. 3c and the mPFC from fig. 4a (a region sensitive to the initial social cue) was also positively associated with log-transformed Social reliance (r = 0.43, P = 0.018). (d) The positive relationship between individuals’ social reliance and TPJ-mPFC connectivity was mediated by the mPFC-mPFC connectivity (a: β = 0.15, P = 0.0013, b: β = 2.12, P = 1.42e-06, a × b: β = 0.32, P = 0.0016). Black and gray arrows indicate significant and non-significant associations between the components, respectively. Red arrow indicates a significant mediation effect; *P < 0.05; **P < 0.01; ***P < 0.001.

Per our Social reliance model, the simulated behavior of others is integrated into the final decision probability. We found that the final decision probability was represented in the TPJ specifically under social observation. If the mPFC region from the above PPI result (Fig. 4a, Table S5) plays a role in simulating the other’s behavior, it should interact with the TPJ to integrate the social information into the final decision. To examine this hypothesis, we conducted another PPI analysis where the TPJ was set as a seed region and the target region was the mPFC cluster identified in fig. 4a (see Materials and Methods for details). Indeed, the TPJ-mPFC connectivity was positively associated with individuals’ social reliance (r = 0.43, P = 0.018; Fig. 4c). That is, particularly for individuals who relied strongly on other observers’ preferences, TPJ-mPFC connectivity was significantly stronger under others’ observation than when making choices alone. Consistent with previous findings on the TPJ (Behrens et al., 2008; Boorman et al., 2013; Charpentier et al., 2020; Park et al., 2021; Young et al., 2010), our model-based analyses indicate that the TPJ plays a key role in inferring others’ behavior and using that information to make individuals’ own decisions.

Supporting the role of the TPJ-mPFC connectivity in social information processing, the association between individuals’ social reliance and the connectivity was mediated by the extent to which individuals responded to the initial social cue (mPFC-mPFC connectivity, mediation effect: β = 0.32, P = 0.0016; see Materials and Methods for the mediation analysis; Fig. 4d). This links our model-based and neuroimaging results. Our data suggest that individuals who were predisposed to rely on belief about others’ choices were more sensitive to the social cue, and that individuals who exhibited higher mPFC sensitivity to social cue showed stronger TPJ-mPFC connectivity in processing social information. Together, our results suggest that the mPFC has a critical role of relaying both social and non-social valuation information for final decision-making processes.

Discussion

Being around social others is known to affect how individuals behave (Abrams & Hogg, 1990; Cikara et al., 2011; Guassi Moreira et al., 2018). Here, we developed a novel experiment where individuals made risky choices under the observation of social others after learning about their risk preferences. Individuals did not always act riskier under social observation, but rather tended to adjust their choices toward the direction they believed the observing partner would prefer. By using a computational model-based approach, we showed that the choice tendencies simulated based on the belief about the observer’s preference contribute to individuals’ decision-making processes. Our results provide a mechanistic explanation for how the presence of social observer influences individuals’ decision-making processes.

Observational learning and mimicry of others’ behavior are the patterns that can be found from social animals, such as nonhuman primates (Van de Waal et al., 2013). Such behavioral patterns are interpreted to be driven by a motivation to acquire more information (informative conformity) or alternatively, by a motivation to follow the group norm even if the following would not lead them to make better choices (e.g., higher accuracy) (Cialdini & Goldstein, 2004). In both cases, the direction of social influence is expected to occur bidirectionally, such that for one’s preference, one may become more or less averse to the option one originally preferred (Suzuki et al., 2016). In contrast to this possibility of bidirectional social influence, the presence of social others has been known to have a unidirectional impact on individuals’ choices, such that particularly in adolescents, they tend to exhibit riskier choices under the presence or observation of other peers (Powers et al., 2018). The current study suggests that this seemingly contradicting finding may be accounted for by the beliefs individuals hold about social others. Before learning about others, individuals may perceive themselves as decision-makers who make stable and safe choices, and thus, might attempt to make riskier choices when others are watching. However, after explicitly learning about others’ preferences, their initially misplaced beliefs about others would be adjusted, which can then be incorporated into decision-making processes under social observation. Our computational framework provides an explanation for why and how social influence on individuals’ choices sometimes manifests as a unidirectional shift of their preferences, while at other times, it results in a bidirectional shift. Future studies may directly examine the beliefs of adolescents about their peers and test whether such biased beliefs (e.g., “All the cool kids would not be afraid of jumping off the cliff.”) indeed drive their riskier behaviors around peers (Gardner & Steinberg, 2005; Haddad et al., 2014).

By using a computational modeling approach, we were able to delineate the impact of social influence across various cognitive processes embedded during decision-making (e.g., valuation, action selection). Previously, it was suggested that an observation of explicitly presented others’ choices adds value to the option others chose during the process of valuation (Chung et al., 2015). Under social others’ observation, no explicit choice information was provided, and thus we expected that the same mechanism would not explain the influence of social observation. Our data suggest that, as expected, being under social observation does not alter individuals’ valuation of the choice options, but instead alters their decision-making policies to mimic the behavior of others. This result may seem contradicting to Suzuki et al. (2016), a previous study that examined the impact of another type of implicit social information. In the corresponding study, authors suggested that repeated opportunities to predict others’ choices have an effect of swaying individuals’ risk preferences toward the learned preferences of others, a mechanism that we rejected for the current study. We note that such a mechanistic discrepancy is likely due to the differences in task design between the research and ours. Specifically, our study provided a comparable setting for learning about others (the Learning phase herein), but then implemented an additional layer of social influence (i.e., social observation), which was the main social factor we investigated. Our data suggest that individuals who have previously acquired knowledge about the social observer tend to rely on simulated choices which would be made by the observer, leading to a fine shaping of their decision policy.

Our modeling results were corroborated with the vmPFC response that tracked individuals’ final decision probabilities under social observation. Various decision-making studies suggested that the vmPFC encodes subjective value signal (Clithero & Rangel, 2014; Levy & Glimcher, 2012), which also encompasses the value of social information (Chung et al., 2015) and the expected value across temporal horizon (Iigaya et al., 2020; Na et al., 2021). Our result expands this view and suggests that the vmPFC tracks individuals’ decision policies (i.e., final decision probabilities), specifically when their choices are adjusted for the extent to which they rely on what social others would do. Typically, individuals’ decision is tightly linked to their subjective valuation, making it hardly possible to dissociate the representation pattern of one from the other. In the current study, as shown in our Social reliance model, the decision-making process under others’ observation clearly drifts apart from the form of combining information in value level. By separately testing these two possibilities, our neural results provide evidence for an alternative view, suggesting that the vmPFC sometimes takes part in tracking action policies rather than tracking economic values per se (Hayden & Niv, 2021). Note that this does not rule out the broadly accepted role of the vmPFC as the subjective value encoder, but instead opens the possibility that the brain region may have a more generalizable capacity. Depending on the context, individuals may make (subjective) value-based choices or follow alternative heuristics (e.g., relying on belief about others’ choices). Independent of such specific cognitive processes, the vmPFC may combine information from multiple sources and track individuals’ behavioral intentions.

Inference about social others is known to recruit a set of brain regions including TPJ and mPFC (Amodio & Frith, 2006; Behrens et al., 2008; Boorman et al., 2013; Charpentier et al., 2020; Mitchell et al., 2006; Park et al., 2021; Ruff & Fehr, 2014; Samson et al., 2004; Saxe & Kanwisher, 2003; Saxe & Kanwisher, 2013; Sul et al., 2015; Van Overwalle, 2009; Wittmann et al., 2016; Young et al., 2010). Among the so called ‘Social brain’ regions, previous studies showed that the TPJ is known to take part in theory-of-mind (TOM), the ability to understand others’ intention and to attribute others’ mental states to oneself (Saxe & Kanwisher, 2013; Schurz et al., 2017). Consistent with the known functional roles of the brain region, particularly when individuals believed others were observing their choices, the TPJ tracked individuals’ decision policies, indicating that the region plays a key role in simulating others’ choices and using that information to make their own decisions. Furthermore, in line with the known roles of the mPFC in the valuation process when making choices for others (Sul et al., 2015) and in processing information obtained from social others (Chung et al., 2020), our result show that the functional communication between the TPJ and mPFC is crucial for this social simulation.

The current study provides a mechanistic account for how the presence of others and others’ observing influence individuals’ risky decision-making. Our Social reliance model provides a way in which individuals adjust their choices around others and may explain why adolescents exhibit well-behaved behavior with their parents but are more likely to act out when with their peers (Van Hoorn et al., 2018). When others’ choices are not explicitly provided but rather presented as a context that individuals must infer, their tendency to rely on social information contributes to changes in their actions than changes in valuation. In the modern world, as much as we have means to obtain information about others, almost every choice we make is seen by others. Under this inevitable social environment, taking perspectives of others is largely adaptive and essential ability to empathize with others’ pain (De Vignemont & Singer, 2006; Decety & Lamm, 2006; Lamm et al., 2007), make prosocial choices (Buckholtz et al., 2008; Nook et al., 2016), and follow social norm (Cialdini & Goldstein, 2004; Rilling & Sanfey, 2011). Our data shed light on the flip side, elucidating why and how incorrect beliefs about others and excessive or biased sensitivity to social context may lead to maladaptive behaviors, such as a higher tendency to commit crimes (Buck et al., 1989) and the formation of extremely polarized opinions (Geschke et al., 2019; Lefebvre et al., 2024).

Materials and Methods

Participants

53 individuals participated in the current study, 38 of whom were scanned while they were making decisions. All participants provided written informed consent. This study was approved by the Institutional Review Boards of the Ulsan National Institute of Science and Technology (UNISTIRB-19-45-A). One participant was excluded due to excessive movement (>3 mm in the x, y, or z direction or >2-degree rotation around x, y, or z axis) and two participants were excluded due to scanner artefact. Seven participants (five from the scanned and two from the behavior only participants) were additionally excluded due to their failure to learn and distinguish the two partners’ risk preferences in their post- task questionnaires (the self-reported risky level difference between the two partners was smaller than [median – 3*(median absolute deviation)], i.e., judging the risk-averse partner as more risky than the risk-seeking partner). After exclusion, data from 43 participants (male/female = 25/18, age = 21.35 ± 2.42) were used for behavioral analyses, and a subsample of the data (N = 30; male/female = 16/14, age = 21.77 ± 2.16; Table S1) were used for neuroimaging analyses.

Experimental procedures

We developed a novel three-phase gambling task (Fig. 1, S1). Throughout the task, participants were asked to make a series of choices between one certain and one risky gamble option. Each option was presented as a pie chart. The certain option was presented as a whole pie, indicating a 100% probability of earning the payoff written on top of the pie. The risky option was divided into two pie pieces, with the size of each piece indicating the probability of earning high- and low- payoffs written on top of each piece. First, 30 sets of gambles were generated (Table S7). The gamble set was created with nine unique risky options, combining high-payoffs of 25, 48, and 90 with probabilities of 0.25, 0.5, and 0.75, as in a previous study (Chung et al., 2017). The low-payoff was always zero. Certain payoffs were set below, close to, and above the expected value of each risky option. In addition, one additional set of gambles was added with the largest expected value difference (EV difference = 38.5). Two sets of gambles were added as ‘catch trials’ (to ensure participants’ attention) where the payoff of the certain option was larger than the high-payoff of the risky option. Second, 30 sets of gambles were additionally generated in the same manner (Table S8). Nine unique risky options were created by pairing three levels of high- payoffs (15, 34, and 90) and three probabilities of high-payoff (0.25, 0.5, and 0.75). Certain payoffs were set as described above, but they were tailored so that the distribution of the expected value differences between the certain and risky options were matched the first 30 sets of gambles. The first 30 sets of gambles were used for the first phase (‘Solo’) and the third phase (‘Observed’) of the task, the second 30 sets of gambles were used for the second phase (‘Learning’).

In the Solo phase, participants completed 30 gamble choices. These choices were used to measure individuals’ own risk preferences without the influence of social context. In the Learning phase, participants were randomly matched with two partners and asked to predict their choices. The partners were introduced as individuals who previously participated in the gambling task. Unbeknown to the participants, one of the two partners were set to be risk- averse (risk preference = 0.43) and the other to be risk-seeking (risk preference = 1.12; see Partners’ choices and Fig. S3 for details). There were 40 prediction trials (20 trials for each partner) that provided feedback, followed by 20 additional prediction trials (10 trials for each partner) without feedback. The order of the partners to be predicted on each trial was randomized. On the last 20 prediction trials, individuals were asked to report their confidence level on a 10-point Likert scale (1: not confident at all, 10: very confident) after making their choices. At the end of the Learning phase, participants answered eight questions about their impressions of each predicted partner. The questions included likability, trustworthiness, riskiness, attractiveness, competence (metacognition, general academic grades), consistency, and preference similarity (see Table S9 for the full list of questions). One additional question was included as an attention check (“Answer 5”). The order of these questions was randomized.

In the Observed phase, participants played another gambling task. Most importantly, individuals were informed that their gamble choices on some trials would be observed by the partners they were matched with in the Learning phase (‘Observer trial’). Specifically, we instructed participants that the partners would revisit the lab and predict the corresponding gamble choices of the participants during their Learning phase, and that the partners would answer the eight impression questions about the participants (Table S9). Informing which partner would observe the choice, at the start of every Observer trial, the identity of a partner was presented in the center of the screen (Fig. 1g). On the other trials, participants were informed that their choices would not be observed by anyone (‘No observer trial’), which was indicated with an empty open door presented at the beginning of the trials. No observer and Observer trials were intermixed, and the Observer trials comprised trials with the risk-averse partner and the risk-seeking partner. In total, each participant had 90 trials (30 unique gamble sets × 3 trial types [No observer; Risk-averse observer; Risk-seeking observer]).

At the end of the study, participants were paid the base rate (20,000 Korean won per hour) and the bonus (2,300-7,010 Korean won), determined by the outcome of a random single gamble drawn from the Solo phase, the outcome of another gamble from the Observed phase, and the performance during the Learning phase.

Partners’ choices

Two partners’ choices over the 30 sets of the Learning phase gambles were generated and provided as feedback during the phase. We first analyzed behavioral choices of 74 non-overlapping participants on the Solo phase gamble sets. Assuming that individuals’ gamble choices are accounted for by expected utility theory (Bernoulli, 1954), we used a standard power utility function (Utility U = ∑Pi(Vi)ρ) and softmax choice rule (Probability(Risky) = [1 + exp(–μ ( URisky – UCertain ))]-1), and estimated their risk preferences (ρ) and value sensitivities (μ). To simulate one partner who is risk-averse and one partner who is risk-seeking, we selected the risk-preference parameters that are [mean ± 1.5 × standard deviation of the risk preference distribution]: risk-averse ρ = 0.43, risk-seeking ρ = 1.12 (Fig. S3). The two partners’ choice behaviors on the gamble sets in the Learning phase were simulated by using a standard power utility function, value sensitivity μ = 5, and each partner’s risk preference.

Behavioral analyses

To compare individuals’ own choice tendencies with their initial belief about partners, we simulated the choices that participants would have made for the same set of gambles, and used chi-square tests to compare these choices against the initial predictions. To simulate individuals’ choices, we used their estimated risk preferences from the Solo phase. To track the changes in individuals’ beliefs about the partners’ risk preferences, we binned their predictions about each partner (window size = 6 trials, overlap = 3 trials) and calculated the proportion of risky choices for each bin. In addition, the proportion of risky choices in individuals’ predictions during the last 10 trials in the Learning phase was calculated for each partner and compared to examine whether they successfully learned and distinguished the risk preferences of the two partners. For model-agnostic behavioral analyses, we used repeated measures analysis of variance (ANOVA) and compared proportion of risky choices among Observer trials with the risk-averse partner, No observer trials, and Observer trials with the risk-seeking partner. Paired t-tests were used for post-hoc analyses between each pair of trial types. After conducting model-based analyses (see Computational modeling below), we examined their consistency with model-agnostic measure by calculating Pearson’s correlation coefficient between individuals’ model-based Social reliance parameters and the impact of observation on the proportion of risky choices (i.e., the differences between Observer and No observer trials). MATLAB R2019a (MathWorks) was used for all statistical tests of behavior. All statistical tests were two-tailed with an alpha level of 0.05 unless noted otherwise.

Computational modeling

For model-based analyses, choices in the Solo phase were used to estimate individuals’ risk preferences, and choices in the Observed phase were used to examine the mechanisms by which individuals were affected by social observation. Individuals’ predictions in the Learning phase were used to estimate their learned beliefs about each partner.

Solo phase

As noted above and per expected utility theory (Bernoulli, 1954), we used a standard power utility function (Utility U = ∑Pi(Vi)ρ = Phigh-payoff × Vρ) and softmax choice rule (Prob(Risky) = [1+exp(–μ( URisky – UCertain ))]-1) to estimate individuals’ risk preference ρ and value sensitivity μ. Note that 0 < ρ < 1 captures risk-aversive choices, ρ = 1 captures risk-neutrality, and ρ > 1 captures risk-seeking choices.

Learning phase

Individuals could use the feedback provided during the first 20 trials in the Learning phase to learn about each partner’s risk preference. We used individuals’ prediction choices for each partner during the last 10 trials in the Learning phase to estimate their beliefs about the partners’ risk preferences. As in the Solo phase analysis, we used a standard power utility function and softmax choice rule to explain individuals’ risky choices. Estimation of the beliefs about the partners’ risk preferences (ρpartner, risk-averse, ρpartner, risk-seeking) was conducted with custom MATLAB script using maximum log-likelihood estimation (MLE) at the individual subject level (maximum of 25,000 function evaluation and iterations).

Observed phase

We hypothesized that under social observation, individuals may i) implement the belief about the partners’ preferences in making choices, ii) change their preferences to match that of the partners, or iii) alter their valuation depending on the identity of the currently observing partner. For a formal model comparison, we computed the integrated Bayesian Information Criteria (iBIC) to examine whether either of the alternative models explains individuals’ behavior equally well or better than the Social reliance model (a smaller iBIC score indicates a better model fit) (Fig. S2a).

Social reliance model

Our hypothesis was that individuals simulate the choices of the social observer and take into account this simulated choice tendency when making their own choices. Individuals had the opportunity to learn about the partners’ preferences (Learning phase) and might use these beliefs to simulate the choices that each partner would make. Constructed based on the same power utility function and softmax choice rule, individuals’ choice tendencies were defined in two components:

where Uown and Upartner are subjective values (utilities) calculated with individuals’ own risk preference ρown and the learned risk preference of the currently observing partner (ρpartner, risk- averse or ρpartner, risk-seeking), respectively. μown is individuals’ value sensitivity between the two options, and depending on the identity of the observing partner, μpartner is separated into μrisk- averse and μrisk-seeking. Individuals’ final choices under social observation were defined to be determined based on a weighted mixture between the two decision probabilities:

where ω is the Social reliance parameter (defined between 0 and 1) that represents the extent to which individuals rely on others’ choice tendencies. The Social reliance model included 5 free parameters: ρown, μown, μrisk-averse, μrisk-seeking, ω. Note that belief about the partners’ risk preferences were estimated from individuals’ predictions in the Learning phase.

Social risk preference change model

If individuals alter their risk preferences under social observation as reported in a previous study (Suzuki et al., 2016), their choices in the Observed phase would be captured by separate risk preferences, each of which assigned to a specific type of trial: ρown for No observer trials, ρpartner, risk-averse for Observer trials with the risk-averse partner, and ρpartner, risk- seeking for Observer trials with the risk-seeking partner. Including a common value sensitivity μ for all choice trials, the Social risk preference change model included 4 free parameters: ρown, ρpartner, risk-averse, ρpartner, risk-seeking, μ.

Other-Conferred Utility (OCU) model

Previously, an explicit presentation of social others’ choices added value to the choice option that others chose (Chung et al., 2015; Chung et al., 2020). Although the partners’ choices were not explicitly revealed, the social context of being under observation may also alter individuals’ valuation of options. To capture this possibility, we defined two ‘other-conferred utility (OCU)’ terms that represent additional values added to either the certain or the risky option, depending on the observer’s identity. Note that individuals successfully learned to distinguish the two partners’ risk preferences, and thus might use the identity information in valuation. Specifically, we assumed that under the risk-averse partner’s observation, individuals would add value to the certain option (Eq.4), whereas under the risk-seeking partner’s observation, they would add value to the risky option (Eq.5).

The OCU model included 4 free parameters: ρ, OCUrisk-averse, OCUrisk-seeking, μ.

Parameter estimation

Parameters were estimated using a hierarchical Bayesian model estimation (Ahn et al., 2017; Daw, 2011) and Markov chain Monte Carlo (MCMC) sampling with the No-U-Turn variation of the Hamiltonian Monte Carlo technique implemented in Stan and its interface to R (Team, 2020). For all parameters, the group-level distributions were assumed to be Gaussian with free hyperparameters of group-level mean, standard deviation (SD), and a standard normal distribution (Normal(0, 1)) following noncentered parameterization (Team, 2020). For μ, ρ, and ω, we applied an inverse probit transformation and then multiplied a constant (50 for all μs [μ, μown, μrisk-averse, μrisk-seeking], 2 for all ρs [ρ, ρown, ρpartner, risk-averse, ρpartner, risk-seeking], and 1 for ω) to constrain the parameters between 0 and the multiplied constant. OCU parameters were not constrained. We estimated the hyperparameters of the group-level distributions using uninformative priors: means ∼ Normal(0, 10) and SDs ∼ Cauchy(0, 2.5) with lower bound of zero.

Model and parameter recovery

To examine whether our model alternatives are identifiable in the empirical parameter range, we conducted a model recovery analysis (Wilson & Collins, 2019). First, the choices on the same set of gambles in the task were simulated for each model alternatives using the parameters estimated from 43 participants’ empirical data. Second, the simulated data were fitted to all model alternatives, and iBIC scores were calculated to identify the best explanatory model. Based on 10 iterations of these procedures, the proportion of the best explanatory model was calculated for each model alternative and depicted as a confusion matrix (Fig. S2b). In addition, we conducted a parameter recovery analysis to assess whether model parameters in the best explanatory model (Social reliance model) were identifiable from each other (Wilson & Collins, 2019). Specifically, we simulated the gamble choices using the parameters (ρown, ω, μown, μrisk-averse, μrisk-seeking) estimated from 43 participants’ empirical data (true parameter), and then re-estimated the model parameters (recovered parameter). To examine identifiability of the parameters, we calculated Pearson’s correlation between the true and recovered parameter pairs (Fig. S2c).

Neuroimaging acquisition and preprocessing

Functional and structural MRI brain scans were acquired with a Siemens MAGNETOM TRIO 3-T scanner. High-resolution T1 weighted structural images were acquired through magnetization prepared-rapid gradient echo (MP-RAGE) sequence with the parameters: repetition time (TR) = 2300 ms, echo time (TE) = 2.28 ms, slices = 192, voxel size = 1.0 × 1.0 × 1.0 mm3, flip angle = 8°, field of view (FOV) = 256 mm. Echo planar images were collected during the Solo and Observed phases to measure blood oxygen-level-dependent (BOLD) signal. Scans were angled 30° from the anterior commissure–posterior commissure line. The functional images were acquired with the parameters: repetition time (TR) = 2000 ms, echo time (TE) = 20 ms, slices = 44, voxel size = 3.0 × 3.0 × 3.0 mm3, flip angle = 80° field of view (FOV) = 192 mm. The functional images were preprocessed using MATLAB and Statistical Parametric Mapping (SPM) 12 (https://www.fil.ion. ucl.ac.uk/spm/). The preprocessing analysis included slice-timing correction, realignment, co-registration, segmentation, spatial normalization to the Montreal Neurological Institute (MNI) template, and smoothing using an 8-mm full width at half maximum (FWHM) Gaussian kernel. A high-pass filter of 1/128 Hz was applied to all scans and autocorrelation of the hemodynamic responses were modeled as a first-order autoregressive process.

General linear model (GLM) analyses

We performed event-related fMRI analyses of the BOLD responses during the Solo and Observed phases. One design matrix (DM) was used for the Solo phase to assess the brain regions that track trial-by-trial decision probabilities (see DM0 below). Four DMs were used for the Observed phase: two for assessing the impacts of social observation on individuals’ decision tendencies and subjective valuation (see DM1, DM2), and two for investigating the functional interactions between brain regions dependent on social observation (see DM3, DM4). For each design matrix, realignment parameters were included to model movement artifacts.

In DM0, all choice trials in the Solo phase were modeled as linear regressors. The task- related regressors were as follows:

  1. Fixation: a crosshair presentation that indicates the initiation of a new choice trial

  2. ViewOptions: revelation of new pair of gambles

  3. ChoiceCue: choice cue presentation informing individuals that a keypress response is enabled

  4. Keypress: all gamble choices during the decision period

Neural responses to ViewOptions were modeled as 6-s events and the responses to other events were modeled as stick functions. The events in each regressor were convolved with the canonical hemodynamic response function, and its temporal and dispersion derivatives. To assess the neural instantiation of final decision tendencies, parametric modulators associated with trial-by-trial decision probabilities for the chosen options (Prob(chosen)) were calculated using each individual’s parameters estimated from the Social reliance model, and were applied to ViewOptions. The parametric modulator was z-score transformed.

To assess neural responses in the Observed phase, DM1 and DM2 were constructed, modeling all choice trials in the Observed phase as linear regressors. Reflecting the task- design, the task-related regressors were as follows:

  1. Fixation: a crosshair presentation that indicates the initiation of a new choice trial

  2. ViewPartner: revelation of the identity of the partner who will be observing the current choice trial [No observer, Risk-averse partner, or Risk-seeking partner]

  3. ViewOptions: revelation of new pair of gambles

  4. ChoiceCue: choice cue presentation informing individuals that a keypress response is enabled

  5. Keypress: all gamble choices during the decision period

Neural responses to ViewPartner and ViewOptions were modeled as 3-s and 6-s events, respectively, and the other events were modeled as stick functions. In the same way as DM0, all the events were convolved with the canonical hemodynamic response function and its temporal and dispersion derivatives. In DM1, parametric modulators associated with Prob(chosen) were applied to ViewOptions. In DM2, to examine the neural substrates of subjective valuations from individuals’ own and their partners’ perspectives, two sets of parametric modulators associated with the utility differences between chosen and unchosen options (ΔU) were applied to ViewOptions. One set was calculated using individuals’ own risk preference ρown, and the other set was calculated using the learned risk preference of the observing partner (ρpartner, risk-averse or ρpartner, risk-seeking). Each set of the parametric modulator was z-score transformed.

We constructed two additional design metrices to investigate the brain regions that process the context of social observation. In DM3, to investigate the neural substrates sensitive to social observation, we separated ViewPartner in DM1 into two separate event regressors: one for when the choices were observed by either the risk-averse or the risk-seeking partner (ViewPartner_Observer) and one for when the choices were not observed by any partners (ViewPartner_NoObserver). Other regressors and settings were kept the same as in DM1. To examine whether the functional connectivity between brain regions was associated with the social context, we constructed DM4 to have the same structure with DM1, except ViewOptions was divided into Observer events (ViewOptions_Observer) and No observer events (ViewOptions_NoObserver). Note that this was to use [Observer trial – No observer trial] as a psychological factor in our subsequent psychophysiological interaction analyses (see Psychophysiological interaction analyses below).

Contrast images were generated for each individual at the first level, and at the second level, one-sample t-tests were conducted to estimate the group average response or the association between individuals’ BOLD responses and their model estimated Social reliance parameters. For the second-level regression, we added individuals’ Social reliance parameter as a covariate to DM4. The family-wise error rate (FWE) with small-volume correction (SVC) was used for multiple comparisons (see ROI analyses).

Region-of-interest (ROI) analyses

Three meta-maps were generated through Neurosynth term-based meta-analyses (Yarkoni et al., 2011), with a false discovery rate corrected at P < 0.01 and a cluster size > 100 voxels. First, for the Solo phase, a meta-map associated with the term ‘decision’ was used to investigate neural substrates of decision processes (vmPFC: [x = -3, y = 38, z = -10]; vStr: [x = 12, y = 11, z = -7]; dACC: [x = 3, y = 26, z = 44]; Fig. 3a, S7). 10 mm radius spheres around these specified coordinates were defined as the ROIs, and used for the small-volume correction. To examine whether the same brain regions involved in the representation of final decision probabilities were recruited under social observation, the significant clusters observed in the Solo phase (Prob(chosen) from DM0 thresholded at P < 0.001) were set as the ROIs for the Observed phase analyses. Specifically, we defined three ROI clusters: vStr (peak voxel at [x = 3, y = 14, z = -10], kE = 9), vmPFC (peak voxel at [x = –3, y = 62, z = –13], kE = 99), and dACC (peak voxel at [x = 12, y = 32, z = 29], kE = 118). Second, to investigate social processing during the Observed phase, a meta-map associated with the term ’social’ was used in setting ROIs (rTPJ: [x = 51, y = -52, z = 14]; lTPJ: [x = -51, y = -58, z = 17]; Fig. 3c, S8a). Third, to set an ROI in investigating neural responses to social cues and context, a meta-map associated with ‘social’, but not with ‘value’, was used (mPFC: [x = 0, y = 50, z = 14]; Fig. 3d, 4a, S8b). Within each meta-map, a sphere of 10 mm radius around the centers of gravity in each regional cluster defined the ROI.

Psychophysiological interaction (PPI) analyses

We tested whether the functional connectivity between brain regions was associated with individuals’ social reliance. We conducted two psychophysiological interaction (PPI) analyses during the time at which the options were presented. First, to investigate brain regions those are sensitive to social cues (or context), we used the mPFC from fig. 3d as a seed and [Observer trial – No observer trial] as a psychological factor (see DM4). Specifically, individuals’ BOLD responses were extracted from a 6 mm sphere centered on each individual’s local maxima nearest to the group peak voxel of the mPFC cluster active at P < 0.005 (x=–3, y = 50, z = 14; Fig. 3d, Table S5), and these time-series were then deconvolved and used as a physiological factor. Second, to examine whether the mPFC and TPJ interacted during individuals’ decision processes under social observation, we used the TPJ from fig. 3c as a seed and [Observer trial – No observer trial] as a psychological factor (see DM4). Given that the bilateral TPJ represented Prob(chosen) in the Observed phase, the left and the right TPJ were used for two separate PPI analyses, respectively (rTPJ: [x = 63, y = –40, z = 17], lTPJ: [x = –54, y = –37, z = 14]; Fig. 3c, Table S4). The mPFC cluster from the result of the first PPI analysis (x = 3, y = 50, z = 5, kE = 74; Fig. 4a, S9a, Table S6) was set as the target region. For the second-level regression, we added individuals’ Social reliance parameter as a covariate.

Mediation analyses

To examine whether the positive association between TPJ-mPFC connectivity and individuals’ social reliance (log-transformed Social reliance estimates) (Fig. 4c, S9a) was mediated by the mPFC’s sensitivity to social cues (mPFC-mPFC connectivity; Fig. 4a), we conducted mediation analyses using an R package mediation (Tingley et al., 2014). Specifically, each individual’s log-transformed Social reliance parameter was used as a predictor, mPFC-mPFC connectivity was used as a mediator, and TPJ-mPFC connectivity was set as an outcome. Two separate mediation models were tested, with the outcome being the functional connectivity between the mPFC and either the left or the right TPJ. The significance of the effects was assessed using a non-parametric bootstrapping method with 5,000 samples, and an alpha level of 0.05 was applied to determine statistical significance.

Acknowledgements

This work was supported in part by the National Research Foundation of Korea (NRF) (RS- 2024-00420674 to D.C.) and the KBRI basic research program through Korea Brain Research Institute funded by Ministry of Science and ICT (24-BR-03-08 to D.C.).