A stochastic world model on gravity for stability inference

  1. Department of Psychology and Tsinghua Laboratory of Brain & Intelligence, Tsinghua University, Beijing, China

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a response from the authors (if available).

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Jörn Diedrichsen
    Western University, London, Canada
  • Senior Editor
    Michael Frank
    Brown University, Providence, United States of America

Reviewer #1 (Public Review):

This paper presents a set of experiments designed to test whether gravity in people's intuitive physics engine is implemented as a simple deterministic representation of gravity or as a Gaussian distribution. The work shows experimentally that the probabilistic representation of gravity does a better job at capturing both human judgments, including biases in stability inferences. The work further shows that Gaussian representations of gravity can evolve in a simple agent-environment reinforcement learning problem setup.

Strengths:
The paper approaches the problem from three different angles in an impressive way. The first is through a direct comparison of human judgments against model predictions. The second is through an analysis of whether the model correctly predicts cognitive illusions. The third is through a computational exploration of how these representations emerge in a reinforcement-learning setup. The idea of approaching the same problem from multiple independent angles, and seeking confirming evidence is laudable.

Weaknesses:
There are two differences between the "natural gravity" account and the "mental gravity" account. The first difference lies in the implementation of gravity. The second, however, is simply that the mental gravity model is integrating more uncertainty into the simulator. In my understanding, adding small amounts of noise to computational models will often increase their fit to human judgments (with softmaxing perhaps being the most common example of this). While counter-intuitive, this is because 'noiseless' models have perfect representations of the stimuli, which is an unrealistic assumption. In the case of intuitive physics, people might have noisy perceptual representations of exactly how flat the table is, the exact location of each block, what small disturbances might be happening in the environment, and so on. The absence of these sources of uncertainty in deterministic models can make them perform in a non-human-like manner.

While all the data presented in the paper is consistent with the possibility that people have a stochastic representation of gravity. It is possible that people have uncertainty over what unobservable forces a block tower might be under (e.g., wind, bumps to the table, etc). Therefore, even if you have a firm belief that gravity goes down, you may want to add noise in your simulations to account for the fact that, in the real world, gravity is almost never the only force acting on an object that has started to move. While the paper acknowledges that such an account would be mathematically equivalent, it does not acknowledge that this raises the question of whether people actually have stochastic representations of gravity.

This alternative account could be particularly important because I believe it might be a more accurate representation of what people believe. I may be wrong, but I believe that it is common to emphasize the probabilistic nature of the models and the importance of implementing forces as distributions (e.g., the concept of 'noisy newtons').

Reviewer #2 (Public Review):

Summary:
Through a set of experiments and model simulations, the authors tested whether the commonly assumed world model of gravity was a faithful replica of the physical world. They found that participants did not model gravity as a single, fixed vector for gravity but instead as a distribution of possible vectors. Surprisingly, the width of this distribution was quite large (~20 degrees). While previous accounts had suggested that this uncertainty was due to perceptual noise or an inferred external perturbation, the authors suggest that this uncertainty simply arises from a noisy distribution of the representation of gravity's direction. A reinforcement learning model with an initial uniform distribution for gravity's direction ultimately converged to a precision in the same order as the human participants, which lends support to the authors' conclusion and suggests that this distribution is learned through experience. What's more, further simulations suggest that representing gravity with such a wide distribution may balance speed and accuracy, providing a potentially normative explanation for the world model with gravity as a distribution.

Strengths:
The authors present surprising findings in a relatively straightforward way in a now classic experimental task. They provide a normative explanation based on a resource-rational framework for why people may have a stochastic world model instead of a deterministic world model.

Weaknesses:
Support for gravity being represented as a Gaussian distribution (stochastic world model), as opposed to perceptual uncertainty or (inferred) external perturbations, is from an RL model simulation. It would be more convincing if the authors could experimentally demonstrate that potential external perturbations did not affect the distribution of gravity.

Reviewer #3 (Public Review):

Summary:
Previous studies suggest that humans may infer objects' stability through a world model that performs mental simulations with a priori knowledge of gravity acting upon objects. In this study, the authors test two alternative hypotheses about the nature of this a priori knowledge. According to the Natural Gravity assumption, the direction of gravity encoded in this world model is straight downwards as in the physical world. According to the alternative Mental Gravity assumption, that gravity direction is encoded in a Gaussian distribution, with the vertical direction as the maximum likelihood. They present two experiments and computer simulations as evidence in support of the Mental Gravity assumption. Their conclusion is that when the brain is tasked to determine the stability of a given structure it runs a mental simulation, termed Mental Gravity Simulation, which averages the estimated temporal evolutions of that structure arising from different gravity directions sampled from a Gaussian distribution.

Weaknesses:
In spite of the fact that the Mental Gravity Simulation (MGS) seems to predict the data of the two experiments, it is an untenable hypothesis. I give the main reason for this conclusion by illustrating a simple thought experiment. Suppose you ask subjects to determine whether a single block (like those used in the simulations) is about to fall. We can think of blocks of varying heights. No matter how tall a block is, if it is standing on a horizontal surface it will not fall until some external perturbation disturbs its equilibrium. I am confident that most human observers would predict this outcome as well. However, the MSG simulation would not produce this outcome. Instead, it would predict a non-zero probability of the block to tip over. A gravitational field that is not perpendicular to the base has the equivalent effect of a horizontal force applied on the block at the height corresponding to the vertical position of the center of gravity. Depending on the friction determined by the contact between the base of the block and the surface where it stands there is a critical height where any horizontal force being applied would cause the block to fall while pivoting about one of the edges at the base (the one opposite to where the force has been applied). This critical height depends on both the size of the base and the friction coefficient. For short objects this critical height is larger than the height of the object, so that object would not fall. But for taller blocks, this is not the case. Indeed, the taller the block the smaller the deviation from a vertical gravitational field is needed for a fall to be expected. The discrepancy between this prediction and the most likely outcome of the simple experiment I have just outlined makes the MSG model implausible. Note also that a gravitational field that is not perpendicular to the ground surface is equivalent to the force field experienced by the block while standing on an inclined plane. For small friction values, the block is expected to slide down the incline, therefore another prediction of this MSG model is that when we observe an object on a surface exerting negligible friction (think of a puck on ice) we should expect that object to spontaneously move. But of course, we don't, as we do not expect tall objects that are standing to suddenly fall if left unperturbed. In summary, a stochastic world model cannot explain these simple observations.

The question remains as to how we can interpret the empirical data from the two experiments and their agreement with the predictions of the stochastic world model if we assume that the brain has internalized a vertical gravitational field. First, we need to look more closely at the questions posed to the subjects in the two experiments. In the first experiment, subjects are asked about how "normal" a fall of a block construction looks. Subjects seem to accept 50% of the time a fall is normal when the gravitational field is about 20 deg away from the vertical direction. The authors conclude that according to the brain, such an unusual gravitational field is possible. However, there are alternative explanations for these findings that do not require a perceptual error in the estimation of the direction of gravity. There are several aspects of the scene that may be misjudged by the observer. First, the 3D interpretation of the scene and the 3D motion of the objects can be inaccurate. Indeed, the simulation of a normal fall uploaded by the authors seems to show objects falling in a much weaker gravitational field than the one on Earth since the blocks seem to fall in "slow motion". This is probably because the perceived height of the structure is much smaller than the simulated height. In general, there are even more severe biases affecting the perception of 3D structures that depend on many factors, for instance, the viewpoint. Second, the distribution of weight among the objects and the friction coefficients acting between the surfaces are also unknown parameters. In other words, there are several parameters that depend on the viewing conditions and material composition of the blocks that are unknown and need to be estimated. The authors assume that these parameters are derived accurately and only that assumption allows them to attribute the observed biases to an error in the estimate of the gravitational field. Of course, if the direction of gravity is the only parameter allowed to vary freely then it is no surprise that it explains the results. Instead, a simulation with a titled angle of gravity may give rise to a display that is interpreted as rendering a vertical gravitational field while other parameters are misperceived. Moreover, there is an additional factor that is intentionally dismissed by the authors that is a possible cause of the fall of a stack of cubes: an external force. Stacks that are initially standing should not fall all of a sudden unless some unwanted force is applied to the construction. For instance, a sudden gust of wind would create a force field on a stack that is equivalent to that produced by a tilted gravitational field. Such an explanation would easily apply to the findings of the second experiment. In that experiment subjects are explicitly asked if a stack of blocks looks "stable". This is an ambiguous question because the stability of a structure is always judged by imagining what would happen to the structure if an external perturbation is applied. The right question should be: "do you think this structure would fall if unperturbed". However, if stability is judged in the face of possible external perturbations then a tall structure would certainly be judged as less stable than a short structure occupying the same ground area. This is what the authors find. What they consider as a bias (tall structures are perceived as less stable than short structures) is instead a wrong interpretation of the mental process that determines stability. If subjects are asked the question "Is it going to fall?" then tall stacks of sound structure would be judged as stable as short stacks, just more precarious.

The RL model used as a proof of concept for how the brain may build a stochastic prior for the direction of gravity is based on very strong and unverified assumptions. The first assumption is that the brain already knows about the force of gravity, but it lacks knowledge of the direction of this force of gravity. The second assumption is that before learning the brain knows the effect of a gravitational field on a stack of blocks. How can the brain simulate the effect of a non-vertical gravitational field on a structure if it has never observed such an event? The third assumption is that from the visual input, the brain is able to figure out the exact 3D coordinates of the blocks. This has been proven to be untrue in a large number of studies. Given these assumptions and the fact that the only parameters the RL model modifies through learning specify the direction of gravity, I am not surprised that the model produces the desired results.

Finally, the argument that the MGS is more efficient than the NGS model is based on an incorrect analysis of the results of the simulation. It is true that 80% accuracy is reached faster by the MGS model than the 95% accuracy level is reached by the NGS model. But the question is: how fast does the NGS model reach 80% accuracy (before reaching the plateau)?

Author Response:

We would like to express our heartfelt gratitude for the reviewers’ scholarly and insightful reviews of our manuscript. The constructive comments and thought-provoking experimental proposals have been invaluable not only in improving the quality of this study but also in shaping the direction of future research. In revision, all comments will be addressed point-by-point, and the manuscript will be revised thoroughly. Here in this reply, we focus on the most critical issue regarding the source of noises during stability inference.

When faced a stack of objects, individuals are more likely to assess taller stacks of objects as being more unstable compared to shorter ones (Fig. 2b & 2d). This bias persists even when comparing single objects of different heights that share the same contact area with the supporting surface. Known as “stability inference bias,” this phenomenon challenges deterministic models with a single, fixed vector for the representation of gravity’s direction (i.e., directly downward). To reconcile this bias with deterministic models, previous studies (e.g., Allen et al., 2020; Battaglia et al., 2013; Kubricht et al., 2017) have incorporated external noises such as perceptual uncertainty and external force perturbations to increase their fit to human performance, also pointed out by Reviewer 1.

In this study, we introduced an alternative perspective through a stochastic model in which variability is instead embedded in the representation of gravity’s direction. In this framework, gravity’s direction is not a fixed vector but a distribution of possible vectors, with the vertical direction serving as the maximum likelihood. While the distinction between deterministic and stochastic models is conceptually clear, mathematically they are equivalent. In addition, our stochastic model does not negate the role of external noises in stability inference, because gravity is seldom the sole force acting upon a moving object in the physical world, as pointed out by Reviewer 1. Together, these two factors make it challenging to ascribe the source of variability to either external or internal noises (Smith & Vul, 2013). This is the major concern raised by all three reviewers.

To distinguish between the deterministic and stochastic models, we designed a series of experiments aimed at demonstrating that internal noises, rather than external noises such as perceptual uncertainty or external force perturbations, influences our inference about object stability. However, the supporting evidence was dispersed and at times implicit throughout the manuscript. In revision, we will thoroughly clarify the ambiguities. In this reply, we will consolidate and present the evidence comprehensively.

1. The examination of external noises.

1.1 External Force Perturbations. Deterministic models suggests that during object stability inference, individuals implicitly assume the presence of external forces (e.g., wind) that could destabilize stacks. While this assumption aligns with the omnipresence of such forces in natural settings, it overlooks a crucial variable: the directionality of these external forces. In psychological studies, individual differences are commonly observed, and the perceived force direction is not an exception. That is, some may assume that it comes from the left, while others from the right. In essence, if external forces were to play a significant role in stability inference, one would expect the perceived force directions to exhibit non-uniform distributions (i.e., anisotropy) in the horizontal plane within individuals and to show substantial variability between individuals.

Contrary to this expectation, our study revealed a different pattern. In the study, we specifically measured the distribution of 𝜑, the horizontal component reflecting the direction of object collapse. Our results indicated that all participants exhibited a uniform distribution of gravity’s directions in the horizontal plane (Fig. 1d right; Extended Data Fig. 2 and 3). This uniformity suggests that if external forces were a key determinant in stability inference, participants would have to assume a varying direction of external force in each trial—an assumption we consider unlikely. Instead, our RL model simulation suggests that the isotropy of 𝜑 arises from agent-environment interactions, notably in the absence of external forces (Extended Data Fig. 6).

In summary, the uniform distribution of horizontal direction component, 𝜑, observed in all participants, challenges the argument for the dominant role of external forces in stability inference. We are sorry that this aspect was not explicitly emphasized in the original text, and in revision we will explain why external forces are unlikely to substantially shape our perception of object stability.

1.2 Perceptual uncertainty. To assess the impact of perceptual uncertainty on stability inference, we examined whether the representation of gravity’s direction is cognitive impenetrable. Specifically, we posited that if noises are external (i.e., perceptual uncertainty), the inference bias should be modulated by task context; in contrast, if noises are internal, the stochastic representation of gravity’s direction will be encapsulated from the context. To test this idea, we inverted the virtual environment, making gravity appear to point upward (also see a similar idea by Reviewer 3). In this unfamiliar context, which diverges dramatically from daily experiences, one would expect heightened perceptual uncertainty, which according to deterministic models would result in a larger inference bias – manifested as an increased width of the distribution (𝜎) of gravity’s direction. Contrary to this prediction, we observed that the width of the distribution remained unchanged (Fig. 1d and 1f). Furthermore, there was a high correlation (r = 0.91) between widths in the upright and inverted conditions across participants (Extended Data Fig. 2 and 3).

In summary, this finding suggests that the manipulation of perceptual uncertainty is unable to cognitively penetrate the representation of gravity’s direction, casting doubt on its dominant role in stability inference. We are sorry that in the original text, we did not clarify the rationale for employing the approach of cognitive impenetrability. In revision, this will be clarified.

2. The origin of intrinsic noises in stability inference.

In deterministic models, either external force perturbations or perceptual uncertainty is often assumed but rarely empirically tested. Indeed, these external noises are introduced primarily to account for observed biases in stability inference. In this study, we explicitly examined the possible origin of the intrinsic noises embedded in the representation of gravity’s direction. Without assumed perceptual uncertainty and external perturbation of forces, the RL model simulation showed that the distribution could evolve naturally based mainly on the agent’s experience, as it used the mismatch between the expectation and the observed state of the stack under natural gravity to update its representation of gravity’s direction (Fig. 3a). Importantly, the width of the distribution for the agent was comparable to that of human participants as measured in the psychophysics experiments (Fig. 3b). Therefore, the experience alone may be sufficient to generate stochastic representation of gravity’s direction, obviating the need for external noises.

Taken together, these findings underscore the limitations of the combination of deterministic models and external noises in accounting for stability inference, and suggest that intrinsic noises embedded in the representation of gravity play a pivotal role in shaping our stability inference of the physical world.

3. Thought experiments.

Although the evidence shown above may provide valuable insights, our study does not definitively settle the debate between deterministic models and our proposed stochastic model. Specifically, our study only preliminarily investigates two sources of external noise, perceptual uncertainty and external force perturbations, leaving many other factors such as object mass and surface friction, unexplored (for studies on these factors, please see Hamrick et al., 2016). As such, the reviewers have proposed a series of thought experiments that warrant further investigation. Below, we enumerate some of them, followed by ours.

3.1 Experiment 1. Reviewer 3 proposed a thought experiment in which participants assess stability of a single block of varying heights. The reviewer argues that a block, regardless of its height, will remain stable on a horizontal surface unless externally disturbed. This assertion is perfectly true in the physical realm. However, in the cognitive domain, both deterministic models and our stochastic model predict differently. Take an extreme example of a standing needle: while it would remain upright in the physical world without external disturbances, both deterministic and stochastic models, which account for mental inference of physical events, will predict a likelihood of it falling, aligning with our subjective feelings. This is because in both models, noises are considered in the intuitive physics engine. In deterministic models, external force perturbations, as well as perceptual uncertainty, are assumed to be omnipresent noises in probabilistic reasoning. In our stochastic model, noises are embedded in the representation of gravity’s direction. Therefore, although this thought experiment, along with other thought experiments on object mass, surface friction (proposed by Reviewer 3), and falling trajectories behind an occlude (proposed by Reviewer 1), is insightful, but it cannot serve to differentiate deterministic and stochastic models. 3.2 Experiment 2. Reviewer 2 suggested constructing a wall on one side of the virtual scene to make it improbable that participants would infer an external force perturbation emanating from that direction. In this setting, deterministic models would predict a non-uniform distribution of the horizontal component, 𝜑, skewed away from the wall. In contrast, according to our stochastic model, the distribution of 𝜑 would remain unaffected, maintaining the uniform distribution observed in previous experiments. Extending this logic, another test scenario could contrast an indoor scene with an outdoor scene. In a confined and static indoor environment, the likelihood of external force perturbations should be much lower than in a dynamic, open outdoor setting. Here, deterministic models would predict an increase in the width of the distribution, 𝜎, in the outdoor environment, whereas our model would anticipate no such change. The underlying rationale for these experiments parallels that of our previous setup (figure 1e), where we inverted the virtual environment and reversed the direction of gravity. Indeed, they all aim to assess the extent to which manipulations of external factors can cognitively penetrate the representation of gravity’s direction.

3.3 Experiment 3: A noteworthy insight derived from our RL model simulation relates to variations in the number of blocks within the virtual worlds. Deterministic models would predict an enlarged bias in stability inference as the number of blocks increased, which is attributed to elevated levels of perceptual uncertainty and an expanded area susceptible to external force perturbations. However, the results from our RL model simulation contradict this prediction, revealing that an augmented number of blocks instead led to a narrowing of the width of the distribution. This decrease in width can be ascribed to richer information provided by a larger number of blocks for refining its representation of gravity’s direction. In line with this rationale, we propose a new experiment from the perspective of ecological psychology, which emphasizes that cognitive processes are shaped by our interactions with the environment. Specifically, we hypothesize that individuals raised in mountainous terrains may exhibit more accurate representations of gravity’s direction than those raised in flat terrains. This proposed experiment could not only help resolving the ongoing debate between two models to some extent, but also advocate future studies on intuitive physics within a more ecologically valid framework.

To conclude, both deterministic and stochastic models align closely with Bayesian principles, where stability inference is conceptualized as probabilistic reasoning. Nevertheless, the divergence between them is no trivial, as it hinges on distinct philosophical assumptions about the relationship between the inner mind and the external world. Deterministic models propose that the mind serves as a faithful reflection of the world; therefore, gravity’s direction is represented as a single, fixed vector directly downward, the same as that in the world. In these models, uncertainty for probabilistic reasoning emanates from factors external to the module of the intuitive physics engine. In contrast, our stochastic model underscores the notion that the mind is an active inference machine, continually reinterpreting inputs from outside world; therefore, the mind gains increased adaptability, allowing for a more nuanced accounting of uncertainty in the world – factors often crucial for survival. Such active inference necessitates flexible representations; accordingly, within the model of intuitive physics engine, variations are embedded into the representation of gravity’s direction. While resolving this philosophical debate is beyond the capacity of the present study, we contend that the field of intuitive physics offers a valuable lens through which to pry open the complex interplay between the mind and the world we live in.

References

  • Allen, K. R., Smith, K. A., & Tenenbaum, J. B. (2020). Rapid trial-and-error learning with simulation supports flexible tool use and physical reasoning. Proceedings of the National Academy of Sciences, 117(47), 29302–29310.
  • Battaglia, P. W., Hamrick, J. B., & Tenenbaum, J. B. (2013). Simulation as an engine of physical scene understanding. Proceedings of the National Academy of Sciences, 110(45), 18327–18332.
  • Kubricht, J. R., Holyoak, K. J., & Lu, H. (2017). Intuitive physics: Current research and controversies. Trends in Cognitive Sciences, 21(10), 749–759.
  • Smith, K. A., & Vul, E. (2013). Sources of uncertainty in intuitive physics. Topics in Cognitive Science, 5(1), 185–199.
  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation