1. Neuroscience
Download icon

Perception as a closed-loop convergence process

  1. Ehud Ahissar Is a corresponding author
  2. Eldad Assa
  1. Weizmann Institute of Science, Israel
Research Article
Cited
6
Views
1,807
Comments
0
Cite as: eLife 2016;5:e12830 doi: 10.7554/eLife.12830

Abstract

Perception of external objects involves sensory acquisition via the relevant sensory organs. A widely-accepted assumption is that the sensory organ is the first station in a serial chain of processing circuits leading to an internal circuit in which a percept emerges. This open-loop scheme, in which the interaction between the sensory organ and the environment is not affected by its concurrent downstream neuronal processing, is strongly challenged by behavioral and anatomical data. We present here a hypothesis in which the perception of external objects is a closed-loop dynamical process encompassing loops that integrate the organism and its environment and converging towards organism-environment steady-states. We discuss the consistency of closed-loop perception (CLP) with empirical data and show that it can be synthesized in a robotic setup. Testable predictions are proposed for empirical distinction between open and closed loop schemes of perception.

https://doi.org/10.7554/eLife.12830.001

eLife digest

How do we perceive the world around us? Today the dominant view in brain research is that sensory information flows from the environment to our eyes, fingers and other sense organs. The input then continues on to the brain, which generates a percept. This process is referred to as “open-loop perception” because information flows through the system predominantly in one direction: from the environment, to the sense organs, to the brain.

Open-loop perception struggles to account for a number of key phenomena. The first is that sensation is an active process. Our eyes and hands constantly move as we interact with the world, and these movements are controlled by the brain. According to Ahissar and Assa, a more accurate view of perception is that the brain triggers the movement of the sense organs, and thereby alters the sensory information that these organs receive. This information is relayed to the brain, triggering further movement of the sense organs and causing the cycle to repeat. Perception is therefore a “closed loop”: information flows between the environment, sense organs and brain in a continuous loop with no clear beginning or end.

Closed-loop perception appears more consistent with anatomy and with the fact that perception is typically an incremental process. Repeated encounters with an object enable a brain to refine its previous impressions of that object. This can be achieved more easily with a ‘circular’ closed-loop system than with a linear open-loop one. Ahissar and Assa show that closed-loop perception can explain many of the phenomena that open-loop perception struggles to account for. This is largely because closed-loop perception considers motion to be an essential part of perception, and not an artifact that must be corrected for.

The open- and closed-loop hypotheses should now be compared systematically. One approach would be to construct an artificial perceiver (or robot) based on each hypothesis and examine its behavior. Another would be to perform experiments in which the two hypotheses make opposing predictions. Paralyzing a sensory organ without affecting the flow of sensory information, for example, would impair perception according to the closed-loop hypothesis, but would have no effect according to the open-loop hypothesis.

https://doi.org/10.7554/eLife.12830.002

Introduction

Until the midst of the 20th century psychologists and psychophysicists viewed perception as a primarily active process: perception is what emerges when an organism equipped with a brain interacts with its environment (James, 1890; Koffka, 1935; Mach, 1959; Merleau-Ponty, 1962; Uexkull, 1926). Indeed, behavioral studies revealed that although mammals can perceive events or objects while being passive, most of the time mammalian individuals seek for objects and perceive the world via active body and sensor movements (Ahissar and Arieli, 2001; Diamond et al., 2008; Findlay and Gilchrist, 2003; Halpern, 1983; Kleinfeld et al., 2006; Konig and Luksch, 1998; Land, 2006; Lederman and Klatzky, 1987; Najemnik and Geisler, 2005; Rucci et al., 2007; Schroeder et al., 2010). Investigating perception at the neuronal level, however, proved to be extremely challenging and neuroscientists have adopted a series of reductionist methods in which various components of the process have been eliminated. One critical such component has been sensor motion – neuroscientists have been investing enormous efforts in precluding sensor movements as these movements, naturally, interfere with systematic characterizations of neuronal responses. This passive paradigm indeed yielded invaluable descriptions of neuronal circuits and pathways that can convey sensory information and suggested how these pathways might process sensory information. Crucially, however, passive paradigms cannot reveal how sensory information is actually processed during active perception (Ahissar and Arieli, 2001; Ennis et al., 2014; Maravall and Diamond, 2014). For that, a unified analysis of the motor and sensory components engaging brains with their environments is required.

Experimental data are usually examined in light of, and reflect on, implicit or explicit hypotheses. One salient outcome of the passive reductionist approach has been the over emphasis of open-loop schemes of perception. The elimination of motor components from the experimental scheme yielded a parallel elimination of motor variables from the corresponding theoretical schemes, leaving models of perception as sensory-only open-loop schemes (e.g., Connor and Johnson, 1992; Edelman, 1993; Marr, 1982; Poggio and Serre, 2013). In agreement with previous suggestions (Ahissar and Arieli, 2001; Dewey, 1896; Freeman, 2001; Kelso, 1997; Port and Van Gelder, 1995), we claim that such a reductionist paradigm should ultimately fail to elucidate neural mechanisms of natural perception. This is not to say that any reduction would fail, but to emphasize that an appropriate reductionist paradigm should leave the motor-object-sensory interactions intact. The current paper describes an attempt to bring the motor variables back to the theoretical modeling of perception, by proposing a motor-sensory closed-loop scheme for the perception of the external environment. The paper makes use of ideas previously developed in various dynamic theories (Ahissar and Vaadia, 1990; Ashby, 1952; Kelso, 1997; O'Regan and Noe, 2001; Port and Van Gelder, 1995; Powers, 1973; Wiener, 1949) and, in general, refers to the perception of external objects as a process of acquiring information about presently-existing external objects, whether consciously or not. The paper addresses perceptual acquisition - mechanisms of perceptual reports and their interactions with perceptual acquisition are not addressed here. It is noted, however, that a comprehensive understanding of perception depends on the understanding of report mechanisms as well. For simplicity, the term “brain” is often used in this article in an extended form that includes the sensory organs and their affiliated nerves and muscles.

The open loop perception (OLP) doctrine

Closed loops are systems in which every signal eventually affects its source; open loops are systems in which signals cannot affect their sources. Clearly, brains contain closed-loops at all levels, some of which have been implicated in relation to perceptual processing (Ahissar and Kleinfeld, 2003; Edelman, 1993; Martin, 2002; Pollen, 1999). Yet, whether perceptual acquisition is considered an open-loop or closed-loop process does not depend on the existence of closed loops within the chain of processing, but on whether the entire chain of processing is closed (as a loop) or open. Thus, a perceptual process that starts at the sensory organ and ends somewhere in the brain, whether containing local loops or not, is termed here an open-loop perceptual (OLP) process (Figure 1A), whereas a perceptual process that includes the sensory organ but has no starting nor ending point, is termed a closed-loop perceptual (CLP) process (Figure 1B).

Possible perceptual schemes.

(A) An open-loop scheme (in the motor-sensory sense) – perception begins with an interaction (uni- or bi-directional) between the object and the sensory organ (an eye in this illustration) and ends somewhere in the brain where a relevant neuronal representation (NR) is formed. (B) A closed-loop scheme (in the motor-sensory sense) – perception is a circular process, with no starting or ending points, which contains the sensory organ.

https://doi.org/10.7554/eLife.12830.003

The OLP doctrine holds that external objects and features are perceived in an open-loop manner, in the motor-sensory sense (Baars, 2002; Dehaene et al., 1998; Hochstein and Ahissar, 2002; Riesenhuber and Poggio, 2000; Tononi and Koch, 2008; Ullman, 2007). Thus, for example, an apple activates retinal receptors, which in turn initiate a stream of activations in the brain, some of which may depend on internal loops, i.e., loops that do not include the sensory organ. An activity pattern that is repeatedly evoked in a given neuronal network in response to a presentation of the apple, and/or when such an apple is perceived, is often termed a neuronal correlate or neuronal representation (NR) of that apple. NRs are representations that are not necessarily consistent or unique, i.e., they may appear in only some of the cases in which the apple is presented or perceived, and may appear also when other objects are presented or perceived. If a specific NR is evoked in a given brain for each and every perceived appearance of the apple, is invariant to changes in internal and environmental conditions, and is unique to the apple, it can be termed “the” invariant representation (IvR) of the apple in that specific brain. Assuming OLP, IvRs should be invariant to the acquisition mode. Visual IvRs of the apple, for example, should be the same in passive and active acquisition modes, i.e., when the eye is stationary and the object moves or flashes (passive mode) and when the object is stationary and the eye moves (active mode).

The search of NRs that are also IvRs, during the last 6–7 decades, yielded several key findings. Among those is the characterization of NRs of various external features along the relevant sensory streams. For example, NRs of brief presentations of visual elements, such as dots and bars, were characterized among retinal, thalamic and cortical neurons (Hartline, 1938; Hubel and Wiesel, 1962). NRs of more complex visual patterns were characterized in various cortical areas (Creutzfeldt and Nothdurft, 1978; Fujita et al., 1992; McMahon et al., 2014). Crucially, however, although partial invariance had been demonstrated for portions of the proposed NRs in some of the cases, none of these NRs was shown so far to be “the” IvR of a specific external object or feature, namely an NR that is (at least substantially) invariant to changes in the most relevant conditions of perception. Moreover, none of these studies provides information that can discriminate between OLP and alternative hypotheses. Consider, for example, studies exhibiting single neurons that increase their firing rate significantly and selectively for a given object (e.g., a face) out of several presented objects, and for several variations of that object (McMahon et al., 2014; Quiroga et al., 2005; Viskontas et al., 2009). The critical factor here is that such a neuron cannot be considered as describing the IvR of that object, neither as describing a reliable projection of the IvR. Based on combinatorial considerations and response variations the assumption in such cases is that the elevated firing rate of such a neuron is a (tiny) component of the relevant NR, and not the NR itself. The question is, then, would the assumed NR be invariant to a sufficiently large portion of all relevant variations of object presentation and context. Given that these neurons are not completely invariant even to the limited sample of variations presented to them (as is evident from the substantial trial-by-trial variability of their responses) and their tiny contribution to the actual NR, it is impossible to infer the level of invariance of the actual NR out of the firing patterns measured from these neurons.

Studying the passive mode of sensation also revealed various forms of internal transformations between NRs, such as, for example, transformation from NRs of static dots to NRs of static bars (Hubel, 1996; Reid, 2001), from temporal-code based NRs to rate-code based NRs (Ahissar et al., 2000) or from rate-code based NRs to temporal-code based NRs (Cleland, 2010). Clearly, these mechanisms can function within both OLP and CLP schemes of perception. Passive-mode experiments were also instrumental in describing the minimal exposure times required for generating meaningful perceptual reports. Across a large set of stimuli it was found that, depending on practice, exposure times as short as a few tens of milliseconds already allow a categorization of the presented stimulus, at least in a binary manner. As will be shown below, these findings are consistent with both OLP and CLP schemes.

Challenges to the OLP doctrine

As described above, the OLP doctrine allowed an invaluable characterization of various components of the perceptual systems of mammals, using a set of reductionist steps. In order to verify that these specific reductions of the perceptual process are scientifically valid, one has to reconstruct perception by combining back the individual identified components. Succeeding in doing so will not only validate the specific reductionist approaches used, but, more importantly, show that OLP can be considered as a valid (i.e., self-consistent) theory of perception. At this stage we can ask whether OLP is consistent with the data collected so far. We describe here several major findings that appear to be inconsistent with OLP and thus significantly challenge the validity of OLP as a mechanism for natural perception in mammals.

Sensation is normally active

Mammalian sensory organs usually acquire information via movements (Ahissar and Arieli, 2001; Chapin and Woodward, 1982; Diamond et al., 2008; Kleinfeld et al., 2006; Konig and Luksch, 1998; Land, 2006; Lederman and Klatzky, 1987; Prescott et al., 2011; Rucci et al., 2007; Schroeder et al., 2010). The strategies employed by sensory systems are often similar. Visual and tactile systems, for example, employ movements of sensory organs that contain two-dimensional arrays of receptors. The movements serve several functions. Larger movements (e.g., ocular saccades and head or arm movements) quickly move the array of receptors from one region of interest to another. Smaller (and slower) movements (e.g., fixational drifts and finger or vibrissal scanning) scan the region of interest at fine resolution (Ahissar and Arieli, 2001). This move-dwell-move pattern is typical for perceptual exploration across a large range of temporal scales, from minutes to less than a second (Figure 2). Olfaction and taste are probably as active as touch and vision (Halpern, 1983; Kepecs et al., 2006; Mainland and Sobel, 2006; Welker, 1964). The extent of action in hearing is less clear - while cochlear amplification is considered active (Dallas, 1992; Nin et al., 2012), whether auditory sensation is typically obtained via sensor activation is still not known (see Perceptual systems are organized as motor-sensory-motor (MSM) loops and Contrasting OLP and CLP – discriminatory testable predictions below). Cross-modal effects between body and sensor movements, which are not discussed in this paper, are likely to play a significant role in perception as well (Ayaz et al., 2013; Fonio et al., 2016; Grion et al., 2016; Keller et al., 2012; Moore et al., 2013; Niell and Stryker, 2010). During sensor scanning, activations of individual (e.g., photo- or mechano-) receptors are functions of the interactions between the moving sensor and the physical features of external objects (Ahissar and Arieli, 2012; Ahissar and Vaadia, 1990; Bagdasarian et al., 2013; Boubenec et al., 2012; Friston, 2010; Gamzu and Ahissar, 2001; Gibson, 1962; Hires et al., 2013; Horev et al., 2011; Jarvilehto, 1999; Kuang et al., 2012; Mainland and Sobel, 2006; O'Regan and Noe, 2001; Pammer et al., 2013; Quist and Hartmann, 2012; Quist et al., 2014; Rucci and Victor, 2015; Saig et al., 2012; Saraf-Sinik et al., 2015; Smear et al., 2011). These dependencies are termed here in general motor-sensory contingencies (MS-contingencies); they form one class of the sensorimotor contingencies described by O’Regan and Noe (O'Regan and Noe, 2001).

Active sensing.

(A) Ocular scanning of a scene.The trajectory of a human subject’s gaze (of one eye) during free viewing of an image presented on a computer screen is depicted. “Drift” points to the slow eye movements scanning a region of interest during a fixational pause. “Saccade” points to a rapid saccadic eye movement moving the gaze from one fixational pause to another. Section duration: 60 s; sampling: 240 Hz. Courtesy of Moshe Fried and Amos Arieli. (B) Manual scanning of a surface. The trajectory of a human subject’s hand, while palpating a virtual surface with a varying density of elevated dots (black), is depicted. The surface was mimicked via a tactile computer mouse system (VTPlayer; VirTouch, Jerusalem) whose two 4x4 pin arrays, which were touched constantly with the index and middle fingers of the right hand, reflected the spatial details of the virtual surface according to mouse location. Section duration: 152 s; sampling: 125 Hz. Courtesy of Avraham Saig and Amos Arieli. (C) Facial scanning of an arena. The trajectory of the snout of a rat, exploring an arena using sniffing and touch, is depicted. Section duration: 828 s; sampling: 25 Hz. Courtesy of Ben Mitchinson, Chris J. Martin, Robyn A. Grant and Tony. J. Prescott; see (Mitchinson et al., 2007). (D) Local vibrissal scanning. The trajectory of a point near the middle of whisker C1 of a rat, exploring a region of an arena, is depicted. All whiskers except row C were trimmed on both sides of the snout. Section duration: 1.5 s; sampling: 500 Hz. Courtesy of Tess Oram, Noy Barak and Dudi Deutsch. (E) Sensory granularity. Left, a sample of retinal photoreceptors array of the human foveal area (from Curcio et al., 1987). Middle, a schematic illustration of the organization of one type of mechanoreceptor (rapidly adapting) under the skin of the human fingertip. Right, whiskers array: left, the array of whiskers across the right snout of a rat, courtesy of Sebastian Haidarliu; right, a schematic illustration of a whisker’s follicle containing hundreds of mechanoreceptors, courtesy of Satomi Ebara.

https://doi.org/10.7554/eLife.12830.004

The fact that mammalian sensation is active significantly challenges the OLP doctrine. First, it turns out that the common reductionist approach in which stimuli are flashed on passive sensory organs cannot be extended back to natural conditions. This is because in such experiments no information is obtained about the dependency of sensory signals on natural active interactions with the object, interactions that cannot be mimicked with passive sensors. In vibrissal touch, for example, a crucial sensory variable is the whisker curvature (Bagdasarian et al., 2013; Boubenec et al., 2012; Quist and Hartmann, 2012), which cannot be physically mimicked with only external forces (Bagdasarian et al., 2013). In vision, while the conditions accompanying an ocular drift can be mimicked, in principle, by drifting the entire visual field, the conditions accompanying ocular saccades cannot be mimicked with passive eyes. Ocular saccades are accompanied by peri-saccadic suppression during which, unlike with flashed stimuli, activity along the visual sensory pathway is significantly suppressed (Hamker et al., 2011). Also, saccades are always ending with additional eye movements, such as overshoots, corrections and drifts, which are lacking in passive-eye experiments. In general, it seems that the conditions introduced when stimuli are flashed on passive sensors mimic a small set of naturally-occurring states such as lightning at night or a sudden wind blowing over the rat’s whiskers. It is thus not surprising that, when compared, the characteristics of NRs revealed with passive sensors are substantially different from those revealed with active sensors (e.g., Kagan et al., 2002).

OLP assumes that the presentation of an object retrieves the NR that represents it, i.e., its neuronal IvR. When this assumption was tested computationally at the presence of simulated eye movements it was found that such a retrieval is possible with a very simple environment (one stimulus) and a limited number of possible NRs (two), in which case the knowledge of the statistics of sensor motion (e.g., eye movements) can provide unique, unambiguous solutions (Pitkow et al., 2007). However, it is not clear if a similar mechanism can work with more crowded environments, even when the movement trajectory of the eye is tracked by the perceiver (Burak et al., 2010). The major challenge with IvR retrieval in OLP, even when the sensor trajectory is known (e.g., Ahissar et al., 2015a), is the instability of the sensory input. With spike-based representations and finite firing rates this instability is devastating – by the time required to construct a reliable representation the sensor may have already moved away and provide new inputs. With representations of fine visual details it had been shown that this is indeed the case (Ahissar and Arieli, 2012).

The realization that the visual system codes external objects differently in passive and active modes sets another major challenge to OLP. This difference can be attributed to the fact that while a passive eye that is stimulated by a flashed image can only use spatial coding to represent the image, a moving eye can use both spatial and temporal coding schemes. In fact, the temporal code appears to be much more accurate, and of higher spatial resolution, than the spatial code (Ahissar and Arieli, 2012; Berry et al., 1997; Reich et al., 1997). Thus an OLP theory assuming that the same IvR is retrieved with or without sensor motion must also assume that perception is based on the less accurate spatially-coded information and ignores (or corrects for) the more accurate temporally-coded information - clearly an inefficient strategy. As we will see below (in Contrasting OLP and CLP – discriminatory testable predictions), a more efficient OLP scheme, which is based on active sensing and can exploit its advantages, is also possible.

Sensory signals convey ambiguous information

Sensory signals may often be ambiguous if processed without the motor signals that yielded them. One example is the curvature signal generated at the base of a whisker upon its contact with an object. The same curvature can be generated when contacting objects at different locations, an ambiguity that is resolved if the angle by which the whisker is rotated is taken into account (Bagdasarian et al., 2013) (Figure 3). Similarly, temporal delays between two whiskers or two photoreceptors code spatial offsets ambiguously if sensor velocity is not considered (Ahissar and Arieli, 2001, 2012; Knutsen and Ahissar, 2009). Consistently, in vivo recordings from the primate retina ruled out pure sensory processing, such as lateral inhibition, as a basis for edge detection while supporting motor-sensory processes involving eye movements (Ennis et al., 2014). These pieces of evidence join a substantial list of evidence for the ambiguity of sensory signals and the unambiguity of MS-contingencies (Bompas and O'Regan, 2006; O'Regan and Noe, 2001). For vision this is further supported by a series of experiments and analyses indicating that retinal information depends on the nature and trajectory of miniature eye movements (Ahissar et al., 2014; Ko et al., 2010; Kuang et al., 2012; Olveczky et al., 2003; Rucci et al., 2007; Snodderly et al., 2001).

An example of MS-Contingency in vibrissal touch.

A schematic illustration of morphological coding of object location (Bagdasarian et al., 2013) is depicted. The motor-sensory phase plane describes the combinations of values of a motor (θp: push angle, maximal change in whisker angle from contact onset) and sensory (k, whisker base curvature) variables when a whisker actively contacts an object at various locations. The locations are defined by their coordinates in the horizontal plane (inset): three azimuth coordinates (Lθ = [p1, p2, p3]) and three radial coordinates (Lr = [60%, 75%, 90%] of whisker length) are depicted and coded by colors. Note that neither of the two variables provide unambiguous coding of object location by itself; for example, k around .02 mm-1 codes for both ~[p2, 60%] and ~[p1, 90%]. In contrast, the contingency between the motor and sensory variables provides unique coding of both Lr and Lθ (see equations).

https://doi.org/10.7554/eLife.12830.005

Note that this challenge cannot be alleviated by adding efference copy information to open-loop perceptual processing – efference copies are not accurate enough to account for perceptual accuracy (Ahissar et al., 2015a; Pitkow et al., 2007; Simony et al., 2008). For example, perception of object location in rats (Knutsen and Ahissar, 2009) depends on the details of the motor trajectory at a resolution corresponding to movements induced by individual motor spikes (Herfst and Brecht, 2008; Simony et al., 2008), a resolution that is likely not available in internal efference copies (Fee et al., 1997; Hill et al., 2011). Similarly, the accuracy of visual efference copies is two orders of magnitude lower than the size of fine eye movements (Pitkow et al., 2007).

Perceptual systems are organized as motor-sensory-motor (MSM) loops

Sensory organs (eyes, hands, whiskers) are associated with muscles whose activations move the sensory organ and induce sensory signals (Simony et al., 2008). The neuronal motor and sensory systems that are associated with a given sensory organ are connected via an intricate system of loops that does not allow an isolated operation of either (see illustration of the vibrissal system in Figure 4A). When motor efferents of a specific sensory organ are activated, sensory signals are inevitably generated (Hentschke et al., 2006; Jarvilehto, 1999; Johansson and Flanagan, 2009; Keller et al., 2012; Poulet and Petersen, 2008) and when sensory signals are generated, motor efferents to the same sensory organ are naturally affected (Bonneh et al., 2013; Gilad et al., 2014; Ko et al., 2010; Mainland and Sobel, 2006; Nguyen and Kleinfeld, 2005). One needs to anesthetize the brain, eliminate specific pathways, or prevent the movements of the relevant sensory organs in order to ‘open’ this motor-to-sensory-to-motor loop.

Anatomy and perceptual schemes of a sensory modality.

(A) Closed-loop motor-sensory-motor (MSM) connections of the vibrissal system.A schematic diagram of the most relevant connections, through which sensory activities feed motor circuits at various levels, is depicted; efference copies are not explicitly depicted. Oval circles indicate brain regions [BPN, brainstem premotor nuclei (arbitrarily divided into two oval circles); BG, basal ganglia; Cer, cerebellum; FN, facial nucleus; MCx, motor cortex; POm, posteromedial thalamic nucleus; RN, red nucleus; SC, superior colliculus; SI, primary somatosensory cortex; SII, secondary somatosensory cortex; TG, trigeminal ganglion; TN, trigeminal brainstem nuclei; VL, ventrolateral thalamic nucleus; VPM, ventroposteromedial thalamic nucleus; ZI, zona incerta]. Black curves connecting brain regions indicate anatomical connections. Arrows indicate the direction of information flow between brain regions. Connections not labeled with arrows are reciprocal (for more details see Bosman et al., 2011; Diamond et al., 2008; Kleinfeld et al., 2006). Three examples of individual MSM-loops are illustrated by green (a brainstem loop), blue (a thalamic loop) and red (a cortical loop); the primary efferents (FN to muscles) and afferents (follicle to TN) may or may not be common to different pathways. Modified from (Ahissar and Knutsen, 2008; Ahissar et al., 2015b). Inset, top view of the head and whiskers of a rat performing a bilateral localization task. (B) An MSM-loop (left) activates and senses the same organ. Sensory-motor arcs (right), which sense one organ and activate another, are not discussed in this paper. (C) Inclusion in an MSM-loop. Re-afferent loops (green) are always closed and thus can be considered as constantly ‘perceiving’ their organs. Ex-afferent loops (magenta) are normally open (dotted). An ex-afferent loop is closed (solid) only when the sensory organ interacts with the object (right); neither object presence alone (left) nor sensor movement alone (middle) close the loop.

https://doi.org/10.7554/eLife.12830.006

Brain loops that include the relevant sensory organ for a given perception (Ahissar and Arieli, 2012; Ahissar and Kleinfeld, 2003; Ahissar and Vaadia, 1990; Diamond et al., 2008; Kleinfeld et al., 2006; Saig et al., 2012) are termed here motor-sensory-motor loops, or briefly MSM-loops. For example, vibrissal MSM-loops include loops running via brainstem stations, thalamic stations and cortical stations, all sharing the same sensory organ (Figure 4A; colored arcs). Finger-touch MSM-loops include loops that are similar to those of the vibrissal system, running through homologous stations (Ahissar et al., 2015b). Existing anatomical descriptions of Visual MSM-loops are less detailed, although it is known that they also follow a multi-pathway architecture (Bishop, 1959; Casagrande, 1994; Diamond, 1983; Lappe et al., 1999; Nassi and Callaway, 2009; Wang et al., 2007), with sensory information feeding back onto oculomotor pathways at virtually all brain levels (Dhande and Huberman, 2014; Fries et al., 1985; Guillery, 2005; Guillery and Sherman, 2002; Krauzlis and Lisberger, 1991; Malik et al., 2015). Likewise, sniffing and tasting are likely to be controlled via modality specific MSM-loops as well (Kareken et al., 2004; Kepecs et al., 2007; Moore et al., 2013). As for the auditory system, relevant MSM-loops are likely those whose motor efferents activate the outer hair cells in the cochlea, which in turn change the tuning of the basilar membrane (Guinan, 1996; Jennings and Strickland, 2012), those which activate the muscles of the middle ear (Kobler et al., 1992) and those which control the direction of the pinnae. MSM-loops that control head movements can be shared by all cranial senses.

Throughout this paper, when we refer to MSM-loops we refer both to their anatomy and function. We use the term”motor-sensory-motor” instead of the common term “sensory-motor” in order to emphasize the fact that the loops that we refer to are those controlling a single sensory organ, and in which the flow of information is from the sensory organ to itself, via the brain. These loops should be distinguished from multi-modal sensory-motor loops, which include sensory-motor arcs that link different modalities (e.g., eye – hand or eye – whisker; Figure 4B) – these inter-modal loops and arcs are not addressed here.

The closed-loop architecture of the perceptual systems challenges the OLP doctrine. How would an open-loop mechanism emerge, and how would it function, in such a closed-loop system? In natural conditions every sensory activity will affect the movement of the sensory organ and evoke new sensory activations, assuming that the external object does not disappear after its first interaction with the brain. As loop cycle times are typically shorter than the typical perceptual epoch (e.g., Deutsch et al., 2012), a sequence of such sensory activations is typically expected within each perceptual epoch (i.e., a period of continuous engagement with the object). How would this sequence of activations be ignored? And, more importantly perhaps, why would it be ignored? Moreover, it is known that increased stimulus exposure durations increase perceptual accuracy and confidence (Packer and Williams, 1992; Saig et al., 2012); if this is achieved in an open-loop manner, then it would mean that the brain does use those additional sensory signals, and “corrects for” the motion that evoked them using efference copy signals. Unfortunately, as mentioned above (in Sensory signals convey ambiguous information), efference copy signals are not accurate enough to account for fine perception.

Perception can be masked “backwardly”

Although the loops are anatomically closed, they can be opened functionally. For example, projecting a flash of an image on the retina or skin, for a duration that is shorter than the duration of the minimal MSM-loop cycle, does not allow closure of the loop. When such a ‘virtual knife’ is used, the system is forced to function in an open-loop mode, regardless of its architecture. According to the OLP doctrine, this reductionist step does not interfere with the fundamental process underlying perception and thus the natural perceptual process can be reconstructed from such individual open-loop processes. However, backward masking, a robust perceptual phenomenon, challenges this assumption. The presentation of a second object within tens of milliseconds after the presentation of a target object prevents or impairs the perception of this target object (Enns and Di Lollo, 2000). Such “backward in time” effect can occur in some open loop scenarios, for example if perception would depend on the integration of two processes, one fast and one slow, such that the fast process activated by the mask would interfere with the slow process activated by the target (Breitmeyer and Ogmen, 2000). Experimental data, however, were found to be inconsistent with such open loop schemes, while supporting a dependency of perception on closed loop (“re-entrant”) mechanisms, in which the stimulus is repetitively sampled (Enns and Di Lollo, 2000). The dependency on repetitive sampling strongly challenges the assumption that the ‘virtual knife’ does not interfere with the natural process of perception. Backward masking indicates that flashed stimuli allow, at best, an examination of the first step of a perceptual process, as explained below (see CLP propositions).

Perception involves motor-sensory convergence

Perception takes time – typical perceptual epochs last hundreds of milliseconds. The first wave of sensory-driven neuronal activity typically reaches most of the relevant cortical areas within ~100 milliseconds, and quick saccadic reports on the crude category of the perceived item can be generated as fast as 150 milliseconds after stimulus onset (Wu et al., 2014). Yet, the identification of more delicate categories and the perception of item details take typically hundreds of milliseconds from first sensor-object encounter, a period during which perceptual acuity continuously improves (Micheyl et al., 2012; Packer and Williams, 1992; Saig et al., 2012). Consistently, scalp EEG recordings reveal that perceptual thresholds are correlated with neuronal activities that are recorded after the first transient neuronal response (Censor et al., 2009).

Careful analyses of rodent and human behavior during tactile perception reveal signatures of a converging process. Object features, such as location and texture, are perceived via a sequence of sensor-object interactions whose motor and sensory variables show a pattern of convergence towards asymptotic values (Chen et al., 2015; Horev et al., 2011; Knutsen et al., 2006; McDermott et al., 2013; Saig et al., 2012; Saraf-Sinik et al., 2015; Voigts et al., 2015). This behavior is consistent with previous descriptions of perception as a dynamic process (Ahissar and Vaadia, 1990; Ashby, 1952; Kelso, 1997; O'Regan and Noe, 2001; Port and Van Gelder, 1995; Powers, 1973; Wiener, 1949), but not with an open-loop one. Converging dynamics, i.e., dynamics during which the state of the entire system gradually approaches a steady state, are hallmarks of closed-loops – an open-loop system does not converge as a whole. Thus, while the OLP doctrine could accept neuronal convergence in local circuits, it cannot account for perceptually-relevant MSM converging dynamics.

Hypothesis and Results

The closed-loop perception (CLP) hypothesis

Here we propose a closed-loop scheme of perceptual acquisition, and suggest to refer to it as a possible alternative to the OLP doctrine. Within the scope of this paper we describe the acquisition of information about the organism’s immediate environment and do not address the interactions between perceptual acquisition and perceptual report. The CLP scheme is consistent with the same data challenging OLP, primarily because it considers sensor motion as an integral part of perception rather than as a factor that needs to be corrected for. We propose to continue comparing the two alternative schemes on equal grounds against accumulating data, and for aiding such a comparison we list potentially discriminative experiments towards the end of this article.

The CLP hypothesis is based on the following assumptions

  1. Sensation is normally active. Sensory organs obtain information about external objects via active interactions with the physical attributes of the object.

  2. MSM-loops are fundamental units of mammalian perception. These loops, as every closed loop, can approach lag-less, steady states. During steady-states all changes in the loop are fully predictable and the loop functions as one unit, with no beginning or end and with no causal order; changes in one component of the loop cannot be considered as lagging or leading changes in any other component of the loop.

  3. There are two basic types of MSM-loops (Figure 4C, left). The first uses proprioceptive (re-afferent; Figure 4C, green) signals to monitor sensor state. Such loops are always closed; that is, information about the sensor state is always conveyed back to the rest of the loop. Importantly, these loops can also sense external features in a rough way (Berryman et al., 2006), probably via sensing significant deviations between intended and actual sensor kinematics. The other type uses sensory signals to directly monitor features of external objects (ex-afferent; Figure 4C, magenta). The receptors of “ex-afferent loops”, i.e., loops that contain ex-afferents, do not respond to sensor movement per-se, but to sensor interactions with external objects. These loops remain open if no object exists in the external field scanned by the sensor. They will be closed (i.e., meaningful neuronal activity will flow along the loop) only through interactions of the sensory organ with specific external features to which their receptors are responsive. For example, whisker contacts with external objects activate a family of vibrissal mechanoreceptors that otherwise would remain silent ("Touch cells") (Knutsen and Ahissar, 2009; Szwed et al., 2003) - the loops containing these neurons will be closed only through the interaction of whiskers with an external object present in the field of whisking (Figure 4C). Similarly, most photoreceptors are activated by luminance changes and thus would remain silent when the eye rotates against a uniform background. Visual ex-afferent MSM-loops are thus likely to be closed only via the existence of specific optical features in the visual field.

CLP propositions

The following set of propositions is consistent with our assumptions and defines a hypothesis for perception.

  1. Perception (of external feature(s)) ≡ a process of inclusion in MSM-loop(s). During this process the entire MSM-loop, including its muscles, receptors and neurons, and with the external feature being included, converges towards a steady-state. Had the loop, with the external feature included, reached steady-state, that feature could be considered as been “directly perceived” by the loop, with no mediation and no delay. However, as such a steady-state is an idealized state in which nothing new is perceived, MSM-loops never reach the absolute steady-states. Rather, they rove dynamically between being perturbed (by external or internal processes) and approaching steady-states.

Our hypothesis thus asserts that a given percept is associated with a given steady-state of the motor-sensory-neuronal variables space. This steady-state can be referred to as the IvR of the relevant feature or object. The steady-states can be of various types: a fixed point in the motor-sensory-neuronal space, a closed trajectory within this space (limit-cycle) or a chaotic attractor. We name these attractors perceptual attractors (Freeman, 2001; Kelso, 1997; Port and Van Gelder, 1995) since perceiving according to our hypothesis is equivalent to converging towards one such specific attractor in the relevant motor-sensory-neuronal space. A crucial aspect of such an attractor is that the dynamics leading to it encompass the entire relevant MSM-loop and thus depend on the function transferring sensor motion into receptors activation; this transfer function describes the perceived object or feature via its physical interactions with sensor motion. Thus, ‘memories’ stored in such perceptual attractors are stored in brain-world interactions, rather than in brain internal representations (see also Dreyfus, 2002; Merleau-Ponty, 1962; O'Regan, 1992).

During the dynamic convergence process the state of the entire MSM-loop (with the external feature included) gradually approaches a steady-state. This can be illustrated by the dynamics of an internal variable, termed here “perceptual confidence” (Cj, where j indicates the perceived feature), whose maximal value is obtained at steady-state (Figure 5A). Cj starts to build up upon the first interaction with the object and gradually increases towards the steady-state asymptote as additional interactions occur (see thalamo-cortical correlates of such a process in Figure 6 of Yu et al., 2015). This convergence process allows for partial perception (e.g., binary classification) to occur even with very brief presentations of external stimuli (VanRullen and Thorpe, 2001) (Figure 5A, red mark).

According to CLP, thus, artificially flashed stimuli initiate a perceptual process, and provide some perceptual information, but do not allow further accumulation of perceptual information as would normally occur with natural stationary objects (Figure 5A). CLP thus predicts that, although the percepts evoked by flashed stimuli can be robust, they would typically include significantly less information than the information actively acquired from continuously-present objects during typical perceptual epochs. Within the CLP scheme, psychophysical data obtained with flashed stimuli are valuable for assessing the degree of convergence that can be reached upon a single interaction with the object, and its reportable resolution.

We leave the details of the generation of the confidence signal, Cj, outside the scope of this article. Yet, for the sake of clarity, we outline here one possible mechanism, which is based on internal models (Anderson et al., 2012; Gordon and Ahissar, 2012; Kawato, 1999; Lalazar and Vaadia, 2008; Nijhawan and Kirschfeld, 2003; Wolpert et al., 1998). Internal models implement simulations of the interactions of the brain with the external world, simulations that are tightly coupled to the actual interactions. A continuous comparison of the predictions of internal models with the signals resulting from the actual interactions can provide a measure of the deviation of the actual convergence process from an expected one - the closer the actual and simulated processes the higher the confidence. If internal models are also continuously updated along with the developing history of the organism, as usually assumed, they can provide a close estimation of Cj. Internal models are often hypothesized to be implemented via cerebro-cerebellar, basal-ganglia, or thalamo-cortical loops; in principle, internal models affiliated with different MSM-loops can be implemented via different brain areas or circuits. Furthermore, the internal models are likely active players in the operation of the MSM-loop and its convergence dynamics, which is consistent with reports of neuronal signals that are involved in both perceptual processing and perceptual confidence (Fetsch et al., 2014).

The converging process is expected to end by another external perturbation, by reaching a certain level of Cj (as in bounded evidence accumulation, Shadlen and Kiani, 2013), by the passage of a certain time interval or by an overriding or coordinated operation of another MSM-loop. The guiding principle of brain-object disengagement, when controlled by the brain, is likely to be based on information gain – when subsequent interactions are expected to provide relatively little relevant information, the brain would typically detach from the perceived feature or object and orient its MSM-loops towards other features or objects (Creutzig et al., 2009; Horev et al., 2011; Little and Sommer, 2013; Polani, 2009; Saig et al., 2012). In principle, modeling of loop disengagement can follow the modeling of decision making dynamics (Gold and Shadlen, 2007; Shadlen and Kiani, 2013) and dynamic perception (Kelso, 1997), targeted to entire MSM-loops rather than to local circuits and assuming active, self-induced sampling of evidence.

  1. Perception of an external object ≡ a coordinated process of inclusion in a collection of MSM-loops. An individual MSM-loop is assumed here to typically perceive an individual feature. An ‘object’ is a certain set of such features, proposed here to be a set that is delineated by a coordinated convergence process. As the dynamics of such multiple-loop convergence are beyond the scope of this article, we would only mention that they should depend on two major processes. One is a binding process in which the loops share information - one candidate vehicle for inter-loop binding is a link established by fast frequency oscillations (Fries et al., 2007; Tallon-Baudry and Bertrand, 1999), as they allow several inter-loop iterations per each motor-sensory-motor iteration. The second is a selection process (Humphries et al., 2007; Prescott et al., 2006) that determines the control over the sensory organ. This selection process is not unique to ‘within object’ loops – it should operate constantly, as naturally more than one MSM-loop is expected to be functional at any given time. We consider two, not mutually exclusive, major schemes of control selection. In one, every loop controls a sub-set of the muscle units attached to the sensory organ (e.g., Takatoh et al., 2013). In the other, there is a dynamic selection of the MSM-loop(s) that control sensor-object interactions at any given moment. This process can be implemented by a variety of architectures, including subsumption-like (Brooks, 1986), hierarchical curiosity (Gordon and Ahissar, 2012; Gordon et al., 2013; Gordon et al., 2014) and others (Arkin, 1998). The binding between the loops is expected to break at the end of the perceptual epoch, upon the disengagement of one or more of the loops from their external features.

Hierarchical dynamics of MSM-loops can be illustrated by considering a visual scanning of an object or a scene or a tactile scanning of a surface (Figure 2). For example, when looking at an object or a scene the eyes saccade through a sequence of fixation areas, following a trajectory that is often termed “scanpath” (Ko et al., 2010; Noton and Stark, 1971; Walker-Smith et al., 1977; Yarbus, 1967), and drift around within each fixation area for several hundreds of milliseconds (Ahissar et al., 2014; Rucci et al., 2007; Steinman and Levinson, 1990). The scanpath trajectory, which moves the visual gaze from one region of interest to another, is considered in our scheme to be part of converging dynamics in one level of MSM-loops, and the local drift scanning trajectories, which acquire local visual details (Ennis et al., 2014), are considered to be parts of converging dynamics of MSM-loops at lower levels (Ahissar et al., 2014). Moving on from a given fixation area depends on the Cjobtained at that area by the lower loops, on the perceptual dynamics of the scanpath loop, on variables of still higher loops depending on the context, task and brain state and on changes in the external object or scene.

  1. Perceptual time is determined by the MSM-loop’s cycle time. Physical time is unlikely to have a neuronal metric, or ‘yardstick,’ enabling its direct measurement. In contrast, a yardstick that is available for each MSM-loop is its own cycle time, which can be sensed by each of its components. Durations of external events can be measured by the counts of such ‘ticks’ (Ahissar, 1998). In this case, the resolution of perceptual time is the loop cycle time; events occurring within one cycle are considered simultaneous (Poppel, 2004). A possible relationship between physical and perceptual times can be described using a helix metaphor (Figure 5B). The helix should be considered flexible in its ‘perceptual axis’, being affected by the state of the perceiving loop. As changes in the loop’s cycle time can also be sensed by neurons (Ahissar, 1998; Ahissar and Vaadia, 1990; Buonomano and Merzenich, 1995), online calibration between perceptual and physical time is possible to some extent. The assessment of physical time by an MSM-loop is predicted here to depend on the loop cycle time, which of course can change according to the perceptual scenario.

Corollaries of the CLP hypothesis

Major corollaries of the CLP propositions are:

  1. An individual MSM-loop is the elemental unit of perception, namely is both necessary and sufficient for perception (of at least one external feature) to emerge in natural conditions. Thus, any reductionist study of perception must include at least one MSM-loop.

  2. Nested MSM-loops can present different dynamics simultaneously. A higher-order loop can perceive (i.e., include) a scene at (close to) a steady-state, while lower-order loops dynamically rove along their perturbed – steady-state axis. Thus, an environment (e.g., a room) can be perceived in (close to) a “direct” manner by higher loops for the entire period in which its details are sequentially scanned by lower-order loops.

  3. Perception is a continuous dynamic and interactive process and not a momentary event (Cleeremans and Sarrazin, 2007; Edelman and Tononi, 2001); during the perceptual process, a percept gradually emerges.

  4. Perception is associated with changes in brain dynamics rather than with the construction of invariant internal representations. Given that sensor movements are never identical, and in fact vary significantly between perceptual epochs even when objects and contexts are constant (e.g., Knutsen et al., 2006; Saraf-Sinik et al., 2015), what remain invariant are the relationships between the variables of the entire MSM-loop(s)(see also Merleau-Ponty, 1962; O'Regan and Noe, 2001). Individual neuronal variables anywhere in the brain are unlikely to remain invariant (Rokni et al., 2007).

  5. Perceptual time is determined by the dynamics of the relevant MSM-loops and thus depends on the perceived environment. Also, within one loop cycle period, changes in external features and internal processes occur at different physical times but at the same perceptual time.

  6. Perception is not necessarily conscious. The brain can perceive external features by loops that are not accessible, at that moment, to conscious report. Thus, conscious perception is defined here as one category of perception.

CLP dynamics.

(A) The dynamics of perception of an individual feature by an individual MSM-loop follows a convergence pattern.The loop starts converging towards its steady-state (in which state perception is complete and “direct”) upon the first interaction with the object, whether active or passive (e.g., a flashed stimulus). The confidence of perceiving feature j (Cj) gradually increases during convergence. The loop may quit the process when Cj becomes larger than a certain internal threshold (Cd) or upon an internal or external perturbation. (B) The relationships between physical and perceptual time during CLP convergence are presented via a spiral metaphor, in which the physical time can be measured along the spiral, and the perceptual time can be measured across the spiral, e.g., by counting the number of activations of a given point along the loop. A steady-state can be reached at some point along the process.

https://doi.org/10.7554/eLife.12830.007

CLP mathematical framework and models

One natural choice of a mathematical framework for CLP is the framework of dynamical systems (Kelso, 1997; Port and Van Gelder, 1995). Within this framework each MSM-loop is modeled as a dynamical system that includes motor, sensory and neuronal variables, as well as the differential equations which describe their relations. The following is a general mathematical description of such a model (Figure 6A):

s¯=f(m¯,u)n˙¯=g(n¯,s¯)m˙¯=h(m¯,n¯)

The bars above the letters indicate that they represent a vector (of one variable or more). g and h are functions describing the intrinsic dynamics of the variables (n¯ and m¯ respectively) and their dependency on the variables in the preceding stations of the loop (s¯ and n¯ respectively). The sensory variables (s¯) do not depend on their intrinsic dynamics in this formalization, which assumes short sensory time constants; they are determined by the motor variables (m¯) and the state of the environment (u), according to the function f. The function f encapsulates the physical laws governing the sensory organ-environment interactions and the transduction of physical signals to neuronal ones.

The state of the system is defined as the vector containing all the variables (m¯,s¯, n¯ ). Perception is achieved through the convergence of the system to a steady-state within this state-space. The information of the perceived feature is contained in the values of the dynamic variables (the system’s state) at this steady-state. High-level functions such as integration of the general context or a report mechanism are not included in this model of single-feature acquisition.

Synthesis of closed-loop perception in a robotic setup.

(A) A sketch of the MSM-loop model template.m, motor variable; SO, sensory organ; s, sensory variable; n, neuronal variable; h, f, g, transfer functions; u the environment dynamics. The arrows depict the direction of information flow within the loop. (B) The SYCLOP robotic platform. A sketch of the robot with its different components: Pan-Tilt control unit (PTCU, only the pan axis was used here) (1), DVS camera (2), and desktop computer (3). The computer sends commands to the PTCU which controls the camera’s rotations in the azimuth (θ) and elevation (ε) axes. The DVS camera sends visual ‘on’ and ‘off’ events to the computer. (C-F) Implementation of a specific contrast perceiving CLP model (see text). (C,D) n1, the integrated difference between ‘ON’ and ‘OFF’ events, and ω, sensor angular velocity along the pan axis, (C and D, respectively) as a function of time in two different runs of the CLP algorithm, one facing a contrast of 0.9 (red) and one facing a contrast of 0.5 (green). n1 is scaled in units of 1000 events. (E) System’s trajectories in the 2D n1-ω state space. Same data as in C and D. (F) Example of emergent smooth-pursuit like behavior when using a moving edge as a stimulus. The trajectory of the system in the n1-ω plane (gray line) overlaid on a heat map where the color of each segment corresponds to the amount of time in seconds the system spent within this segment. The smooth pursuit periods are represented by the white and light red squares. While in a smooth pursuit, the camera was moving with a constant angular velocity – smoothly tracking the edge.

https://doi.org/10.7554/eLife.12830.008

Synthesis of CLP in a robotic setup

One way to test such CLP models and demonstrate their basic behavior is to implement them using a synthetic agent. We built a simple robot for this purpose; the robot (SYCLOP: SYnthetic Closed-LOop Perceiver) includes two motors, one sensor and their bilateral connections (Figure 6B). This platform allows the implementation of minimal MSM-loops based models (one motor DOF and one sensor). The SYCLOP uses a biomimetic camera (DVS128, iniLabs Ltd Zurich, Switzerland, Lichtsteiner et al., 2008) as its sensor; this camera, like a retina, sends signals only upon luminance intensity changes. The camera is mounted on a pan-tilt control unit (PTU-46-17, DirectedPerception, CA, USA). The motor-to-sensory connection is implemented by moving the camera along the pan-tilt axes while the sensory-to-motor connection is implemented by a computer that implements the model's equations.

The SYCLOP platform was used, for example, to implement and test the behavior of a single MSM-loop model which was designed to perceive a visual contrast. The stimulus, in this case, was presented on a computer screen: half of the screen was kept dark and on the other half a uniform grayscale surface was displayed. The grayscale values ranged from dark to white. We defined two sensory variables ron and roff - the rate of ‘ON’ events (single-pixel events in which the luminance intensity increased) and the rate of ‘OFF’ events (single-pixel events in which the luminance intensity decreased) - integrated over the entire camera’s field. The characterization of the dependency of these two sensory variables on the chosen motor variable (sensor angular velocity along the pan axis, ω) and the external feature (contrast, γ) resulted in the following equation:

ronroff=C1γω

Where C1 represents a constant and noise is ignored. The MSM-loop model is completed by the addition of two transfer functions that define two differential equations, sensory-to-neuronal (g) and neuronal-to-motor (h):

{n˙1=g(n1,ron,roff)=C2(ronroff)=C2C1γωw˙=h(ω,n1)=1C3(μ(1C4n12)C3ωn1)

Where n1 is defined as the (single) neuronal variable, which integrates the difference between ron and roff (Demb and Singer, 2012), ω is the (single) motor variable defined as the sensor’s angular velocity and C2, C3 and C4 represent constants. The functions g and h were chosen such that the resulting dynamical system would be equivalent (up to constants multiplications, assuming all constants and parameters are positive) to a Van der Pol oscillator (Kanamaru, 2007). This specific system was chosen due to its known dynamics: the system converges to a single closed trajectory within its 2D phase plane (i.e. a limit cycle) independently of the initial values of the variables. After convergence each of the dynamic variables is a periodic function of time (e.g., Figure 6C and D). Clearly, other dynamical systems could fit as well.

This model was implemented on the SYCLOP platform with the aid of a c program running on the computer incorporated in the platform (Figure 6B, item 3). The program received the ON and OFF events from the DVS camera, computed the ron and roff sensory variables and used them to compute the values of n1 and ω by integrating the two differential equations described above. The value of ω was then sent by the program to the pan-tilt controller and modified the camera’s pan velocity. This implementation illustrates a simple CLP convergence process (Figure 5) and shows how different precepts can be differentiated in CLP. The convergence dynamics involves different dynamics of the sensory (ron and roff), neuronal (n1, Figure 6C) and motor (ω, Figure 6D) variables. Yet, the variables are strongly linked, as demonstrated by the phase diagram of the neuronal and motor variables (Figure 6E); these two variables quickly converge to a limit cycle (i.e., a constant closed trajectory in the phase plane). Similar behavior is observed in the other phase planes (sensory-motor and sensory-neuronal, not shown). Importantly, in all these phase planes the limit cycle depends on the external contrast (γ); while maintaining all loop parameters constant, a monotonic change in γ results in a corresponding monotonic change of the limit cycle (green and red trajectories in Figure 6E for contrasts of 0.5 and 0.9, respectively). Hence, the image’s contrast can be inferred from the asymptotic behavior of the system or, in other words, the motor-sensory-neuronal trajectory that is uniquely associated with (or, equivalently, the CLP’s IvR of) a given contrast can be “retrieved” by the presentation of that contrast to the perceiver.

The behavior of the SYCLOP is described here in order to demonstrate how a possible implementation of our CLP model would look like. Interestingly, however, it is worth mentioning that the SYCLOP also exhibits behaviors that it was not intentionally designed to exhibit – for example, a smooth pursuit behavior. When presented with a moving image (back and forth horizontal movement of the contrast image at a constant speed) SYCLOP tended to track the image smoothly in each direction (as indicated by the “dwelling spots” at ω5; and 5 deg/s Figure 6F).

Discussion

Summary of the CLP hypothesis

CLP suggests that perception of the external environment is a process in which the brain temporarily ‘grasps’ external objects and incorporates them in its MSM-loops. Such objects become virtual components of the relevant loops, hardly distinguishable, as long as they are perceived, from other components of the loop such as muscles, receptors and neurons. What primarily distinguishes external objects from body parts are inclusion duration and state; short and transient inclusions mark external objects while long and steady inclusions mark body parts (see also Uexkull, 1926). Interestingly, the perceptual dynamics suggested by this hypothesis reconciles a conflict between objective scientific observations and the subjective everyday experience of perceiving objects with no mediation (see also A philosophical angle below). Everyday perception of a given external object, CLP suggests, is the dynamic process of inclusion of its features in MSM-loops. This process starts with a perturbation, internal or external, and gradually converges towards a complete inclusion - approaching, although never reaching, a state of “direct” perception. A laboratory-induced flashed stimulus, according to this model, probes the initiation of a perceptual process, whereas dreaming and imagining evoke internal components of the process.

Contrasting OLP and CLP – discriminatory testable predictions

We consider here all versions of OLP, i.e., all versions of hypotheses in which perception does not depend on the integrity of the MSM-loop and its closed-loop dynamics within individual perceptual epochs. We consider here two major OLP classes: in one, sensory OLP (sOLP), the movement of the sensory organ is not an essential component of perception, and in the other, motor-sensory OLP (msOLP), it is (Figure 7). sOLP thus assumes that IvRs are confined to the brain (i.e., they are specific NRs) and can be fully retrieved by sensory activations alone when the sensor is passive. msOLP, in contrast, postulates that IvRs are not confined to the brain, and can form the basis for perception only if they include the relevant MS-contingencies (Figure 7). According to msOLP, IvRs cannot be retrieved with passive sensory organs. Importantly, however, msOLP does not assume a motor-sensory-motor loop; that is, its scheme includes a motor-to-sensory arc but not a sensory-to-motor arc (Figure 7). Hence, with msOLP, movements of the sensory organ are predetermined for each perceptual epoch and are not affected by the ongoing sensory input during that epoch. In contrast to the OLP hypotheses, using the same representational terminology, CLP postulates that IvRs can form the basis for perception only if they contain the dynamics and state of the entire MSM-loop including the relevant features of the object (Figure 7). Thus, the minimal set of variables that must be included in the IvR of each object, or feature, is different for each hypothesis (Figure 7, bluish ellipses): internal-only sensory variables in sOLP, internal sensory variables and MS-contingencies in msOLP and the entire perceiving loop in CLP.

Functional connectivity and essential elements of perceptual schemes.

The essential elements in each scheme are indicated by solid curves and blue titles.MSM, motor-sensory-motor; MS, motor-sensory; S, sensory; NR, neuronal representation; green curves, re-afferent related pathways; magenta curves, ex-afferent related pathways. Note that re-afferent related pathways can form closed-loops with their sensory organs also in OLP schemes (dashed curves). Arrows indicate optional whisker (black) or object (magenta) movement; solid arrows indicate movements that are essential for perception; in the sOLP scheme none of the movements is essential in itself, but it is essential that at least one of them will occur in order to activate the receptors. Appropriate experimental paradigms are indicated by green titles; CLP and msOLP schemes can be studied only via active sensing paradigms.The minimal sets for invariant representations (IvRs) of external features, i.e., the components that must be included in any IvR according to each perceptual scheme, are marked by the bluish ellipses. sOLP: internal, sensory only NRs. msOLP: sensory NRs + motor-object-sensory contingencies. CLP: entire motor-object-sensory-motor loops.

https://doi.org/10.7554/eLife.12830.009

Perhaps the first question that comes to mind when considering msOLP and CLP is whether paralyzed subjects perceive stationary (i.e., not flashing or moving) objects similarly to non-paralyzed subjects. If they do - here go the msOLP and CLP hypotheses. Unfortunately, however, this is not a trivial test. Note that the paralysis must include the relevant sensory organ and the object must be entirely stationary. In the case of touch it should be evident that while contacts may be detectable, no object perception is possible with paralyzed hands – we are not aware of any study contradicting this conjecture. In contrast, our intuition regarding hearing is that action is not a fundamental requirement for hearing. Yet, two important points are relevant here. First, our intuition may be misled by the fact that we cannot be aware of motor activation of the outer hair cells and the muscles of the middle ear – we are not aware of perceptual experiments in which these activations were blocked, or measured. Second, no stationary object exists in audition. Acoustic waves are always dynamic and always activate the inner hair cells. This makes auditory sensation less dependent on self-motion, a fact that indeed may put audition in a motor-sensory regime that is distinct from those of touch and vision.

Regarding vision, we are aware of only one study analysing visual perception in a congenital ophthalmoplegic patient, a patient who had no eye movements since birth; in this case, the patient developed a pattern of head movements that resembled that of natural eye movements, only on a slower rhythm (Gilchrist et al., 1997). This adaptation clearly indicates the need in active sensation for visual perception, at least in that patient. Natural employment of active vision is indicated by the "weird, confusing and indescribable" forms of perceptions reported during acute partial paralysis of the ocular muscles (Stevens et al., 1976). These data are certainly not consistent with sOLP. Yet, these data, as well as part of the OLP-challenging data presented above, may still be consistent with msOLP. The distinction between msOLP and CLP hypotheses is thus more demanding, and requires specifically designed experiments.

We describe here examples of potentially discriminative experiments in three categories.

  1. The motor-to-sensory arc. The following manipulations are predicted to impair perception according to msOLP or CLP but not according to sOLP: (i) Paralysis of the sensory organ while keeping the sensory flow unimpaired. (ii) Replacing continuous presentation of an object with a series of one or more brief presentations (flashes) while keeping the total stimulus time and/or energy equal.

  2. The sensory-to-motor arc. The following manipulations are predicted to impair perception according to CLP but not according to sOLP or msOLP: (i) Limiting or forcing sensor movement trajectory via instructions in humans or interventions in rodents. For example, asking humans to scan a scene according to verbal instructions or by pursuing a target, or moving the sensory organ according to a trajectory that was recorded in a previous active session. (ii) Allowing active touch but with the motion of one hand determining the sensory flow to the other hand. (iii) Perturbing neuronal specific sensory-to-motor pathways, such as those connecting the sensory cortex to the motor cortex (Colechio and Alloway, 2009), those connecting sensory cortex to motor nuclei (typically via layer 5B neurons), or those connecting the thalamus to motor (cortical and sub-cortical) stations (Smith et al., 2012) (Figure 4A). The exact design should depend on available genetic markers and the testing of these predictions should be conducted in a balanced way, using appropriate sham perturbations. The following observations are predicted by CLP but not by sOLP or msOLP during natural perception: (iv) The motion trajectories of the sensory organ will differ for different object features (expected from affective sensory-to-motor connections). (v) The motion of the sensory organ will depend on the concurrent sensory input; for example, when a rat perceives an object’s shape or texture, the movement trajectory of its whiskers will depend on the sequence of curvatures and stick-slip events preceding it within the same perceptual epoch. Similarly, the motion trajectory of the eye will depend on the retinal activations preceding it within the same perceptual epoch.

  3. Motor-sensory-motor convergence. The following observations are predicted by CLP but not by sOLP or msOLP: (i) The movement trajectory of the sensory organ will show convergence dynamics, i.e., gradual approach to a steady-state pattern, during natural perception. (ii) Convergence will be to different steady-state patterns while perceiving different features or values. (iii) Specific steady-state patterns will be associated with specific perceptual reports. (iv) Convergence dynamics can predict perceptual report timing and/or error. (v) A virtual object can be perceived when sensory neuronal activity is manipulated to mimic the activity expected by the movement of the sensory organ and the presence of a real object (O'Connor et al., 2013); with reliable mimicry a convergence process should follow. (vi) CLP predicts that the lag between the actual and perceived times of an external transition (“perceptual lag”) should decrease along the process of perceptual convergence, when in steady-state no lag is expected. It has been previously shown that perceptual lags of the onset of transient stimuli are longer than those of continuously-present ones (Nijhawan, 2001). CLP thus predicts that with similar experimental protocols (e.g., a rotating arm is shown continuously and a dot is transiently displayed (flashed) for various durations at various positions traversed by the rotating arm), when looking at the offset of the transient stimulus rather than its onset, the temporal perceptual lag of the offset will decrease with increasing transient durations. (vii) With a motion-induced-blindness (MIB) protocol, in which stationary targets are surrounded by moving background dots, the targets ‘disappear’ occasionally (Bonneh et al., 2001). In one possible implementation of CLP the visual system would control the velocity of retinal image slip, and maintain it within a certain working range, instead of directly controlling drift velocity. This would be achieved by modifying drift speed in a manner that is inversely proportional to the speed of the retinal slip. When the retinal slip is dominated by external motion, such as in MIB, eye drift speed would be reduced significantly. When the drift speed will be reduced below a certain level, retinal receptors at corresponding eccentricities may not receive sufficient luminance changes to be activated by the stationary parts of the image. Thus, in MIB conditions in which the drift speed is inversely correlated with the dots’ speed, target disappearance is expected to be preceded by a reduction of the drift speed below a certain threshold; threshold level should depend on the eccentricity of the disappearing target.

Ideally, the comparison of the behaviors predicted by CLP and OLP, related to the inter-dependencies of motor, object, sensory and report variables, should be done in natural conditions. Practically, as the scientific method enforces reductionist steps, it is important to notice what reductions are allowed, as behavioral predictions of CLP or msOLP, regarding natural perception, cannot be tested in paradigms in which their basic assumptions are “reduced out.” Clearly, if eye or whisker motion is prevented, critical predictions of CLP or msOLP cannot be tested. Experiments in which eye or whisker motion is allowed but head motion is restrained have a limited discriminative power - conclusions in these cases should take into account the possibility that head-restrained animals develop unique compensatory active strategies which may not be indicative for the head-free condition. When MSM-loops are not given enough time to converge, as is the case with passive sensing (e.g., visual flashes) for example, discrimination between CLP and OLP is usually not possible (as both predict partial perception, Figure 5).

A philosophical angle

For at least four centuries the philosophical community, and during the last century also the neuroscience community, have been puzzled by the contrast between objective scientific observations that relate to perception and the everyday subjective experience of perception. What feels direct and immediate to every human perceiver appears indirect and mediated when physical constrains are taken into account (Crane, 2005; Kelso, 1997; Port and Van Gelder, 1995; Ullman, 1980). Our CLP hypothesis proposes a reconciliation of objective scientific observations and subjective everyday experience via closed-loop dynamics between the perceiver and the perceived. Such closed-loops converge gradually to a state in which the perceiver and the perceived are inseparable. The idea is that, although the loops never actually reach an ideal steady-state, they get closer and closer to these states during a perceptual epoch and typically quit the convergence process when the distance from a steady state is barely sensible. Being close enough to the steady state can give rise to the feeling of direct and immediate perception.

In practical terms, this article proposes to open the discussion about the phenomenology and mechanisms of perception, and in particular to confront open- and closed-loop schemes. We hope that the set of predictions listed here will serve as a starting point for informative experimental confrontation.

References

  1. 1
  2. 2
  3. 3
    Seeing via miniature eye movements: A dynamic hypothesis for vision
    1. E Ahissar
    2. A Arieli
    (2012)
    Frontiers in Computational Neuroscience, 6, 10.3389/fncom.2012.00089, 23162458.
  4. 4
    On the possible roles of microsaccades and drifts in visual perception
    1. E Ahissar
    2. A Arieli
    3. M Fried
    4. Y Bonneh
    (2016)
    Vision Research, 118, 10.1016/j.visres.2014.12.004, 25535005.
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
    Behavior-Based Robotics
    1. RC Arkin
    (1998)
    Cambridge, MA: MIT Press.
  13. 13
    Design for a Brain
    1. WR Ashby
    (1952)
    Chapman and Hall / Science Paperbacks.
  14. 14
  15. 15
  16. 16
  17. 17
    The structure and precision of retinal spike trains
    1. MJ Berry
    2. DK Warland
    3. M Meister
    (1997)
    Proceedings of the National Academy of Sciences of the United States of America 94:5411–5416.
    https://doi.org/10.1073/pnas.94.10.5411
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
    Anatomical pathways involved in generating and sensing rhythmic whisker movements
    1. LW Bosman
    2. AR Houweling
    3. CB Owens
    4. N Tanke
    5. OT Shevchouk
    6. N Rahmati
    7. WH Teunissen
    8. C Ju
    9. W Gong
    10. SK Koekkoek
    11. CI De Zeeuw
    (2011)
    Frontiers in Integrative Neuroscience, 5, 10.3389/fnint.2011.00053, 22065951.
  24. 24
    Whisker encoding of mechanical events during active tactile exploration
    1. Y Boubenec
    2. DE Shulz
    3. G Debrégeas
    (2012)
    Frontiers in Behavioral Neuroscience, 6, 10.3389/fnbeh.2012.00074, 23133410.
  25. 25
  26. 26
  27. 27
  28. 28
    Bayesian model of dynamic image stabilization in the visual system
    1. Y Burak
    2. U Rokni
    3. M Meister
    4. H Sompolinsky
    (2010)
    Proceedings of the National Academy of Sciences of the United States of America 107:19525–19530.
    https://doi.org/10.1073/pnas.1006076107
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
    Neural coding of tactile texture: Comparison of spatial and temporal mechanisms for roughness perception
    1. CE Connor
    2. KO Johnson
    (1992)
    The Journal of Neuroscience  12:3414–3426.
  37. 37
    The Stanford Encyclopedia of Philosophy
    1. T Crane
    (2005)
    The Problem of Perception, The Stanford Encyclopedia of Philosophy, http://plato.stanford.edu/entries/perception-problem/.
  38. 38
  39. 39
    Past-future information bottleneck in dynamical systems
    1. F Creutzig
    2. A Globerson
    3. N Tishby
    (2009)
    Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics, 79, 10.1103/PhysRevE.79.041925, 19518274.
  40. 40
  41. 41
    The active cochlea
    1. P Dallas
    (1992)
    The Journal of Neuroscience 2:4575–4585.
  42. 42
    A neuronal model of a global workspace in effortful cognitive tasks
    1. S Dehaene
    2. M Kerszberg
    3. JP Changeux
    (1998)
    Proceedings of the National Academy of Sciences of the United States of America 95:14529–14534.
    https://doi.org/10.1073/pnas.95.24.14529
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
    Parellel pathways in the auditory, visual and somatic systems
    1. IT Diamond
    (1983)
    In: G Macchi, A Rustioni, R Spreafico, editors. Somatosensory Integration in the Thalamus. Amsterdam: Elsevier. pp. 251–272.
  48. 48
  49. 49
  50. 50
    A Universe of Consciousness How Matter Becomes Imagination
    1. G Edelman
    2. G Tononi
    (2001)
    New York, NY:  Basic Books.
  51. 51
  52. 52
  53. 53
  54. 54
    Central versus peripheral determinants of patterned spike activity in rat vibrissa cortex during whisking
    1. MS Fee
    2. PP Mitra
    3. D Kleinfeld
    (1997)
    Journal of Neurophysiology 78:1144–1149.
  55. 55
  56. 56
  57. 57
  58. 58
    How Brains Make Up Their Minds
    1. W Freeman
    (2001)
    New York: Columbia University Press.
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
    Importance of temporal cues for tactile spatial- frequency discrimination
    1. E Gamzu
    2. E Ahissar
    (2001)
    The Journal of Neuroscience  21:7416–7427.
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
    The thalamus as a monitor of motor outputs
    1. RW Guillery
    2. SM Sherman
    (2002)
    Philosophical Transactions of the Royal Society B: Biological Sciences 357:1809–1821.
    https://doi.org/10.1098/rstb.2002.1171
  74. 74
    The Cochlea
    1. JJ Guinan
    (1996)
    435–502, Physiology of Olivocochlear Efferents, The Cochlea, New York, NY, Springer New York, 10.1007/978-1-4612-0757-3_8.
  75. 75
  76. 76
    Computational models of spatial updating in peri-saccadic perception
    1. FH Hamker
    2. M Zirnsak
    3. A Ziesche
    4. M Lappe
    (2011)
    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 366:554–571.
    https://doi.org/10.1098/rstb.2010.0229
  77. 77
    The response of single optic nerve fibers of the vertebrate eye to illumination of the retina
    1. HK Hartline
    (1938)
    American Journal of Physiology 121:400–415.
  78. 78
  79. 79
  80. 80
  81. 81
    Tapered whiskers are required for active tactile sensation
    1. SA Hires
    2. L Pammer
    3. K Svoboda
    4. D Golomb
    (2013)
    eLife, 2, 10.7554/eLife.01350, 24252879.
  82. 82
    View from the top: hierarchies and reverse hierarchies in the visual system
    1. S Hochstein
    2. M Ahissar
    (2002)
    Neuron 36:791–804.
  83. 83
  84. 84
  85. 85
  86. 86
    Is there a brainstem substrate for action selection?
    1. MD Humphries
    2. K Gurney
    3. TJ Prescott
    (2007)
    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 362:1627–1639.
    https://doi.org/10.1098/rstb.2007.2057
  87. 87
  88. 88
  89. 89
    Evaluating the effects of olivocochlear feedback on psychophysical measures of frequency selectivity
    1. SG Jennings
    2. EA Strickland
    (2012)
    The Journal of the Acoustical Society of America, 132, 10.1121/1.4742723, 23039443.
  90. 90
  91. 91
  92. 92
  93. 93
  94. 94
  95. 95
  96. 96
    Dynamic Patterns: The Self-Organization of Brain and Behavior
    1. JS Kelso
    (1997)
    MIT press.
  97. 97
  98. 98
  99. 99
  100. 100
  101. 101
  102. 102
  103. 103
    Acoustic reflex frequency selectivity in single stapedius motoneurons of the cat
    1. JB Kobler
    2. JJ Guinan
    3. SR Vacher
    4. BE Norris
    (1992)
    Journal of Neurophysiology 68:807–817.
  104. 104
    Principles of Gestalt Psychology
    1. K Koffka
    (1935)
    New York: Harcourt, Brace and Company.
  105. 105
    Active sensing--closing multiple loops
    1. P König
    2. H Luksch
    (1998)
    Zeitschrift Für Naturforschung C 53:542–549.
  106. 106
  107. 107
  108. 108
  109. 109
  110. 110
  111. 111
  112. 112
  113. 113
    Learning and exploration in action-perception loops
    1. DY Little
    2. FT Sommer
    (2013)
    Frontiers in Neural Circuits, 7, 10.3389/fncir.2013.00037, 23579347.
  114. 114
    The Analysis of Sensations
    1. E Mach
    (1959)
    New York: Dover publications.
  115. 115
  116. 116
  117. 117
  118. 118
    Vision
    1. D Marr
    (1982)
    San Francisco: W. H. Freeman.
  119. 119
  120. 120
  121. 121
    Face-selective neurons maintain consistent visual responses across months
    1. DBT McMahon
    2. AP Jones
    3. IV Bondar
    4. DA Leopold
    (2014)
    Proceedings of the National Academy of Sciences of the United States of America 111:8251–8256.
    https://doi.org/10.1073/pnas.1318331111
  122. 122
    Phenomenology of Perception
    1. M Merleau-Ponty
    (1962)
    London: Routledge.
  123. 123
  124. 124
  125. 125
  126. 126
  127. 127
  128. 128
  129. 129
  130. 130
  131. 131
  132. 132
  133. 133
  134. 134
    Neural coding during active somatosensation revealed using illusory touch
    1. DH O'Connor
    2. SA Hires
    3. ZV Guo
    4. N Li
    5. J Yu
    6. QQ Sun
    7. D Huber
    8. K Svoboda
    (2013)
    Nature Neuroscience, 16, 10.1038/nn.3419, 23727820.
  135. 135
  136. 136
  137. 137
  138. 138
  139. 139
  140. 140
  141. 141
  142. 142
  143. 143
  144. 144
    Lost in time: a historical frame, elementary processing units and the 3-second window
    1. E Pöppel
    (2004)
    Acta Neurobiologiae Experimentalis 64:295–301.
  145. 145
    Mind as Motion: Explorations in the Dynamics of Cognition
    1. RF Port
    2. Gelder T Van
    (1995)
    MIT press.
  146. 146
  147. 147
  148. 148
    Active touch sensing
    1. TJ Prescott
    2. ME Diamond
    3. AM Wing
    (2011)
    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 366:2989–2995.
    https://doi.org/10.1098/rstb.2011.0167
  149. 149
  150. 150
  151. 151
  152. 152
  153. 153
    Response variability and timing precision of neuronal spike trains in vivo
    1. DS Reich
    2. JD Victor
    3. BW Knight
    4. T Ozaki
    5. E Kaplan
    (1997)
    Journal of Neurophysiology 77:2836–2841.
  154. 154
  155. 155
  156. 156
  157. 157
  158. 158
  159. 159
  160. 160
  161. 161
  162. 162
  163. 163
  164. 164
  165. 165
  166. 166
  167. 167
    The role of eye movement in the detection of contrast and spatial detail
    1. RM Steinman
    2. JZ Levinson
    (1990)
    In: E Kowler, editors. Eye Movementsand Their Role in Visual and Cognitive Processes. Amsterdam: Elsevier. pp. 115–212.
  168. 168
  169. 169
  170. 170
  171. 171
  172. 172
  173. 173
    Theoretical Biology
    1. J.v Uexkull
    (1926)
    London: K. Paul, Trench, Trubner & Co. Ltd.
  174. 174
  175. 175
  176. 176
  177. 177
    Human medial temporal lobe neurons respond preferentially to personally relevant images
    1. IV Viskontas
    2. RQ Quiroga
    3. I Fried
    (2009)
    Proceedings of the National Academy of Sciences of the United States of America 106:21329–21334.
    https://doi.org/10.1073/pnas.0902319106
  178. 178
  179. 179
  180. 180
  181. 181
  182. 182
    Cybernetics
    1. N Wiener
    (1949)
    New York: John Wiley & Sons.
  183. 183
  184. 184
    At 120 msec you can spot the animal but you don't yet know it's a dog
    1. C-T Wu
    2. SM Crouzet
    3. SJ Thorpe
    4. M Fabre-Thorpe
    (2014)
    Journal of Cognitive Neuroscience, 10.1162/jocn_a_00701.
  185. 185
  186. 186

Decision letter

  1. David Kleinfeld
    Reviewing Editor; University of California, San Diego, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your work entitled "Perception as a closed-loop convergence process" for consideration by eLife. Your article has been reviewed by 3 peer reviewers – including the Reviewing editor, David Kleinfeld –, and the evaluation has been overseen by Eve Marder as the Senior Editor. The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission. While there was considerable interest and enthusiasm for this work, the reviewers also had some substantive critiques that will require attention.

Summary:

The basic premise of this manuscript is that "…passive paradigms cannot reveal how sensory information is actually processed during active perception […] For that, a unified analysis of the motor and sensory components engaging brains with their environments is required." Toward addressing this premise, "The current paper describes an attempt to bring the motor variables back to theoretical modeling of perception, by proposing a motor-sensory closed-loop scheme for the perception of the external environment." The crux proposal is that internal representation may be a time-varying signal that reaches a steady state behavior perhaps, if only because "Mammalian sensory organs usually acquire information via movements…".

This is a "Viewpoint" with some original material as opposed to an "Original Article" per se. Yet it is timely and important. One would think that most rational neuroscientists would agree. Yet much of mammalian systems neuroscience went through a dark period of working with primarily anesthetized animals or highly constrained animals. This was particularly egregious in vision, where one could argue that a generation of neuroscientists attended to second-order effects to Hubel and Wiesel receptive fields while a basic role of visual areas for motion control went undiscovered until a few years back; see, e.g., Carandini (Nat Neurosci 2013), Bonhoeffer (Neuron 2012), and Stryker (Neuron 2010). The current work reviews this dark period, although the authors should note that some areas, including the study of the VOR and the OKR, the study of hippocampal function during learning and memory (2015 Nobel prize), and clearly the study of motor control for locomotion and manipulation, did not fall into this trap. The pioneering gating experiments of Chapin and Woodward (Exp Neurol 1982), which show how motor output gates sensation, deserve special mention as counterpoint to the authors' sarcasm, i.e., "The (re) discovery that mammalian sensation is active…".

The authors propose that the internal representation of a stimulus depends on motor output. Thus, unless the animal acts on the information, one does not know if the representation, presumably the pattern of neurons spiking in different brain areas, is a complete or only a partial representation. Further, the partial representation could be too incomplete for action to occur. I think we all would agree. Many highly cited studies on internal representation view a change in motor output in response to a change in internal representation – which could be the act of pushing a lever to declare a sensorimotor process is terminating- as a gold standard. There is a fair literature on this – including the pioneering ICMS experiments of Newsome and colleagues (Nature 1990, J Neurophysiol 1992, Neuron 2014). In the vibrissa literature, which appears prominently in this manuscript, there are the reafferent coding studies of Kleinfeld and colleagues (J Neurophysiol 1997; Neuron 2011). The authors go through many arguments to describe why the notion of a motor-free, or open loop representation, will fail. A key argument involves the time it takes – presumably cycles of recurrence – to form an internal representation. This is reminiscent of the argument by Martin (TiNS 1988) on the formation of visual representation as a recurrent of feedback, which was written as a challenge to the feed forward processing implicit in the wiring maps of Feldman and Van Essen.

Essential revisions:

The full reports of all reviewers are appended. All reviewers found merit with the timeliness and importance of the work but all reviewers also found faults that require attention. It is essential to address these issues:

1) Draw a clear distinction about dynamics that spread beyond sensory areas to involve decision making and motor output, each of which may contain local feedback loops, as opposed to brain-wide feedback dynamics per se.

2) Provide clearer and more thoughtful experiments to distinguish between the manifestation of open loop and closed loop representation of the sensory world – at least an object!

3) Properly define and clarify the output from the model / robot (Figure 7).

Specific points:

Reviewer # 1 (annotated by BRE David Kleinfeld):

1) In their paper, “Perception as a closed-loop convergence process”, Ahissar and Assa conceptualize perception as an interactive dynamical process. Specifically the authors propose that perception is a convergence process that involves about 4 repeated sensory-interactions through which an object percept is dynamically generated. Further the authors emphasize the constitutive active nature of sensing and stress the presence of loops rather than of a feed-forward architecture in the brain.

DK: This summary is telling. The reviewer focuses solely on dynamics per se rather than on motor output and control as an integral part of sensation. This implies that the larger message from Ahissar and Assa may have failed to get through.

2) The paper is a strange mix out high-level assumptions and details of rodent active touch. To me these two different levels never fully merged, i.e. it did not become clear to me, where in the rodent brain the dynamical process happens that forms the perceptual object.

DK: In fact, there is published evidence that all of vibrissa L5b cells (in both sensory and motor cortices) have a role in motor control; this goes back to work by Glickman. So this is a clear place to note the origin of a perception and one that is, in terms of hypothesis testing, (just barely) accessible with Ca2+-imaging.

3) The predictions that differentiate the OpenLoop and the ClosedLoop model of perception are neither very strong nor very clear. More work is required here.

DK: All reviewers agree on this point. The section on predictions is a crux aspect of the paper that requires significant improvement, as the key is to entice experimentalists to try to falsify or verify the ideas inherent in representation through motor control. This will take thought and time and still may not work out!

4) I have major doubts that the authors are right. It is obvious from the literature that passive, or briefly flashed stimulus presentations, which do not allow active sensing, still evoke robust percepts. I would predict that we will find also a lot more single touch percepts in the active touch system, once we look harder in situations, where animals sensing under time pressure.

DK: I think the confusion results from a mixing of loops for perception, which are hypothesized to include sensory and motor function, and local sensory loops solely for reverberation. The latter are well known to occur with sensory processes, and the most dramatic case is the > 20 s of reverberatory signal in AIT cortex during the delay period of a match to sample task (Fuster & Alexander 1971 Science). Please clarify your text.

Reviewer # 2 (annotated by BRE David Kleinfeld):

1) The distinction between "neural representation" and "internal representation" seems unnecessary – and ill-defined.

DK: This should be fixed.

2) Although I think I understand the intuition behind referring to a sensory-motor loop as "motor-sensory-motor loop", this seems unnecessary; "loop" already implies circularity.

DK: I suspect that this was done to separate loops that are local and lie just in solely sensory or solely regions from brain-wide loops that span the nervous system. As noted above, this needs to be clarified.

3) It is unclear to me what the distinction between msOLP and CLP is?

DK: Please either drop or clarify this issue. It should be a straight forward fix.

4) At times the argument is speculative and unnecessarily strong – to the point of likely already being wrong? E.g., in the subsection “Contrasting OLP and CLP – discriminatory testable predictions” "the question is […] whether paralyzed subjects perceive. If they do – here goes the closed-loop hypothesis". We clearly perceive by hearing without moving. I will grant the authors that it is still unclear what the function of outer hair cells is – but as far as we know, this is a counter example to their hypothesis?

DK: please provide a more graded presentation. The manuscript started out this way, in that open loop representations were a primarily seen as a subset of the larger internal representation.

5) The "mathematical model" and the robotic setup seem to add little to the manuscript. The robotic system example only proves that an oscillatory system can be driven to different attractors with different inputs?

DK: All reviewers commented on the opaque nature of this presentation. It needs to be rewritten. I do not see a fundamental flaw.

6) The testable predictions part is a great idea – but as formulated they are not very helpful. A specific motor output can be thought of as the correlate of a dynamic neural attractor instead of an "instantaneous state".

DK: All reviewers agree. The section on predictions is a crux aspect of the paper that requires significant improvement, as the key is to entice experimentalists to try to falsify or verify the ideas inherent in representation through motor control. This will take thought and time and still may not work out!

Reviewer # 3 (RE Kleinfeld):

1) The statement that "[motor-sensory-motor]-loops are fundamental units of mammalian perception" cannot be right. These loops can support activity and thus a motor-sensory-motor representation, but anatomical loops per se is not a representation.

2) The discussion of two types of loops notes that the "first uses proprioceptive signals […] The other type uses sensory signals to monitor features of external objects…". In fact, this extends confusion in the literature. Signals from muscle spindles, usually regarded as proprioceptive in the sense that are used only for motor feedback, are also sensory. See the pioneering work by Hsiao (2006 J Neurophysiol) on discriminating the size of objects based on muscle stretch.

3) The discussion of reading out the convergence, say, to a limit cycle is muddled. Are convergence cycles related to the accumulation of evidence? If so, it is an interesting idea. But the authors need to describe a mechanism to link dynamics with estimated of confidence. They end with "although the loops never reach the ultimate steady-states, they typically quit the convergence process when the distance from that state is no more sensible". This seems too soft a statement for a serious article.

4) The authors discuss the need to share information between limit cycles (perceptual loops). They are a bit glib in listing possibilities as the locking and unlocking of activity in different loops is essential to their scheme of hierarchical loops. Coherence between different loops is tricky – if the interactions cause a pair of loops to phase-lock, then it is not clear how they separate and dephase. The authors have neglected issues of noise, which is a mechanism to break locking and to dephase.

5) The equations for the SYCLOP model need to be explained. As it stands, this section will lose almost all readers. None of the symbols are explained. I would also start by saying that the simplest model of a loop uses Van der Pol relaxation dynamics. On the one hand it is a bit of a let-down to have the work condensed to a single oscillator that came out of the days of vacuum triodes. On the other hand, the presentation of the realization with the Van der Pol oscillator (Figure 7) is very condensed. I think Figure 7 needs to be considerably unpacked. Panels A, B, and example dynamics like panel H can be one figure, while panels D-G and I could be a second figure. Also, define "k-events", label the ordinates of panels H and I.

6) The authors end with a number of proposed experiments to address the claims of closed versus open loop object representation. One involves the detection of the phase of contact in the whisking cycle, yet is followed by the claim that "…predictions of CLP and OLP can be distinguished only in natural perceiving conditions." This appears to obviate the use of head-fixed animals, an excellent preparation for combined behavior and electrophysiology. Why is head fixing bad for whisking? It seems that perception must often work under partial constraints.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Perception as a closed-loop convergence process" for further consideration at eLife. Your revised article has been favorably evaluated by Eve Marder (Senior editor), Reviewing editor David Kleinfeld, and one reviewer.

The manuscript has been improved but there are a few remaining issues raised by the reviewer and verified by Reviewing editor Kleinfeld. In order to complete this odyssey, please address these queries.

1) In the subsection “Synthesis of CLP in a robotic setup”. Please expand on the solution of the model, a van der Pol oscillator, to make it transparent to the "typical" biologically trained reader. The statement "The implementation of these equations using the SYCLOP platform" needs to be detailed – even in the appendix – so a reader can duplicate your calculation.

2) In the subsection “Perception can be masked “backwardly”” – "If perception could be reduced to a sequence of pure open loop processes backward masking should not occur." One might think that any slow integration step in a feed-forward processing system would explain backward masking through injection of a signal within the integration time. Perhaps your statement could be better explained as dependent on a system with only "fast" integration.

3) In the subsection “Perception can be masked “backwardly”” – "Perceptual masking thus challenges the validity of the 'virtual knife' reduction and the ability to reconstruct perception based on experiments with flashed stimuli only." The argument leading up to this is not clear.

Finally, please reread the manuscript in a "copy-edit" manner to improve the grammar and correct any number of typos in punctuation.

https://doi.org/10.7554/eLife.12830.010

Author response

The basic premise of this manuscript is that "…passive paradigms cannot reveal how sensory information is actually processed during active perception […] For that, a unified analysis of the motor and sensory components engaging brains with their environments is required." Toward addressing this premise, "The current paper describes an attempt to bring the motor variables back to theoretical modeling of perception, by proposing a motor-sensory closed-loop scheme for the perception of the external environment." The crux proposal is that internal representation may be a time-varying signal that reaches a steady state behavior perhaps, if only because "Mammalian sensory organs usually acquire information via movements…".

This is a "Viewpoint" with some original material as opposed to an "Original Article" per se. Yet it is timely and important. One would think that most rational neuroscientists would agree. Yet much of mammalian systems neuroscience went through a dark period of working with primarily anesthetized animals or highly constrained animals. This was particularly egregious in vision, where one could argue that a generation of neuroscientists attended to second-order effects to Hubel and Wiesel receptive fields while a basic role of visual areas for motion control went undiscovered until a few years back; see, e.g., Carandini (Nat Neurosci 2013), Bonhoeffer (Neuron 2012), and Stryker (Neuron 2010). The current work reviews this dark period, although the authors should note that some areas, including the study of the VOR and the OKR, the study of hippocampal function during learning and memory (2015 Nobel prize), and clearly the study of motor control for locomotion and manipulation, did not fall into this trap. The pioneering gating experiments of Chapin and Woodward (Exp Neurol 1982), which show how motor output gates sensation, deserve special mention as counterpoint to the authors' sarcasm, i.e., "The (re) discovery that mammalian sensation is active…".

Thank you for this concise summary. We would only comment here that our crux proposal is that the perception of external objects is a dynamical process encompassing loops that integrate the organism and its environment and converging towards organism-environment steady-states. We now emphasize this in the Abstract.

The Introduction was revised to cover (and cite) the studies mentioned above and the sarcastic term was removed (subsection “Sensation is normally active”, third paragraph).

The authors propose that the internal representation of a stimulus depends on motor output. Thus, unless the animal acts on the information, one does not know if the representation, presumably the pattern of neurons spiking in different brain areas, is a complete or only a partial representation. Further, the partial representation could be too incomplete for action to occur. I think we all would agree. Many highly cited studies on internal representation view a change in motor output in response to a change in internal representation – which could be the act of pushing a lever to declare a sensorimotor process is terminating- as a gold standard. There is a fair literature on this – including the pioneering ICMS experiments of Newsome and colleagues (Nature 1990, J Neurophysiol 1992, Neuron 2014). In the vibrissa literature, which appears prominently in this manuscript, there are the reafferent coding studies of Kleinfeld and colleagues (J Neurophysiol 1997; Neuron 2011). The authors go through many arguments to describe why the notion of a motor-free, or open loop representation, will fail. A key argument involves the time it takes – presumably cycles of recurrence – to form an internal representation. This is reminiscent of the argument by Martin (TiNS 1988) on the formation of visual representation as a recurrent of feedback, which was written as a challenge to the feed forward processing implicit in the wiring maps of Feldman and Van Essen.

Thank you for these valuable points. We have modified the text to refer to these points and cite the relevant papers – ICMS and confidence (subsection “CLP propositions”), reafference and efference-copy signals (subsection “Perceptual systems are organized as motor-sensory-motor (MSM) loops”) and feedforward versus recurrent processing (subsection “The open loop perception (OLP) doctrine”).

Essential revisions:

The full reports of all reviewers are appended. All reviewers found merit with the timeliness and importance of the work but all reviewers also found faults that require attention. It is essential to address these issues:

1) Draw a clear distinction about dynamics that spread beyond sensory areas to involve decision making and motor output, each of which may contain local feedback loops, as opposed to brain-wide feedback dynamics per se.

This distinction is now clearer. The common dynamics of all components of the relevant MSM-loop(s) is now clearly stated in the Abstract, subsection “CLP propositions”, I, the legend of Figure 5 and the predictions section. The distinction from dynamics of local circuits, as in typical models of decision making, was also added (subsections “The open loop perception (OLP) doctrine” and “CLP propositions”, I).

2) Provide clearer and more thoughtful experiments to distinguish between the manifestation of open loop and closed loop representation of the sensory world – at least an object!

The predictions section was substantially revised. We have categorized the predictions in three groups, focusing on motor-to-sensory, sensory-to-motor, and convergence effects. We have also clarified all predictions, made them more explicit, added examples, and added points raised by the reviewers. In order to emphasize the differences between the predictions of the different schemes we are referring now to the invariant representation (IvR), instead of internal representation (IR), as the representational comparative variable throughout the article. We thank the reviewers for these comments, which significantly helped clarifying our thoughts.

3) Properly define and clarify the output from the model / robot (Figure 7).

The output of the model, as demonstrated by the robot, is now clarified and explained. We modified the description substantially by simplifying the figure (Figure 7, now Figure 6) and extending the text (see below).

Specific points:

Reviewer # 1 (annotated by BRE David Kleinfeld):

1) In their paper, “Perception as a closed-loop convergence process”, Ahissar and Assa conceptualize perception as an interactive dynamical process. Specifically the authors propose that perception is a convergence process that involves about 4 repeated sensory-interactions through which an object percept is dynamically generated. Further the authors emphasize the constitutive active nature of sensing and stress the presence of loops rather than of a feed-forward architecture in the brain.

DK: This summary is telling. The reviewer focuses solely on dynamics per se rather than on motor output and control as an integral part of sensation. This implies that the larger message from Ahissar and Assa may have failed to get through.

We have modified the Abstract to sharpen the crucial suggestions of our hypothesis and make them more explicit, and in particular the crucial role of organism-environment loops. Also, our modifications throughout the paper were done with this issue in mind.

2) The paper is a strange mix out high-level assumptions and details of rodent active touch. To me these two different levels never fully merged, i.e. it did not become clear to me, where in the rodent brain the dynamical process happens that forms the perceptual object.

DK: In fact, there is published evidence that all of vibrissa L5b cells (in both sensory and motor cortices) have a role in motor control; this goes back to work by Glickman. So this is a clear place to note the origin of a perception and one that is, in terms of hypothesis testing, (just barely) accessible with Ca2+-imaging.

We probably failed to state it clearly. The dynamical process that forms the perceived object is occurring along the entire MSM-loop(s). Thus, in principle there is no single brain site that preferably represents the object. Still, the comment about L5b makes sense – we now added it to the list of potential tests for the dependency of perception on S-M coupling (subsection “II. The sensory-to-motor arc”). We also make clearer statements about the whole-loop representation in the Abstract and along the paper.

3) The predictions that differentiate the OpenLoop and the ClosedLoop model of perception are neither very strong nor very clear. More work is required here.

DK: All reviewers agree on this point. The section on predictions is a crux aspect of the paper that requires significant improvement, as the key is to entice experimentalists to try to falsify or verify the ideas inherent in representation through motor control. This will take thought and time and still may not work out!

The predictions section was substantially revised. Please see our detailed description in reply to Essential revisions (2) above.

4) I have major doubts that the authors are right. It is obvious from the literature that passive, or briefly flashed stimulus presentations, which do not allow active sensing, still evoke robust percepts. I would predict that we will find also a lot more single touch percepts in the active touch system, once we look harder in situations, where animals sensing under time pressure.

DK: I think the confusion results from a mixing of loops for perception, which are hypothesized to include sensory and motor function, and local sensory loops solely for reverberation. The latter are well known to occur with sensory processes, and the most dramatic case is the > 20 s of reverberatory signal in AIT cortex during the delay period of a match to sample task (Fuster & Alexander 1971 Science). Please clarify your text.

We agree with the reviewer that briefly flashed stimuli evoke robust percepts. However, we argue that this observation is consistent with both OLP and CLP schemes. According to CLP such artificial stimuli initiate the perceptual process, which indeed normally would continue longer and include motor-sensory dynamics but which also can gain information from this initial step (Figure 5 and associated text). What precludes discrimination between OLP and CLP based on flashed stimuli is that although the percepts evoked by flashed stimuli can be robust, they most likely include significantly less information than the information actively acquired from stationary objects. Subjects indeed can differentiate between flashed cars and houses, or animals and humans, but probably cannot perceive the details of the images. We have added a paragraph explaining this in subsection “I. Perception (of external feature(s)) ≡ a process of inclusion in MSM-loop(s)”, fourth paragraph.

We use the terms “most likely” and “probably” because we are not aware of a systematic quantitative comparison of perceptual accuracies of complex images when they are flashed versus being stationary. The first set of predictions in our list includes such a comparison (prediction I-ii; we now added the word “flashes” to make it more explicit) – there we suggest to equalize the total time or energy of the stimuli and thus to use a series of flashes rather than a single one.

Reviewer # 2 (annotated by BRE David Kleinfeld):

1) The distinction between "neural representation" and "internal representation" seems unnecessary – and ill-defined.

DK: This should be fixed.

We have fixed it. We now distinguish between a “Neuronal Representation” (NR), which can be any pattern that shows some correlation with the external feature, and the “Invariant Representations (IvR), which is the representation that represents the feature consistently and uniquely – i.e., it occurs always and only when that feature occurs. This is now explained better in the subsection “The open loop perception (OLP) doctrine”. The minimal sets for IvR according to each perceptual scheme are now described better in Figure 7 (previously Figure 8), its legend and associated text (subsection “Contrasting OLP and CLP – discriminatory testable predictions”).

2) Although I think I understand the intuition behind referring to a sensory-motor loop as "motor-sensory-motor loop", this seems unnecessary; "loop" already implies circularity.

DK: I suspect that this was done to separate loops that are local and lie just in solely sensory or solely regions from brain-wide loops that span the nervous system. As noted above, this needs to be clarified.

Both the criticism and comment are well taken. The major reason for using the term MSM-loop instead of MS-loop is that our repeated experience with presenting the ideas discussed in this paper to colleagues indicated that people often automatically refer to sensory-motor arcs, or to inter-modal sensory-motor loops, when sensory-motor loops are mentioned (see our Figure 4). Thus, many people imagine a kind of a loop that combines visual sensation with arm movement, for example, closing the loop via visual sensation of the arm. This is of course not what we refer to in this paper – we refer to loops that include only one sensory organ, control its movement and sense its signals. We try to emphasize the flow of the loop signals from and to the same sensory organ by using the term motor-sensory-motor loop. We have now expanded the explanation of this term (subsection “subsection “Perceptual systems are organized as motor-sensory-motor (MSM) loops”, third paragraph), which we believe will also help clarifying the scheme we are talking about.

3) It is unclear to me what the distinction between msOLP and CLP is?

DK: Please either drop or clarify this issue. It should be a straight forward fix.

This distinction is important, and we hope that we are now doing a better job in explaining it. The difference is that in msOLP there is no loop – the sensory-to-motor arc is open. We now explain it better in the subsection “Contrasting OLP and CLP – discriminatory testable predictions”, and emphasize it via the classification of our predictions.

4) At times the argument is speculative and unnecessarily strong – to the point of likely already being wrong? E.g., in the subsection “Contrasting OLP and CLP – discriminatory testable predictions” "the question is […] whether paralyzed subjects perceive. If they do – here goes the closed-loop hypothesis". We clearly perceive by hearing without moving. I will grant the authors that it is still unclear what the function of outer hair cells is – but as far as we know, this is a counter example to their hypothesis?

DK: please provide a more graded presentation. The manuscript started out this way, in that open loop representations were a primarily seen as a subset of the larger internal representation.

The auditory case is indeed interesting. We agree with the reviewer that, based on currently available knowledge, it could very well be the case that hearing does not crucially depend on motor outputs. Yet, the experiment had not been yet done – one needs to paralyze or block the outputs to the outer hair cells and the muscles of the middle ear in order to test it. We now state it in the second paragraph of the subsection “Contrasting OLP and CLP – discriminatory testable predictions”.

There is an additional point here. The auditory stimulus is fundamentally different than the visual and tactile ones – it can never be stationary. There is no stationary acoustic wave. Thus, the inner hair cells are always activated by a sound. This makes the dependence on sensor motion less crucial. Thus, we have modified the question to “whether paralyzed subjects perceive stationary (i.e., not flashing or moving) objects similarly to non-paralyzed subjects” (see aforementioned paragraph). Given the reviewer’s comment we also found it important to elaborate further on the special case of auditory sensation.

5) The "mathematical model" and the robotic setup seem to add little to the manuscript. The robotic system example only proves that an oscillatory system can be driven to different attractors with different inputs?

DK: All reviewers commented on the opaque nature of this presentation. It needs to be rewritten. I do not see a fundamental flaw.

We agree with the criticism – the description was too laconic and encrypted. We modified this section substantially by simplifying the figure (Figure 7, now Figure 6 – removing the open-loop responses) and explaining the outcome of the robotic model and its significance (subsection “Synthesis of CLP in a robotic setup”). We think that the demonstrations of how a convergence process may look like, and how a steady-state may look like, are of value in this paper as they can help the reader capturing the type of processes we refer to – we thus prefer to leave this section in.

6) The testable predictions part is a great idea – but as formulated they are not very helpful. A specific motor output can be thought of as the correlate of a dynamic neural attractor instead of an "instantaneous state".

DK: All reviewers agree. The section on predictions is a crux aspect of the paper that requires significant improvement, as the key is to entice experimentalists to try to falsify or verify the ideas inherent in representation through motor control. This will take thought and time and still may not work out!

The predictions section was substantially revised. Please see our detailed description in reply to Essential revisions (2) above.

Reviewer # 3 (BRE David Kleinfeld):

1) The statement that "[motor-sensory-motor]-loops are fundamental units of mammalian perception" cannot be right. These loops can support activity and thus a motor-sensory-motor representation, but anatomical loops per se is not a representation.

Good point. Indeed, whenever we refer to MSM loops we refer to both their anatomical and functional levels. We are now stating this explicitly in the subsection “Perception can be masked “backwardly”.

2) The discussion of two types of loops notes that the "first uses proprioceptive signals […] The other type uses sensory signals to monitor features of external objects…". In fact, this extends confusion in the literature. Signals from muscle spindles, usually regarded as proprioceptive in the sense that are used only for motor feedback, are also sensory. See the pioneering work by Hsiao (2006 J Neurophysiol) on discriminating the size of objects based on muscle stretch.

Another good point – thank you! Indeed, proprioceptive loops can provide (limited) information about external objects. For example, when a large error between the planned and executed movement occurs, an external object that blocks the movement is a natural interpretation of the brain. We now added this point and a citation of Hsiao’s paper in the subsection “The closed-loop perception (CLP) hypothesis”, last paragraph.

3) The discussion of reading out the convergence, say, to a limit cycle is muddled. Are convergence cycles related to the accumulation of evidence? If so, it is an interesting idea. But the authors need to describe a mechanism to link dynamics with estimated of confidence. They end with "although the loops never reach the ultimate steady-states, they typically quit the convergence process when the distance from that state is no more sensible". This seems too soft a statement for a serious article.

In general we consider the formalization of the link between dynamics and confidence to lie outside the scope of the current paper. Yet, we agree that the description of possible mechanisms is in place here. We have thus added a paragraph describing the potential mechanism we prefer, which is based on internal models (subsection “I Perception (of external feature(s)) ≡ a process of inclusion in MSM-loop(s)”, fifth paragraph).

4) The authors discuss the need to share information between limit cycles (perceptual loops). They are a bit glib in listing possibilities as the locking and unlocking of activity in different loops is essential to their scheme of hierarchical loops. Coherence between different loops is tricky – if the interactions cause a pair of loops to phase-lock, then it is not clear how they separate and dephase. The authors have neglected issues of noise, which is a mechanism to break locking and to dephase.

We assume here that the loops composing an object perception remain engaged until the perceptual epoch ends. In this case dephasing will automatically follow. We have added a sentence mentioning that in the first paragraph of the subsection “II. Perception of an external object ≡ a coordinated process of inclusion in a collection of MSM-loops”.

5) The equations for the SYCLOP model need to be explained. As it stands, this section will lose almost all readers. None of the symbols are explained. I would also start by saying that the simplest model of a loop uses Van der Pol relaxation dynamics. On the one hand it is a bit of a let-down to have the work condensed to a single oscillator that came out of the days of vacuum triodes. On the other hand, the presentation of the realization with the Van der Pol oscillator (Figure 7) is very condensed. I think Figure 7 needs to be considerably unpacked. Panels A, B, and example dynamics like panel H can be one figure, while panels D-G and I could be a second figure. Also, define "k-events", label the ordinates of panels H and I.

The equations of the SYCLOP model are now better explained (subsection “Synthesis of CLP in a robotic setup”). The outcome of SYCLOP model is now described in detail and the motivation for using Van der Pol dynamics is now explained as well (in the aforementioned subsection).

Figure 7 (now Figure 6) was substantially simplified – the open-loop response panels were removed and the figure now conveys the main messages in a clearer manner.

6) The authors end with a number of proposed experiments to address the claims of closed versus open loop object representation. One involves the detection of the phase of contact in the whisking cycle, yet is followed by the claim that "…predictions of CLP and OLP can be distinguished only in natural perceiving conditions." This appears to obviate the use of head-fixed animals, an excellent preparation for combined behavior and electrophysiology. Why is head fixing bad for whisking? It seems that perception must often work under partial constraints.

We agree. The scope of the term “natural conditions” is too wide. We have modified this paragraph substantially and now explain better which reductionist paradigms would not allow a meaningful testing of the hypotheses. We also now explain better in that paragraph how different conditions, including head fixation, should be taken into account (subsection “III. Motor-sensory-motor convergence”, last paragraph).

[Editors' note: further revisions were requested prior to acceptance, as described below.]

The manuscript has been improved but there are a few remaining issues raised by the reviewer and verified by Reviewing editor Kleinfeld. In order to complete this odyssey, please address these queries.

1) In the subsection “Synthesis of CLP in a robotic setup”. Please expand on the solution of the model, a van der Pol oscillator, to make it transparent to the "typical" biologically trained reader. The statement "The implementation of these equations using the SYCLOP platform" needs to be detailed – even in the appendix – so a reader can duplicate your calculation.

We have expanded on the solution of the model, and explained the choice of the van der Pol oscillator (subsection “Synthesis of CLP in a robotic setup”). We also provide more details on the SYCLOP implementation, and explain how the model equations were implemented (in the aforementioned subsection).

2) In the subsection “Perception can be masked “backwardly”” – "If perception could be reduced to a sequence of pure open loop processes backward masking should not occur." One might think that any slow integration step in a feed-forward processing system would explain backward masking through injection of a signal within the integration time. Perhaps your statement could be better explained as dependent on a system with only "fast" integration.

The section on backward masking is now clearer. We explain the ‘standard model’ for open loop schemes, which is based on dual channel (fast and slow) integration and interaction (Breitmeyer & Ogmen’s 2000, reference added), its inconsistency with experimental data, and the challenge it forms for OLP (subsection “Perception can be masked “backwardly”).

3) In the subsection “Perception can be masked “backwardly”” – "Perceptual masking thus challenges the validity of the 'virtual knife' reduction and the ability to reconstruct perception based on experiments with flashed stimuli only." The argument leading up to this is not clear.

We hope that the improved explanation of the backward masking challenge (point 2) provides a better background for understanding the statement about the virtual knife. In addition, we have modified this sentence to be more explicit and clear (subsection “Perception can be masked “backwardly”). A related change was introduced in the fourth paragraph of the subsection “I Perception (of external feature(s)) ≡ a process of inclusion in MSM-loop(s)”.

Finally, please reread the manuscript in a "copy-edit" manner to improve the grammar and correct any number of typos in punctuation.

We have reread the entire paper, fixed typos and grammatical mistakes (with the aid of a linguistic editor) and improved the clarity of the text where needed – thanks for noting that. Figures 1 and 7 were slightly modified (Figure 1: eye – object arrows added; Figure 7: graphics).

https://doi.org/10.7554/eLife.12830.011

Article and author information

Author details

  1. Ehud Ahissar

    Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel
    Contribution
    EAh, Contributed to all aspects of this work
    For correspondence
    ehud.ahissar@weizmann.ac.il
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon 0000-0003-1223-9767
  2. Eldad Assa

    Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel
    Contribution
    EAs, Contributed to all aspects of this work
    Competing interests
    The authors declare that no competing interests exist.

Funding

Israel Science Foundation (1127/14)

  • Ehud Ahissar

United States-Israel Binational Science Foundation (2011432)

  • Ehud Ahissar

The NSF-BSF Brain Research EAGER program (2014906)

  • Ehud Ahissar

The Minerva Foundation funded by the Federal German Ministry for Education and Rsearch

  • Ehud Ahissar

The Israel Ministry of Defense

  • Ehud Ahissar

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Merav Ahissar, Amos Arieli, Asher Cohen, Coralie Ebert, Ram Frost, Andrei Gorea, Liron Gruber, Ealan Henis, Rafi Malach, Guy Nelinger, Tess Oram, Kevin O’Regan, Dov Sagi and Avi Saig for helpful comments and discussions and Michal Ahissar for linguistic editing. This work was supported by the Israel Science Foundation (grant #1127/14), the United States-Israel Bi-national Science Foundation (grant #2011432), the NSF-BSF Brain Research EAGER program, (grant #2014906), Israel Ministry of Defense and the Minerva Foundation funded by the Federal German Ministry for Education and Research. EA holds the Helen Diller Family Chair in Neurobiology.

Reviewing Editor

  1. David Kleinfeld, Reviewing Editor, University of California, San Diego, United States

Publication history

  1. Received: November 4, 2015
  2. Accepted: May 8, 2016
  3. Accepted Manuscript published: May 9, 2016 (version 1)
  4. Version of Record published: June 8, 2016 (version 2)

Copyright

© 2016, Ahissar et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,807
    Page views
  • 474
    Downloads
  • 6
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Comments

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Cell Biology
    2. Developmental Biology and Stem Cells
    Amy E Brinegar et al.
    Research Article Updated