A chromatic feature detector in the retina signals visual context changes
Abstract
The retina transforms patterns of light into visual feature representations supporting behaviour. These representations are distributed across various types of retinal ganglion cells (RGCs), whose spatial and temporal tuning properties have been studied extensively in many model organisms, including the mouse. However, it has been difficult to link the potentially nonlinear retinal transformations of natural visual inputs to specific ethological purposes. Here, we discover a nonlinear selectivity to chromatic contrast in an RGC type that allows the detection of changes in visual context. We trained a convolutional neural network (CNN) model on large-scale functional recordings of RGC responses to natural mouse movies, and then used this model to search in silico for stimuli that maximally excite distinct types of RGCs. This procedure predicted centre colour opponency in transient suppressed-by-contrast (tSbC) RGCs, a cell type whose function is being debated. We confirmed experimentally that these cells indeed responded very selectively to Green-OFF, UV-ON contrasts. This type of chromatic contrast was characteristic of transitions from ground to sky in the visual scene, as might be elicited by head or eye movements across the horizon. Because tSbC cells performed best among all RGC types at reliably detecting these transitions, we suggest a role for this RGC type in providing contextual information (i.e. sky or ground) necessary for the selection of appropriate behavioural responses to other stimuli, such as looming objects. Our work showcases how a combination of experiments with natural stimuli and computational modelling allows discovering novel types of stimulus selectivity and identifying their potential ethological relevance.
Editor's evaluation
This study presents a fundamental and very technically strong dataset of mouse ganglion cells responding to natural stimuli that include more natural chromatic properties. Fits of convolutional neural networks to experimental measurements highlighted a novel form of color opponency in suppressed-by-contrast ganglion cells. More generally, the work provides a compelling example of how modern experimental and computational tools can be used to generate and test hypotheses about sensory function under natural conditions.
https://doi.org/10.7554/eLife.86860.sa0Introduction
Sensory systems evolved to generate representations of an animal’s natural environment useful for survival and procreation (Lettvin et al., 1959). These environments are complex and high dimensional, and different features are relevant for different species (reviewed in Baden et al., 2020). As a consequence, the representations are adapted to an animal’s needs: features of the world relevant for the animal are represented with enhanced precision, whereas less important features are discarded. Sensory processing is thus best understood within the context of the environment an animal evolved in and that it interacts with (reviewed in Turner et al., 2019; Simoncelli and Olshausen, 2001).
The visual system is well suited for studying sensory processing, as the first features are already extracted at its experimentally well-accessible front-end, the retina (reviewed in Kerschensteiner, 2022; Baden et al., 2020). In the mouse, this tissue gives rise to around 40 parallel channels that detect different features (Goetz et al., 2022; Baden et al., 2016; Bae et al., 2018; Rheaume et al., 2018), represented by different types of retinal ganglion cells (RGCs), whose axons send information to numerous visual centres in the brain (Martersteck et al., 2017). Some of these channels encode basic features, such as luminance changes and motion, that are only combined in downstream areas to support a range of behaviours such as cricket hunting in mice (Johnson et al., 2021). Other channels directly extract specific features from natural scenes necessary for specific behaviours. For instance, transient OFF α cells trigger freezing or escape behaviour in response to looming stimuli (Münch et al., 2009; Yilmaz and Meister, 2013; Kim et al., 2020; Wang et al., 2021).
For many RGC types, however, we lack an understanding of the features they encode and how these link to behaviour (Schwartz and Swygart, 2020). One reason for this is that the synthetic stimuli commonly used to study retinal processing fail to drive retinal circuits ‘properly’ and, hence, cannot uncover critical response properties triggered in natural environments. This was recently illustrated at the example of spatial nonlinear processing, which was found to be more complex for natural scenes than for simpler synthetic stimuli (Karamanlis and Gollisch, 2021). Such nonlinearities, which are crucial for the encoding of natural stimuli, cannot be captured by linear-nonlinear (LN) models of retinal processing, and several alternative methods have been proposed for the analysis of natural stimulus responses (reviewed in Sharpee, 2013).
One approach to modelling nonlinear visual processing is to train a convolutional neural network (CNN) to predict neuronal responses. This approach has gained popularity in recent years, both in the retina (McIntosh et al., 2016; Maheswaranathan et al., 2023; Tanaka et al., 2019; Batty et al., 2017) and in higher visual areas (Yamins et al., 2014; Cadena et al., 2019; Ustyuzhaninov et al., 2024). The resulting models, also referred to as ‘digital twins’, offer a number of analysis techniques that have been used to investigate, for example, the contributions of different interneurons to a cell’s response (Maheswaranathan et al., 2023), or the effects of stimulus context (Fu et al., 2024; Goldin et al., 2022) and behavioural state (Franke et al., 2022) on neural coding. In particular, feature visualisations (Olah et al., 2017) can be used to quickly generate stimuli that would maximally excite the modelled neurons (Walker et al., 2019; Bashivan et al., 2019), which in turn can serve as interpretable short-hand descriptions of nonlinear neuronal selectivities. In visual cortex, the resulting maximally exciting inputs (MEIs) revealed more complex and diverse neuronal selectivities than were expected based on previous results obtained with synthetic stimuli and linear methods (Walker et al., 2019; Bashivan et al., 2019).
Here, we combined the power of CNN-based modelling with large-scale recordings from RGCs to investigate colour processing in the mouse retina under natural stimulus conditions. Colour is a salient feature in nature, and the mouse visual system dedicates intricate circuitry to the processing of chromatic information (Szél et al., 1992; Joesch and Meister, 2016; Baden et al., 2013; Szatko et al., 2020; Khani and Gollisch, 2021; Mouland et al., 2021). Studies using simple synthetic stimuli have revealed nonlinear and centre-surround (i.e. spatial) interactions between colour channels, but it is not clear how these are engaged in retinal processing of natural, temporally varying environments. Since mouse photoreceptors are sensitive to green and UV light (Jacobs et al., 2004), we recorded RGC responses to stimuli capturing the chromatic composition of natural mouse environments in these two chromatic channels. A model-guided search for MEIs in chromatic stimulus space predicted a novel type of chromatic tuning in tSbC RGCs, a type whose function is being debated (Mani and Schwartz, 2017; Tien et al., 2015; Tien et al., 2022).
A detailed in silico characterisation followed up by experimental validation ex-vivo confirmed this cell type’s pronounced and unique selectivity for dynamic full-field changes from green-dominated to UV-dominated scenes, a type of visual input that matches the scene statistics of transitions across the horizon (Qiu et al., 2021; Abballe and Asari, 2022; Gupta et al., 2022). We therefore suggest a role for tSbC RGCs in detecting behaviourally relevant changes in visual context, such as a transitions from ground (i.e. below the horizon) to sky (i.e. above the horizon).
Results
Here, we investigated colour processing in the mouse retina under natural stimulus conditions. To this end, we trained a CNN model on RGC responses to a movie covering both achromatic and chromatic contrasts occurring naturally in the mouse environment, and then performed a model-guided search for stimuli that maximise the responses of RGCs.
Mouse RGCs display diverse responses to a natural movie stimulus
Using two-photon population Ca2+ imaging, we recorded responses from 8388 cells (in 72 recording fields across 32 retinae) in the ganglion cell layer (GCL) of the isolated mouse retina (Figure 1a) to a range of visual stimuli. Since complex interactions between colour channels have been mostly reported in the ventral retina and opsin-transitional zone, we focused our recordings on these regions (Szatko et al., 2020; Khani and Gollisch, 2021).
The stimuli included two achromatic synthetic stimuli – a contrast and frequency modulation (‘chirp’ stimulus) and a bright-on-dark bar moving in eight directions (‘moving bar’, MB) – to identify the functional cell type (see below), as well as a dichromatic natural movie (Figure 1b–d). The latter was composed of footage recorded outside in the field using a camera that captured the spectral bands (UV and green; Qiu et al., 2021) to which mouse photoreceptors are sensitive ( nm for S- and M-cones, respectively; Jacobs et al., 2004). We used 113 different movie clips, each lasting 5 s, that were displayed in pseudo-random order. Five of these constituted the test set and were repeated three times: at the beginning, in the middle, and at the end of the movie presentation, thereby allowing to assess the reliability of neuronal responses across the recording (Figure 1b, top).
The responses elicited by the synthetic stimuli and the natural movie were diverse, displaying ON (Figure 1d, rows 4–9), ON-OFF (row 3), and OFF (rows 1 and 2), as well as sustained and transient characteristics (e.g. rows 8 and 4, respectively). Some responses were suppressed by temporal contrast (generally, rows 10, 11; at high contrast and frequency, row 9). A total of 6984 GCL cells passed our response quality criteria (see Methods); 3527 cells could be assigned to 1 of 32 previously characterised functional RGC groups (Baden et al., 2016) based on their responses to the chirp and MB stimuli using our recently developed classifier (Figure 1e; Figure 1—figure supplement 1; Qiu et al., 2021). Cells assigned to any of groups 33–46 were considered displaced amacrine cells and were not analysed in this study (for detailed filtering pipeline, see Figure 1—figure supplement 1c).
CNN model captures diverse tuning of RGC groups and predicts MEIs
We trained a CNN model on the RGCs’ movie responses (Figure 2a) and evaluated model performance as the correlation between predicted and trial-averaged measured test responses, (Figure 2b). This metric can be interpreted as an estimate of the achieved fraction of the maximally achievable correlation (see Methods). The mean correlation per RGC group ranged from 0.32 (G14) to 0.79 (G24) (Figure 1—figure supplement 1) and reached an average of 0.48 (for all N=3527 cells passing filtering steps 1–3, Figure 1—figure supplement 1). We also tested the performance of our nonlinear model against a linearised version (see Methods; equivalent to an LN model, and from here on ‘LN model’) and found that the nonlinear CNN model achieved a higher test set correlation for all RGC groups (average correlation LN model: 0.38; G14: 0.2, G24: 0.65, Figure 2c).
Next, we wanted to leverage our nonlinear CNN model to search for potentially nonlinear stimulus selectivities of mouse RGC groups. Towards this goal, we aimed to identify stimuli that optimally drive RGCs of different groups. For linear systems, the optimal stimulus is equivalent to the linear filter and can be identified with classical approaches such as reverse correlation (Chichilnisky, 2001). However, since both the RGCs and the CNN model were nonlinear, a different approach was necessary. Other recent modelling studies in the visual system have leveraged CNN models to predict static MEIs for neurons in monkey visual area V4 (Bashivan et al., 2019; Willeke et al., 2023) and mouse visual area V1 (Walker et al., 2019; Franke et al., 2022). We adopted this approach to predict dynamic (i.e. time-varying) MEIs for mouse RGCs. We used gradient ascent on a randomly initialised, contrast- and range-constrained input to find the stimulus that maximised the mean activation of a given model neuron within a short time window (0.66 s; see Methods; Figure 2d).
It is important to note that MEIs should not be confused with, or interpreted as, the linear filters derived from classical approaches such as reverse correlation (Chichilnisky, 2001; Schwartz et al., 2006). While both MEIs and linear filters offer simplified views of a neuron’s nonlinear response properties, they emphasise different aspects. The linear filter is optimised to provide the best possible linear approximation of the response function, identifying the stimulus direction to which the cell is most sensitive on average across the stimulus ensemble. In contrast, the MEI maximises the neuron’s response by finding the single stimulus that activates the cell most strongly. Consequently, MEIs can differ significantly from linear filters, often exhibiting greater complexity and higher frequency components (Walker et al., 2019).
MEIs reflect known functional RGC group properties
The resulting MEIs were short, dichromatic movie clips; their spatial, temporal, and chromatic properties and interactions thereof are best appreciated in lower-dimensional visualisations (Figure 3a–c; more example MEIs in Figure 3—figure supplement 1).
To analyse the MEIs in terms of these properties, we decomposed them into their spatial and temporal components, separately for green and UV, and parameterised the spatial component as a difference-of-Gaussians (DoG) (Gupta et al., 2022) (N=1613 out of 1947, see Methods). We then located MEIs along the axes in stimulus space corresponding to three properties: centre size, mean temporal frequency, and centre contrast, separately for green and UV (Figure 3d–f). These MEI properties reflect RGC response properties classically probed with synthetic stimuli, such as spots of different sizes (Goetz et al., 2022), temporal frequency modulations (Baden et al., 2016), and stimuli of varying chromatic contrast (Szatko et al., 2020; Khani and Gollisch, 2021). Using the MEI approach, we were able to reproduce known properties of RGC groups (Figure 3g–i). For example, sustained ON RGCs (G24), which are known to prefer large stimuli (Baden et al., 2016; Mani and Schwartz, 2017), had MEIs with large centres (G24, N=20 cells: green centre size, mean ± SD: 195 ±82 μm; UV centre size 178 ±45 μm; average across all RGC groups: green 148 ±42 μm, UV 141 ±42 μm; see Figure 3g).
The MEI’s temporal frequency relates to the temporal frequency preference of an RGC: MEIs of G20 and G21, termed ON high frequency and ON low frequency (Baden et al., 2016), had high and low average temporal frequency, respectively (G20, N=40 cells, green, mean ± SD: 2.71 ±0.16 Hz, UV 2.86 ±0.22 Hz; G21, N=50 cells, green, mean ± SD: 2.32 ±0.63 Hz, UV 1.98 ± 0.5 Hz; see Figure 3h). Some MEIs exhibit fast oscillations (Figure 3e and Figure 3—figure supplement 1). This is not an artefact but rather a consequence of optimising a stimulus to maximise activity over a 0.66 s time window (Figure 2d). To maximise the response of a transient RGC over several hundred milliseconds, it has to be stimulated repetitively, hence the oscillations in the MEI. Maximising the response over a shorter time period results in MEIs without oscillations (Figure 3—figure supplement 2).
Finally, the contrast of an MEI reflects what is traditionally called a cell’s ON vs. OFF preference: MEIs of ON and OFF RGCs had positive and negative contrasts, respectively (Figure 3i). An ON-OFF preference can be interpreted as a tuning map with two optima – one in the OFF- and one in the ON-contrast regime. For an ON-OFF cell, there are hence two stimuli that are approximately equally effective at eliciting responses from that cell. Consequently, for the ON-OFF RGC groups, optimisation resulted in MEIs with ON or OFF contrast, depending on the relative strengths of the two optima and on the initial conditions (Figure 3—figure supplement 1, G10, and see Discussion).
MEIs were also largely consistent within functional RGC groups (Figure 3—figure supplement 1). Where this was not the case, the heterogeneity of MEIs could be attributed to a known heterogeneity of cells within that group. For example, MEIs of G31 RGCs were diverse (Figure 3—figure supplement 1), and the cells that were originally grouped to form G31 probably spanned several distinct types, as suggested by the group’s unusually high coverage factor (Baden et al., 2016). Together, these results provided strong evidence that RGCs grouped based on responses to synthetic stimuli (chirp and MB) also form functional groups in natural movie response space.
CNN model predicts centre colour opponency in RGC group G28
Our goal was to explore chromatic tuning of RGCs and to identify novel stimulus selectivities related to chromatic contrast. Therefore, we specifically focused on regions in stimulus space where a given stimulus property differs for green and UV. For centre size and temporal frequency, we asked which RGC groups contributed to the MEIs outside of the 95th percentile around the diagonal (Figure 3g, h, j, and k). These 5% MEIs furthest away from the diagonal were almost exclusively contributed by ON cells; and among these, more so by slow than by fast ON cells.
MEI contrast needed to be analysed differently than size and temporal frequency for two reasons. First, due to the dominance of UV-sensitive S-opsin in the ventral retina (Szél et al., 1992), stimuli in the UV channel were much more effective at eliciting RGC responses. As a result, the contrast of most MEIs is strongly shifted towards UV (Figure 3i). Second, contrast in green and UV can not only vary along positive valued axes (as is the case for size and temporal frequency), but can also take on opposite signs, resulting in colour-opponent stimuli. Whereas most MEIs had the same contrast polarity in both colour channels (i.e. both ON or OFF, Figure 3c, blue and turquoise trajectories), some MEIs had opposing contrast polarities in UV and green (Figure 3c, orange trajectory, and Figure 3i, upper left quadrant). Thus, for contrast, we asked which RGC groups contributed to colour-opponent MEIs (i.e. MEIs in the colour-opponent, upper left or lower right quadrant in Figure 3i). Again, slow ON RGCs made up most of the cells with colour-opponent MEIs. Here, G28 stood out: 66% (24/36) of all cells of this group had colour-opponent MEIs (UVON-greenOFF), followed by G27 with 42% colour-opponent MEIs.
The colour opponency we found in G28 was not centre-surround, as described before in mice (Szatko et al., 2020), but rather a centre opponency (‘co-extensive’ colour-opponent RF; reviewed in Schwartz, 2021), as can be seen in the lower-dimensional visualisations (Figure 3a and b, right column; Figure 3c, orange trajectory).
In conclusion, our model-guided in silico exploration of chromatic stimulus space revealed a variety of preferred stimuli that captured known properties of RGC groups, and revealed a preference of G28 RGCs for centre colour-opponent, UVON-greenOFF stimuli, a feature previously unknown for this RGC group.
Experiments confirm selectivity for chromatic contrast
Next, we verified experimentally that the MEIs predicted for a given RGC group actually drive cells of that group optimally. To this end, we performed new experiments in which we added to our battery of stimuli a number of MEIs chosen according to the following criteria: We wanted the MEIs to (i) span the response space (ON, ON-OFF, OFF, transient, sustained, and contrast-suppressed) and (ii) to represent both well-described RGC types, such as cells (i.e. G5,24), as well as poorly understood RGC types, such as suppressed-by-contrast cells (G28,31,32) (Figure 4a). We therefore chose MEIs of RGCs from groups G1 (OFF local), G5 (OFF sustained), G10 (ON-OFF local-edge), G18 (ON transient), G20 (ON high frequency), G21 (ON low frequency), G23 (ON mini ), G24 (sustained ON ), G28 (ON contrast suppressed), G31 (OFF suppressed 1), and G32 (OFF suppressed 2). For simplicity, in the following we refer to the MEI of an RGC belonging to group as group ’s MEI, or MEI .
We presented these MEIs on a regularly spaced 5 × 5 grid to achieve approximate centring of stimuli on RGC RFs in the recording field (Figure 4b and c). For these recordings, we fit models whose readout parameters allowed us to estimate the RGCs’ RF locations. We used these RF location estimates to calculate a spatially weighted average of the responses to the MEIs displayed at different locations, weighting the response at each location proportional to the RF strengths at those locations (Figure 4b, red highlight, and Figure 4d, top). We then performed the same experiment in silico, confirming that the model accurately predicts responses to the MEIs (Figure 4d, bottom; Figure 4—figure supplement 1). These experiments allowed us to evaluate MEI responses at the RGC group level (Figure 4e and f; Figure 3—figure supplement 1).
We expected RGCs to show a strong response to their own group’s MEI, a weaker response to the MEIs of functionally related groups, and no response to MEIs of groups with different response profiles. Indeed, most RGC groups exhibited their strongest (G5,20,21,28,32) or second-strongest (G1,10,23) response to their own group’s MEI (Figure 4g, top). Conversely, RGC groups from opposing regions in response space showed no response to each others’ MEIs (e.g. G1,5 [OFF cells] vs. G21-28 [slow ON cells]). The model’s predictions showed a similar pattern (Figure 4g, bottom), thereby validating the model’s ability to generalise to the MEI stimulus regime.
Notably, G28 RGCs responded very selectively to their own MEI 28, displaying only weak responses to most other MEIs (Figure 4f and g, selectivity index G28 to MEI 28, , defined as the average difference in response between MEI 28 and all other MEIs in units of standard deviation of the response, mean ± SD: ; see Methods). This was in contrast to other RGC groups, such as G23 and G24, that responded strongly to MEI 28, but also to other MEIs from the slow ON response regime (Figure 4g, top; Figure 4—figure supplement 1, , mean ± SD: , , mean ± SD: ). Hence, our validation experiments confirm the model’s prediction that RGC group G28 is selective for centre colour-opponent, UVON-greenOFF stimuli.
G28 corresponds to the tSbC RGC type
Next, we sought to identify which RGC type G28 corresponds to. In addition to its unique centre colour opponency, the responses of G28 displayed a pronounced transient suppression to temporal contrast modulations (chirp response in Figure 1e). Therefore, we hypothesised that G28 corresponds to the tSbC RGC type (Tien et al., 2015; Tien et al., 2016; Tien et al., 2022), which is one of the three SbC RGC types identified so far in the mouse and is also referred to as ON delayed (OND) because of its delayed response onset (Jacoby and Schwartz, 2018).
To test this hypothesis, we performed cell-attached electrophysiology recordings (Figure 5) targeting tSbC/OND cells (N=4), identified by their responses to spots of multiple sizes (Goetz et al., 2022), and later confirmed by their distinctive morphology (Jacoby and Schwartz, 2018; type 73 in Bae et al., 2018; Figure 5c and d). We recorded spikes while presenting the MEI stimuli (Figure 5a, top). Just like G28 RGCs in the Ca2+ imaging, tSbC/OND cells exhibited a pronounced selectivity for MEI 28, and were suppressed by most other MEIs (Figure 5a, middle and bottom). Notably, the characteristic delayed response onset was visible in both the Ca2+ (Figure 4f, top) and electrical (Figure 5a) responses but was not predicted by the model (Figure 4f, bottom).
As a control, we also recorded MEI responses of a different, well-characterised RGC type, sustained (s) ON (G24; Krieger et al., 2017; Figure 5b, top; N=4). Again, the electrical recordings of the cells’ MEI responses yielded virtually the same results as the Ca2+ imaging (Figure 5b, middle and bottom; Figure 4—figure supplement 1). Crucially, sON cells were not selective for MEI 28. The fact that these experiments with precise positioning of stimuli on the cells’ RFs yielded the same results as the 2P imaging experiments confirms the validity of the grid approach for stimulus presentation used in the latter.
Chromatic contrast selectivity derives from a nonlinear transformation of stimulus space
Next, we asked whether G28 (tSbC) RGC’s selectivity is a linear feature, as could be achieved by two linear filters with opposite signs for the two colour channels, or whether it is a nonlinear feature. To address this question, we tested whether an LN model (implemented using convolutions; see Methods) could recover the chromatic selectivity of G28 by predicting MEIs using the LN model. We found that the LN model predicted colour-opponent MEIs for only 9 out of 36 (25%) G28 RGCs (nonlinear CNN: 24 out of 36 [66%] colour-opponent MEIs; Figure 6a–c). This finding argues against the possibility that G28’s colour opponency can be explained on the computational level by two opposite-sign linear filters operating on the two colour channels, which could be recovered by an LN model. Instead, it suggests the presence of a nonlinear dependency between chromatic contrast (of the stimulus) and chromatic selectivity (of the cell). In other words, G28 RGCs process stimuli differently depending on their chromatic contrast. This is a nonlinear property that cannot be accurately captured by an LN model that makes a single estimate of the linear filter for the whole stimulus space.
To understand the nature of this dependency, we expanded the estimate of the model RGCs’ tuning to colour contrast around the maximum (the MEI). We did this by mapping the model neurons’ response and its gradient in 2D chromatic contrast space (Figure 6b and c). This analysis revealed that, indeed, G28 RGCs have a nonlinear tuning for colour contrast: they are strongly UV-selective at lower contrasts, but become colour-opponent, i.e., additionally inhibited by green, for higher contrasts. For individual neurons with very strong colour opponency that extends over a large region of chromatic contrast space, also the LN model’s approximation reflects this colour opponency, which demonstrates that the LN model can in principle model colour opponency, too (Figure 6—figure supplement 1).
We confirmed the model’s predictions about G28’s nonlinear tuning for colour contrast using electrical recordings as described above. The example G28 (tSbC) cells shown in the figure exhibit similar nonlinear tuning in chromatic contrast space (Figure 6e–g). The first example cell’s firing rate (Figure 6f) and, consequently, the tuning curve (Figure 6e) peak for UVON-greenOFF stimuli (top left in panel e; upper row, second from left in f), and are lower for UVON-greenON stimuli (top right in panel e; upper row, far right in f), reflecting the suppressive effect of green contrast on the cell’s response. The same is true for the second example cell (Figure 6g).
The nonlinearity in tuning to colour contrast of G28 RGCs leads to a transformation of stimulus space (Figure 6) that amplifies the distance of colour-opponent stimuli from non-colour-opponent stimuli and is thereby expected to increase their discriminability. We therefore hypothesised that the representation of visual input formed by G28 might serve to detect an ethologically relevant, colour-opponent feature from the visual scene.
Visual context changes are characterised by changes in chromatic contrast
Chromatic contrast changes at the horizon (Khani and Gollisch, 2021; Qiu et al., 2021; Gupta et al., 2022; Abballe and Asari, 2022), and so does visual context: from sky to ground or vice versa. We therefore hypothesised that G28 might leverage chromatic contrast as a proxy for detecting changes in visual context, such as might be caused by head or eye movements that cross the horizon. Detecting these changes in visual context could provide information that is crucial for interpreting signals in other RGC channels.
In our natural movie stimulus, the transitions between movie clips (inter-clip transitions; Figure 1b) can be categorised into those with and without change in visual context: ground-to-sky and sky-to-ground transitions for vertical movements with a change in visual context, and as controls, ground-to-ground and sky-to-sky transitions for horizontal movements without change in visual context. We calculated the contrast of these transitions in the green and UV channel to map them to chromatic contrast stimulus space (Figure 7a). We found that ground-to-ground and sky-to-sky transitions were distributed along the diagonal, indicating that they reflect largely achromatic contrast changes. The two transitions resembling visual input elicited by vertical movements crossing the horizon fell into the two colour-opponent quadrants of the stimulus space: sky-to-ground transitions in the lower right quadrant, and ground-to-sky transitions in the upper left quadrant (Figure 7a and b). The UVON-greenOFF MEIs 28, the preferred stimuli of G28, shared a location in stimulus space with ground-to-sky transitions, indicating that these two stimuli are similar in terms of chromatic contrast (Figure 3i).
Chromatic contrast selectivity allows detecting visual context changes
Next, we tested if G28 RGCs indeed respond strongly to visual context changes as occur in ground-to-sky transitions. To this end, we extracted the RGC responses to the inter-clip transitions, mapping their tuning to chromatic contrasts (Figure 7—figure supplement 1, Figure 7—figure supplement 2), and then averaged the resulting single-cell tuning maps for each RGC group (e.g. see Figure 7c–e). G28 is most strongly tuned to full-field transitions in the upper left quadrant containing mostly ground-to-sky inter-clip transitions (Figure 7c) – unlike, for example, non-colour-opponent reference RGC groups from the slow ON and OFF response regime (Figure 7d and e).
Could a downstream visual area detect ground-to-sky visual context changes based on input from G28 RGCs? To answer this question, we performed a linear signal detection analysis for each RGC by sliding a threshold across its responses to the inter-clip transitions, classifying all transitions that elicited an above-threshold response as ground-to-sky, and evaluating false-positive and true-positive rates (FPR and TPR, respectively) for each threshold (Figure 7f). Plotting the resulting TPRs for all thresholds as a function of FPRs yields a receiver operating characteristic (ROC) curve (Fawcett, 2006; Figure 7f, middle). The area under this curve (AUC) can be used as a measure of detection performance: it is equivalent to the probability that a given RGC will respond more strongly to a ground-to-sky transition than to any other type of transition. Indeed, G28 RGCs achieved the highest AUC on average (Figure 7f, bottom, and Figure 7g; G28, mean ± SD AUC (N=78 cells): 0.68 ± 0.08; two-sample permutation test G28 vs. all other groups with at least N=4 cells (see Methods), significant for each group, with Bonferroni-corrected for 30 multiple comparisons).
Ground-to-sky transitions, and therefore visual context changes, can also appear in the lower visual field, which is processed by the dorsal retina, where RGCs receive weaker UV input (Szatko et al., 2020). We recorded additional fields in the dorsal retina (Figure 7—figure supplement 2) and also found here that G28 (tSbC) RGCs displayed the strongest tuning to ground-to-sky transitions among all dorsal RGCs (Figure 7—figure supplement 3c-h, for statistics, see legends).
Visual context changes triggered by different behaviours, such as locomotion and head or eye movements, will differ strongly with respect to their statistics – in particular with respect to their speed. Therefore, for G28 (tSbC) RGCs to play a role in detecting context changes, their detection performance should be robust across velocities. To test whether this is the case, we conducted additional in silico experiments where we ran model inference on stimuli that simulate transitions across the visual field with and without context change (Figure 7h) at different velocities: 50, 150, 250, and 350 visual degrees per second (° s–1; see Methods; Figure 7—figure supplement 3a and b) The slowest speed simulated visual input as could be elicited by locomotion, and the fastest speed approached that of saccades (Meyer et al., 2020). We then performed an ROC analysis on the model cell responses. It should be noted that, because model predictions are noise-free, results from the ROC analysis based on simulated responses will overestimate detection performance. However, under the assumption of approximately equal noise levels across RGC groups, we can still draw conclusions about the relative performance of different RGC groups. This analysis confirmed that G28 RGCs could distinguish ground-to-sky context changes from all other types of transitions robustly across different speeds (Figure 7i and j). Interestingly, the advantage of G28 over other RGC groups in performing this detection task diminished with increasing speed (Figure 7—figure supplement 3c and d; see also Discussion).
Together, these analyses demonstrate that a downstream area, reading out from a single RGC group, would achieve the best performance in detecting ground-to-sky context changes if it based its decisions on inputs from G28 RGCs, robustly across different lighting conditions (transitions between movie snippets), retinal location (ventral and dorsal), and speeds.
Discussion
We combined large-scale recordings of RGC responses to natural movie stimulation with CNN-based modelling to investigate colour processing in the mouse retina. By searching the stimulus space in silico to identify maximally exciting inputs (MEIs), we found a novel type of chromatic tuning in tSbC RGCs. We revealed this RGC type’s pronounced and unique selectivity for full-field changes from green-dominated to UV-dominated scenes, a stimulus that matches the chromatic statistics of ground-to-sky transitions in natural scenes. Therefore, we suggest that tSbC cells may signal context changes within their RF. Beyond our focus on tSbC cells, our study demonstrates the utility of an in silico approach for generating and testing hypotheses about the ethological relevance of sensory representations.
Nonlinear approaches for characterising neuronal selectivities
We leveraged image-computable models in combination with an optimisation approach to search in dynamic, chromatic stimulus space for globally optimal inputs for RGCs, the MEIs. The resulting MEI represents the peak in the nonlinear loss landscape that describes the neuron’s tuning in high-dimensional stimulus space. This approach has also been used to reveal the complexities and nonlinearities of neuronal tuning in monkey visual cortex area V4 (Bashivan et al., 2019; Willeke et al., 2023) and mouse area V1 (Walker et al., 2019; Franke et al., 2022).
Finding optimal stimuli via predictive models is by no means the only way to reveal nonlinear selectivity. Several alternative approaches exist (Schwartz et al., 2006; Sharpee et al., 2004; Liu et al., 2017; Globerson et al., 2009; Maheswaranathan et al., 2023) that could in principle also recover the type of tuning we report (although it is not a trivial question whether and under what conditions they would). More importantly, these approaches are not readily applicable to our data. Some approaches, such as spike-triggered covariance (Schwartz et al., 2006) or spike-triggered non-negative matrix factorisation (Liu et al., 2017), typically make assumptions about the distribution of input stimuli that are violated, or at least not guaranteed, for naturalistic stimuli, which consequently precludes using these methods to probe cells under their natural stimulus statistics (adaptations of these methods for naturalistic stimuli exist, e.g., see Aljadeff et al., 2013; Aljadeff et al., 2016). Other approaches, including maximally informative dimensions (Sharpee et al., 2004) or maximum noise entropy (Globerson et al., 2009), use information theory as a framework. They require estimating mutual information between stimulus and responses, which is a challenge when dealing with high-dimensional stimuli and continuous responses.
Predictive models, on the other hand, can handle high-dimensional input distributions well and can easily be adapted to different response modalities. However, one important limitation of our approach for identifying neuronal stimulus selectivities is that searching for the maximally exciting input will return a single input – even when there are several inputs that would elicit an equal response, such as ON and OFF stimuli for ON-OFF cells (see Figure 3—figure supplement 1, G10 MEIs). A remedy for this limitation is to search for diverse exciting inputs by generating stimuli that are both highly effective at eliciting neuronal responses and at the same time distinct from one another. Ding et al., 2023 used this approach to study bipartite invariance in mouse V1 (see also Cadena et al., 2018).
Another limitation is that identifying the MEI does not immediately provide insight into how the different stimulus dimensions contribute to the neuron’s response, i.e., how varying the stimulus along these dimensions affects the neuron’s response. However, differentiable models readily lend themselves to explore the interactions and contributions of different stimulus dimensions in generating the neuronal response, e.g., by inspecting the gradient field along dimensions of interest as done here, or by searching for locally optimal stimulus perturbations (Goldin et al., 2022). These models can also be used to understand better the properties that distinguish cell types from each other by generating most discriminative stimuli, as recently demonstrated for RGCs in mouse and marmoset (Burg et al., 2023).
Together, these studies showcase the versatility of the toolkit of optimisation-based approaches at characterising nonlinear neuronal operations in high-dimensional, natural stimulus spaces. We add to this toolkit by first searching for a globally optimal stimulus, and then searching locally in its vicinity to map the cells’ loss landscape around the peak.
Circuit mechanisms for colour opponency in tSbC RGCs
Most previous studies of colour opponency in the mouse retina have identified sparse populations of colour-opponent RGCs that have not been systematically assigned to a particular functional type (Szatko et al., 2020; Khani and Gollisch, 2021; Gouras and Ekesten, 2004). The only studies that have examined the mechanisms of colour opponency in identified mouse RGC types showed a centre-surround organisation, with RF centre and surround having different chromatic preferences (Chang et al., 2013; Joesch and Meister, 2016; Stabio et al., 2018, but see Sonoda et al., 2020). While we do not specifically analyse centre-surround opponency in this study, we see a similar trend as described previously in many RGC types, with stronger surrounds in the green channel relative to the UV channel (see Figure 4a, Figure 3—figure supplement 1). tSbC RGCs, in contrast, respond to spatially co-extensive colour-opponent stimuli, functionally reminiscent of colour-opponent RGCs in guinea pig (Yin et al., 2009) and ground squirrels (Michael, 1968).
In mice, centre-surround opponency has been attributed to the opsin gradient (Chang et al., 2013) and rod contributions in the outer retina (Joesch and Meister, 2016; Szatko et al., 2020), whereas the circuitry for spatially co-extensive opponency remains unknown. It seems unlikely, though, that the opsin gradient plays a major role in the tSbC cell’s colour opponency, because both ventral and dorsal tSbC cells preferentially responded to full-field green-to-UV transitions. In primates, spatially co-extensive colour opponency in small bistratified RGCs is thought to arise from the selective wiring of SON and M/LOFF bipolar cells onto the inner and outer dendritic strata, respectively (Dacey and Lee, 1994, but see Field et al., 2007). A similar wiring pattern seems unlikely for tSbC RGCs, since their inner dendrites do not co-stratify with the S-ON (type 9) bipolar cells, nor do their outer dendrites co-stratify with the candidate M-OFF bipolar cell (type 1) (Behrens et al., 2016). The bistratified dendritic arbour distinguishes the mouse tSbC also from the colour-opponent ON RGC type in guinea pig, which is monostratified (Yin et al., 2009).
The large RF centres of the tSbC cells, extending well beyond their dendritic fields, come from a non-canonical circuit, in which tonic inhibition onto the RGC via GABAB receptors is relieved via serial inhibition from different amacrine cells using GABAC receptors (Mani and Schwartz, 2017). An intriguing possibility is that a colour-selective amacrine cell is part of this circuit, perhaps supporting chromatically tuned disinhibition in the absence of selective wiring from the aforementioned cone-selective bipolar cells onto the RGC.
A new functional role for tSbC RGCs
Suppressed-by-contrast responses have been recorded along the early visual pathway in dorsal lateral geniculate nucleus (dLGN), superior colliculus (SC), and primary visual cortex (V1) (Niell and Stryker, 2010; Piscopo et al., 2013; Ito et al., 2017), with their function still being debated (Masland and Martin, 2007). In the retina, three types of SbC RGCs have so far been identified (reviewed in Jacoby and Schwartz, 2018), among them the tSbC cell (Mani and Schwartz, 2017; Tien et al., 2015; Tien et al., 2022). Despite their relatively recent discovery, tSbC RGCs have been suggested to play a role in several different visual computations. The first report of their light responses in mice connected them to the SbC RGCs previously discovered in rabbit, cat, and macaque, and suggested a role in signalling self-generated stimuli, perhaps for saccade suppression (Tien et al., 2015). Aided by a new intersectional transgenic line to selectively label tSbC RGCs (Tien et al., 2022), their projections were traced to areas in SC, v- and dLGN, and nucleus of the optic tract (NOT). The latter stabilises horizontal eye movements; however, as the medial terminal nucleus, which serves stabilisation of vertical eye movements, lacks tSbC innervation, it is unclear whether and how these RGCs contribute to gaze stabilisation.
A retinal study identified the circuit mechanisms responsible for some of the unique spatial and temporal response properties of tSbC cells and suggested a possible role in defocus detection to drive emmetropization in growing eyes and accommodation in adults (Mani and Schwartz, 2017; Baden et al., 2017). Here, we identified another potential role for these RGCs in vision based on the chromatic properties of their RFs: signalling visual context changes (see next section). These different possible functional roles are not mutually exclusive, and might even be complementary in some cases, highlighting the difficulty in assigning single features to distinct RGC types (Schwartz and Swygart, 2020). In particular, the centre colour opponency that we discovered in tSbC RGCs could serve to enhance their role in defocus detection by adding a directional signal (myopic vs. hyperopic) based on the chromatic aberration of lens and cornea (Gawne and Norton, 2020). Future studies may test these theories by manipulating these cells in vivo using the new transgenic tSbC mouse line (Tien et al., 2022).
Behavioural relevance of context change detection
The horizon is a prominent landmark in visual space: it bisects the visual field into two regions, ground and sky. This is particularly relevant in animals like mice, where eye motion largely accounts for head movements and keeps the visual field stable with respect to the horizon (Meyer et al., 2020). Visual stimuli carry different meaning depending on where they occur relative to the horizon, and context-specific processing of visual inputs is necessary for selecting appropriate behavioural responses (reviewed in Evans et al., 2019). For example, it is sensible to assume that a looming stimulus above the horizon is a predator, the appropriate response to which would be avoidance (i.e. escape or freezing). A similar stimulus below the horizon, however, is more likely to be harmless or even prey. To allow for time-critical perceptual decisions – predator or prey – and corresponding behavioural response selection – avoidance or approach – it might be useful for stimulus information (e.g. dark moving spot) and contextual information to converge early in the visual circuitry.
Notably, VGluT3-expressing amacrine cells (a ‘hub’ for distributing information about motion) represent a shared element in upstream circuitry, providing opposite-sign input to tSbC and to RGCs implicated in triggering avoidance behaviour, such as tOFF (Krieger et al., 2017; Münch et al., 2009) and W3 cells (Zhang et al., 2012). In downstream circuitry, SbC inputs have been found to converge with ‘conventional’ RGC inputs onto targets in dLGN and NOT; whether tSbC axons specifically converge with tOFF or W3 axons remains to be tested. Such convergence may allow ‘flagging’ the activity of these RGCs with their local context (sky/threat or ground/no threat).
Depending on the behaviour that elicits a context change – be it a head or eye movement or locomotion – the parameters of the incoming stimulus, such as illumination level and velocity, may change. To be behaviourally useful, a context-change-flagging signal needs to be reliable and robust across these different stimulus parameters. While many slow-ON RGCs achieve high detection performance at higher transition velocities, probably reacting to the increasingly flash-like stimuli, tSbC/G28 RGCs were the only type with robustly high performance across different levels of illumination and all simulated speeds.
In silico approaches to linking neural tuning and function
Modelling studies have advanced our understanding of the complexity and organisation of retinal processing in recent years. It is helpful to consider the contributions of different studies in terms of three perspectives on the retinal encoding of natural scenes: the circuit perspective (‘how?’), the normative perspective (‘why?’), and the coding perspective (‘what?’) (Marr, 2010; Karamanlis et al., 2022). For example, an in silico dissection of a CNN model of the retina offered explanations on how the surprisingly complex retinal computations, such as motion reversal, omitted stimulus response, and polarity reversal, emerge from simpler computations within retinal circuits (Maheswaranathan et al., 2023; Tanaka et al., 2019). From the normative perspective, networks trained on an efficient coding objective accurately predicted the coordination of retinal mosaics (Roy et al., 2021).
Here, we proposed an approach that allows investigating the complexity of retinal processing simultaneously from the coding and the normative perspectives: A global search for most exciting mouse RGC inputs in dynamic, chromatic stimulus space provides answers to the question of what it is that retinal neurons encode. Interpreting the abstract features extracted by the retina against the backdrop of natural stimulus space points to why these features might be behaviourally relevant. And finally, classifying individual RGCs into types then allows to bring in the circuit perspective through targeted experiments aimed at dissecting how specific retinal computations are implemented.
Methods
Animals and tissue preparation
All imaging experiments were conducted at the University of Tübingen; the corresponding animal procedures were approved by the governmental review board (Regierungspräsidium Tübingen, Baden-Württemberg, Konrad-Adenauer-Str. 20, 72072 Tübingen, Germany) and performed according to the laws governing animal experimentation issued by the German Government. All electrophysiological experiments were conducted at Northwestern University; the corresponding animal procedures were performed according to standards provided by Northwestern University Center for Comparative Medicine and approved by the Institutional Animal Care and Use Committee (IACUC).
For all imaging experiments, we used 4- to 15-week-old C57Bl/6J mice (N=23; JAX 000664) of either sex (10 male, 13 female). These animals were housed under a standard 12 hr day/night rhythm at 22° and 55% humidity. On the day of the recording experiment, animals were dark-adapted for at least 1 hr, then anaesthetised with isoflurane (Baxter) and killed by cervical dislocation. All following procedures were carried out under very dim red (>650 nm) light. The eyes were enucleated and hemisected in carboxygenated (95% O2, 5% CO2) artificial cerebrospinal fluid (ACSF) solution containing (in mM): 125 NaCl, 2.5 KCl, 2 CaCl2, 1 MgCl2, 1.25 NaH2PO4, 26 NaHCO3, 20 glucose, and 0.5 L-glutamine at pH 7.4. Next, the retinae were bulk-electroporated with the fluorescent Ca2+ indicator Oregon-Green BAPTA-1 (OGB-1), as described earlier (Briggman and Euler, 2011). In brief, the dissected retina was flat-mounted onto an Anodisc (#13, 0.2 μm pore size, GE Healthcare) with the RGCs facing up, and placed between a pair of 4 mm horizontal plate electrodes (CUY700P4E/L, Nepagene/Xceltis). A 10 µl drop of 5 mM OGB-1 (hexapotassium salt; Life Technologies) in ACSF was suspended from the upper electrode and lowered onto the retina. Next, nine pulses (≈9.2 V, 100 ms pulse width, at 1 Hz) from a pulse generator/wide-band amplifier combination (TGP110 and WA301, Thurlby handar/Farnell) were applied. Finally, the tissue was placed into the microscope’s recording chamber, where it was perfused with carboxygenated ACSF (at ≈36°C) and left to recover for ≥30 min before recordings started. To visualise vessels and damaged cells in the red fluorescence channel, the ACSF contained ≈0.1 µM Sulforhodamine-101 (SR101, Invitrogen) (Euler et al., 2009). All procedures were carried out under dim red (>650 nm) light.
For electrophysiology experiments, we used ChAT-Cre (JAX 006410) × Ai14 (JAX 007914) mice on a C57Bl/6J background (N=2, male, aged 27 and 30 weeks). Mice were housed with siblings in groups up to 4, fed normal mouse chow and maintained on a 12:12 hr light/dark cycle. Before the experiment, mice were dark-adapted overnight and sacrificed by cervical dislocation. Retinal tissue was isolated under infrared illumination (900 nm) with the aid of night-vision goggles and IR dissection scope attachments (BE Meyers). Retinal orientation was identified using scleral landmarks (Wei et al., 2010), and preserved using relieving cuts in cardinal directions, with the largest cut at the dorsal retina. Retinas were mounted on 12 mm poly-D-lysine-coated glass affixed to a recording dish with grease, with the GCL up. Oxygenation was maintained by superfusing the dish with carboxygenated Ames medium (US Biological, A1372-25) warmed to 32°C. For cell-attached single-cell recordings, we used Symphony software (Cafaro et al., 2019) with custom extensions (Schwartz and Ala-Laurila, 2024).
Owing to the exploratory nature of our study, we did not use randomisation and blinding. No statistical methods were used to predetermine sample size.
Two-photon calcium imaging
We used a MOM-type two-photon microscope (designed by W Denk, purchased from Sutter Instruments) (Euler et al., 2009; Euler et al., 2019), which was equipped with a mode-locked Ti:sapphire laser (MaiTai-HP DeepSee, Newport Spectra-Physics) tuned to 927 nm, two fluorescence detection channels for OGB-1 (HQ 510/84, AHF/Chroma) and SR101 (HQ 630/60, AHF), and a water immersion objective (CF175 LWD x 16/0.8W, DIC N2, Nikon, Germany). Image acquisition was performed with custom-made software (ScanM by M Müller and TE) running under IGOR Pro 6.3 for Windows (Wavemetrics), taking time-lapsed 64 × 64 pixel image scans (≈ (100 μm)2) at 7.8125 Hz (Figure 1c). For simplicity, we refer to such a time-lapsed scan of a local population of GCL cells as a ‘recording’. Despite the low frame rate, the Ca2+ responses can be related to the spike rate (Trapani et al., 2023; Román Rosón et al., 2019; Berens et al., 2018; Theis et al., 2016). For documenting the position of the recording fields, the retina under the microscope was oriented such that the most ventral edge pointed always towards the experimenter. In addition, higher resolution images (512 ×512 pixel) were acquired and recording field positions relative to the optic nerve were routinely logged.
Data preprocessing
Ca2+ traces were extracted for individual ROIs as described previously (Baden et al., 2016; Szatko et al., 2020). Extracted traces were then detrended to remove slow drifts in the recorded signal that were unrelated to changes in the neural response. First, a smoothed version of the traces, , was calculated by applying a Savitzky-Golay filter of third polynomial order and a window length of 60 s using the SciPy implementation scipy.signal.savgol_filter. This smoothed version was then subtracted from the raw traces to yield the detrended traces.
To make traces non-negative (), we then clipped all values smaller than the 2.5th percentile, , to that value, and then subtracted from the detrended traces:
This procedure (i.e. clipping to, and subtracting ) was more robust than simply subtracting the minimum.
Finally, traces were then divided by the standard deviation within the time window before stimulus start at :
For training the model on movie response, we then estimated firing rates from the detrended Ca2+ traces using the package C2S (https://github.com/lucastheis/c2s, copy archived at lucastheis, 2016; Theis et al., 2016).
Inclusion criteria
We applied a sequence of quality filtering steps to recorded cells before analysis illustrated in Figure 1—figure supplement 1. As a first step, we applied a general response quality criterion, defined as a sufficiently reliable response to the MB stimulus (as quantified by a quality index ), or a sufficiently reliable response to the chirp stimulus (as quantified by a quality index ). The quality index is defined as in Baden et al., 2016:
where is the T by I response matrix (time samples by stimulus repetitions) and and denote the mean and variance across the indicated dimension , respectively.
The second and third step made sure only cells were included that were assigned to a ganglion cell group (i.e. group index between 1 and 32) with sufficient confidence. Confidence is defined as the probability assigned to the predicted class by the random forest classifier (see Qiu et al., 2023), and the threshold was set at ≥0.25.
The fourth step made sure only cells with a sufficient model prediction performance, defined as an average single-trial test set correlation of , were included.
All cells passing steps 1–3 were included in the horizon detection analysis (Figure 7); all cells passing steps 1–4 were included in the MEI analysis (Figure 3); the ‘red’ cells passing steps 1–4 were included in the MEI validation analysis (Figure 4). In the process of analysing MEIs, we fitted DoGs to their green and UV spatial component (see Methods section Concentric anisotropic 2D DoG fit). For the analysis of MEI properties (temporal frequency, centre size, chromatic contrast), we only included cells with a sufficient DoG goodness-of-fit, determined as a value of the cost function of <0.11 for both green and UV on the resulting DoG fit. This threshold was determined by visual inspection of the DoG fits and led to the inclusion of 1613 out of 1947 RGCs in the MEI property analysis.
Visual stimulation
For light stimulation (imaging experiments), we projected the image generated by a digital light processing projector (lightcrafter DPM-FE4500MKIIF, EKB Technologies Ltd) through the objective onto the tissue. The lightcrafter featured a light-guide port to couple in external, band-pass-filtered UV and green LEDs (light-emitting diodes) (green: 576 BP 10, F37-576; UV: 387 BP 11, F39-387; both AHF/Chroma) (Franke et al., 2019). To optimise spectral separation of mouse M- and S-opsins, LEDs were band-pass-filtered (390/576 dual-band, F59-003, AHF/Chroma). LEDs were synchronised with the microscope’s scan retrace. Stimulator intensity (as photoisomerisation rate, 103 per cone) was calibrated to range from ≈ 0.5 (black image) to ≈ 20 for M- and S-opsins, respectively. Additionally, we estimated a steady illumination component of ≈ 104 per cone to be present during the recordings because of two-photon excitation of photopigments (Euler et al., 2009; Euler et al., 2019). Before data acquisition, the retina was adapted to the light stimulation by presenting a binary noise stimulus (20 × 15 matrix, (40 μm)2 pixels, balanced random sequence) at 5 Hz for 5 min to the tissue. Stimuli were presented using the software RRID:SCR_016985, QDSpy (https://github.com/eulerlab/QDSpy).
For electrophysiology experiments, stimuli were presented using a digital projector (DPM-FE4500MKII, EKB Technologies Ltd) at a frame rate of 60 Hz and a spatial resolution of 1140 × 912 pixels (1.3 μm per pixel) focused on the photoreceptor layer. Neutral density filters (Thorlabs), a triple-band pass filter (405 BP 20, 485 BP 20, 552 BP 16; 69,000×, Chroma), and a custom LED controller circuit were used to attenuate the light intensity of stimuli either to match that of the Ca2+ imaging experiments (for MEI presentation) or to range from ≈0 to 200 per rod (for cell identification). Stimuli were presented using Symphony software (https://symphony-das.github.io/) with custom extensions (https://github.com/Schwartz-AlaLaurila-Labs/sa-labs-extension, copy archived at Schwartz-AlaLaurila-Labs, 2024).
Identifying RGC types
To functionally identify RGC groups in the Ca2+ imaging experiments, we used our default ‘fingerprinting’ stimuli, as described earlier (Baden et al., 2016). These stimuli included a full-field (700 μm in diameter) chirp stimulus, and a 300 ×1,000 μm bright bar moving at 1000 μm/s in eight directions across the recording field (with the shorter edge leading; Figure 1b).
The procedure and rationale for identifying cells in the electrophysiological recordings is presented in Goetz et al., 2022. Cells with responses that qualitatively matched that of the OND and ON types were included in the study. Following recording, cells were filled with Alexa Fluor-488 by patch pipette and imaged under a two-photon microscope. Dendrites were traced in Fiji (NIH) using the SNT plugin (Arshadi et al., 2021). Dendritic arbours were computationally flattened using a custom MATLAB tool (https://doi.org/10.5281/zenodo.6578530) based on the method in Sümbül et al., 2014, to further confirm their identity as morphological type 73 from Bae et al., 2018.
Mouse natural movies
The natural movie stimulus consisted of clips of natural scenes recording outside in the field with a specialised, calibrated camera (Qiu et al., 2021). This camera featured a fish-eye lens, and two spectral channels, UV (band-pass filter F37-424, AHF, > 90% transmission at 350–419 nm) and green (F47-510, >90%, 470–550 nm, AHF), approximating the spectral sensitivities of mouse opsins (Jacobs et al., 2004). In mice, eye movements often serve to stabilise the image on the retina during head movements (Meyer et al., 2020). Therefore, the camera was also stabilised by mounting it on a gimbal. As a result, the horizon bisected the camera’s visual field.
A mouse cam movie frame contained a circular field of view of 180° corresponding to 437 pixels along the diameter. To minimise the influence of potential chromatic and spatial aberrations introduced by the lenses, we focused on image cut-outs (crops; 30 × 26, equivalent to 72 × 64 pixels in size) from upper and lower visual field, centred at [28, 56] and [-42, –31], respectively, relative to the horizon (for details, see Qiu et al., 2021). Our stimulus movie consisted of 113 movie clips, each 150 frames (= 5 s) long. 108 clips were randomly reordered for each recording and split into two 54 clips-long training sequences. The remaining 5 clips formed a fixed test sequence that was presented before, in between, and after the training sequences (Figure 1b). To keep intensity changes at clip transitions small, we only used clips with mean intensities between 0.04 and 0.22 (for intensities in [0, 1]). For display during the experiments, intensities were then mapped to the range covered by the stimulator, i.e., [0, 255].
CNN model of the retina
We trained a CNN model to predict responses of RGCs to a dichromatic natural movie. The CNN model consisted of two modules, a convolutional core that was shared between all neurons, and a readout that was specific for each neuron (Klindt et al., 2017).
The core module was modelled as a two-layer CNN with 16 feature channels in each layer. Both layers consisted of space-time separable 3D convolutional kernels followed by a batch normalisation layer and an ELU (exponential linear unit) nonlinearity. In the first layer, sixteen 2 × 11 × 11 × 21 (c=#input channels (green and UV) × h=height ×w=width × t=#frames) kernels were applied as valid convolution; in the second layer, sixteen 16 × 5 × 5 × 11 kernels were applied with zero padding along the spatial dimensions. We parameterised the temporal kernels as Fourier series and added one time stretching parameter per recording to account for inter-experimental variability affecting the speed of retinal processing. More precisely, every temporal kernel was represented by the first sine and cosine functions, with trainable weights and phases, on an evenly spaced temporal grid, where for the first layer, and for the second layer. Additionally, we introduced a trainable stretch parameter for every recording to account for faster and slower response kernels. For example, the first layer temporal kernels are 21 steps long. Then, in order to stay well under the Nyquist limit, we parameterise the kernels with sines and cosines.
For each of those sines and cosines a weight () is learned to represent the shape of the temporal responses kernel (shared among cells within a recording). Per scan , the time grid (21 steps from 0 to 1) is stretched by a factor to account for different response speeds. To avoid adding additional cycles (e.g. for stretch factors ) this is masked by an exponential envelope
Thus,
is the temporal kernel parameterisation, that allows the model to learn a shared temporal filter that is made faster or slower for each specific scan (Zhao et al., 2020).
In the readout, we modelled each cell’s spatial receptive field (RF) as a 2D isotropic Gaussian, parameterised as . We then modelled the neural response as an affine function of the core feature maps weighted by the spatial RF, followed by a softplus nonlinearity.
For the linearised version of the model, the architecture was exactly the same except for the fact that there was no ELU nonlinearity after both convolutional layers. The resulting CNN was therefore equivalent to an LN model.
Model training and evaluation
We trained our network by minimising the Poisson loss
where is the number of neurons, is the measured and the predicted firing rate of neuron for an input of duration t=50 frames. We followed the training schedule of Lurz et al., 2021. Specifically, we used early stopping (Prechelt, 1998) on the correlation between predicted and measured neuronal responses on the validation set, which consisted of 15 out of the 108 movie clips. If the correlation failed to increase during any five consecutive passes through the entire training set (epochs), we stopped the training and restored the model to the best performing model over the course of training. We went through four cycles of early stopping, restoring the model to the best performing, and continuing training, each time reducing the initial learning rate of 0.01 by a learning rate decay factor of 0.3. Network parameters were iteratively optimised via stochastic gradient descent (SGD) using the Adam optimiser (Kingma and Ba, 2015) with a batch size of 32 and a chunk size (number of frames for each element in the batch) of 50. For all analyses and MEI generation, we used an ensemble of models as described in Franke et al., 2022. Briefly, we trained five instances of the same model initialised with different random seeds. Inputs to the ensemble model were passed to each member and the final ensemble model prediction was obtained by averaging the outputs of the five members. For ease of notation, we thus redefine to be the ensemble model prediction.
After training, we evaluated model performance for each modelled neuron as the correlation to the mean, i.e., the correlation between predicted response and measured response to the held-out test sequence, the latter averaged across three repetitions : . Unlike the single-trial correlation which is always limited to values by inherent neuronal noise, a perfect model can in theory achieve a value of 1 for the correlation to the mean, in the limit of infinitely many repetitions when the sample average is a perfect estimate of the true underlying response . The observed correlation to the mean can thus be interpreted as an estimate of the fraction of the maximally achievable correlation achieved by our model. For deciding which cells to exclude from analysis, we used average single-trial correlation () since this measure reflects both model performance as well as reliability of the neuronal response to the movie stimulus for neuron (see also Methods section on Inclusion criteria).
Synthesising MEIs
We synthesised MEIs for RGCs as described previously (Walker et al., 2019). Formally, for each model neuron we wanted to find
i.e., the input where the model neuron’s response , averaged across frames 30–50, attains a maximum, subject to norm and range constraints (see below). To this end, we randomly initialised an input of duration t=50 frames with Gaussian white noise, and then iteratively updated according to the gradient of the model neuron’s response:
where was the learning rate. The optimisation was performed using SGD, and was subject to a norm and a range constraint. The norm constraint was applied jointly across both channels and ensured that the L2 norm of each MEI did not exceed a fixed budget of 30. The norm-constrained MEI was calculated at each iteration as
The range constraint was defined and applied for each colour channel separately and ensured that the range of the MEI values stayed within the range covered by the training movie. This was achieved by clipping values of the MEI exceeding the range covered by the training movie to the minimum or maximum value. Optimisation was run for at least 100 iterations, and then stopped when the number of iterations reached 1000, or when it had converged (whichever occurred first). Convergence was defined as 10 consecutive iterations with a change in model neuron activation of less than 0.001; model neuron activations ranged from ≈1 to ≈10. We denote the resulting MEI for neuron as .
Analysing MEIs
We analysed MEIs to quantify their spatial, temporal, and chromatic properties.
Spatial and temporal components of MEIs
For each colour channel , we decomposed the spatiotemporal MEIs into a spatial component and a temporal component by singular value decomposition:
with for [green, UV] is the MEI of neuron in a given colour channel with its spatial dimension (18 ×16=288) flattened out. As a result, any spatiotemporal dependencies are removed and we only analyse spatial and temporal properties separately. The following procedures were carried out in the same manner for the green and the UV component of the MEI, and we drop the colour channel index for ease of notation. The temporal component is then defined as the first left singular vector, , and the spatial component is defined as the first right singular vector, , reshaped to the original dimensions 18 × 16 .
Concentric anisotropic 2D DoG fit
We modelled the spatial component as concentric anisotropic DoG using the nonlinear least-squares solver scipy.optimize.least_squares with soft-L1 loss function (Gupta et al., 2022). The DoGs were parameterised by a location shared between centre and surround, amplitudes , , variances , , and rotation angles , separately for centre and surround:
with
and
and likewise for . We initialised in the following way: Since we set the model readout’s location parameters to (0, 0) for all model neurons when generating their MEIs, we also expected the MEIs to be centred at (0, 0), as well. Hence, we determined the location of the minimum and the maximum value of the MEI; whichever was closer to the centre (0,0) provided the initial values for the parameters . Starting from there, we then first fit a single Gaussian to the MEI, and took the resulting parameters as initial parameters for the DoG fit. This was a constrained optimisation problem, with lower and upper bounds on all parameters; in particular, such that the location parameter would not exceed the canvas of the MEI, and such that the variance would be strictly positive.
MEI properties
Centre size
We defined the diameter of the centre of the MEI in the horizontal and the vertical orientation, respectively, as and . The centre size was calculated as . We then estimated a contour outlining the MEI centre as the line that is defined by all points at which the 2D centre Gaussian attains the value with ). The centre mask was then defined as a binary matrix with all pixels within the convex hull of this contour being 1 and all other pixels set to 0. This mask is used for calculating centre chromatic contrast (see below).
Temporal frequency
To estimate temporal frequency of the MEIs, we estimated the power spectrum of the temporal components using a fast Fourier transform after attenuating high frequency noise by filtering with a fifth-order low-pass Butterworth filter with cutoff frequency 10 Hz. We then estimated the mean frequency of the temporal component by calculating an average of the frequency components, each weighted with its relative power.
Contrast
The contrast of the MEIs in the two channels, for [green, UV], was defined as the difference between the mean value within the centre mask at the two last peaks of the temporal component of the MEI in the UV channel at time points and :
where ⊙ denotes the element-wise multiplication of the MEI and the binary mask (see Figure 3f). The peaks were found with the function scipy.signal.find_peaks, and the peaks found for the UV channel were used to calculate contrast both in the green and the UV channel.
Validating MEIs experimentally
Generating MEI stimuli
To test experimentally whether the model correctly predicts which stimuli would maximally excite RGCs of different RGC groups, we performed a new set of experiments (numbers indicated in red in Figure 1—figure supplement 1), where we complemented our stimulus set with MEI stimuli. For the MEI stimuli, we selected 11 RGCs, chosen to span the responses space and to represent both well-described and poorly understood RGC groups, for which we generated MEIs at different positions on a 5 × 5 grid (spanning 110 μm in vertical and horizontal direction). We decomposed the MEIs as described above, and reconstructed MEIs as rank 1 tensors by taking the outer product of the spatial and temporal components:
The MEI stimuli, lasting 50 frames (1.66 s) were padded with 10 frames (0.34 s) of inter-stimulus grey, and were randomly interleaved. With 11 stimuli, presented at 25 positions and lasting 2 s each, the total stimulus duration was 11 × 25 × 2 s = 550 s. Since the model operated on a z-scored (0 mean, 1 SD) version of the movie, MEIs as predicted by the model lived in the same space and had to be transformed back to the stimulator range ([0, 255]) before being used as stimuli in an experiment by scaling with the movie’s SD and adding the movie’s mean. The MEIs’ green channel was then displayed with the green LED, and the UV channel was displayed with the UV LED. For experiments at Northwestern University, an additional transform was necessary to achieve the same levels of photoreceptor activation (photoisomerisation rates) for M- and S-cones with different LEDs. To ensure proper chromatic scaling between the different experimental apparatuses with different spectral profiles, we described the relative activation of M- and S-cones by the green and UV LEDs in the stimulation setup used in the two-photon imaging experiments (setup A) by a matrix
and the relative activation of M- and S-cones by the stimulation setup used in the patch-clamp experiments (setup B) by a matrix
where diagonal entries describe the activation of M-cones by the green LED, and of S-cones by the UV LED, and entries in the off-diagonal describe the cross-activation (i.e. M-cones by UV-LED and S-cones by green LED). The activation of M-cones and S-cones by a stimulus displayed on a given stimulation setup was approximated as (Christenson et al., 2022). Hence, a stimulus displayed on setup B, defined as , will achieve the same photoreceptor activation as stimulus displayed on setup A. Since the solution exceeded the valid range of the stimulator ([0, 255]), we added an offset and multiplied with a scalar factor to ensure all stimuli were within the valid range.
Analysing RGC responses to MEI stimuli
We wanted to evaluate the responses of RGCs to the MEI stimuli in a spatially resolved fashion, i.e., weighting responses to MEIs displayed at different locations proportional to the strength of the RGCs RF at that location. In order to be able to meaningfully compare MEI responses between RGCs and across groups, for each RGC, we first centred and scaled the responses to zero mean and a standard deviation of 1. Then, for each RGC , we computed a spatial average of its responses, weighting its responses at each spatial location proportional to the Gaussian density , where the parameters of the Gaussian were the model’s estimated readout parameters for neuron (Figure 4b, c, and d, left):
where is the 60 frames (2 s) long response of neuron to an MEI at position , resampled from the recording frame rate of 7.81 Hz to 30 Hz. We then averaged across time in the optimisation time window, i.e., frames 30–50, to get a scalar response for each MEI stimulus (Figure 4d).
Selectivity index
To quantify the selectivity of the response of an RGC to an MEI , we defined a selectivity index as follows. First, we standardised the responses across all MEIs by subtracting the mean and dividing by the standard deviation. The selectivity index of RGC group Gg to MEI was then defined as
where is the Kronecker delta. In words, the SI is the difference (in units of SD response) between the response to the MEI of interest () and the mean response to all other (10) MEIs, , averaged across all cells belonging to the group of interest Gg.
Characterising nonlinear processing of chromatic contrast
We wanted to analyse the tuning of G28/tSbC RGCs to chromatic contrast and to this end, we mapped the model response and its gradient across a range of chromatic contrasts (Figure 6). Specifically, the MEIs have pixels and dimensions, 14,400 for each colour channel. Now let be the cell’s MEI estimated using the CNN model, with the first d=14,400 dimensions defining the green pixels and the remaining dimensions defining the UV pixels. Then for each cell, we first consider a two-dimensional subspace spanned by two basis vectors where
In words, the first basis vector consists of the green component of the MEI in the green channel, multiplied by , and of 0 s in the UV channel, and the second basis vector consists of the UV component in the UV channel and of 0 s in the green channel for . and are two scaling factors chosen to equalise contrast (as measured by L2 norm) across colour channels while preserving the contrast of the stimulus as a whole. In the subspace spanned by these basis vectors, the point represents a contrast-scaled version of the original MEI. We then sampled 11 points along each dimension, equally spaced between , which resulted in stimuli that are identical in terms of their spatial and temporal properties and only differ in their contrast. We then evaluated the model neuron response at these points in the subspace (Figure 6d). We also evaluated the gradient of the model neuron response at these points and plotted the direction of the gradient projected into the subspace spanned by (Figure 6b and c).
Detection performance analysis
To test the performance of individual RGCs of different groups in detecting the target class of inter-clip transitions (ground-to-sky) from all other classes of inter-clip transitions, we performed an ROC analysis (Fawcett, 2006). For each RGC, we calculated its response to an inter-clip transition occurring at time as the baseline-subtracted average response within 1 s following the transition, i.e., , with T=30 frames at 30 Hz. For all N=40 equally spaced thresholds within the response range of an RGC, we then calculated the TPR and FPR of a hypothetical classifier classifying all transitions eliciting an above-threshold response as a positive, and all other transitions as negative. Plotting the TPR as a function of FPR yields an ROC curve, the area under which (AUC) is equivalent to the probability that the RGC will respond more strongly to a randomly chosen inter-clip transition of the target class than to a randomly chosen inter-clip transition of a different class. The AUC thus is a measure of performance for RGCs in this detection task.
Detection task in simulation
We simulated the four types of transitions (sky-sky, sky-ground, ground-ground, ground-sky) in natural scenes at different velocities, which could be triggered by different behaviours such as locomotion or eye movements. With the simulated context-changing stimuli, we predicted model neuron responses in silico and then determined if G28 could perform the detection task robustly well across speeds.
For generating the stimuli, 500 frames were randomly extracted from the same mouse natural movies used for the 2P imaging experiments. For each frame, we simulated visual transitions by moving a 72×64 pixel-large window along a fixed trajectory (Figure 7h, bottom) at four different angular velocities: 50, 150, 250, and 350/s, corresponding to 4, 12, 20, and 28 pixels per frame, respectively (Figure 7—figure supplement 3a and b). Each edge of the trajectory is 220 pixels long, covering 90.6 of visual angle. Each selected scene frame was sampled eight times (i.e. twice per velocity). To avoid potential biases due to asymmetries in the mouse natural movie, we sampled each frame for each velocity both in clockwise and in counterclockwise direction. The stimuli were then down-sampled to 18×16 pixels and shown to the model at a frame rate of 30 Hz. Because the trajectories contained different numbers of moving frames for the four velocities, we ‘padded’ the stimuli at the beginning and the end of each transition stimulus by duplicating the start and end frames, resulting in a total of 60 frames each (see illustration in Figure 7—figure supplement 3).
Statistical analysis
Permutation test
We wanted to test how likely the difference in AUC observed for different RGC groups are to occur under the null hypothesis that the underlying distributions they are sampled from are equal. To this end, we performed a permutation test. We generated a null distribution for our test statistic, the absolute difference in AUC values , by shuffling the RGC group labels of the two groups of interest (e.g. G28 and G24) and calculating the test statistic with shuffled labels 100,000 times. We only included RGC groups with at least N=4 cells in this analysis. We then obtained a p-value for observed with true labels as the proportion of entries in the null distribution larger than .
Bootstrapped confidence intervals
We bootstrapped confidence intervals for (Figure 7 and Figure 7—figure supplement 2). For , we generated a bootstrapped distribution by sampling 100 times with replacement from the AUC values of the two groups that were being compared and calculated . We then estimated the 95% confidence interval for as the interval defined by the 2.5th and 97.5th percentile of the bootstrapped distribution of .
For , we generated a bootstrapped distribution by sampling 100 times with replacement from the MEI responses of RGC group and then calculating and for each sample. We then estimated the 95% confidence interval for as the interval defined by the 2.5th and 97.5th percentile of the bootstrapped distribution of .
Estimating effect size
The effect size of difference in AUC observed for different RGC groups and , (Figure 7 and Figure 7—figure supplement 2), was estimated as Cohen’s (Cohen, 1988; Goulet-Pelletier and Cousineau, 2018):
with
and and the sample mean and standard deviation, respectively, of the AUC observed for the RGCs of group k.
Estimating linear correlation
Wherever the linear correlation between two paired samples and of size was calculated for evaluating model performance, Figure 2, Figure 1—figure supplement 1, Figure 4—figure supplement 1, we used Pearson’s correlation coefficient:
Data availability
The data and the movie stimulus are available at https://gin.g-node.org/eulerlab/rgc-natstim. Code for training models and reproducing analyses and figures is available at https://github.com/eulerlab/rgc-natstim-model, (copy archived at Hoefling et al., 2024).
References
-
Spike triggered covariance in strongly correlated gaussian stimuliPLOS Computational Biology 9:e1003206.https://doi.org/10.1371/journal.pcbi.1003206
-
Visual neuroscience: a retinal ganglion cell to report image focus?Current Biology 27:R139–R141.https://doi.org/10.1016/j.cub.2016.12.022
-
Understanding the retinal basis of vision across speciesNature Reviews. Neuroscience 21:5–20.https://doi.org/10.1038/s41583-019-0242-1
-
ConferenceMultilayer recurrent network models of primate retinal ganglion cell responsesProceedings of the 5th International Conference on Learning Representations.
-
Community-based benchmarking improves spike rate inference from two-photon calcium imaging dataPLOS Computational Biology 14:e1006157.https://doi.org/10.1371/journal.pcbi.1006157
-
Bulk electroporation and population calcium imaging in the adult mammalian retinaJournal of Neurophysiology 105:2601–2609.https://doi.org/10.1152/jn.00722.2010
-
BookDiverse feature visualizations reveal invariances in early layers of deep neural networksIn: Cadena SA, Weis MA, editors. Computer Vision - ECCV 2018. Springer Link. pp. 225–240.https://doi.org/10.1007/978-3-030-01258-8_14
-
Deep convolutional models improve predictions of macaque V1 responses to natural imagesPLOS Computational Biology 15:e1006897.https://doi.org/10.1371/journal.pcbi.1006897
-
Exploiting colour space geometry for visual stimulus design across animalsPhilosophical Transactions of the Royal Society of London. Series B, Biological Sciences 377:20210280.https://doi.org/10.1098/rstb.2021.0280
-
BookStudying a light sensor with light: multiphoton imaging in the retinaIn: Hartveit Espen, editors. In Multiphoton Microscopy. Humana Press. pp. 225–250.https://doi.org/10.1007/978-1-4939-9702-2_10
-
Cognitive control of escape behaviourTrends in Cognitive Sciences 23:334–348.https://doi.org/10.1016/j.tics.2019.01.012
-
An introduction to ROC analysisPattern Recognition Letters 27:861–874.https://doi.org/10.1016/j.patrec.2005.10.010
-
Spatial properties and functional organization of small bistratified ganglion cells in primate retinaThe Journal of Neuroscience 27:13261–13272.https://doi.org/10.1523/JNEUROSCI.3437-07.2007
-
Context-dependent selectivity to natural images in the retinaNature Communications 13:5556.https://doi.org/10.1038/s41467-022-33242-8
-
A review of effect sizes and their confidence intervals, part I: the cohen’s d familyThe Quantitative Methods for Psychology 14:242–265.https://doi.org/10.20982/tqmp.14.4.p242
-
Why do mice have ultra-violet vision?Experimental Eye Research 79:887–892.https://doi.org/10.1016/j.exer.2004.06.031
-
Softwarergc-natstim-model, version swh:1:rev:fab17f3c34167effb97f2fbbf534ea5f5cfcb8a8Software Heritage.
-
Segregation of visual response properties in the mouse superior colliculus and their modulation during locomotionThe Journal of Neuroscience 37:8428–8443.https://doi.org/10.1523/JNEUROSCI.3689-16.2017
-
Typology and circuitry of suppressed-by-contrast retinal ganglion cellsFrontiers in Cellular Neuroscience 12:269.https://doi.org/10.3389/fncel.2018.00269
-
Nonlinear spatial integration underlies the diversity of retinal ganglion cell responses to natural imagesThe Journal of Neuroscience 41:3479–3498.https://doi.org/10.1523/JNEUROSCI.3075-20.2021
-
Retinal encoding of natural scenesAnnual Review of Vision Science 8:171–193.https://doi.org/10.1146/annurev-vision-100820-114239
-
Feature detection by retinal ganglion cellsAnnual Review of Vision Science 8:135–169.https://doi.org/10.1146/annurev-vision-100419-112009
-
Linear and nonlinear chromatic integration in the mouse retinaNature Communications 12:1900.https://doi.org/10.1038/s41467-021-22042-1
-
Adam: a method for stochastic optimizationProceedings of the 3rd International Conference on Learning Representations 01:1–15.https://doi.org/10.1145/1830483.1830503
-
What the frog’s eye tells the frog’s brainProceedings of the IRE 47:1940–1951.https://doi.org/10.1109/JRPROC.1959.287207
-
BookVision: A Computational Investigation Into the Human Representation and Processing of Visual InformationThe MIT Press.
-
The unsolved mystery of visionCurrent Biology 17:R577–R582.https://doi.org/10.1016/j.cub.2007.05.040
-
ConferenceDeep learning models of the retinal response to natural scenesNeural Information Processing Systems. pp. 1–9.https://doi.org/10.1172/JCI44752.288
-
Two distinct types of eye-head coupling in freely moving miceCurrent Biology 30:2116–2130.https://doi.org/10.1016/j.cub.2020.04.042
-
Approach sensitivity in the retina processed by a multifunctional neural circuitNature Neuroscience 12:1308–1316.https://doi.org/10.1038/nn.2389
-
Diverse visual features encoded in mouse lateral geniculate nucleusThe Journal of Neuroscience 33:4642–4656.https://doi.org/10.1523/JNEUROSCI.5187-12.2013
-
BookEarly stopping - but when?In: Prechelt L, editors. Neural Networks:Tricks of the Trade. Lecture Notes in Computer Science. Springer. pp. 55–69.https://doi.org/10.1007/3-540-49430-8_3
-
Efficient coding of natural scenes improves neural system identificationPLOS Computational Biology 19:e1011037.https://doi.org/10.1371/journal.pcbi.1011037
-
BookCircuits for feature selectivity in the inner retinaIn: Schwartz GW, editors. The Senses: A Comprehensive Reference. Elsevier. pp. 275–292.https://doi.org/10.1016/B978-0-12-809324-5.24186-2
-
BookChapter 16 - color processingIn: Schwartz GW, editors. In Retinal Computation. Academic Press. pp. 288–317.https://doi.org/10.1016/B978-0-12-819896-4.00017-2
-
SoftwareSa-labs-extension, version swh:1:rev:6ffbbc63bd3b1f558db2dd8bf604ae12983491e9Software Heriatge.
-
Computational identification of receptive fieldsAnnual Review of Neuroscience 36:103–120.https://doi.org/10.1146/annurev-neuro-062012-170253
-
Natural image statistics and neural representationAnnual Review of Neuroscience 24:1193–1216.https://doi.org/10.1146/annurev.neuro.24.1.1193
-
Overlapping morphological and functional properties between M4 and M5 intrinsically photosensitive retinal ganglion cellsThe Journal of Comparative Neurology 528:1028–1040.https://doi.org/10.1002/cne.24806
-
A genetic and computational approach to structurally classify neuronal typesNature Communications 5:3512.https://doi.org/10.1038/ncomms4512
-
Unique topographic separation of two spectral classes of cones in the mouse retinaThe Journal of Comparative Neurology 325:327–342.https://doi.org/10.1002/cne.903250302
-
ConferenceFrom deep learning to mechanistic understanding in neuroscience: the structure of retinal predictionProceedings of the 33rd Conference on Neural Information Processing Systems. pp. 1–11.https://doi.org/10.5555/3454287
-
Genetically identified suppressed-by-contrast retinal ganglion cells reliably signal self-generated visual stimuliThe Journal of Neuroscience 35:10815–10820.https://doi.org/10.1523/JNEUROSCI.1521-15.2015
-
Layer-specific developmentally precise axon targeting of transient suppressed-by-contrast retinal ganglion cellsThe Journal of Neuroscience 42:7213–7221.https://doi.org/10.1523/JNEUROSCI.2332-21.2022
-
Differences in nonlinearities determine retinal cell typesJournal of Neurophysiology 130:706–718.https://doi.org/10.1152/jn.00243.2022
-
Stimulus- and goal-oriented frameworks for understanding natural visionNature Neuroscience 22:15–24.https://doi.org/10.1038/s41593-018-0284-0
-
Inception loops discover what excites neurons most using deep predictive modelsNature Neuroscience 22:2060–2065.https://doi.org/10.1038/s41593-019-0517-x
-
Rapid innate defensive responses of mice to looming visual stimuliCurrent Biology 23:2011–2015.https://doi.org/10.1016/j.cub.2013.08.015
-
The temporal structure of the inner retina at a single glanceScientific Reports 10:4399.https://doi.org/10.1038/s41598-020-60214-z
Article and author information
Author details
Funding
Deutsche Forschungsgemeinschaft (CRC 1233 project number 276693517)
- Matthias Bethge
- Philipp Berens
- Katrin Franke
- Thomas Euler
Deutsche Forschungsgemeinschaft (Heisenberg Professorship BE5601/8-1)
- Philipp Berens
Deutsche Forschungsgemeinschaft (EXC 2064 390727645)
- Philipp Berens
Deutsche Forschungsgemeinschaft (CRC 1456 project number 432680300)
- Alexander S Ecker
Bundesministerium für Bildung und Forschung (FKZ 01IS18039A)
- Philipp Berens
National Institutes of Health (NEI EY031029)
- Gregory W Schwartz
National Institutes of Health (NEI F30EY031565)
- Zachary Jessen
European Research Council (grant agreement No. 101041669)
- Alexander S Ecker
National Institutes of Health (NEI EY031329)
- Gregory W Schwartz
National Institutes of Health (NIGMS T32GM008152)
- Zachary Jessen
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Jonathan Oesterle and Dominic Gonschorek for feedback on the manuscript, Jan Lause for statistical consulting, and Merle Harrer for general assistance. We also thank all members of the Sinz lab for regular discussions on the project. This work was supported by the German Research Foundation (DFG; CRC 1233 'Robust Vision: Inference Principles and Neural Mechanisms', project number 276693517 to PB, MB, TE, KF; Heisenberg Professorship, BE5601/8-1 to PB; Excellence Cluster EXC 2064, project number 390727645 to PB; CRC 1456 'Mathematics of Experiment' project number 432680300 to ASE), the Federal Ministry of Education and Research (FKZ 01IS18039A to PB), National Institutes of Health (NIH; NEI EY031029, NEI EY031329 to GWS; NEI F30EY031565, NIGMS T32GM008152 to ZJ), and the European Research Council (ERC) under the European Union’s Horizon Europe research and innovation programme (ASE, grant agreement No. 101041669). We acknowledge support from the Open Access Publication Fund of the University of Tübingen.
Ethics
All imaging experiments were conducted at the University of Tübingen; the corresponding animal procedures were approved by the governmental review board (Regierungspräsidium Tübingen, Baden-Württemberg, Konrad-Adenauer-Str. 20, 72072 Tübingen, Germany) and performed according to the laws governing animal experimentation issued by the German Government. All electrophysiological experiments were conducted at Northwestern University; the corresponding animal procedures were performed according to standards provided by Northwestern University Center for Comparative Medicine and approved by the Institutional Animal Care and Use Committee (IACUC).
Copyright
© 2024, Höfling et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 228
- views
-
- 29
- downloads
-
- 4
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Neuroscience
Identical stimuli can be perceived or go unnoticed across successive presentations, producing divergent behavioral outcomes despite similarities in sensory input. We sought to understand how fluctuations in behavioral state and cortical layer and cell class-specific neural activity underlie this perceptual variability. We analyzed physiological measurements of state and laminar electrophysiological activity in visual area V4 while monkeys were rewarded for correctly reporting a stimulus change at perceptual threshold. Hit trials were characterized by a behavioral state with heightened arousal, greater eye position stability, and enhanced decoding performance of stimulus identity from neural activity. Target stimuli evoked stronger responses in V4 in hit trials, and excitatory neurons in the superficial layers, the primary feed-forward output of the cortical column, exhibited lower variability. Feed-forward interlaminar population correlations were stronger on hits. Hit trials were further characterized by greater synchrony between the output layers of the cortex during spontaneous activity, while the stimulus-evoked period showed elevated synchrony in the feed-forward pathway. Taken together, these results suggest that a state of elevated arousal and stable retinal images allow enhanced processing of sensory stimuli, which contributes to hits at perceptual threshold.
-
- Neuroscience
Significant technical challenges exist when measuring synaptic connections between neurons in living brain tissue. The patch clamping technique, when used to probe for synaptic connections, is manually laborious and time-consuming. To improve its efficiency, we pursued another approach: instead of retracting all patch clamping electrodes after each recording attempt, we cleaned just one of them and reused it to obtain another recording while maintaining the others. With one new patch clamp recording attempt, many new connections can be probed. By placing one pipette in front of the others in this way, one can ‘walk’ across the mouse brain slice, termed ‘patch-walking.’ We performed 136 patch clamp attempts for two pipettes, achieving 71 successful whole cell recordings (52.2%). Of these, we probed 29 pairs (i.e. 58 bidirectional probed connections) averaging 91 μm intersomatic distance, finding three connections. Patch-walking yields 80–92% more probed connections, for experiments with 10–100 cells than the traditional synaptic connection searching method.