A computationally informed comparison between the strategies of humans and rodents in visual object recognition

Anna Elisabeth Schnell author has email address
Maarten Leemans
Kasper Vinken
Hans Op de Beeck

Department of Brain and Cognition & Leuven Brain Institute, KU Leuven, Leuven, Belgium
Department of Neurobiology, Harvard Medical School, Boston, Massachusetts, United States of America

https://doi.org/10.7554/eLife.87719.1

Open access
Copyright information

Figures and data

The design of the animal study, including the stimuli.
Animals started with a standardized shaping procedure, followed by three training protocols, as indicated by the dashed outline. In these protocols, animals received real reward, i.e. reward for touching the target. They received correction trials for incorrect answers. After the three training protocols, the animals went through a number of testing protocols. The order of the first six protocols (*) and the last two testing protocols (**) was counterbalanced between the animals. During testing protocols, animals received one third old trials, and two third new trials. In the new trials, they received random reward in 80% of the trials whereas in the old trials, they received real reward and correction trials if necessary.

Results of the Dimension learning and Transformations training protocols.
Each cell of the matrix indicates the average performance per stimulus pair, pooled over all animals. The columns represent the distractors, whereas the rows separate the targets. The colour bar indicates the performance correct.

Pairwise percentage matrices of all nine testing protocols for the rat data.
The colour bar indicates the percentage correct of the pooled responses of all animals together. The more red a cell is, the higher the average performance. Chance or below chance values are indicated in the highest intensity of blue.

Correlation of the Classification Score for single target/distractor pairs between single cDNN layers and the rat performance, for all nine test protocols together.
The black and grey horizontal lines on the x-axis indicated the layer blocks (block 1 consisting of conv1, norm1, pool1; block 2 consisting of conv2, norm2, pool2; block 3-4 corresponding to conv3-4 (respectively); block 5 consisting of conv5, pool5; block 6-7-8 corresponding to fc6-7-8, respectively. The vertical grey dashed line indicates the division between convolutional and fully connected layer blocks. The horizontal dashed line indicates a correlation of 0. The different markers indicate different sorts of layers: circle for convolutional layers, triangle for normalization layers, point for pool layers, and squares for fully connected layers. The asterisks indicate significant correlations according to a permutation test (* < 0.05, ** < 0.01 and *** < 0.001).

Average performance of humans versus rats.
On the x-axis, the nine test protocols in addition to the performance on all old stimuli are presented in the following order: rotation x (RotX), rotation y (RotY), rotation z (RotZ), size, position (Pos), light location (LL), Combination rotation (CR), Zero vs. high (ZVH), High vs. zero (HVZ) and All Old. The dashed horizontal line indicates chance level. The error bars indicate standard error over humans/rats.

Correlation of the Classification Score for single target/distractor pairs between single cDNN layers and the human performance, for all nine test protocols together.
The naming convention on the x axis corresponds to the layers of the network, identical as in Figure 4. The black and grey horizontal lines on the x-axis indicated the layer blocks (block 1 consisting of conv1, norm1, pool1; block 2 consisting of conv2, norm2, pool2; block 3-4 corresponding to conv3-4 (respectively); block 5 consisting of conv5, pool5; block 6-7-8 corresponding to fc6-7-8, respectively. The vertical grey dashed line indicates the division between convolutional and fully connected layer blocks. The horizontal dashed line indicates a correlation of 0. The different markers indicate different sorts of layers: circle for convolutional layers, triangle for normalization layers, point for pool layers, and squares for fully connected layers. The asterisks indicate significant correlations according to a permutation test (* < 0.05, ** < 0.01 and *** < 0.001).

The performance of the cDNN after training on our training stimuli, with noise added to its input.
The naming convention on the x axis corresponds to the layers of the network, identical as in Figure 4. The performance (y-axis) illustrates that each layer is challenged by at least part of the test protocols. The purple line indicates the training performance and the green line indicates the test performance of the neural network. The x axis on each subplot indicates the block of the layer: layers 1-13 correspond to convolutional layer 1, normalization layer 1, pool layer 1, convolutional layer 2, normalization layer 2, pool layer 2, convolutional layer 3, convolutional layer 4, convolutional layer 5, pool layer 5, fully connected layer 6, fully connected layer 7 and fully connected layer 8, respectively. The black and grey horizontal lines on the x-axis indicated the layer blocks (block 1 consisting of conv1, norm1, pool1; block 2 consisting of conv2, norm2, pool2; block 3-4 corresponding to conv3-4 (respectively); block 5 consisting of conv5, pool5; block 6-7-8 corresponding to fc6-7-8, respectively. The vertical grey dashed line indicates the division between convolutional and fully connected layer blocks. The horizontal dashed line indicates chance level. The shaded error bounds correspond to 95% confidence intervals calculated using Jackknife standard error estimates, as done previously in (Vinken & Op de Beeck, 2021). The different markers indicate different sorts of layers: circle for convolutional layers, triangle for normalization layers, point for pool layers, and squares for fully connected layers.

Sign up for email alerts