What do adversarial images tell us about human vision?

  1. Marin Dujmović  Is a corresponding author
  2. Gaurav Malhotra
  3. Jeffrey S Bowers
  1. School of Psychological Science, University of Bristol, United Kingdom
19 figures, 1 table and 1 additional file

Figures

Examples of two types of adversarial images.

(a) fooling adversarial images taken from Nguyen et al., 2015 that do not look like any familiar object. The two images on the left (labelled ‘Electric guitar’ and ‘Robin’) have been generated using …

Average levels of agreement in Experiment 1 (error bars denote 95% confidence intervals).
Example of best-case and worst-case images for the same category (‘penguin’) used in Experiment 2.
Average levels of agreement in Experiment 2 (error bars denote 95% confidence intervals).
Examples of images from Nguyen et al., 2015 used in the four experimental conditions in Experiment 3.

Images are generated using an evolutionary algorithm either using the direct or indirect encoding and generated to fool a network trained on either ImageNet or MNIST.

Agreement (mean percentage of images on which a participant choices agree with the DCNN) as a function of experimental condition in Experiment 3 (error bars denote 95% confidence intervals).
Average levels of agreement in Experiment 4 (error bars denote 95% confidence intervals).

The inset depicts a single trial in which participants were shown three fooling adversarial images and naturalistic examples from the target category. Their task was to choose the adversarial image …

Results for images that are confidently classified with high network-to-network agreement on Alexnet, Densenet-161, GoogLeNet, MNASNet 1.0, MobileNet v2, Resnet 18, Resnet 50, Shufflenet v2, Squeezenet 1.0, and VGG-16.

(a) Examples of images used in the experiment - for all the stimuli see Appendix 2—figures 4 and 5, (b) average levels of agreement between participants and DCNNs under the random and competitive

Appendix 1—figure 1
Agreement across adversarial images from Experiment 3b in Zhou and Firestone, 2019.

The red line represents the mean, the blue line represents the median, and the black reference line represents chance agreement. The inset contains a histogram of agreement levels across the 48 …

Appendix 1—figure 2
Participant responses ranked by frequency (Experiment 3b).

Each row contains the adversarial image, the DCNN label for that image, the top eight participant responses. Shaded cells contain the DCNN choice, when not ranked in the top 8, it is shown at the …

Appendix 1—figure 3
Participant responses ranked by frequency (Experiment 3b).

Continued.

Appendix 1—figure 4
Per-item histograms of response choices from Experiment 3b in Zhou and Firestone, 2019.

Each histogram contains the adversarial stimuli and shows the percentage of responses per each choice (y-axis). The choice labels (x-axis) are ordered the same way as in Appendix 1—figures 2 and 3

Appendix 1—figure 5
Per-item histograms of response choices from Experiment 3b in Zhou and Firestone, 2019.

Continued.

Appendix 1—figure 6
Per-item histograms of response choices from Experiment 3b in Zhou and Firestone, 2019.

Continued.

Appendix 2—figure 1
Experiment 1 stimuli and competitive alternative labels.
Appendix 2—figure 2
An item-wise breakdown of agreement levels in Experiment 2 as a function of experimental condition and category.

Average agreement levels for each category in each condition with 95% CI are presented in (a) with the black line referring to chance agreement. The best case stimuli are presented in (b), these …

Appendix 2—figure 3
An item-wise breakdown of agreement levels for the four conditions in Experiment 3.

Each bar shows the agreement level for a particular image, that is, the percentage of participants that agreed with DCNN classification for that image. Each sub-figure also shows the images that …

Appendix 2—figure 4
Experiment 5 stimuli and competitive alternative labels.
Appendix 2—figure 5
Experiment 5 stimuli and competitive alternative labels.

Continued.

Tables

Table 1
Mean DCNN-participant agreement in the experiments conducted by Zhou and Firestone, 2019
Exp.Test typeMean agreementChance
1Fooling 2AFC N1574.18% (35.61/48 images)50%
2Fooling 2AFC N1561.59% (29.56/48 images)50%
3aFooling 48AFC N1510.12% (4.86/48 images)2.08%
3bFooling 48AFC N159.96% (4.78/48 images)2.08%
4TV-static 8AFC N1528.97% (2.32/8 images)12.5%
5Digits 9AFC P1616% (1.44/9 images)11.11%
6Naturalistic 2AFC K1873.49% (7.3/10 images)50%
73D Objects 2AFC A1759.55% (31.56/53 images)50%
  1. * To give the readers a sense of the levels of agreement observed in these experiments, we have also computed the average number of images in each experiment where humans and DCNNs agree as well as the level of agreement expected if participants were responding at chance.

    Stimuli sources: N15 - Nguyen et al., 2015; P16 - Papernot et al., 2016; K18 - Karmon et al., 2018; A17 - Athalye et al., 2017.

Additional files

Download links