The interaction between image content and model sensitivities greatly affects asymptotic performance, most noticeably on the synthesized vs. synthesized comparison for the energy model, while critical scaling does not vary as much. (A) Performance for each image, averaged across subjects, comparing synthesized images to natural images. Most images show similar performance, with one obvious outlier whose performance never rises above 60%. Data points represent the average across subjects, 288 trials per data point for half the images, 144 per data point for the other half. Lines represent the posterior predictive means across subjects, with the shaded region giving the 95% HDI. (B) Example model metamers for two extreme images. The top row (nyc) is the image with the best performance (purple line in panel A), while the bottom row (llama) has the worst performance (red line in panel A). In each row, the leftmost image is the target image, and the next two show model metamers with the lowest and highest tested scaling values for this comparison. Performance on the llama image is poor because much of the image content resembles pink noise. Thus, even with larger scaling values, the model metamers are very difficult to distinguish from the target image. The nyc image, on the other hand, contains hard edges with precise alignment of phase across scales. As the energy model discards phase information, this phase structure is lost in the model metamers, which are consequently easy to distinguish from the target image at all tested scaling values. However, this pattern does not hold in the luminance model, or for synthesized vs. synthesized comparisons, for which both images exhibit typical performance (see appendix figure 6). Full resolution version of this figure can be found on the OSF