Experimental paradigm for auditory-visual label learning.
A) Subjects were exposed to four different visual-auditory pairs during three days (6 repetitions of each pair, 3 minute video). Two pairs were always presented in the ‘visual-then-auditory’ order (object to label), and two in the ‘auditory-then-visual’ (label to object) order. During the test phase, this canonical order was kept on 80% of trials, including 10% of incongruent pairs to test memory of the learned pairs, and was reversed on 20% of the trials. On reversed trials, half the pairs were congruent and half were incongruent (each 10% of total trials), thus testing reversibility of the pairings without affording additional learning. B,C) Activation in sensory cortices. Although each trial comprises auditory and visual stimuli, these could be separated by the temporal offsets. Images show significantly activated regions in the contrasts image > sound (red-yellow) and sound > image (blue-light blue), averaged across all subjects and runs for humans (B) and monkeys (C). D,E) Average finite-impulse-response (FIR) estimate of the deconvolved hemodynamic responses for humans (D) and monkeys (E) within clusters shown in B and C respectively, separately for visual-audio (VA) and audio- visual (AV) trials. Sign flipped on y-axis for monkey responses.