A matter of time

Reward learning is not about the probability of reward but when it occurs.

Image of a rat. Image credit: Kapa65 (CC0)

All animals, including humans, can learn when one event signals that another is about to occur, such as when a flash of lightning signals a thunderclap a few seconds later. Such learning is acquired more quickly when the events are closer together in time. However, when events occur less frequently, spacing learning episodes apart is more effective than cramming them together.

Most theories of associative learning assume that it involves the gradual strengthening of a mental connection between representations of the two events each time they are experienced together. This often uses the same simple rules seen in neural networks that power modern AI. However, these theories are not particularly well-suited to explaining why learning is so sensitive to the timing and frequency of learned events. This is why AI finds it easier to learn sequential dependencies in language than to predict events in real time.

Harris and Gallistel sought to determine if they could replicate the decades-old finding that the timing and frequency of events influence learning reciprocally, as they affect how informative the first event is about when the second will occur. Here, informativeness is defined as the time separating each instance of the second event divided by the time between the first and second events. For example, the longer the wait between thunderclaps, and the shorter the wait after each flash of lightning, the more informative the lightning is about the thunder.

Harris and Gallistel trained rats to associate food with a light stimulus using various timing intervals between training trials. The results showed that the speed at which rats learned that a light predicts food can be explained by how informative the light is regarding the presence of food. If the rat knows how infrequent the food is (i.e., how long the rat must wait between food deliveries) and how quickly the food follows the light, the researchers could use the ratio of these intervals to predict how long it takes the rats to learn.

Previous research has shown this in pigeons, but Harris and Gallistel are the first to extend this to a different species, thereby establishing a general principle of learning. Moreover, the frequency with which the rats checked for food was directly proportional to how long they had to wait for it, suggesting the rats learned about the temporal relationship between light and food.

The findings of Harris and Gallistel highlight the differences in how real animals and artificial networks achieve learning. This should guide the efforts of both neuroscientists studying real brains and computer scientists aiming to reproduce animal intelligence.