Study sheds light on relationship between stimuli and delayed rewards, explaining why Pavlov's dogs learned to drool when they heard a bell.
More than a century ago, Pavlov figured out that dogs fed after hearing a bell eventually began to salivate when they heard the ring. A Johns Hopkins University-led research team has now figured out a key aspect of why.
In an article published in the journal Neuron, Johns Hopkins neuroscientist Alfredo Kirkwood settles a mystery of neurology that has stumped scientists for years: Precisely what happens in the brain when we learn, or how Pavlov's dogs managed to associate an action with a delayed reward to create knowledge. For decades scientists had a working theory of how it happened, but Kirkwood's team is now the first to prove it.
"If you're trying to train a dog to sit, the initial neural stimuli, the command, is gone almost instantly—it lasts as long as the word sit," said Kirkwood, a professor with the university's Zanvyl Krieger Mind/Brain Institute. "Before the reward comes, the dog's brain has already turned to other things. The mystery was, 'How does the brain link an action that's over in a fraction of a second with a reward that doesn't come until much later?'"
The working theory—which Kirkwood's team has validated—is that invisible "eligibility traces" effectively tag the synapses activated by the stimuli so that it can be cemented as true learning with the arrival of a reward.
In the case of a dog learning to sit, when the dog gets a treat or a reward, neuromodulators like dopamine flood the dog's brain with "good feelings." Though the brain has long since processed the sit command, eligibility traces respond to the neuromodulators, prompting a lasting synaptic change.
The team was able to prove the theory by isolating cells in the visual cortex of a mouse. When they stimulated the axon of one cell with an electrical impulse, they sparked a response in another cell. By doing this repeatedly, they mimicked the synaptic response between two cells as they process a stimulus and create an eligibility trace. When the researchers later flooded the cells with neuromodulators, simulating the arrival of a delayed reward, the response between the cells strengthened or weakened, showing the cells had "learned" and were able to do so because of the eligibility trace.
"This is the basis of how we learn things through reward," Kirkwood said, "a fundamental aspect of learning."
In addition to a greater understanding of the mechanics of learning, these findings could enhance teaching methods and lead to treatments for cognitive problems.
Researchers included Johns Hopkins postdoctoral fellow Su Hong; Johns Hopkins graduate student Xiaoxiu Tie; former Johns Hopkins research associate Kaiwen He; along with Marco Huertas and Harel Shouval, neurobiology researchers at the University of Texas at Houston; and Johannes W. Hell, a professor of pharmacology at University of California, Davis. The research was supported by grants from JHU's Science of Learning Institute and National Institutes of Health.