Imagine teaching a dog to play fetch. You throw a ball and your dog runs after it, picks it up and runs back. You then reward your panting puppy with a treat. But now here’s the real trick for your dog: determining which part of this sequence earned the treat. Scientists call this the “credit attribution problem” in the brain. This is a fundamental question: understanding which actions are responsible for the positive results we achieve.
Dopamine, a key chemical messenger in the brain, is known to play a crucial role in this process. But exactly how the brain connects specific actions to dopamine release remains unclear.
A study published in Nature by scientists from the Allen Institute, the Zuckerman Mind Brain Behavior Institute at Columbia University, the Champalimaud Center for the Unknown, and the Seattle Children’s Research Institute sheds new light on this mystery. It reveals how dopamine not only signals reward, but also guides animals toward the specific behaviors that lead to those rewards through trial and error.
Intriguingly, research also shows that the brain’s reward system can quickly and dynamically change the full range of an animal’s movements and behaviors. This highlights a sophisticated learning strategy in which behaviors are not only reinforced, but actively shaped and refined through experience, said Rui Costa, DVM, Ph.D., the lead author of the study. .
“When you reinforce a behavior, we often think it’s just that action,” said Costa, president and CEO of the Allen Institute. “But no: you change the whole behavioral structure. And what’s really surprising is how quickly that happened.”
Decoding how dopamine shapes learning
To uncover this information, the team collaborated with engineers and neuroscientists at the Champalimaud Center for the Unknown to develop a new “closed-loop” system that could link specific mouse actions to dopamine release in real time.
Researchers equipped mice with wireless sensors to track their movements in a simple controlled space. They then fed this data into a machine learning algorithm, which classified these actions into distinct groups. The researchers then used optogenetics, a method of controlling neurons with light, to stimulate dopamine neurons after the mice performed predefined “target actions.”
They found that the mice quickly changed their behavior in response to the release of dopamine. First, they not only increased the frequency of the target action, but also that of similar actions and those occurring a few seconds before the dopamine release. Meanwhile, non-target actions declined rapidly. Over time, this refinement became more precise, with the mice increasingly focusing on the exact action that led to the release of dopamine.
The study also examined how mice learn a series of actions, unveiling a key process similar to rewinding time to understand what leads to a reward. When dopamine-triggering actions occurred more widely apart, the mice learned more slowly. This shows that longer waits between actions make it more difficult for mice to connect the sequence to the reward.
Essentially, actions just before the reward are quickly understood and improved, while earlier actions are refined more gradually. This “rewinding” process reinforces behavior and helps mice gradually identify the precise actions and sequences that yield reward.
The findings could impact diverse fields such as education and artificial intelligence (AI), said lead author Jonathan Tang, Ph.D., an assistant professor of medicine-pediatrics at the University of Washington. , at the Seattle Children’s Research Institute. For example, allowing exploration, mistakes, and incremental refinement in the classroom may be more in line with our brains’ innate learning processes.
In the field of AI, the knowledge gained could lead to more sophisticated and efficient learning systems. By better replicating biological learning processes, we could create AI that is more capable of adapting to new data and situations.
This study offers deeper insight into how our brains learn and adapt through trial and error, whether you’re a scientist or a puppy.
“We take a lot of things for granted about how things work, including credit allocation,” said Tang, who began research with Costa while at Columbia University. “But it’s when you really start to delve into it that you realize the complexity. That’s why people do science: to understand the truth.”
More information:
Dynamic restructuring of behavior mediates dopamine-dependent credit attribution, Nature (2023). DOI: 10.1038/s41586-023-06941-5. www.nature.com/articles/s41586-023-06941-5
Provided by the Allen Institute for Brain Science
Quote: New study sheds light on how the brain learns to seek reward (December 13, 2023) retrieved December 13, 2023 from
This document is subject to copyright. Apart from fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.