When you do something right, you can't learn anything from your success without a system in the brain for assigning credit to whatever action led to the desired outcome.
Say, for instance, you've forgotten which of your 10 usual passwords logs you into a favorite website. When you finally enter the right one, your brain should have a mechanism for noting which one led to that success. You have such a mechanism, of course, and now a study in non-human primates is the first to directly pinpoint it at work in the dorsolateral prefrontal cortex.
The new findings in the Journal of Neuroscience not only add insight into how the brain works, but also could lead to improvements in the care of patients who suffer traumatic brain injuries that affect the area, said lead author Dr. Wael Asaad, an assistant professor of neurosurgery at the Warren Alpert Medical School of Brown University and a surgeon at Rhode Island Hospital.
Positioned on the brain's surface behind the top of the forehead, the dorsolateral prefrontal cortex (dPFC) is in a vulnerable place. The long-term impairments that can result from the traumatic brain injuries he treats in the operating room are what motivate Asaad's research, he says.
"The frontal lobes are sites where you often find traumatic hemorrhages that lead to all kinds of problems in these people's lives in the future, and many of those problems have to do with poor decision making," said Asaad, who is affiliated with the Brown Institute for Brain Science. "Our hypothesis was that this part of the brain is responsible for trying to figure out why things happen the way they did and linking causes and effects so that you can make better choices in the future."
A simple game
The research team gathered its data more than six years ago at Massachusetts General Hospital. Two rhesus macaques performed a simple experimental game: figure out which image among four presented on a screen was the "correct" one to earn a rewarding sip of juice. The task was an exercise in credit assignment because to consistently get the juice, the subjects needed to recognize which image was the one that had led to the reward and would continue to in the future. In a control experiment, the researchers rewarded the subjects not for choosing a correct image, but instead for choosing a correct corner of the grid.
As the macaques played the simple game thousands of times (the correct image or location would change every so often to produce the need for a new credit assignment), the researchers recorded the electrical spiking activity of hundreds of their individual neurons in the dPFC, which neuroscientists have debated as perhaps playing a role in credit assignment.
If some dPFC neurons have that function, Asaad and his colleagues reasoned, then they should have to behave in four specific ways during the task.
"If you lay out a list of criteria that the neural activity would need to conform to in order to solve credit assignment, these neurons fulfill all those criteria," Asaad said.
First, for neurons to associate a cause (the correct image) with a desired effect (juice reward and green circle on the screen), their spiking activity should simultaneously represent the image and the successful outcome at the time the positive feedback was delivered. Using software called a decoder, researchers were able to isolate the unique neural spiking patterns associated with each image and with positive or negative feedback, the latter being simply the absence of juice and a red X symbol on the screen. Using this technique, they were able to see that the representation of the chosen image persisted in many neurons through the time when those same neurons also responded to the positive feedback, producing the required simultaneous representation.
Another requirement was that neurons represent the correct image consistently over time, specifically when the picture was visible and when the outcome was revealed, even though by that time the picture had disappeared to make way for the green circle or the red X. Indeed, neurons that represented the images (via unique spiking patterns) did so consistently throughout the task, meaning the same neural code used to represent that picture could link cause and effect.
Because credit assignment is a learning process, Asaad noted, there should be a greater degree and fidelity of neural activity across time when the learning was occurring than when it was well established and merely being reapplied. Again, the researchers saw just such a pattern. Early in each new credit assignment exercise, neural fidelity associated with the correct image and rewarding result was greater than it was when the macaques had become accustomed to selecting same correct image simply by habit.
Finally, if the neurons were truly assigning credit for the reward to the correct cause, then in the control experiment where location was the cause rather than image, the representation of the now-irrelevant image shouldn't still appear at the time of feedback. Again that proved to be the case. While some neurons still showed activity when images were first presented, that increased activity did not reoccur simultaneously with reward when the right answer was grid position rather than image.
"Together, these results are consistent with the notion that neurons in the dPFC provide the necessary, selective and stable representation of relevant features at the time of feedback to enable credit assignment," the researchers wrote in the journal.
The study does not rule out that other brain regions are also involved, Asaad noted. Instead, credit assignment almost certainly does involve other regions. But the new evidence shows that the dPFC is a key player.
Explore further: Good or bad: Surprises drive learning in same neural circuits
Wael F. Asaad et al, Prefrontal neurons encode a solution to the credit assignment problem, The Journal of Neuroscience (2017). DOI: 10.1523/JNEUROSCI.3311-16.2017