An optimal decision-making strategy emerges from nonstop learning
Unlike machines, the behavior of animals and humans almost always has an element of unpredictability. Countless experiments have shown that our responses to the exact same challenge are sometimes faster, sometimes slower, sometimes correct and sometimes wrong.
In the field of neuroscience, this variability is often attributed to what is called noise, an ever-present "neural babble" that influences the way brains process and respond to incoming information.
A new collaborative study in rodents by a team of scientists from the Champalimaud Centre for the Unknown in Portugal, Harvard Medical School in the U.S., and the University of Geneva in Switzerland, shows that this variability could sometimes be wrongly interpreted as noise. Instead, it may actually be the reflection of a behavioral strategy that was overlooked due to prior assumptions about how the subject should behave. Their results—published today (June 2nd) in the scientific journal Nature Communications—call into question what "optimal behavior" really means.
An unexpected strategy
"It all started with a simple experiment," recalls Maria Inês Vicente, who collected the experimental data as part of her graduate work at the Champalimaud Centre for the Unknown and is currently working at Leiden University . "We took two different odors and created several mixtures of the two. During the experiment, the different mixtures were presented to the rats, one at a time. On each trial, the rats had to report which of two odors was more dominant. If it thought the answer was odor A, it would approach a water spout on the right, and if it opted for odor B, it would go to the left. Some mixtures had much more of one odor compared with the other, making it easier to tell which was more salient. Whereas in other mixtures, the difference was more subtle. If the rat got the correct answer, it received a water reward."
The researchers recorded how quickly the rats responded and whether their answer was right or wrong. To their surprise, when they analyzed the data, they realized that the rats' behavior didn't follow a common decision-making rule. "In these types of tasks we tend to see a clear dependency between difficulty and decision time: On the harder, more subtle trials, animals (and humans) take longer to decide than on easy trials," says André Mendonça of the Champalimaud Centre for the Unknown. "Instead, our rats would take, on average, the same amount of time to make both hard and easy decisions."
"The explanation for this unexpected observation wasn't easy to come by," adds Jan Drugowitsch, a co-author affiliated with Harvard Medical School. "Finally, we found it by constructing a mathematical model that united separate branches in the field of decision-making. In a sense, our goal was to replicate the rats' behavior in a 'machine brain' with the hope of discovering the underlying variables that produced this surprising result."
The model revealed an unexpected strategy. On each trial, the rat was readjusting its behavior according to the results of the previous trial. If the rat was correct in one trial, it would be biased towards the same odor in the next one. And vice versa, an incorrect response in one trial would lead to switching in the next.
Why did the animals adopt this particular strategy? "This strategy is consistent with a worldview where the environment is continuously changing, which leads the animals to update their decision-making approach on a trial-by-trial basis. From the outside, their behavior appears highly variable, but in fact, they were just adapting too quickly. That is why it would have been easy to wrongly interpret this variablity as just noise," Drugowitsch says.
Optimality is in the eye of the beholder
Why did the rats opt for a different strategy from the expected one? The authors explain that there are several reasons. The first is the nature of the task. "There isn't just one type of sensory discrimination task," says Mendonça. "Various elements in the design of the task may draw out different decision-making strategies. For instance, if we had asked rats to localize the side a sound comes from instead of discriminating between odors, their strategy would have aligned with our initial expectation. This is because there is a built-in right-left category in the brain for certain sensory modalities that are naturally spatially separated, but that's not the case for olfaction."
Another reason is confidence. "Just like humans, rats appear to evaluate their own decisions and change their behavior accordingly. When you are very confident and end up making the correct decision, there's really not much to learn. But what happens when you're confident but then find out that you're actually wrong? In this case, you should change your behavior drastically. Which is precisely what we saw with our rats," says Zachary Mainen, one of the group leaders who headed the study and who is affiliated with the Champalimaud Centre for the Unknown.
According to the authors, another explanation for the rat's choice of strategy is their "hard-wired" circuitry for learning. "Ironically, if they would not constantly readjust their responses according to the outcome of the last trial, they would actually do better. In fact, what we were originally expecting them to do is to construct an 'odor A-odor B' category and implement it," says Alex Pouget, who is a group leader at the University of Geneva and co-author of the study. "Still, the rats' strategy makes sense."
As the authors explain, this observation doesn't mean the rat is a maladapted animal. On the contrary, they claim that the scientific community should reconsider what they define as "optimal behavior."
"Rats have evolved over millions of years to search and explore an ever-changing environment. Therefore, when we assess the behavior of these animals, we should remember that it's not necessarily only about performance per se. Optimality should depend both on the problem at hand and the nature of the problem solver," Pouget argues.
"We believe that our work is a good starting point for exploring further how different subfields of decision-making may interact. We also hope that other scientists will use and refine our models in follow-up experiments. It would be fascinating and informative to see when, how and why our model starts to fail. Making an error is an opportunity for learning something new, and that is both the result and take-home message of our study," Mendonça says.