Eye tracking during visual challenges reveals neural encoding
"Seeing eye to eye" is an expression of harmony, but do different people literally see the same thing when they view the same external world?
"The short answer is—no," says Dr. Liron Gruber. "Even the same person sees the same thing differently each time they look at it," adds Prof. Ehud Ahissar.
Gruber and Ahissar, of the Brain Sciences Department at the Weizmann Institute of Science, arrived at these conclusions after conducting a study in which they investigated intriguing discrepancies between human and computer vision uncovered by Weizmann mathematicians.
Those researchers, headed by Prof. Shimon Ullman of the Computer Science and Applied Mathematics Department, had found that a computer algorithm, no matter how clever, was much worse than humans at interpreting image fragments known as minimally recognizable configurations, or MIRCs (middle row in the figure above)—that is, at telling from which objects (top row) these fragments had been derived. Moreover, when the researchers gradually cropped or blurred the MIRCs, recognition by the computer decreased in a linear fashion, whereas among human participants it dropped abruptly at a certain cut-off point (bottom row).
Gruber realized that experiments involving MIRCs could provide a wealth of data about the workings of the human visual system. In an earlier study, she and Ahissar, her Ph.D. advisor had already shown that contrary to the widely accepted view, the human eye doesn't work like a camera that takes passive snapshots. In a later study published in the journal Proceedings of the National Academy of Sciences, she and Ahissar teamed up with computer scientist Ullman to put human vision to the test.
Identifying MIRCs typically takes people a relatively long time—over two seconds, which is more than six times longer than the 300 or so milliseconds needed to recognize whole objects. The researchers recorded the eye movements of people attempting to recognize MIRCs and, using a computational model, simulated the resulting activities of neurons in the retina.
These activity patterns not only varied with different eye movements, they differed depending on whether or not people managed to recognize the object in the picture. On average, recognition took four sets of scanning by the eyes of different points in the picture; at each point the eyes drifted locally in all directions for several hundred milliseconds.
The results indicated that the interactions between eye movements and the object are critical to recognition. In fact, when the researchers canceled out the interactions between the objects and the eye movements—for example, by moving the pictures in step with the eyes—study participants failed to recognize the objects.
"The retina doesn't create copies of the outside world—unlike a camera, which reproduces external patterns on film or digitally. Rather, human vision is an active process that involves interactions between the external objects and eye movements," Ahissar says. "The eyes of different people follow different paths when viewing the same thing, and even the eyes of the same person never copy the same trajectory, so in a way, each time we look at something, it's a one-off experience."
So how does the brain encode visual reality? More precisely, how does this encoding result from the interactions between the eye movements and the object? Gruber says, "When we look at an object or scene, the light picked up by each receptor in the retina changes in intensity with every eye movement. The resultant patterns of neuronal activity can be interpreted and perhaps stored by the brain."
These findings represent a new direction in the search for the neural code—that is, how information is encoded in the brain—which, unlike the ubiquitous genetic code, probably varies from one brain region to another. The findings show that the retinal code results from a dynamic process in which the brain interacts with the external reality it encounters though the senses. They explain why it takes time to recognize a blurred object or to figure out optical illusions—for example, to spot a "hidden" Dalmatian amidst black patches on a white surface: Grasping such complex images requires scanning with the eyes.
Once human vision—from eye movement to neural encoding—is better understood, it may be possible to develop efficient artificial aids for the visually impaired and to teach robots to catch up with humans in recognizing objects under challenging conditions.
More information: Liron Zipora Gruber et al, Oculo-retinal dynamics can explain the perception of minimal recognizable configurations, Proceedings of the National Academy of Sciences (2021). DOI: 10.1073/pnas.2022792118