Why AI fails to reproduce human vision
Toronto: While computers may be able to spot a familiar face or an oncoming vehicle faster than the human brain, their accuracy is questionable.
Toronto: While computers may be able to spot a familiar face or an oncoming vehicle faster than the human brain, their accuracy is questionable.
Computers can be taught to process incoming data, like observing faces and cars, using artificial intelligence (AI) known as deep neural networks or deep learning. This type of machine learning process uses interconnected nodes or neurons in a layered structure that resembles the human brain.
The key word is “resembles” as computers, despite the power and promise of deep learning, have yet to master human calculations and crucially, the communication and connection found between the body and the brain, specifically when it comes to visual recognition, according to a study led by Marieke Mur, a neuroimaging expert at Western University in Canada.
“While promising, deep neural networks are far from being perfect computational models of human vision,” said Mur.
Previous studies have shown that deep learning cannot perfectly reproduce human visual recognition, but few have attempted to establish which aspects of human vision deep learning fails to emulate.
The team used a non-invasive medical test called magnetoencephalography (MEG) that measures the magnetic fields produced by a brain’s electrical currents. Using MEG data acquired from human observers during object viewing, Mur and her team detected one key point of failure.
They found that readily nameable parts of objects, such as “eye,” “wheel,” and “face,” can account for variance in human neural dynamics over and above what deep learning can deliver.
“These findings suggest that deep neural networks and humans may in part rely on different object features for visual recognition and provide guidelines for model improvement,” said Mur.
The study shows deep neural networks cannot fully account for neural responses measured in human observers while individuals are viewing photos of objects, including faces and animals, and has major implications for the use of deep learning models in real-world settings, such as self-driving vehicles.
“This discovery provides clues about what neural networks are failing to understand in images, namely visual features that are indicative of ecologically relevant object categories such as faces and animals,” said Mur.
“We suggest that neural networks can be improved as models of the brain by giving them a more human-like learning experience, like a training regime that more strongly emphasises behavioural pressures that humans are subjected to during development.”
For example, it is important for humans to quickly identify whether an object is an approaching animal or not, and if so, to predict its next consequential move. Integrating these pressures during training may benefit the ability of deep learning approaches to model human vision.