In his epic 1968 film 2001: A Space Odyssey, Stanley Kubrick incorporated unforgettable scenes in which the sentient computer, HAL, watches Dave, the scientist on board the Discovery One spacecraft. This portrayal of a how a machine perceives the world is ultimately defined by our own perception of it.
In practice, we have built and taught AI to understand the world as if looking through our own eyes. As a result, most artificial vision systems rely on cameras that produce images intended for humans to begin with, which then form the basis for training a neural network. However, forcing robotic eyes to see through our own cognitive interpretation might actually impair its true potential.
But what alternatives could robots use? It’s hard to think about world perception beyond our own experience. Biologic organisms’ ability to comprehend reality was shaped by survival necessities and molded by evolution over hundreds of millions of years. However, AI is not confined to this construct.
Research published in the journal Advanced Intelligent Systems tries to address this issue by proposing a new approach to artificial vision. It aims to implement an intelligent visual perception apparatus that mimics biologic retinal cells and their connecting neurons on a fundamental level. In addition, it incorporates a small, hardware-based artificial neural network intended to perform rudimentary tasks similar to the elementary functions of the visual cortex.
How “smart” do these robotic eyes need to be?
Robotic eyes with artificial visual perception are key to some important technologies, such as automotive safety systems, industrial fabrication, and even advanced medical equipment. Such platforms are usually expensive since they are based on a camera to captures images that are processed by a complex AI algorithm running over a strong CPU. Moreover, the AI must undergo extensive offline training with large, detailed databases.
But do all intelligent machine applications require high-end hardware with high-level cognition in their learning process?
Abstract comprehension could be enough in some applications, where AI only needs to make basic or sweeping assumptions. For example, identifying a ball or round object can suffice in certain situations, without requiring the system to differentiate between a basketball and a baseball. Such an AI does not need to rely on a CPU and would be considerably cheaper. Moreover, it may be operated even without having to rely on a camera. For instance, abstract vision may be used to simply identify defected, disproportional balls.
The answer to the previous question thus directly affects the cost and complexity of AI systems. In the first case, a large artificial neural network is required while the second may be addressed by fabricating small and cheap building blocks that function in parallel. Referring to the previous example, both basketballs and baseballs can have unique, differentiating features.
A high-comprehension-level network must account for such nuances and learn to classify them correctly. The amount of information fed into the AI can therefore be quite large. As a result, network size and complexities can grow rapidly and the associated energy expenditure increase even faster.
On the other hand, an AI with abstract comprehension may be small and simple, designed to merely identify whether or not the captured image contains a shape with a single symmetry. A cluster of such entities can tell when a ball-shaped object is being presented by identifying multiple axes of symmetry within the image. Thus, the integrated decision from all the individual units produces a relatively complicated answer. Such a system may be implemented using dedicated and minimalistic hardware, as was shown in the current study.
Ubiquitous intelligent vision systems usually adhere to the high-level approach and have to rely on software, where the AI is implemented as a learning algorithm. Those algorithms must go through a preliminary training process before it can give its own predictions.
Object-oriented programming provides many degrees of freedom and allows for a straightforward implementation, where code entities take the role of artificial neurons. These neurons are represented by a mathematical function that contains a very large number of multiplication and summation operations. Those complex algorithms require strong CPUs that consume a lot of power to process the interaction between thousands of multi-variable software-neurons.
As for abstract-level intelligence, dedicated neural processors may be used. Such computers are composed of hardware-based neurons orchestrated to cooperatively perform tasks that surpass their inherent level of sophistication. In this light, the research demonstrated a cheap and simple implementation, used to control a robotic vehicle, with merely four hardware neurons that could be trained on-the-fly. In addition, bioinspired image acquisition allowed considerably reducing the input data size.
These concepts were demonstrated using a prototype vision platform that was used to maneuver a small robotic vehicle. It was based on a micro-controller and fully-programmable integrated circuit. The system incorporated a neural processor that was trained on-the-fly to associate a set of hieroglyphs with motor control instructions. These hieroglyphic symbols were linked to commands such as “go forward and then turn right” or “go back and then turn left”.
So do robotic eyes need to “see” through a camera that was designed to satisfy human perception?Honestly, no. And lifting this limitation may allow them to move into areas previously unthought of.
Dan Berco, Chih-Hao Chiu and Diing Shenp Ang, Hieroglyphically-supervised Bioinspired Visio-neural Controller, Advanced Intelligent Systems (2022), DOI:aisy.202200066
Disclaimer: The author of this article was involved in the study