To better understand the meaning and, above all, the complexity of machine vision, we believe it is appropriate to take a step back and analyze what can be defined as the theories of vision. In fact, what has led, over the centuries, to the development of today's vision systems has deep historical roots that deserve to be analyzed in order to have an overall picture of the matter.
The eye and consequently the vision have been the subject of very contrasting interpretations since ancient times and throughout history.
Plato, who was a firm believer in an "active action of the eye", wrote that light emanated from the eye itself, identifying and enveloping objects with its rays. Theophrastus, who had been a disciple of Aristotle, was of the same opinion and wrote that the eye had "the fire inside", moving away from the ideas of his master, who instead thought it was the eye that received the rays.
The emissionist theory
Around 300 BC, Euclid wrote "Optica", which reported the results of his studies on the properties of light. He hypothesized that it traveled in a straight line, described the laws of reflection, studied them mathematically and argued that visual rays are emitted by the eye to capture the objects that are being observed.
Ptolemy and Galen further explored the refraction of light and also supported the emissionist theory according to which objects are seen by the rays of light that emerge from the eyes. Galen's great medical authority allowed his theory to exert considerable influence in Europe for much of the following thousand years. Proponents of the emissionist theory bore two pieces of evidence to support their belief:
- The habit of greeting Greek soldiers, who put their hands in front of their eyes, stemmed from the need to protect themselves from the powerful light emanating from the eyes of their commanders;
- The light from the eyes of some animals, such as cats, which allows them to see in the dark.
The eye was also a topic of particular interest for medieval Islamic medicine and philosophy, which were obviously influenced by the treatises written by Galen. Numerous specialist treatises on ophthalmology appeared between the ninth and fourteenth centuries, and leading scholars, such as Al-Kindi and Hunain Ibn Ishaq, favored the theory of vision loss.
However, both scholars began to pay particular attention to the anatomy of the eye and to investigate the role of important parts such as the retina and the crystalline lens.
Criticisms to the emissionist theory
Avicenna, a famous Persian philosopher and scientist, offered a systematic critique of the Galenic theory of the eye and while keeping faith with the results of Galen's anatomical studies of hollow nerves and crystalline lenses, at some point he began to express some perplexities about the emissionist theory, gradually approaching the concept of the intrusion of light into the eyes.
At the beginning of the 10th century, the great clinician of Baghdad, Al-Razi (Rhazes), noticed the contraction and dilation of the pupil, a century later, Al-Haythan (Alhazen) noted in his "Book of Optics" that the eye was hurt by a strong light and both scholars began to think that it was the light that hit the eye and not vice versa. Islamic literature was translated from Arabic into Latin in the period between the eleventh and thirteenth centuries, as a result medieval European doctors had much to discuss and investigate.
Renaissance anatomists studied the anatomy of the eye extensively and Leonardo da Vinci, for example, substantially transformed his theory of vision, having first supported the emissionist theory but then approached the intromissionist theory.
The intromissionist theory
The theory of intromission was already proposed by Democritus, who lived in Greece between the 4th and 3rd centuries BC, then by Lucretius (1st century BC), to be finally concretized by the Arab scholar Alhazen, who lived around the year one thousand. The intromissionist theory suggests that vision is the result of various types of substances traveling into the eye, with nothing coming out of it. In the Renaissance, technological advances made important new contributions that allowed the progressive affirmation of the intromissionist theory.
For example, the development of linear perspective in painting, the better understanding of the anatomy of the eye, the recognition of the real shape of the lens, the study and realization of the darkroom and finally the study of spectacle lenses played an important role. All these advances provided the essential ingredients for Kepler's theory of retinal images, published in 1604.
It is at this point that proponents of emissionist theory pose the question of why we don't see everything reversed and upside down if the eye behaves like a darkroom and this inability to explain perception has haunted science ever since. Proponents of the theory of interference began to argue that all images, real and virtual, somehow end up being realized inside the
Combination of emissionist and intromissionist theory
In his study of the intellectual development of children, Piaget (1896-1980) found that the majority of children between the ages of 10 and 11 think that sight involves an influence that moves outward from the eyes. Eighty percent of children between the ages of 8 and 9 agree that vision involves the movement, whether inward or outward, of rays, energy or something else indefinable.
In the same age group, 75% said they could perceive the gaze of others and 38% said they could perceive the gaze of animals. All of this carries a significant correlation between adult people's belief in the ability to perceive gazes and their belief that something goes out of their eyes when people look at them.
The belief in the ability to perceive the gaze of others increases with age, with 92% of older children and adults being convinced that they feel observed, even if they do not see the person who is watching them. The belief in the ability to perceive gazes, of high level among children and adults, appears to increase with age.
The computational approach
The vision has baffled scientists and philosophers for centuries and continues to do so. According to scholars of computational theory, man must consider himself an information processor, for which knowledge and understanding, in a broad sense, are a complex series of processes that lead to the construction of various representations of reality. The visual processes that lead to the complete realization of vision are also very complicated.
David Marr, an English psychologist, identifies, for the total realization of the visual process, the completion, by man, of processes based in succession on three levels, called computational, algorithmic and implementation. It also demonstrated that even a low-level computational process, such as visual perception, is actually very complex.
Vision is therefore a cognitive system, organized into subsystems, defined as a black box that receives input (input), a pair of retinal images and produces output (output), a description of the objects contained in that image; the subsystems in this case could be color perception, size detection, surface type recognition and so on.
Vision theories: conclusions
In light of the results of most recent research, the process related to vision would be quite complex. If the thesis linked to the computational approach is accepted, in order to completely solve the process linked to vision, it is also necessary to accept the algorithmic and implementation approaches.
Understanding the theory at the algorithmic level is the psychologist's task, while, at the implementation level, it is a question of correlating the descriptions and representations resulting from the previous levels with the available data relating to the biology of the brain and to the functioning of neurophysiological mechanisms.
For a complete understanding of the visual process, therefore, a coordinated and collaborative action of philosophy, psychology, information technology and neuroscience would be necessary.