Logo

Problems with Machine Vision

Ever thought how hard it would be to get a computer to see? Ever thought about the complexities of the human eye? Hopefully this essay will enlighten you, or provoke your thinking, in these two areas.

Edge Detection

A essential part of vision is edge detection. Differentiating between objects is imperative in any vision system. Simple objects are easy to define such as spheres and cubes (see below), but even then, the computer is fooled by the shadows and the reflections.

Take an even more complicated object, such as a computer. The below example shows how hard a real life object would be to recognize even for the trained human mind, let alone a computer. Various areas of the computer come out well, the CPU and the modem are distinguishable, yet, the monitor (which is showing an 'infinite' image of itself) confuses the monitor edges detection completely. For a robot or AI software to be able to generalize and call that object a monitor will take incredible computation.

Why do edges help? Lets say you were using neural networks for a image recognition. Using grey values instead of RGB (red, green, blue) and stark white lines instead of subtle changes, accurate results would be attained a lot more.

Depth Perception

Once the edges of edges has been computed, its not much use if no idea of depth is known. How does your brain decipher the depth from the information its given? It compares points and their relative position in the given image, and 'calculates' how far away the given object it.

To prove this, try this example. Below is a picture of "generation5" taken at two different very similar angles. Focus your eyes so that an additional third image is formed in the centre of the two images. This will form a 3D image.

Imagine the processing power required to do this on a computer, it would be tremendous, and probably very slow. The only applications this could have it perhaps creating 3D models of terrain from satellite photography, and other applications where speed is not as important. A more practical method for robots would be perhaps use of laser as range finders. This would be a lot quicker, but the results would be coarser.

Dealing with Shadows and Shades

In the above edge detection picture, you can see how the computer thought that the shadows were shapes. If a successful Machine Vision program was ever to be created, a way of eliminating such problems would have to be devised. There currently a way to do that, but again it is slow. It involves mapping a RGB colour in a cube representing all colours, with one diagonal representing all intensities (basically all colours from white to black). Then using various properties of the plane, the hue of the colour can be determined. Again, this process is very slow, only useful in AI software packages, created for image analysis purposes. A faster algorithm has yet to be devised (as far as the author knows).
  • Introduction to Robotics - The basics of robots.
  • Problems with Machine Vision - An intro to the problems that image recognition faces.