Indeed on the AIBO, the camera is the main source of sensory information, and as such, we placed a strong emphasis on the vision component of our team. Since computer vision is a current area of active research, there is not yet any perfect solution. As such, our vision module has undergone continual development over the course of this multi-year project. This lecture focusses on the progress made during our first year as an example of what can be done relatively quickly. During that time, the vision reached a sufficient level to support all of the localization and behavior achievements described in the rest of this lecture. Our progress since the first year is detailed in our 2004 and 2005 team technical reports , as well as a series of research papers Our vision module processes the images taken by the CMOS camera located on the AIBO. The module identifies colors in order to recognize objects, which are then used to localize the robot and to plan its operation.
Our visual processing is done using the established procedure of color segmentation followed by object recognition. Color segmentation is the process of classifying each pixel in an input image as belonging to one of a number of predefined color classes based on the knowledge of the ground truth on a few training images. Though the fundamental methods employed in this module have been applied previously (both in RoboCup and in other domains), it has been built from scratch like all the other modules in our team. Hence, the implementation details provided are our own solutions to the problems we faced along the way.
We have drawn some of the ideas from the previous technical reports of CMU [89]andUNSW[9]. This module can be broadly divided into two stages: (i) low-level vision, where the color segmentation and region building operations are performed and (ii) high-level vision, wherein object recognition is accomplished and the position and bearing of the various objects in the visual field are determined.