A robot visuoauditory system that makes it possible to process data in real time
to track vision and audition for an object, that can integrate visual and auditory
information on an object to permit the object to be kept tracked without fail and
that makes it possible to process the information in real time to keep tracking
the object both visually and auditorily and visualize the real-time processing
is disclosed. In the system, the audition module (20) in response to sound
signals from microphones extracts pitches therefrom, separate their sound sources
from each other and locate sound sources such as to identify a sound source as
at least one speaker, thereby extracting an auditory event (28) for each
object speaker. The vision module (30) on the basis of an image taken by
a camera identifies by face, and locate, each such speaker, thereby extracting
a visual event (39) therefor. The motor control module (40) for turning
the robot horizontally. extracts a motor event (49) from a rotary position
of the motor. The association module (60) for controlling these modules
forms from the auditory, visual and motor control events an auditory stream (65)
and a visual stream (66) and then associates these streams with each other
to form an association stream (67). The attention control module (6)
effects attention control designed to make a plan of the course in which to control
the drive motor, e.g., upon locating the sound source for the auditory event and
locating the face for the visual event, thereby determining the direction in which
each speaker lies. The system also includes a display (27, 37, 48, 68) for
displaying at least a portion of auditory, visual and motor information. The attention
control module (64) servo-controls the robot on the basis of the association
stream or streams.