The system includes an image display system, a direct annotation creation
module, an annotation display module, a vocabulary comparison module and
a dynamic updating module. These modules are coupled together by a bus
and provide for the direct multi-modal annotation of media of media
objects. The direct annotation creation module creates annotation
objects. The annotation display module works in cooperation with the
image display system to display the annotations themselves or graphic
representations of the annotation positioned relative to the images of
the objects. The system automatically creates the annotation, associates
it with the selected images, and displays either a graphic representation
of the annotation or a text translation of the audio input.