Techniques for retrieval of multimedia data through visual representations
are provided. Such visual representations, preferably in the form of
visual activity maps or spatio-temporal activity maps, serve as an
efficient and intuitive graphical user interface for multimedia
retrieval, particularly when the media streams are derived from multiple
sensors observing a physical environment. An architecture for interactive
media retrieval is also provided by combining such visual activity maps
with domain specific event information. Visual activity maps are derived
from the trajectories of motion of objects in the environment. The visual
activity map based techniques significantly help users in quickly and
effectively discovering interesting portions of the data, and randomly
accessing and retrieving the corresponding portions of the media streams.