The present invention relates to a descriptor for the representation, from a
video indexing viewpoint, of motions of a camera or any kind of observer or observing
device within any sequence of frames in a video scene. The motions are at least
one or several of the following basic operations: fixed, panning (horizontal rotation),
tracking (horizontal transverse movement) tilting (vertical rotation), booming
(vertical transverse movement), zooming (changes of the focal length), dollying
(translation along the optical axis) and rolling (rotation around the optical axis),
or any combination of at least two of these operations. Each of said motion types,
except fixed, is oriented and subdivided into two components that stand for two
different directions, and represented by means of an histogram in which the values
correspond to a predefined size of displacement. The invention also relates to
an image retrieval system in which a video indexing device uses said descriptor.