The present invention provides a method for the recognition of objects in
an image, where the objects may consist of an arbitrary number of parts
that are allowed to move with respect to each other. In the offline phase
the invention automatically learns the relative movements of the single
object parts from a sequence of example images and builds a hierarchical
model that incorporates a description of the single object parts, the
relations between the parts, and an efficient search strategy. This is
done by analyzing the pose variations (e.g., variations in position,
orientation, and scale) of the single object parts in the example images.
The poses can be obtained by an arbitrary similarity measure for object
recognition, e.g., normalized cross correlation, Hausdorff distance,
generalized Hough transform, the modification of the generalized Hough
transform, or the similarity measure. In the online phase the invention
uses the hierarchical model to efficiently find the entire object in the
search image. During the online phase only valid instances of the object
are found, i.e., the object parts are not searched for in the entire
image but only in a restricted portion of parameter space that is defined
by the relations between the object parts within the hierarchical model,
what facilitates an efficient search and makes a subsequent validation
step unnecessary.