Object recognition techniques are disclosed that provide both accuracy and
speed. One embodiment of the present invention is an identification
system. The system is capable of locating objects in images by searching
for local features of an object. The system can operate in real-time. The
system is trained from a set of images of an object or objects. The
system computes interest points in the training images, and then extracts
local image features (tokens) around these interest points. The set of
tokens from the training images is then used to build a hierarchical
model structure. During identification/detection, the system, computes
interest points from incoming target images. The system matches tokens
around these interest points with the tokens in the hierarchical model.
Each successfully matched image token votes for an object hypothesis at a
certain scale, location, and orientation in the target image. Object
hypotheses that receive insufficient votes are rejected.