A system and method for acquiring, processing, and comparing an image with
a stored image to determine if a match exists. In particular, the system
refines the image data associated with an object based on pre-stored
color values, such as flesh tone color. The system includes a storage
element for storing flesh tone colors of a plurality of people, and a
defining stage for localizing a region of interest in the image. A
combination stage combines the unrefined region of interest with one or
more pre-stored flesh tone colors to refine the region of interest based
on color. This flesh tone color matching ensures that at least a portion
of the image corresponding to the unrefined-region of interest having
flesh tone color is incorporated into the refined region of interest.
Hence, the system can localize the head, based on the flesh tone color of
the skin of the face in a rapid manner. According to one practice, the
refined region of interest is smaller than or about equal to the
unrefined region of interest. This method and apparatus are particularly
adapted to consumer devices such as hand-held devices and cars.