A system and method indexes an image database by partitioning an image thereof
into a plurality of cells, combining the cells into intervals and then spots according
to perceptual criteria, and generating a set of spot descriptors that characterize
the perceptual features of the spots, such as their shape, color and relative position
within the image. The shape preferably is a derivative of the coefficients of a
Discrete Fourier Transform (DFT) of the perimeter trace of the spot. The set of
spot descriptors forms as an index entry for the spot. This process repeated for
the various images of the database. To search the index, a key comprising a set
of spot descriptors for a query image is generated and compared according to a
perceptual similarity metric to the entries of the index. The metric determines
the perceptual similarity that the features of the query image match those of the
indexed image. The search results are presented as a scored list of the indexed
images. A wide variety of image types can be indexed and searched, including: bi-tonal,
gray-scale, color, "real scene" originated, and artificially generated images.
Continuous-tone "real scene" images such as digitized still pictures and video
frames are of primary interest. There are stand alone and networked embodiments.
A hybrid embodiment generates keys locally and performs image and index storage
and perceptual comparison on a network or web server.