A method includes deriving a plurality of semantic categories for representing
important semantic cues in images, where each semantic category is modeled through
a combination of perceptual features that define the semantics of that category
and that discriminate that category from other categories; for each semantic category,
forming a set of the perceptual features comprising required features and frequently
occurring features; comparing an image to said semantic categories; and classifying
said image as belonging to one of said semantic categories if all of the required
features and at least one of the frequently occurring features for that semantic
category are present in said image. A database contains image information, where
the image information includes at least one of already classified images, network
locations of already classified images and documents containing already classified
images. The database is searched for images matching an input query, comprising,
e.g., an image, text, or both.