A system, method and computer program product for identifying spam in an
image using grey scale representation of an image, including identifying
a plurality of contours in the image, the contours corresponding to
probable symbols (letters, numbers, punctuation signs, etc.); ignoring
contours that are too small or too large given the specified limits;
identifying text lines in the image, based on the remaining contours;
parsing the text lines into words; ignoring words that are too short or
too long, from the identified text lines; ignoring text lines that are
too short; verifying that the image contains text by comparing a number
of pixels of a symbol color within remaining contours to a total number
of pixels of the symbol color in the image; and if the image contains a
text, rendering a spam/no spam verdict based on comparing a signature of
the remaining text against a SPAM template.