A system, method and computer program product for identifying spam in an
image, including (a) identifying a plurality of contours in the image,
the contours corresponding to probable symbols; (b) ignoring contours
that are too small or too large; (c) identifying text lines in the image,
based on the remaining contours; (d) parsing the text lines into words;
(e) ignoring words that are too short or too long from the identified
text lines; (f) ignoring text lines that are too short; (g) verifying
that the image contains text by comparing a number of pixels of a symbol
color within remaining contours to a total number of pixels of the symbol
color in the image, and that there is at least one text line after
filtration; and (h) if the image contains text, rendering a spam/no spam
verdict based on a contour representation of the text that which appears
after step (f).