Translation of text or messages provides a message that is more reliably
or efficiently analyzed for purposes as, for example, to detect spam in
email messages. One translation process takes into account statistics of
erroneous and intentional misspellings. Another process identifies and
removes characters or character codes that do not generate visible
symbols in a message displayed to a user. Another process detects symbols
such as periods, commas, dashes, etc., interspersed in text such that the
symbols do not unduly interfere with, or prevent, a user from perceiving
a spam message. Another process can detect use of foreign language
symbols and terms. Still other processes and techniques are presented to
counter obfuscating spammer tactics and to provide for efficient and
accurate analysis of message content. Groups of similar content items
(e.g., words, phrases, images, ASCII text, etc.) are correlated and
analysis can proceed after substitution of items in the group with other
items in the group so that a more accurate detection of "sameness" of
content can be achieved. Dictionaries are used for spam or ham words or
phrases. Other features are described.