A cascaded processing approach is used to clean noisy electronic mail or
other text messaging data. Non-text filtering is first performed on the
noisy data to filter out non-text items in the data. Text normalization
is then performed on the filtered data to provide cleaned data. The
cleaned data can be used in one or more of a wide variety of other
applications or processing systems.