The subject invention provides for a feedback loop system and method that
facilitate classifying items in connection with spam prevention in server
and/or client-based architectures. The invention makes uses of a
machine-learning approach as applied to spam filters, and in particular,
randomly samples incoming email messages so that examples of both
legitimate and junk/spam mail are obtained to generate sets of training
data. Users which are identified as spam-fighters are asked to vote on
whether a selection of their incoming email messages is individually
either legitimate mail or junk mail. A database stores the properties for
each mail and voting transaction such as user information, message
properties and content summary, and polling results for each message to
generate training data for machine learning systems. The machine learning
systems facilitate creating improved spam filter(s) that are trained to
recognize both legitimate mail and spam mail and to distinguish between
them.