Embodiments of the invention provide methods and apparatuses for
classifying electronic documents (e.g., electronic communications) as
either spam electronic documents or legitimate electronic documents. In
accordance with one embodiment of the invention, each of a plurality of
electronic communications is reduced to a corresponding multidimensional
vector based on a multi-dimensional vector space. The multi-dimensional
vectors represent corresponding electronic documents that have been
classified as at least one type of electronic documents. Subsequent
electronic documents to be classified are reduced to a corresponding
multi-dimensional vector inserted into the multi-dimensional vector
space. The electronic documents corresponding to an inserted
multi-dimensional vector are classified based upon the proximity of the
inserted multi-dimensional vector to at least one previously classified
multi-dimensional vectors of the multi-dimensional vector space.