In the context of applications such as finding messages dealing with a
particular topic, or finding inter-conversation topic groupings via
centroid-based clustering methods, the essential text of a first message
is adjusted to avoid vector distance distortions based on differences in
quoting styles. Text is deleted from the first message if that text
constitutes an entire prefixed or suffixed second message (typically a
parent message), while selective quotes in the first message are included
in the adjusted message because these are considered to form a logical
pan of the message. When the first text does not contain any quoting
portions of the second text, an analysis is done to determine whether all
or part of a second text constitutes a logical reference to the first
message. If so, all or some parts of the essential text of the second
(parent) message are included in the adjusted message.