A method and a system to automatically segment text based on header tokens
is described. A relevance value and an irrelevance value are determined
for each token in a description, assuming no tokens are left out of
computations. The irrelevance value is based on occurrences of a token in
a sample set of descriptions. The relevance value is an estimated
probability of relevance based on the header of the description being
segmented.