A method and a system to automatically segment text based on header tokens is described. A relevance value and an irrelevance value are determined for each token in a description, assuming no tokens are left out of computations. The irrelevance value is based on occurrences of a token in a sample set of descriptions. The relevance value is an estimated probability of relevance based on the header of the description being segmented.

 
Web www.patentalert.com

< Contextual content publishing system and method

> Multi-pass data organization and automatic naming

~ 00446