A method for repairing a wrapper associated with an information source, includes defining a classifier, based on content features of extracted and labeled information using the wrapper, using the classifier to extract content information from the file according to a set of classifier extraction rules; analyzing the extracted content information according to the content features and assigning a label to any extracted content information which satisfies the label's rules; and defining a repaired wrapper as the classifier and those labels in the set which have been assigned to extracted content information. Additional content information and labels can be extracted by iteratively creating a classifier based on both content features and structure features of extracted strings.

 
Web www.patentalert.com

> Presence information based on media activity

~ 00303