A method for repairing a wrapper associated with an information source, includes
defining a classifier, based on content features of extracted and labeled information
using the wrapper, using the classifier to extract content information from the
file according to a set of classifier extraction rules; analyzing the extracted
content information according to the content features and assigning a label to
any extracted content information which satisfies the label's rules; and defining
a repaired wrapper as the classifier and those labels in the set which have been
assigned to extracted content information. Additional content information and labels
can be extracted by iteratively creating a classifier based on both content features
and structure features of extracted strings.