A method of information extraction from a Web page using a broken wrapper,
includes using the wrapper to extract strings from the Web page parsed in
forward direction; analyzing the extracted strings according to a set of
rules for assigning labels associated with the wrapper; assigning labels
to those strings which satisfy the label rules; classifying the extracted
strings based on content features of the labeled extracted strings;
validating those labeled extracted strings which satisfy the label rules
within some threshold value.