A method for repairing a wrapper associated with an information source, includes defining a classifier, based on content features of extracted and labeled information using the wrapper, using the classifier to extract content information from the file according to a set of classifier extraction rules; analyzing the extracted content information according to the content features and assigning a label to any extracted content information which satisfies the label's rules; and defining a repaired wrapper as the classifier and those labels in the set which have been assigned to extracted content information. Additional content information and labels can be extracted by iteratively creating a classifier based on both content features and structure features of extracted strings.

 
Web www.patentalert.com

< Method of judging hydrogen embrittlement cracking of material used in high-temperature, high-pressure hydrogen environment

< Methods and systems for organizing information stored within a computer network-based system

> Machinist calculating apparatus

> Systems and methods for the discovery and presentation of electronic messages that are related to an electronic message

~ 00256