An information extraction model is trained on format features identified within labeled training documents. Information from a document is extracted by assigning labels to units based on format features of the units within the document. A begin label and end label are identified and the information is extracted between the begin label and the end label. The extracted information can be used in various document processing tasks such as ranking.

 
Web www.patentalert.com

< System and process for presenting search results in a histogram/cluster format

> Model based optimization with focus regions

~ 00485