The text format of input data is checked, and is converted into a system-manipulated format. It is further determined if the input data is in an HTML or e-mail format using tags, heading information, and the like. The converted data is divided into blocks in a simple manner such that elements in the blocks can be checked based on repetition of predetermined character patterns. Each block section is tagged with a tag indicating a block. The data divided into blocks is parsed based on tags, character patterns, etc., and is structured. A table in text is also parsed, and is segmented into cells. Finally, tree-structured data having a hierarchical structure is generated based on the sentence-structured data. A sentence-extraction template paired with the tree-structured data is used to extract sentences.

 
Web www.patentalert.com

< Audio equipment and control method of audio equipment

< Image transmission device and method, transmitting device and method, receiving device and method, and robot apparatus

> Communication terminal equipment and call incoming control method

> Network system, network server and terminal device

~ 00286