|
|
A method for processing semi-structured data. The method includes
receiving semi-structured data into a first format from a real business
process. Preferably, the semi-structured data are machine generated. The
method includes tokenizing the semi-structured data into a second format
and storing the semi-structured data in the second format into one or
more memories and clustering the tokenized data to form a plurality of
clusters. The method also includes identifying a selected low frequency
term in each of the clusters, and processing at least two of the clusters
and the associated selected low frequency terms to form a single template
for the at least two of the clusters. In a preferred embodiment, the
method replaces the selected low frequency term with a wild card
character.
|
|
|
< System and method for managing a database
> Network system, server, web server, web page, data processing method, storage medium, and program transmission apparatus
|
~ 00440
|