Automated taxonomy generation

In a hierarchical taxonomy of document, the categories of information may be structured as a binary tree with the nodes of the binary tree containing information relevant to the search. The binary tree may be `trained` or formed by examining a training set of documents and separating those documents into two child nodes. Each of those sets of documents may then be further split into two nodes to create the binary tree data structure. The nodes may be generated to maximize the likelihood that all of the training documents are in either or both of the two child nodes. In one example, each node of the binary tree may be associated with a list of terms and each term in each list of terms is associated with a probability of that term appearing in a document given that node. New documents may be categorized by the nodes of the tree. For example, the new documents may be assigned to a particular node based upon the statistical similarity between that document and the associated node.

Web www.patentalert.com

> Scheduling the recording of television programs

~ 00375