Methods and apparatus are provided for generating a decision trees using
linear discriminant analysis and implementing such a decision tree in the
classification (also referred to as categorization) of data. The data is
preferably in the form of multidimensional objects, e.g., data records
including feature variables and class variables in a decision tree
generation mode, and data records including only feature variables in a
decision tree traversal mode. Such an inventive approach, for example,
creates more effective supervised classification systems. In general, the
present invention comprises splitting a decision tree, recursively, such
that the greatest amount of separation among the class values of the
training data is achieved. This is accomplished by finding effective
combinations of variables in order to recursively split the training data
and create the decision tree. The decision tree is then used to classify
input testing data.