A two-class analysis system for summarizing features and determining
features appropriate to use in training a classifier related to a data
mining operation. Exemplary embodiments describe how to select features
which will be suited to training a classifier used for a two-class text
classification problem. Bi-Normal Separation methods are defined wherein
there is a measure of inverse cumulative distribution function of a
standard probability distribution and representative of a difference
between occurrences of the feature between said each class. In addition
to training a classifier, the system provides a means of summarizing
differences between classes.