In a method of generating an anomaly detection model for classifying
activities of a computer system, using a training set of data
corresponding to activity on the computer system, the training set
comprising a plurality of instances of data having features, and wherein
each feature in said plurality of features has a plurality of values. For
a selected feature and a selected value of the selected feature, a
quantity is determined which corresponds to the relative sparsity of such
value. The quantity may correspond to the difference between the number
occurrences of the selected value and the number of occurrences of the
most frequently occurring value. These instances are classified as
anomaly and added to the training set of normal data to generate a rule
set or other detection model.