A method of anomaly detection applicable to telecommunications or retail
fraud or software vulnerabilities uses inductive logic programming to
develop anomaly characterization rules from relevant background knowledge
and a training data set, which includes positive anomaly samples of data
covered by rules. Data samples include 1 or 0 indicating association or
otherwise with anomalies. An anomaly is detected by a rule having
condition set which the anomaly fu,lfils. Rules are developed by addition
of conditions and unification of variables, and are filtered to remove
duplicates, equivalents, symmetric rules and unnecessary conditions.
Overfitting of noisy data is avoided by an encoding cost criterion.
Termination of rule construction involves criteria of rule length,
absence of negative examples, rule significance and accuracy, and absence
of recent refinement. Iteration of rule construction involves selecting
rules with unterminated construction, selecting rule refinements
associated with high accuracies, and iterating a rule refinement,
filtering and evaluation procedure to identify any refined rule usable to
test data. Rule development may use first order logic or Higher Order
logic.