Methods and systems are described for reducing the number of false alarms
in fault correlation software used to detect and diagnose faults in computer networks
and similar systems. The fault correlation software includes rules that monitor
a number of indicators that, if occurring together over a window of time, are known
to cause or reflect the occurrence of a fault. The method involves monitoring the
transition of these indicators from one state to another over the time window and
determining the extent of the correlation of the transitions of the indicators.
The determination that indicators monitored by a rule do not correlate closely
in their transitions is used to reduce the likelihood of the rule finding correlation
of the indicators as a whole. This in turn reduces the number of false alarms which
the rule-based system might otherwise have transmitted.