A method and system for improving the reliability of a system, such as a
software system, is disclosed. "Service measurements" that are routinely
measured and monitored in connection with the operation of systems (e.g.,
QOS (Quality Of Service) measurements) are utilized in the reliability
architecture of the system. Service measurements include a number of
alternative data types, such as transactions completed, messages
received, messages sent, calls completed, bytes transmitted, jobs
processed, etc. Any operations of the system that are typically monitored
for other purposes can be utilized in the reliability architecture of the
present invention. Most systems track these types of statistics as, for
example, part of their billing procedures or part of their performance
bench-marking or QOS processes. Since these statistics are already kept,
it is very simple to analyze the statistics to create the historical
signatures, and then monitor the statistics of the currently operating
system to perform the signature checking process.