Disclosed is a software application reliability and availability tracking
and reporting mechanism that collects event data from target computers,
analyzes the data, and produces reliability and availability reports. A
network administrator specifies target computers for which event data are
collected. The collected event data along with a reliability model are
provided to a reliability and availability analysis engine. Output from
the engine includes reliability and availability data expressed as
durations of time spent in each state and as associations with the
events. The reliability and availability data are fed to a report
generator which computes reliability and availability metrics. The
metrics are used to generate reports that can be interpreted by the
network administrator without the need for specialized data analysis
skills. The metrics are also aggregated to provide historical and
relative ranking reliability and availability data useful for planning
and tracking against reliability and availability objectives.