The present invention provides a method and system for providing a reset
after an operating system (OS) hang condition in a computer system, the
computer system including an interrupt handler not accessible by the OS.
The method includes determining if an interrupt has been generated by a
watchdog timer; monitoring for an OS hang condition by the interrupt
handler if the interrupt has been generated and after it is known that the
OS is operating; and resetting the OS if a device driver within the OS has
not set a bit in a register, the bit for indicating that the OS is
operating. The method and system in accordance with the present invention
uses existing hardware and software within a computer system to reset the
OS. The present invention uses a method by which a critical hardware
watchdog periodically wakes a critical interrupt handler of the computer
system. The critical interrupt handler determines if the OS is in a hang
condition by polling a share hardware register that a device driver,
running under the OS, will set periodically. If the critical interrupt
handler does not see that the device driver has set the register bit, it
will assume the OS has hung and will reset the system. In addition, the
critical interrupt handler will store the reset in non-volatile memory.
The reset can be logged into the system error log. Because the method and
system in accordance with the present invention uses existing hardware and
software within the computer system, instead of requiring an additional
processor, it is cost efficient to implement while also providing a reset
of the OS without human intervention.