An apparatus and method for a computer system is used for implementing an
extended distributed recovery block fault tolerance scheme. The computer
system includes a supervisory node, an active node and a standby node.
Each of the nodes has a primary routine, an alternate routine and an
acceptance test for testing the output of the routines. Each node also
includes a device driver, a monitor and a node manager for determining
the operational configuration of the node. The supervisory node
coordinates the operation of the active and standby nodes. The primary
and alternate routines are implemented with an application task through a
plurality of agent objects operating as finite state machines. A reliable
data link extends between the monitors of the active and standby nodes.