A method for checkpointing a multithreaded application program, based on
the egalitarian and competitive active replication strategy. The
invention enables different threads to be checkpointed at different times
in such a way that the checkpoints restore a consistent state of the
threads at a new or recovering replica, even though the threads operate
concurrently and asynchronously. Separate checkpoints are generated for
the local state of each thread and for the data that are shared between
threads and are protected by mutexes. The checkpoint of the shared data
is communicated in a special message that also determines the order in
which the claims of mutexes are granted to the threads. A source-code
preprocessor tool is described for inserting code into an application
program to checkpoint the state of the thread during normal operation and
to restore the state of the thread from the checkpoint subsequently.