A high-speed method for maintaining a summary of thread activity reduces the
number
of remote-memory operations for an n processor, multiple node computer system from
n2 to (2n-1) operations. The method uses a hierarchical summary of-thread-activity
data structure that includes structures such as first and second level bit masks.
The first level bit mask is accessible to all nodes and contains a bit per node,
the bit indicating whether the corresponding node contains a processor that has
not yet passed through a quiescent state. The second level bit mask is local to
each node and contains a bit per processor per node, the bit indicating whether
the corresponding processor has not yet passed through a quiescent state. The method
includes determining from a data structure on the processor's node (such as a second
level bitmask) if the processor has passed through a quiescent state. If so, it
is then determined from the data structure if all other processors on its node
have passed through a quiescent state. If so, it is then indicated in a data structure
accessible to all nodes (such as the first level bitmask) that all processors on
the processor's node have passed through a quiescent state. The local generation
number can also be stored in the data structure accessible to all nodes. If a processor
determines from this data structure that the processor is the last processor to
pass through a quiescent state, the processor updates the data structure for storing
a number of the current generation stored in the memory of each node.