Each node of a failing distributed computer system, e.g., as a result of a
split-brain failure, races to achieve a quorum by successfully reserving
two shared storage devices which are designated quorum controllers. During
normal operation of the distributed computer system, each of the quorum
controllers is associated with and reserved by a respective node. During
the race for quorum in response to a detected failure of the distributed
computer system, each node which has not failed forcibly reserves the
quorum controller which is associated with the other node. If a node
simultaneously holds reservations for both quorum controllers, that node
has acquired a quorum. The forcible reservation of a shared storage device
does not fail even if another node holds a valid reservation to the same
storage device. Accordingly, a failed node which does not relinquish a
reservation to the node's quorum controller cannot prevent another node
from acquiring a quorum. Prior to forcibly reserving the quorum controller
of another node, each node verifies that it continues to hold a
reservation of the node's own associated quorum controller. If a node no
longer holds a reservation of the node's own associated quorum controller,
that node has lost the race for quorum since another node has already
forcibly reserved the former node's associated quorum controller. Thus,
quorum can be efficiently and effectively determined by independent nodes
of a failing distributed computer system notwithstanding the failure of a
failing node to relinquish shared storage device reservations held by the
failing node.