Disclosed is a mechanism for handling failover of a data management
application for a shared disk file system in a distributed computing
environment having a cluster of loosely coupled nodes which provide
services. According to the mechanism, certain nodes of the cluster are
defined as failover candidate nodes. Configuration information for all
the failover candidate nodes is stored preferably in a central storage.
Message information including but not limited to failure information of
at least one failover candidate node is distributed amongst the failover
candidate nodes. By analyzing the distributed message information and the
stored configuration information it is determined whether to take over
the service of a failure node by a failover candidate node or not. After
a take-over of a service by a failover candidate node the configuration
information is updated in the central storage.