A method for monitoring a plurality of servers in a cluster and taking
corrective action for the servers. A request to one of the servers is
sent. Then, a determination is made if the one server successfully
handles the request and how long it took for the one server to handle the
request. If a response is received indicating that the one server
successfully handled the request, but it took the one server longer than
a predetermined time period to handle the request, a dispatcher for the
one server is notified to reduce, but not eliminate, a workload of the
one server. There is specified a number of consecutive requests that can
be sent to a server and not handled by the server within a specified time
period for each of the requests; the number indicates that the server is
down. A request is sent to one of the servers, and a determination is
made that the one server did not successfully handle the request within
the specified time period. A determination is made that the number has
not yet been attained and therefore, no corrective action is taken. A
subsequent request is sent to the one server, and a determination is made
that the one server did not successfully handle the request within the
specified time period. A determination is made that the number has been
attained and therefore, corrective action is taken.