I/O requests from hosts in a data storage system are blocked or
rate-restricted upon detection of an unbalanced or overload condition in
order to prevent timeouts by host computers, and achieve an aggregate
reduction of data access latency. The blockages are generally of short
duration, and are transparent to hosts, so that host timeouts are
unlikely to occur. During the transitory suspensions of new I/O requests,
server queues shorten, after which I/O requests are again enabled.