A web site provides services to uses over the Internet. End-user
service-level objectives (SLOs) such as availability and performance are
measured and reported to a SLO agent. User requests pass through several
tiers at the web site, such as a firewall tier, a web-server tier, an
application-server tier, and a database-server tier. Each tier has several
redundant service components that can process requests for that tier.
Local agents, operating with any local resource managers, monitor running
service components and report to a service agent. Node monitors also
monitor network-node status and report to the service agent. When a node
or service component fails, the service agent attempts to restart it using
the local agent, or replicates the service component to other nodes. When
the SLO agent determines that a SLO is not being met, it instructs the
service agent to replicate more of the constraining service components or
increase resources.