A massively scalable architecture for providing a self-monitoring and
self-correcting storage system that is capable of handling hundreds of
millions of users and tens of billions of files. The system includes one
or more clusters storing data elements that are received from a plurality
of clients. Each cluster comprises a plurality of storage servers. The
storage system facilitates the addition of new storage servers, and the
fast recovery of failed storage servers, by logging system transactions in
multiple journals of different lengths. When a storage server fails, a
cluster backup determines the time of failure and replays one of the
journals in order to bring the failed storage server up to date.