An archive cluster application runs in a distributed manner across a
redundant array of independent nodes. Each node preferably runs a
complete archive cluster application instance. A given nodes provides a
data repository, which stores up to a large amount (e.g., a terabyte) of
data, while also acting as a portal that enables access to archive files.
Each symmetric node has a set of software processes, e.g., a request
manager, a storage manager, a metadata manager, and a policy manager. The
request manager manages requests to the node for data (i.e., file data),
the storage manager manages data read/write functions from a disk
associated with the node, and the metadata manager facilitates metadata
transactions and recovery across the distributed database. The policy
manager implements one or more policies, which are operations that
determine the behavior of an "archive object" within the cluster. The
archive cluster application provides object-based storage. Preferably,
the application permanently associates metadata and policies with the raw
archived data, which together comprise an archive object. Object policies
govern the object's behavior in the archive. As a result, the archive
manages itself independently of client applications, acting automatically
to ensure that all object policies are valid.