An electronic mail system with a methodology providing distributed message
storage and processing is described. In particular, this methodology
breaks up how the individual components of message data are stored.
Message data itself is broken up into two parts: a metadata (mutable)
portion, and an immutable portion. The metadata portion represents that
part of the message data that may change over time. This includes message
status flags (e.g., the IMAP "message deleted" flag) and the message's
position within a particular message folder, among other information. The
immutable portion, which comprises the bulk of electronic mail data
(namely, the message itself), once stored is never edited. Immutable data
is written f+1 times on as many unique servers, to tolerate f number of
server failures using Lampson's stable storage algorithm. The metadata
portion is stored 2f+1 times on as many unique servers to tolerate f
number of server failures using quorum voting. Once the message has been
stored once, instead of being copied, its location is passed around by
reference. The system utilizes a two-tier architecture. One tier consists
of servers which store message metadata and immutable data, the Data
Servers, and servers that operating upon those data, the Access Servers.
Message store integrity is maintained in the event of server failure and
as the set of Data Servers changes. In the latter case, I/O and storage
workloads are dynamically redistributed across Data Servers in an
efficient way.