Disclosed are a system, a method, and a computer program product to efficiently
create consistent transaction sets to maintain one or more copies of data at different
data storage sites. All transactions sent to a primary backup appliance during
a consistent transaction set creation interval are formed into a consistent transaction
set by efficiently adding new transactions as they are received and removing unnecessary
transfers as newer versions arrive. When the creation interval has expired, the
complete consistent transaction set is transferred to a secondary backup appliance
to be used to update a consistent backup copy of the primary site data. For each
consistent transaction set, there will be a tree data structure (a search tree)
created that contains the addressing information for all of the blocks of data
in the consistent transaction set. The tree data structure used is a modified splay
tree, which is a specialization of a binary search tree such that accessed nodes
are "percolated" to the top of the tree for faster subsequent access. Secondary
data consistency is maintained because the consistent transaction sets are applied
whole at the secondary site, and after application, the secondary volumes are exact
copies of the primary volumes at the time the consistent transaction set was completed.