The disclosed technology facilitates recovery from storage-related
failures by checkpointing copy-on-write operation sequences. An operation
sequence incorporating such checkpoints into a copy-on-write can include
the following: receive a write request that identifies payload data to be
written to a first data store, read original data associated with the
first data store, copy the original data to a second data store, record
transactional information associated with the write request, generate a
first checkpoint to confirm the successful recordation of the
transactional information and the successful copying of the original data
to the second data store, write the payload data to the first data store,
acknowledge a successful completion of the copy-on-write operation
sequence, and generate a second checkpoint that confirms the successful
completion of such operation sequence. The first and second checkpoints
are used to form a pre-failure representation of one or more storage
units (or parts thereof). The checkpoints can be stored with other
transactional information, to facilitate recovery in the event of a
failure, and can be used to facilitate the use of optimizations to
process I/O operations.