A method for computing matching windows for delta compression. The method
computes pairs of matching source and target data segments irrespective
of target data segment size or file offset. The method includes (1)
representing a large source data file by a sequence of fixed-size
segments; (2) computing a signature for each data segment using its
contents such that, with a strong likelihood, two segments are the same
if their signatures match; (3) parsing target data using a prefix
matching method on such a sequence of signatures of source data to
compute matching sequences of segments; and (4) merging closely matched
segments as necessary to form matching windows.