A system and method for unorchestrated determination of data sequences
using "sticky byte" factoring to determine breakpoints in digital
sequences such that common sequences can be identified. Sticky byte
factoring provides an efficient method of dividing a data set into pieces
that generally yields near optimal commonality. This is effectuated by
employing a rolling hashsum and, in an exemplary embodiment disclosed
herein, a threshold function to deterministically set divisions in a
sequence of data. Both the rolling hash and the threshold function are
designed to require minimal computation. This low overhead makes it
possible to rapidly partition a data sequence for presentation to a
factoring engine or other applications that prefer subsequent
synchronization across the data set.