A method of discovering one or more patterns in two sequences of symbols
S.sub.1 and S.sub.2 includes the formation, for each sequence, of a
master offset table that groups for each symbol the position in the
sequence occupied by each occurrence of that symbol. The difference in
position between each occurrence of a symbol in one of the sequences and
each occurrence of that same symbol in the other sequence is determined
and a Pattern Map is formed. For each given value of a difference in
position the Pattern Map lists the position in the first sequence of each
symbol therein that appears in the second sequence at that difference in
position. The collection of the symbols tabulated for each value of
difference in position thereby defines a parent pattern in the first
sequence that is repeated in the second sequence.A computer readable
medium having instructions for controlling a computer system to perform
the method and a computer readable medium containing a data structure
used in the practice of the method are also disclosed.