A method and system for detecting and counting bytecode sequences in a data processing
system is provided. A bytecode tree data structure is used to represent sequences
of bytecodes. A bytecode sequence is a subset of consecutive bytecodes within the
set of bytecodes. The bytecode tree data structure contains a set of nodes in which
each node represents a bytecode in a bytecode sequence or subsequence and in which
a path through the bytecode tree data structure represents a bytecode sequence
or subsequence. Each node of the bytecode tree data structure records one or more
bytecode occurrence statistics for its corresponding bytecode in a set of bytecode
sequences or subsequences. In order to determine the frequency of occurrence of
common bytecode sequences and subsequences, a bytecode sequence tree data structure
is generated from a set of bytecode sequences. The bytecode sequence tree data
structure is then convolved into a bytecode subsequence occurrence tree data structure,
which is a union of all subtrees of the bytecode sequence tree data structure.