Data destined for a client is compressed at a server in a manner that produces
a compressed data string that can be searched in its compressed state. The server
constructs a code table that assigns codes from a standard code set (e.g., ASCII
code set) that are normally unused to selected character pairs in the data string
(e.g., the most frequently occurring character pairs). During compression, the
selected character pairs are replaced with the corresponding codes. Identifiers
are inserted into the compressed data string to separate substrings. To search
the compressed data string at the client, a search query is compressed and compared
to the compressed substrings. The substring identifiers are used to quickly locate
each successive compressed substring. When a match is found, the matching substring
is decompressed by replacing the code in the compressed substring with the corresponding
character pair in the code table.