Method and apparatus for processing data sequences for the purpose of data
integrity assessment, data ownership demonstration and data
authentication. An input data sequence is processed to extract what could
be called naturally occurring digital watermarks. More appropriately,
these naturally occurring digital watermarks are referred to herein as
fileprints. Fileprints are data sequences that have been extracted from
the input data sequence in a repeatable manner, and are with high
probability unique to that input. In this sense, a fileprint can be used
to identify which data sequence it came from, just as a human fingerprint
can identify a particular person to which it belongs. Because the
fileprints have not been embedded, they are not actually digital
watermarks. However, fileprints can be used in a like fashion to digital
watermarks for information protection.