A hash-optimized backup system and method takes data blocks and generates
a probabilistically unique digital fingerprint of the content of each
data block using a substantially collision-free algorithm. The process
compares the generated fingerprint to a database of stored fingerprints
and, if the generated fingerprint matches a stored fingerprint, the data
block is determined to already have been backed up, and therefore does
not need to be backed up again. Only if the generated fingerprint does
not match a stored fingerprint is the data block backed up, at which
point the generated fingerprint is added to the database of stored
fingerprints. Because the algorithm is substantially collision-free,
there is no need to compare actual data content if there is a hash-value
match. The process can also be used to audit software license compliance,
inventory software, and detect computer-file tampering such as viruses
and malware.