A method for estimating similarity between two collections of information
is described herein. The method includes comparing a first Bloom filter
representing a first collection of information and a second Bloom filter
representing a second collection of information, and determining a
measure of similarity between the first collection of information and the
second collection of information based on the comparing.