A method for searching a document collection includes providing an index of terms indicating the documents in which the terms appear. A first statistical distribution of each of at least some of the terms in the index and a second statistical distribution of each of at least some of the categories are estimated a over the documents in the collection. A query including one or more of the terms and a category restriction referring to at least one of the categories is accepted. A modified term distribution is produced by operating on the first statistical distribution of at least one of the terms in the query using the second statistical distribution, responsively to the category restriction. The query is applied to the index to return a response, in which occurrences of the at least one of the terms are scored responsively to the modified term distribution.

 
Web www.patentalert.com

< Efficient data aggregation operations using hash tables

> Efficient cascaded lookups at a network node

~ 00448