The present invention provides systems and methods that employ a
statistical distributional analysis to improve content search engine
search results. In particular, a substring and/or a string sequence
distributional algorithm can be applied to a set of queries to generate a
distributional characteristic (e.g., a profile) for the set of queries,
wherein the set is selected from a plurality of queries stored on a query
log. Typically, the queries are selected based on a substring of interest
and/or an identification of a user initiating searches. The
distributional characteristic can then be employed to determine a
distributional similarity measure that can be utilized in connection with
a search to facilitate search results via providing a mechanism to
determine synonymous search terms, spelling corrections/variations, and
facilitate collaborative filtering, for example. Thus, the present
invention employs a novel technique that mines and employs previous
queries to enhance the query search results.