A method and structure of searching a database containing hypertext
documents comprising searching the database using a query to produce a set
of hypertext documents; and geometrically clustering the set of hypertext
documents into various clusters using a toric k-means similarity measure
such that documents within each cluster are similar to each other, wherein
the clustering has a linear-time complexity in producing the set of
hypertext documents, wherein the similarity measure comprises a weighted
sum of maximized individual components of the set of hypertext documents,
and wherein the clustering is based upon words contained in each hypertext
document, out-links from each hypertext document, and in-links to each
hypertext document.