A system for indexing a document including: receiving a document to be
processed for inclusion in an index of documents; locating a set of
documents that include hyperlinks to the document; retrieving anchortext
associated with each hyperlink; parsing the anchortext into one or more
tokens. Then for each token the following acts are performed: determining
a weight for the token, determining whether the weight assigned to the
token exceeds a threshold token weight; and indexing the document under
the token, if the token weight assigned to the token exceeds the
threshold token weight.