Nodes of a web graph are distributed over a cluster of computers. Tables
distributed over the computers map source (destination) locations to
lists of destination (source) locations. To accommodate traversing
hyperlinks forward, a table maps the location of a web page "X" to
locations of all the web pages "X" links to. To accommodate traversing
hyperlinks backward, a table maps the location of a web page "Y" to
locations of all web pages that link to Y. URLs identifying web pages are
mapped to fixed-sized checksums, reducing the storage required for each
node, while providing a way to map a URL to a node. Mapping is chosen to
preserve information about the web server component of the URL. Nodes can
then be partitioned across the machines in the cluster such that nodes
corresponding to URLs on the same web server are assigned to the same
machine in the cluster.