A method of detecting web pages belonging to at least one similarity class
from a plurality of web pages includes determining clusters of the
plurality of web pages based on characteristics of the content of the web
pages. For each of the determined clusters, at least one metric is
determined indicative of similarity among resource locators associated
with the web pages of that cluster. A determination of web pages
belonging to the at least one similarity class is based on the determined
clusters and the determined similarity metrics.