A method and system for determining relatedness of images of pages based
on link and page layout analysis. A link analysis system determines
relatedness between images by first identifying blocks within web pages,
and then analyzing the importance of the blocks to web pages, web pages
to blocks, and images to blocks. Based on this analysis, the link
analysis system determines the degree to which each image is related to
each other image. The link analysis system may also use the relatedness
of images to generate a ranking of the images. The link analysis system
may also generate a vector representation of the images based on their
relatedness and apply a clustering algorithm to the vector
representations to identify clusters of related images.