A system and method for identifying crawl paths of a web cruise operation,
where each crawl path represents successive uniform resource locator
(URL) nodes in a parent/child relationship. One or more seed URLs are
identified for the web cruise operation, each seed URL defining an
origination of at least one crawl path. A set of attributes of each
parent URL in each crawl path are identified to be inherited by one or
more child URLs found in the web cruise operation. Then, each child URL
is associated with the set of attributes identified for all parent URLs
in the crawl path.