A selection system and method. The selection method comprises receiving,
by a computing system, a taxonomy of data related to a specified domain
of knowledge on the web. A taxonomy tree is constructed from the
taxonomy. A sub tree related to a sub-domain from specified domain is
selected from the taxonomy tree. A first list comprising user expected
universal resource locators (URLs) related to the sub-domain is received.
A second list comprising topic expressions defining each node of the
taxonomy sub-tree is generated. A query based on the second list is
generated. The query is applied on an index of URLs generated from a web
crawling process to generate a third list. A recall value is calculated
based on the first list and the third list.