The present invention relates to a system and methodology that applies
automated learning procedures for determining document relevance and
assisting information retrieval activities. A system is provided that
facilitates a machine-learned approach to determine document relevance.
The system includes a storage component that receives a set of human
selected items to be employed as positive test cases of highly relevant
documents. A training component trains at least one classifier with the
human selected items as positive test cases and one or more other items
as negative test cases in order to provide a query-independent model,
wherein the other items can be selected by a statistical search, for
example. Also, the trained classifier can be employed to aid an
individual in identifying and selecting new positive cases or utilized to
filter or re-rank results from a statistical-based search.