A method for extracting information from a corpus of data includes
specifying a topic and a query term associated with the topic, and
defining adjunct terms which may occur in the corpus in a context of the
query term, the adjunct terms comprising one or more off-topic terms.
Occurrences of the query term are found in the corpus, the occurrences
including at least one occurrence of the query term together with at
least one of the off-topic terms in the context of the query term. The at
least one occurrence of the query term is classified as non-relevant to
the topic responsively to the occurrence of the at least one of the
off-topic terms in the context of the query term.