Systems and techniques to generate a term taxonomy for a collection of
documents and filling the taxonomy with documents from the collection. In
general, in one implementation, the technique includes: extracting terms
from a plurality of documents; generating term pairs from the terms;
ranking terms in each term pair based on a relative specificity of the
terms; aggregating the ranks of the terms in each term pair; selecting
term pairs based on the aggregate rankings; and generating a term
hierarchy from the selected term pairs.