A method and apparatus for identifying the focus of a document, in a
natural language processing application, the natural language processing
application comprising a hierarchical concept tree having a plurality of
nodes, each node being associated with a term, the method comprising the
steps of: mapping an input document to nodes in a concept tree to
determine a number of occurrences of a term in the input document which
also occur at a node in the concept tree; weighting each node in the
concept tree, depending on the determined number of occurrences of the
term in the input document and a determined value assigned to each node
in the concept tree; traversing the concept tree to identify a heaviest
weighted path, in dependence on the weighting of each node in the concept
tree; and determining the focus of the input document by identifying a
node having the heaviest weight along the most heavily-weighted path.