A system and method for producing semantically-rich representations of
texts to amplify and sharpen the interpretations of texts. The method
relies on the fact that there is a substantial amount of semantic content
associated with most text strings that is not explicit in those strings,
or in the mere statistical co-occurrence of the strings with other
strings, but which is nevertheless extremely relevant to the text. This
additional information is used to both sharpen the representations
derived directly from the text string, and also to augment the
representation with content that, while not explicitly mentioned in the
string, is implicit in the text and, if made explicit, can be used to
support the performance of text processing applications including
document indexing and retrieval, document classification, document
routing, document summarization, and document tagging. These enhancements
may be used to support down-stream processing, such as automated document
reading and understanding, online advertising placement, electronic
commerce, corporate knowledge management, and business and government
intelligence applications.