A semantic discovery and exploration system is disclosed where an
environment enabling a developer or user to uncover, navigate, and
organize semantic patterns and structures in a document collection with
or without the aid of structured knowledge. The semantic discovery and
exploration system provides techniques for searching document
collections, categorizing documents, inducing lists of related concepts,
and identifying clusters of related terms and documents. This system
operates both without and with infusions of structured knowledge such as
gazetteers, thesauruses, taxonomies and ontologies. System performance
improves when structured knowledge is incorporated. The semantic
discovery and exploration system may be used as a first step in
developing an information extraction system such as to categorize or
cluster documents in a particular domain or to develop gazetteers and as
a part of a deployed run-time information extraction system. It may also
be used as standalone utility for searching, navigating, and organizing
document collections and structured knowledge bases such as dictionaries
or domain-specific reference works.