Corpus analysis methods have previously been applied to text, typically to
annotated text. The invention shows how to apply corpus analysis methods
to information captured in databases, where the database columns include
a mixture of both structured domains and unstructured domains containing
text. It uses case-based methods to automatically organize cases for
periodic review. The invention can help to identify opportunities for
increasing knowledge about databases. By organizing a database around
common lexical, semantic, pragmatic and syntactic relationships, the
invention can be used to increase the effectiveness of previous corpus
analysis methods, and to apply them to a diversity of commercial
applications. The invention applies contextual constraints to focus the
application of linguistic methods. This invention can provide a component
for medical records, enterprise databases, information retrieval,
question answering systems, interactive robots, interactive appliances,
linguistically competent speech recognition, speech understanding and
many other useful devices and applications that require a high level of
linguistic competence within operational contexts.