A method automatically determines groups of words or phrases that are descriptive names of a small set of documents, as well as infers concepts in the small set of documents that are more general and more specific than the descriptive names, without any prior knowledge of the hierarchy or the concepts, in a language independent manner. The descriptive names and the concepts may not even be explicitly contained in the documents. The primary application of the invention is for searching of the World Wide Web, but the invention is not limited solely to use with the World Wide Web and may be applied to any set of documents. Classes of features are identified in order to promote understanding of a set of documents. Preferably, there are three classes of features. "Self" features or terms describe the cluster as a whole. "Parent" features or terms describe more general concepts. "Child" features or terms describe specializations of the cluster. The self features can be used as a recommended name for a cluster, while parents and children can be used to place the clusters in the space of a larger collection. Parent features suggest a more general concept, while children features suggest concepts that describe a specialization of the self feature(s). Automatic discovery of parent, self and child features is useful for several purposes including automatic labeling of web directories and improving information retrieval.

 
Web www.patentalert.com

> System and method for navigating patient medical information

~ 00317