A system for enhancing web-based searching is provided. Categorizing and
clustering techniques are used to optimize searching. Businesses are
classified using a control group of predetermined categories. The
predetermined categories may be SIC codes or headings that are used to
describe business activities. The website addresses for a business listed
in the control group is determined, and the content of the business's
website is extracted. The extracted content is associated with the
predetermined category that the business is classified under. The
extracted content is used to further enhance the overall classification
scheme. The system may compare and match the extracted content with
content of other business' websites, which are similarly categorized. If
a relevant keyword match is identified in several of the websites, the
keyword may be used to update the classification scheme. A new category
or sub-category can be created based on this keyword. Furthermore, when a
search is performed, the search results are organized by these
categories, and using various processes, the most common results are kept
and the less relevant results are discarded.