Data files (205) are categorised in order to facilitate the searching for
information. The analysis is performed in order to identify items which
may be considered as having high value without actually being directly
specified. Occurrences of unspecified candidate items are identified (207)
in contexts for a preferred specified category. Occurrences of unspecified
candidate items are identified (209) in non-preferred contexts. The
preferred occurrences are processed (211) with the non-preferred
occurrences for each candidate item in order to select candidate items as
being high value items. In the preferred embodiment, data relating to
companies is identified without specific company names being defined.