The invention relates to indexing of digitized entities in a large and
comparatively unstructured data collection, for instance the Internet,
such that text-based searches with respect to the data collection can be
ordered via a user client terminal. Index information is generated for
each digitized entity, which contains distinctive features being ranked
according to a rank parameter. The rank parameter indicates a degree of
relevance of particular distinctive feature with respect to a given
digitized entity and is derived from fields or tags associated with one
or more copies of the digitized entity in the data collection. The index
information is stored in a searchable database, which is accessible via a
user client interface and a search engine. The derived distinctive
features and the rank parameter thus provides a possibility to carry out
text-based searches in respect of non-text digitized entities, such as
images, audio files and video sequences and obtain a highly relevant
search result.