For upgrading a data stream of multimedia data, which comprises features
with textual description, a set of phonetic translation hints is included
in the data stream, which specifies the phonetic transcription of parts
or words of the textual description. The phonetic transcriptions need not
be repeated for each occurrence of a word. This reduces the amount of
data necessary for storing or transmitting the description text.