The invention relates to pre-processing of a pronunciation dictionary for
compression in a data processing device, the pronunciation dictionary
comprising at least one entry, the entry comprising a sequence of
character units and a sequence of phoneme units. According to one aspect
of the invention the sequence of character units and the sequence of
phoneme units are aligned using a statistical algorithm. The aligned
sequence of character units and aligned sequence of phoneme units are
interleaved by inserting each phoneme unit at a predetermined location
relative to the corresponding character unit.