An improved voice recognition system in which a Voice Keyword Table is
generated and downloaded from a set-up device to a voice recognition
device. The VKT includes visual form data, spoken form data, phonetic
format data, and an entry corresponding to a keyword, and TTS-generated
voice prompts and voice models corresponding to the phonetic format data.
A voice recognition system on the voice recognition device is updated by
the set-up device. Furthermore, voice models in the voice recognition
device are modified by the set-up device.