A computerized pronunciation system is provided for generating
pronunciations for words and storing the pronunciations in a
pronunciation dictionary. The system includes a word list including at
least one word; transcribed acoustic data including at least one waveform
for the word and transcribed text associated with the waveform; a
pronunciation-learning module configured to accept as input the word list
and the transcribed acoustic data, the pronunciation-learning module
including: sets of initial pronunciations of the word, a scoring module
configured score pronunciations and to generate phone probabilities, and
a set of alternate pronunciations of the word, wherein the set of
alternate pronunciations include a highest-scoring set of initial
pronunciations with a highest-scoring substitute phone substituted for a
lowest-probability phone; and a pronunciation dictionary configured to
receive the highest-scoring set of initial pronunciations and the set of
alternate pronunciations.