A new word model is trained from synthetic word samples derived by Monte
Carlo techniques from one or more prior word models. The prior word model
can be a phonetic word model and the new word model can be a
non-phonetic, whole-word, word model. The prior word model can be trained
from data that has undergone a first channel normalization and the
synthesized word samples from which the new word model is trained can
undergo a different channel normalization similar to that to be used in a
given speech recognition context. The prior word model can have a first
model structure and the new word model can have a second, different,
model structure. These differences in model structure can include, for
example, differences of model topology; differences of model complexity;
and differences in the type of basis function used in a description of
such probability distributions.