A method for optimizing a language model is presented comprising developing an
initial language model from a lexicon and segmentation derived from a received
corpus using a maximum match technique, and iteratively refining the initial language
model by dynamically updating the lexicon and re-segmenting the corpus according
to statistical principles until a threshold of predictive capability is achieved.