A method and apparatus are provided for adapting a language model to a task-specific
domain. Under the method and apparatus, the relative frequency of n-grams in a
small training set (i.e. task-specific training data set) and the relative frequency
of n-grams in a large training set (i.e. out-of-domain training data set) are used
to weight a distribution count of n-grams in the large training set. The weighted
distributions are then used to form a modified language model by identifying probabilities
for n-grams from the weighted distributions.