A method and computer-readable medium are provided that construct a collocation
mistake pattern database for use in writing in a first language by a person whose
native language is a second language. The method includes obtaining a bilingual
corpus having sentences in first and second languages and extracting second language
word pairs from the second language sentences in the corpus. For each second language
word pair extracted from the corpus, a corresponding first language word pair is
extracted from the corresponding first language sentence in the corpus to determine
a correct first language translation for the second language word pair. Also, for
each second language word pair extracted from the corpus, a set of combinations
of first language translation words corresponding to the second language word pair
is created. Finally, for each second language word pair extracted from the corpus,
the correct first language translation is removed from the set of combinations
of first language translation words such that the set of combinations represent
a set of collocation mistake first language word pairs corresponding to the second
language word pair.