The present invention adopts the fundamental architecture of a statistical
machine translation system which utilizes statistical models learned from
the training data and does not require expert knowledge for rule-based
machine translation systems. Out of the training parallel data, a certain
amount of sentence pairs are selected for manual alignment. These
sentences are aligned at the phrase level instead of at the word level.
Depending on the size of the training data, the optimal amount for manual
alignment may vary. The alignment is done using an alignment tool with a
graphical user interface which is convenient and intuitive to the users.
Manually aligned data are then utilized to improve the automatic word
alignment component. Model combination methods are also introduced to
improve the accuracy and the coverage of statistical models for the task
of statistical machine translation.