Generating a transfer dictionary used in a transfer-based translation machine system. A pair of source/target language sentences are received. The source language sentence comprises at least one marked idiom, at least one argument and at least one marked collocation. The target language sentence comprises the target language translation for the idiom and the source language word(s) for the argument. The source language sentence is parsed to generate a source language syntactic tree. Nodes are extracted from the source language syntactic tree. A least common ancestor node of the extracted nodes is calculated and source language structure information is generated based on the source language syntactic tree data structure. Target language structure information is generated by adding the part-of-speech information to each morpheme in the target language sentence and by replacing each source language word in the target language with the corresponding syntactic information within the source language syntactic tree.

 
Web www.patentalert.com

< Efficient storage of fingerprints

> Transcript alignment

~ 00496