Title | Improving alignment for SMT by reordering and augmenting the training corpus |
Publication Type | Conference Paper |
Year of Publication | 2009 |
Authors | Holmqvist M, Stymne S, Foo J, Ahrenberg L |
Conference Name | Fourth Workshop on Statistical Machine Translation (WMT09) |
Conference Location | Athens, Greece |
Keywords | machine translation |
Abstract | We describe the LIU systems for English-German and German-English translation in the WMT09 shared task. We focus on two methods to improve the word alignment: (i) by applying Giza++ in a second phase to a reordered training corpus, where reordering is based on the alignments from the first phase, and (ii) by adding lexical data obtained as high-precision alignments from a different word aligner. These methods were studied in the context of a system that uses compound processing, a morphological sequence model for German, and a part-of-speech sequence model for English. Both methods gave some improvements to translation quality as measured by Bleu and Meteor scores, though not consistently. All systems used both out-of-domain and in-domain data as the mixed corpus had better scores in the baseline configuration. |
URL | http://statmt.org/wmt09/WMT-09-2009.pdf#page=136 |
- Log in or register to post comments
- Google Scholar