You are here

Improving alignment for SMT by reordering and augmenting the training corpus

TitleImproving alignment for SMT by reordering and augmenting the training corpus
Publication TypeConference Paper
Year of Publication2009
AuthorsHolmqvist M, Stymne S, Foo J, Ahrenberg L
Conference NameFourth Workshop on Statistical Machine Translation (WMT09)
Conference LocationAthens, Greece
Keywordsmachine translation
Abstract

We describe the LIU systems for English-German and German-English translation in the WMT09 shared task. We focus on two methods to improve the word alignment: (i) by applying Giza++ in a second phase to a reordered training corpus, where reordering is based on the alignments from the first phase, and (ii) by adding lexical data obtained as high-precision alignments from a different word aligner. These methods were studied in the context of a system that uses compound processing, a morphological sequence model for German, and a part-of-speech sequence model for English. Both methods gave some improvements to translation quality as measured by Bleu and Meteor scores, though not consistently. All systems used both out-of-domain and in-domain data as the mixed corpus had better scores in the baseline configuration.

URLhttp://statmt.org/wmt09/WMT-09-2009.pdf#page=136