tailieunhanh - Báo cáo khoa học: "Training Phrase Translation Models with Leaving-One-Out"

Several attempts have been made to learn phrase translation probabilities for phrasebased statistical machine translation that go beyond pure counting of phrases in word-aligned training data. Most approaches report problems with overfitting. We describe a novel leavingone-out approach to prevent over-fitting that allows us to train phrase models that show improved translation performance on the WMT08 Europarl German-English task. In contrast to most previous work where phrase models were trained separately from other models used in translation, we include all components such as single word lexica and reordering models in training. Using this consistent training of phrase models we. | Training Phrase Translation Models with Leaving-One-Out Joern Wuebker and Arne Mauser and Hermann Ney Human Language Technology and Pattern Recognition Group RWTH Aachen University Germany surname @ Abstract Several attempts have been made to learn phrase translation probabilities for phrasebased statistical machine translation that go beyond pure counting of phrases in word-aligned training data. Most approaches report problems with overfitting. We describe a novel leaving-one-out approach to prevent over-fitting that allows us to train phrase models that show improved translation performance on the WMT08 Europarl German-English task. In contrast to most previous work where phrase models were trained separately from other models used in translation we include all components such as single word lexica and reordering models in training. Using this consistent training of phrase models we are able to achieve improvements of up to points in BLEU. As a side effect the phrase table size is reduced by more than 80 . 1 Introduction A phrase-based SMT system takes a source sentence and produces a translation by segmenting the sentence into phrases and translating those phrases separately Koehn et al. 2003 . The phrase translation table which contains the bilingual phrase pairs and the corresponding translation probabilities is one of the main components of an SMT system. The most common method for obtaining the phrase table is heuristic extraction from automatically word-aligned bilingual training data Och et al. 1999 . In this method all phrases of the sentence pair that match constraints given by the alignment are extracted. This includes overlapping phrases. At extraction time it does not matter whether the phrases are extracted from a highly probable phrase alignment or from an unlikely one. Phrase model probabilities are typically defined as relative frequencies of phrases extracted from word-aligned parallel training data. The joint counts C f e .

TỪ KHÓA LIÊN QUAN