tailieunhanh - Báo cáo khoa học: "Semi-Supervised Training for Statistical Word Alignment"

We introduce a semi-supervised approach to training for statistical machine translation that alternates the traditional Expectation Maximization step that is applied on a large training corpus with a discriminative step aimed at increasing word-alignment quality on a small, manually word-aligned sub-corpus. We show that our algorithm leads not only to improved alignments but also to machine translation outputs of higher quality. | Semi-Supervised Training for Statistical Word Alignment Alexander Fraser ISI University of Southern California 4676 Admiralty Way Suite 1001 Marina del Rey CA 90292 fraser@ Daniel Marcu ISI University of Southern California 4676 Admiralty Way Suite 1001 Marina del Rey CA 90292 marcu@ Abstract We introduce a semi-supervised approach to training for statistical machine translation that alternates the traditional Expectation Maximization step that is applied on a large training corpus with a discriminative step aimed at increasing word-alignment quality on a small manually word-aligned sub-corpus. We show that our algorithm leads not only to improved alignments but also to machine translation outputs of higher quality. 1 Introduction The most widely applied training procedure for statistical machine translation IBM model 4 Brown et al. 1993 unsupervised training followed by post-processing with symmetrization heuristics Och and Ney 2003 yields low quality word alignments. When compared with gold standard parallel data which was manually aligned using a high-recall precision methodology Melamed 1998 the word-level alignments produced automatically have an F-measure accuracy of and see Section 2 for details . In this paper we improve word alignment and subsequently MT accuracy by developing a range of increasingly sophisticated methods 1. We first recast the problem of estimating the IBM models Brown et al. 1993 in a discriminative framework which leads to an initial increase in word-alignment accuracy. 2. We extend the IBM models with new sub models which leads to additional increases in word-alignment accuracy. In the process we also show that these improvements are explained not only by the power of the new models but also by a novel search procedure for the alignment of highest probability. 3. Finally we propose a training procedure that interleaves discriminative training with maximum likelihood training. These steps lead to word alignments

TÀI LIỆU LIÊN QUAN