tailieunhanh - Báo cáo khoa học: "Discriminative Word Alignment with Conditional Random Fields"

In this paper we present a novel approach for inducing word alignments from sentence aligned data. We use a Conditional Random Field (CRF), a discriminative model, which is estimated on a small supervised training set. The CRF is conditioned on both the source and target texts, and thus allows for the use of arbitrary and overlapping features over these data. Moreover, the CRF has efficient training and decoding processes which both find globally optimal solutions. | Discriminative Word Alignment with Conditional Random Fields Phil Blunsom and Trevor Cohn Department of Software Engineering and Computer Science University of Melbourne pcbl tacohn @ Abstract In this paper we present a novel approach for inducing word alignments from sentence aligned data. We use a Conditional Random Field CRF a discriminative model which is estimated on a small supervised training set. The CRF is conditioned on both the source and target texts and thus allows for the use of arbitrary and overlapping features over these data. Moreover the CRF has efficient training and decoding processes which both find globally optimal solutions. We apply this alignment model to both French-English and Romanian-English language pairs. We show how a large number of highly predictive features can be easily incorporated into the CRF and demonstrate that even with only a few hundred word-aligned training sentences our model improves over the current state-of-the-art with alignment error rates of and for the two tasks respectively. 1 Introduction Modern phrase based statistical machine translation SMT systems usually break the translation task into two phases. The first phase induces word alignments over a sentence-aligned bilingual corpus and the second phase uses statistics over these predicted word alignments to decode translate novel sentences. This paper deals with the first of these tasks word alignment. Most current SMT systems Och and Ney 2004 Koehn et al. 2003 use a generative model for word alignment such as the freely available GIZA Och and Ney 2003 an implementation of the IBM alignment models Brown et al. 1993 . These models treat word alignment as a hidden process and maximise the probability of the observed e f sentence pairs1 using the expectation maximisation EM algorithm. After the maximisation process is complete the word alignments are set to maximum posterior predictions of the model. While GIZA gives good results when

TÀI LIỆU LIÊN QUAN