tailieunhanh - Báo cáo khoa học: "Distortion Models For Statistical Machine Translation"

In this paper, we argue that n-gram language models are not sufficient to address word reordering required for Machine Translation. We propose a new distortion model that can be used with existing phrase-based SMT decoders to address those n-gram language model limitations. We present empirical results in Arabic to English Machine Translation that show statistically significant improvements when our proposed model is used. We also propose a novel metric to measure word order similarity (or difference) between any pair of languages based on word alignments. . | Distortion Models For Statistical Machine Translation Yaser Al-Onaizan and Kishore Papineni IBM TJ. Watson Research Center 1101 Kitchawan Road Yorktown Heights NY 10598 USA onaizan papineni @ Abstract In this paper we argue that n-gram language models are not sufficient to address word reordering required for Machine Translation. We propose a new distortion model that can be used with existing phrase-based SMT decoders to address those n-gram language model limitations. We present empirical results in Arabic to English Machine Translation that show statistically significant improvements when our proposed model is used. We also propose a novel metric to measure word order similarity or difference between any pair of languages based on word alignments. 1 Introduction A language model is a statistical model that gives a probability distribution over possible sequences of words. It computes the probability of producing a given word W1 given all the words that precede it in the sentence. An n-gram language model is an n-th order Markov model where the probability of generating a given word depends only on the last n - 1 words immediately preceding it and is given by the following equation P wk P W1 P W2 wi P WnW-1 1 where k n. N-gram language models have been successfully used in Automatic Speech Recognition ASR as was first proposed by Bahl et al. 1983 . They play an important role in selecting among several candidate word realization of a given acoustic signal. N -gram language models have also been used in Statistical Machine Translation SMT as proposed by Brown et al. 1990 Brown et al. 1993 . The run-time search procedure used to find the most likely translation or transcription in the case of Speech Recognition is typically referred to as decoding. There is a fundamental difference between decoding for machine translation and decoding for speech recog nition. When decoding a speech signal words are generated in the same order in which their corresponding

TÀI LIỆU LIÊN QUAN