tailieunhanh - Báo cáo khoa học: "ADP based Search Using Monotone Alignments in Statistical Translation"

In this paper, we describe a Dynamic Programming (DP) based search algorithm for statistical translation and present experimental results. The statistical translation uses two sources of information: a translation model and a language model. The language model used is a standard bigram model. For the translation lnodel, the alignment probabilities are made dependent on the differences in the alignment positions rather than on the absolute positions. | A DP based Search Using Monotone Alignments in Statistical Translation c. Tillmann s. Vogel H. Ney A. Zubiaga Lehrstuhl fur Informatik VI RWTH Aachen D-52056 Aachen. Germany tillmann ney @ Abstract In this paper we describe a Dynamic Programming DP based search algorithm for statistical translation and present experimental results. The statistical translation uses two sources of information a translation model and a language model. The language model used is a standard bigram model. For the translation model the alignment probabilities are made dependent on the differences in the alignment positions rather than on the absolute positions. Thus the approach amounts to a first-order Hidden Markov model HMM as they are used successfully in speech recognition for the time alignment problem. Under the assumption that the alignment is monotone with respect to the word order in both languages an efficient search strategy for translation can be formulated. The details of the search algorithm are described. Experiments on the EuTrans corpus produced a word error rate of . 1 Overview The Statistical Approach to Translation The goal is the translation of a text given in some source language into a target language. We are given a source French string which is to be translated into a target English string fj e 1 .e . Among all possible target strings we will choose the one with the highest probability which is given by Bayes decision rule Brown et al. 1993 - argmax Pr e f V argmax Pr e Pr ci . Pr e is the language model of the target language whereas Pr fl cJ is the string translation model. The argmax operation denotes the search problem. In this paper we address the problem of introducing structures into the probabilistic dependencies in order to model the string translation probability e . the search procedure . an algorithm to perform the argmax operation in an efficient way. transformation steps for both the source and the target .

TỪ KHÓA LIÊN QUAN