tailieunhanh - Báo cáo khoa học: "Fast Decoding and Optimal Decoding for Machine Translation"

A good decoding algorithm is critical to the success of any statistical machine translation system. The decoder’s job is to find the translation that is most likely according to set of previously learned parameters (and a formula for combining them). Since the space of possible translations is extremely large, typical decoding algorithms are only able to examine a portion of it, thus risking to miss good solutions. In this paper, we compare the speed and output quality of a traditional stack-based decoding algorithm with two new decoders: a fast greedy decoder and a slow but optimal decoder that treats. | Fast Decoding and Optimal Decoding for Machine Translation Ulrich Germannf Michael JahrJ Kevin Knightf Daniel Marcuf and Kenji Yamadaf f Information Sciences Institute University of Southern California 4676 Admiralty Way Suite 1001 Marina del Rey CA 90292 germann knight marcu kyamada @ ỊDepartment of Computer Science Stanford University Stanford CA 94305 jahr@ Abstract A good decoding algorithm is critical to the success of any statistical machine translation system. The decoder s job is to find the translation that is most likely according to set of previously learned parameters and a formula for combining them . Since the space of possible translations is extremely large typical decoding algorithms are only able to examine a portion of it thus risking to miss good solutions. In this paper we compare the speed and output quality of a traditional stack-based decoding algorithm with two new decoders a fast greedy decoder and a slow but optimal decoder that treats decoding as an integer-programming optimization problem. 1 Introduction A statistical MT system that translates say French sentences into English is divided into three parts 1 a language model LM that assigns a probability P e to any English string 2 a translation model TM that assigns a probability P f e to any pair of English and French strings and 3 a decoder. The decoder takes a previously unseen sentence J and tries to find the e that maximizes P e f or equivalently maximizes P e -P fe . Brown et al. 1993 introduced a series of TMs based on word-for-word substitution and reordering but did not include a decoding algorithm. If the source and target languages are constrained to have the same word order by choice or through suitable pre-processing then the linear Viterbi algorithm can be applied Tillmann et al. 1997 . If re-ordering is limited to rotations around nodes in a binary tree then optimal decoding can be carried out by a high-polynomial algorithm Wu 1996 . For arbitrary .