tailieunhanh - Báo cáo khoa học: "Collaborative Decoding: Partial Hypothesis Re-ranking Using Translation Consensus between Decoders"
This paper presents collaborative decoding (co-decoding), a new method to improve machine translation accuracy by leveraging translation consensus between multiple machine translation decoders. Different from system combination and MBR decoding, which postprocess the n-best lists or word lattice of machine translation decoders, in our method multiple machine translation decoders collaborate by exchanging partial translation results. Using an iterative decoding approach, n-gram agreement statistics between translations of multiple decoders are employed to re-rank both full and partial hypothesis explored in decoding. . | Collaborative Decoding Partial Hypothesis Re-ranking Using Translation Consensus between Decoders Mu Li1 Nan Duan2 Dongdong Zhang1 Chi-Ho Li1 Ming Zhou1 1Microsoft Research Asia 2Tianjin University Beijing China Tianjin China muli v-naduan dozhang chl mingzhou @ Abstract This paper presents collaborative decoding co-decoding a new method to improve machine translation accuracy by leveraging translation consensus between multiple machine translation decoders. Different from system combination and MBR decoding which postprocess the n-best lists or word lattice of machine translation decoders in our method multiple machine translation decoders collaborate by exchanging partial translation results. Using an iterative decoding approach n-gram agreement statistics between translations of multiple decoders are employed to re-rank both full and partial hypothesis explored in decoding. Experimental results on data sets for NIST Chinese-to-English machine translation task show that the co-decoding method can bring significant improvements to all baseline decoders and the outputs from co-decoding can be used to further improve the result of system combination. 1 Introduction Recent research has shown substantial improvements can be achieved by utilizing consensus statistics obtained from outputs of multiple machine translation systems. Translation consensus can be measured either at sentence level or at word level. For example Minimum Bayes Risk MBR Kumar and Byrne 2004 decoding over n-best list tries to find a hypothesis with lowest expected loss with respect to all the other translations which can be viewed as sentence-level consensus-based decoding. Word based methods proposed range from straightforward consensus voting Bangalore et al. 2001 Matusov et al. 2006 to more complicated word-based system combination model Rosti et al. 2007 Sim et al. 2007 . Typically the resulting systems take outputs of individual machine translation systems as input and build a .
đang nạp các trang xem trước