tailieunhanh - Báo cáo khoa học: "Forest-Based Translation"
Among syntax-based translation models, the tree-based approach, which takes as input a parse tree of the source sentence, is a promising direction being faster and simpler than its string-based counterpart. However, current tree-based systems suffer from a major drawback: they only use the 1-best parse to direct the translation, which potentially introduces translation mistakes due to parsing errors. We propose a forest-based approach that translates a packed forest of exponentially many parses, which encodes many more alternatives than standard n-best lists. . | Forest-Based Translation Haitao mF Liang Huangi Qun Liu tKey Lab. of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences . Box 2704 Beijing 100190 China htmi liuqun @ Department of Computer Information Science University of Pennsylvania Levine Hall 3330 Walnut Street Philadelphia PA 19104 USA lhuang3@ Abstract Among syntax-based translation models the tree-based approach which takes as input a parse tree of the source sentence is a promising direction being faster and simpler than its string-based counterpart. However current tree-based systems suffer from a major drawback they only use the 1-best parse to direct the translation which potentially introduces translation mistakes due to parsing errors. We propose a forest-based approach that translates a packed forest of exponentially many parses which encodes many more alternatives than standard n-best lists. Large-scale experiments show an absolute improvement of BLEU points over the 1-best baseline. This result is also points higher than decoding with 30-best parses and takes even less time. 1 Introduction Syntax-based machine translation has witnessed promising improvements in recent years. Depending on the type of input these efforts can be divided into two broad categories the string-based systems whose input is a string to be simultaneously parsed and translated by a synchronous grammar Wu 1997 Chiang 2005 Galley et al. 2006 and the tree-based systems whose input is already a parse tree to be directly converted into a target tree or string Lin 2004 Ding and Palmer 2005 Quirk et al. 2005 Liu et al. 2006 Huang et al. 2006 . Compared with their string-based counterparts treebased systems offer some attractive features they are much faster in decoding linear time vs. cubic time see Huang et al. 2006 do not require a binary-branching grammar as in string-based models Zhang et al. 2006 and can have separate grammars for parsing and .
đang nạp các trang xem trước