tailieunhanh - Báo cáo khoa học: "Fine-grained Tree-to-String Translation Rule Extraction"

Tree-to-string translation rules are widely used in linguistically syntax-based statistical machine translation systems. In this paper, we propose to use deep syntactic information for obtaining fine-grained translation rules. A head-driven phrase structure grammar (HPSG) parser is used to obtain the deep syntactic information, which includes a fine-grained description of the syntactic property and a semantic representation of a sentence. We extract fine-grained rules from aligned HPSG tree/forest-string pairs and use them in our tree-to-string and string-to-tree systems. . | Fine-grained Tree-to-String Translation Rule Extraction Xianchao Wu Takuya Matsuzaki Jun ichi Tsujii Department of Computer Science The University of Tokyo 7-3-1 Hongo Bunkyo-ku Tokyo 113-0033 Japan School of Computer Science University of Manchester National Centre for Text Mining NaCTeM Manchester Interdisciplinary Biocentre 131 Princess Street Manchester M1 7DN UK wxc matuzaki tsujii @ Abstract Tree-to-string translation rules are widely used in linguistically syntax-based statistical machine translation systems. In this paper we propose to use deep syntactic information for obtaining fine-grained translation rules. A head-driven phrase structure grammar HPSG parser is used to obtain the deep syntactic information which includes a fine-grained description of the syntactic property and a semantic representation of a sentence. We extract fine-grained rules from aligned HPSG tree forest-string pairs and use them in our tree-to-string and string-to-tree systems. Extensive experiments on large-scale bidirectional Japanese-English translations testified the effectiveness of our approach. 1 Introduction Tree-to-string translation rules are generic and applicable to numerous linguistically syntax-based Statistical Machine Translation SMT systems such as string-to-tree translation Galley et al. 2004 Galley et al. 2006 Chiang et al. 2009 tree-to-string translation Liu et al. 2006 Huang et al. 2006 and forest-to-string translation Mi et al. 2008 Mi and Huang 2008 . The algorithms proposed by Galley et al. 2004 2006 are frequently used for extracting minimal and composed rules from aligned 1-best tree-string pairs. Dealing with the parse error problem and rule sparseness problem Mi and Huang 2008 replaced the 1-best parse tree with a packed forest which compactly encodes exponentially many parses for tree-to-string rule extraction. However current tree-to-string rules only make use of Probabilistic Context-Free Grammar tree fragments in which .

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG