tailieunhanh - Báo cáo khoa học: "Rule Filtering by Pattern for Efficient Hierarchical Translation"

We describe refinements to hierarchical translation search procedures intended to reduce both search errors and memory usage through modifications to hypothesis expansion in cube pruning and reductions in the size of the rule sets used in translation. Rules are put into syntactic classes based on the number of non-terminals and the pattern, and various filtering strategies are then applied to assess the impact on translation speed and quality. Results are reported on the 2008 NIST Arabic-toEnglish evaluation task. . | Rule Filtering by Pattern for Efficient Hierarchical Translation Gonzalo Iglesias Adria de Gispert Eduardo R. Banga William Byrne University of Vigo. Dept. of Signal Processing and Communications. Vigo Spain giglesia erbanga @ University of Cambridge. Dept. of Engineering. CB2 1PZ Cambridge . ad465 wjb31 @ Abstract We describe refinements to hierarchical translation search procedures intended to reduce both search errors and memory usage through modifications to hypothesis expansion in cube pruning and reductions in the size of the rule sets used in translation. Rules are put into syntactic classes based on the number of non-terminals and the pattern and various filtering strategies are then applied to assess the impact on translation speed and quality. Results are reported on the 2008 NIST Arabic-to-English evaluation task. 1 Introduction Hierarchical phrase-based translation Chiang 2005 has emerged as one of the dominant current approaches to statistical machine translation. Hiero translation systems incorporate many of the strengths of phrase-based translation systems such as feature-based translation and strong target language models while also allowing flexible translation and movement based on hierarchical rules extracted from aligned parallel text. The approach has been widely adopted and reported to be competitive with other large-scale data driven approaches . Zollmann et al. 2008 . Large-scale hierarchical SMT involves automatic rule extraction from aligned parallel text model parameter estimation and the use of cube pruning k-best list generation in hierarchical translation. The number of hierarchical rules extracted far exceeds the number of phrase translations typically found in aligned text. While this may lead to improved translation quality there is also the risk of lengthened translation times and increased memory usage along with possible search errors due to the pruning procedures needed in search. We describe .

TỪ KHÓA LIÊN QUAN