tailieunhanh - Báo cáo khoa học: "Rule Markov Models for Fast Tree-to-String Translation"

Most statistical machine translation systems rely on composed rules (rules that can be formed out of smaller rules in the grammar). Though this practice improves translation by weakening independence assumptions in the translation model, it nevertheless results in huge, redundant grammars, making both training and decoding inefficient. Here, we take the opposite approach, where we only use minimal rules (those that cannot be formed out of other rules), and instead rely on a rule Markov model of the derivation history to capture dependencies between minimal rules. . | Rule Markov Models for Fast Tree-to-String Translation Ashish Vaswani Information Sciences Institute University of Southern California avaswani@ Haitao Mi Institute of Computing Technology Chinese Academy of Sciences htmi@ Liang Huang and David Chiang Information Sciences Institute University of Southern California Ihuang chiang @ Abstract Most statistical machine translation systems rely on composed rules rules that can be formed out of smaller rules in the grammar . Though this practice improves translation by weakening independence assumptions in the translation model it nevertheless results in huge redundant grammars making both training and decoding inefficient. Here we take the opposite approach where we only use minimal rules those that cannot be formed out of other rules and instead rely on a rule Markov model of the derivation history to capture dependencies between minimal rules. Large-scale experiments on a state-of-the-art tree-to-string translation system show that our approach leads to a slimmer model a faster decoder yet the same translation quality measured using B as composed rules. 1 Introduction Statistical machine translation systems typically model the translation process as a sequence of translation steps each of which uses a translation rule for example a phrase pair in phrase-based translation or a tree-to-string rule in tree-to-string translation. These rules are usually applied independently of each other which violates the conventional wisdom that translation should be done in context. To alleviate this problem most state-of-the-art systems rely on composed rules which are larger rules that can be formed out of smaller rules including larger phrase pairs that can be formerd out of smaller phrase pairs as opposed to minimal rules which are rules that cannot be formed out of other 856 rules. Although this approach does improve translation quality dramatically by weakening the independence assumptions in the .

TỪ KHÓA LIÊN QUAN