tailieunhanh - Báo cáo khoa học: "A Hierarchical Phrase-Based Model for Statistical Machine Translation"

We present a statistical phrase-based translation model that uses hierarchical phrases— phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a bitext without any syntactic information. Thus it can be seen as a shift to the formal machinery of syntaxbased translation systems without any linguistic commitment. In our experiments using BLEU as a metric, the hierarchical phrasebased model achieves a relative improvement of over Pharaoh, a state-of-the-art phrase-based system. . | A Hierarchical Phrase-Based Model for Statistical Machine Translation David Chiang Institute for Advanced Computer Studies UMIACS University of Maryland College Park MD 20742 USA dchiang@ Abstract We present a statistical phrase-based translation model that uses hierarchical phrases phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a bitext without any syntactic information. Thus it can be seen as a shift to the formal machinery of syntaxbased translation systems without any linguistic commitment. In our experiments using BLEU as a metric the hierarchical phrasebased model achieves a relative improvement of over Pharaoh a state-of-the-art phrase-based system. 1 Introduction The alignment template translation model Och and Ney 2004 and related phrase-based models advanced the previous state of the art by moving from words to phrases as the basic unit of translation. Phrases which can be any substring and not necessarily phrases in any syntactic theory allow these models to learn local reorderings translation of short idioms or insertions and deletions that are sensitive to local context. They are thus a simple and powerful mechanism for machine translation. The basic phrase-based model is an instance of the noisy-channel approach Brown et al. 1993 1 in which the translation of a French sentence f into an throughout this paper we follow the convention of Brown et al. of designating the source and target languages as French and English respectively. The variables f and e stand for source and target sentences f stands for the substring of f from position i to position j inclusive and similarly for ej. English sentence e is modeled as 1 arg max P e f arg max P e f 2 arg max P e X P f e e The translation model P f e encodes e into f by the following steps 1. segment e into phrases e1 eI typically with a uniform distribution over segmentations 2. reorder the ei according to some distortion model