tailieunhanh - Báo cáo khoa học: "Machine Translation with a Stochastic Grammatical Channel"

We introduce a stochastic grammatical channel model for machine translation, that synthesizes several desirable characteristics of both statistical and grammatical machine translation. As with the pure statistical translation model described by Wu (1996) (in which a bracketing transduction grammar models the channel), alternative hypotheses compete probabilistically, exhaustive search of the translation hypothesis space can be performed in polynomial time, and robustness heuristics arise naturally from a language-independent inversiontransduction model. . | Machine Translation with a Stochastic Grammatical Channel DekaiWu and Hongsing Wong HKUST Human Language Technology Center Department of Computer Science University of Science and Technology Clear Water Bay Hong Kong dekai wong @ Abstract We introduce a stochastic grammatical channel model for machine translation that synthesizes several desirable characteristics of both statistical and grammatical machine translation. As with the pure statistical translation model described by Wu 1996 in which a bracketing transduction grammar models the channel alternative hypotheses compete probabilistically exhaustive search of the translation hypothesis space can be performed in polynomial time and robustness heuristics arise naturally from a language-independent inversiontransduction model. However unlike pure statistical translation models the generated output string is guaranteed to conform to a given target grammar. The model employs only 1 a translation lexicon 2 a context-free grammar for the target language and 3 a bigram language model. The fact that no explicit bilingual translation rules are used makes the model easily portable to a variety of source languages. Initial experiments show that it also achieves significant speed gains over our earlier model. 1 Motivation Speed of statistical machine translation methods has long been an issue. A step was taken by Wu Wu 1996 who introduced a polynomial-time algorithm for the runtime search for an optimal translation. To achieve this Wu s method substituted a language-independent stochastic bracketing transduction grammar SBTG in place of the simpler word-alignment channel models reviewed in Section 2. The SBTG channel made exhaustive search possible through dynamic programming instead of previous stack search heuristics. Translation accuracy was not compromised because the SBTG is apparently flexible enough to model wordorder variation between English and Chinese even though it eliminates large portions of the .

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN