tailieunhanh - Báo cáo khoa học: "Context-dependent SMT Model using Bilingual Verb-Noun Collocation"

In this paper, we propose a new contextdependent SMT model that is tightly coupled with a language model. It is designed to decrease the translation ambiguities and efficiently search for an optimal hypothesis by reducing the hypothesis search space. It works through reciprocal incorporation between source and target context: a source word is determined by the context of previous and corresponding target words and the next target word is predicted by the pair consisting of the previous target word and its corresponding source word. . | Context-dependent SMT Model using Bilingual Verb-Noun Collocation Young-Sook Hwang ATR SLT Research Labs 2-2-2 Hikaridai Seika-cho Soraku-gun Kyoto 619-0288 JAPAN Yutaka Sasaki ATR SLT Research Labs 2-2-2 Hikaridai Seika-cho Soraku-gun Kyoto 619-0288 JAPAN Abstract In this paper we propose a new contextdependent SMT model that is tightly coupled with a language model. It is designed to decrease the translation ambiguities and efficiently search for an optimal hypothesis by reducing the hypothesis search space. It works through reciprocal incorporation between source and target context a source word is determined by the context of previous and corresponding target words and the next target word is predicted by the pair consisting of the previous target word and its corresponding source word. In order to alleviate the data sparseness in chunk-based translation we take a stepwise back-off translation strategy. Moreover in order to obtain more semantically plausible translation results we use bilingual verb-noun collocations these are automatically extracted by using chunk alignment and a monolingual dependency parser. As a case study we experimented on the language pair of Japanese and Korean. As a result we could not only reduce the search space but also improve the performance. 1 Introduction For decades many research efforts have contributed to the advance of statistical machine translation. Recently various works have improved the quality of statistical machine translation systems by using phrase translation Koehn et al. 2003 Marcu et al. 2002 Och et al. 1999 Och and Ney 2000 Zens et al. 2004 . Most of the phrase-based translation models have adopted the noisy-channel based IBM style models Brown et al. 1993 ê argmaxoiPr fy eỉ Pr eỉ 1 In these model we have two types of knowledge translation model Pr fi I e and language model Fr e . The translation model links the source language sentence to the target language sentence.

TÀI LIỆU LIÊN QUAN