tailieunhanh - Báo cáo khoa học: "Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers"

In this paper, with a belief that a language model that embraces a larger context provides better prediction ability, we present two extensions to standard n-gram language models in statistical machine translation: a backward language model that augments the conventional forward language model, and a mutual information trigger model which captures long-distance dependencies that go beyond the scope of standard n-gram language models. We integrate the two proposed models into phrase-based statistical machine translation and conduct experiments on large-scale training data to investigate their effectiveness. . | Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers Deyi Xiong Min Zhang Haizhou Li Human Language Technology Institute for Infocomm Research 1 Fusionopolis Way 21-01 Connexis Singapore 138632 dyxiong mzhang hli @ Abstract In this paper with a belief that a language model that embraces a larger context provides better prediction ability we present two extensions to standard n-gram language models in statistical machine translation a backward language model that augments the conventional forward language model and a mutual information trigger model which captures long-distance dependencies that go beyond the scope of standard n-gram language models. We integrate the two proposed models into phrase-based statistical machine translation and conduct experiments on large-scale training data to investigate their effectiveness. Our experimental results show that both models are able to significantly improve translation quality and collectively achieve up to 1 BLEU point over a competitive baseline. 1 Introduction Language model is one of the most important knowledge sources for statistical machine translation SMT Brown et al. 1993 . The standard n-gram language model Goodman 2001 assigns probabilities to hypotheses in the target language conditioning on a context history of the preceding n 1 words. Along with the efforts that advance translation models from word-based paradigm to syntax-based philosophy in recent years we have also witnessed increasing efforts dedicated to extend standard n-gram language models for SMT. We roughly categorize these efforts into two directions data-volume-oriented and data-depth-oriented. 1288 In the first direction more data is better. In order to benefit from monolingual corpora LDC news data or news data collected from web pages that consist of billions or even trillions of English words huge language models are built in a distributed manner Zhang et al. 2006

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN