tailieunhanh - Báo cáo khoa học: "Word Sense Disambiguation Improves Statistical Machine Translation"

Recent research presents conflicting evidence on whether word sense disambiguation (WSD) systems can help to improve the performance of statistical machine translation (MT) systems. In this paper, we successfully integrate a state-of-the-art WSD system into a state-of-the-art hierarchical phrase-based MT system, Hiero. We show for the first time that integrating a WSD system improves the performance of a state-ofthe-art statistical MT system on an actual translation task. Furthermore, the improvement is statistically significant. . | Word Sense Disambiguation Improves Statistical Machine Translation Yee Seng Chan and Hwee Tou Ng Department of Computer Science National University of Singapore 3 Science Drive 2 Singapore 117543 chanys nght @ David Chiang Information Sciences Institute University of Southern California 4676 Admiralty Way Suite 1001 Marina del Rey CA 90292 USA chiang@ Abstract Recent research presents conflicting evidence on whether word sense disambiguation WSD systems can help to improve the performance of statistical machine translation MT systems. In this paper we successfully integrate a state-of-the-art WSD system into a state-of-the-art hierarchical phrase-based MT system Hiero. We show for the first time that integrating a WSD system improves the performance of a state-of-the-art statistical MT system on an actual translation task. Furthermore the improvement is statistically significant. 1 Introduction Many words have multiple meanings depending on the context in which they are used. Word sense disambiguation WSD is the task of determining the correct meaning or sense of a word in context. WSD is regarded as an important research problem and is assumed to be helpful for applications such as machine translation MT and information retrieval. In translation different senses of a word w in a source language may have different translations in a target language depending on the particular meaning of w in context. Hence the assumption is that in resolving sense ambiguity a WSD system will be able to help an MT system to determine the correct translation for an ambiguous word. To determine the correct sense of a word WSD systems typically use a wide array of features that are not limited to the local context of w and some of these features may not be used by state-of-the-art statistical MT systems. 33 To perform translation state-of-the-art MT systems use a statistical phrase-based approach Marcu and Wong 2002 Koehn et al. 2003 Och and Ney 2004 by treating .

TÀI LIỆU LIÊN QUAN