tailieunhanh - Báo cáo khoa học: "Generalizing Word Lattice Translation"

Word lattice decoding has proven useful in spoken language translation; we argue that it provides a compelling model for translation of text genres, as well. We show that prior work in translating lattices using finite state techniques can be naturally extended to more expressive synchronous context-free grammarbased models. Additionally, we resolve a significant complication that non-linear word lattice inputs introduce in reordering models. Our experiments evaluating the approach demonstrate substantial gains for ChineseEnglish and Arabic-English translation. . | Generalizing Word Lattice Translation Christopher Dyer Smaranda Muresan Philip Resnik Laboratory for Computational Linguistics and Information Processing Institute for Advanced Computer Studies Department of Linguistics University of Maryland College Park MD 20742 USA redpony smara resnik AT Abstract Word lattice decoding has proven useful in spoken language translation we argue that it provides a compelling model for translation of text genres as well. We show that prior work in translating lattices using finite state techniques can be naturally extended to more expressive synchronous context-free grammarbased models. Additionally we resolve a significant complication that non-linear word lattice inputs introduce in reordering models. Our experiments evaluating the approach demonstrate substantial gains for Chinese-English and Arabic-English translation. 1 Introduction When Brown and colleagues introduced statistical machine translation in the early 1990s their key insight - harkening back to Weaver in the late 1940s -was that translation could be viewed as an instance of noisy channel modeling Brown et al. 1990 . They introduced a now standard decomposition that distinguishes modeling sentences in the target language language models from modeling the relationship between source and target language translation models . Today virtually all statistical translation systems seek the best hypothesis e for a given input f in the source language according to e arg max Pr e f 1 e An exception is the translation of speech recognition output where the acoustic signal generally underdetermines the choice of source word sequence f. There Bertoldi and others have recently found that rather than translating a single-best transcription f it is advantageous to allow the MT decoder to consider all possibilities for f by encoding the alternatives compactly as a confusion network or lattice Bertoldi et al. 2007 Bertoldi and Federico 2005 Koehn et al. 2007 . Why however .

TÀI LIỆU LIÊN QUAN