tailieunhanh - Báo cáo khoa học: "Word Lattices for Multi-Source Translation"

Multi-source statistical machine translation is the process of generating a single translation from multiple inputs. Previous work has focused primarily on selecting from potential outputs of separate translation systems, and solely on multi-parallel corpora and test sets. We demonstrate how multi-source translation can be adapted for multiple monolingual inputs. We also examine different approaches to dealing with multiple sources, including consensus decoding, and we present a novel method of input combination to generate lattices for multi-source translation within a single translation model. In this paper, we present three models of multisource translation, with increasing degrees of sophistication, which we. | Word Lattices for Multi-Source Translation Josh Schroeder Trevor Cohn and Philipp Koehn School of Informatics University of Edinburgh 10 Crichton Street Edinburgh EH8 9AB Scotland United Kingdom jschroel tcohn pkoehn @ Abstract Multi-source statistical machine translation is the process of generating a single translation from multiple inputs. Previous work has focused primarily on selecting from potential outputs of separate translation systems and solely on multi-parallel corpora and test sets. We demonstrate how multi-source translation can be adapted for multiple monolingual inputs. We also examine different approaches to dealing with multiple sources including consensus decoding and we present a novel method of input combination to generate lattices for multi-source translation within a single translation model. 1 Introduction Multi-source statistical machine translation was first formally defined by Och and Ney 2001 as the process of translating multiple meaningequivalent source language texts into a single target language. Multi-source translation is of particular use when translating a document that has already been translated into several languages either by humans or machines and needs to be further translated into other target languages. This situation occurs often in large multi-lingual organisations such as the United Nations and the European Parliament which must translate their proceedings into the languages of the member institutions. It is also common in multi-national companies which need to translate product and marketing documentation for their different markets. Clearly any existing translations for a document can help automatic translation into other languages. These different versions of the input can resolve deficiencies and ambiguities . syntactic and semantic ambiguity present in a single input resulting in higher quality translation output. In this paper we present three models of multisource translation with increasing .

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN