tailieunhanh - Báo cáo khoa học: "A Pattern-based Machine Translation System Extended by Example-based Processing"
In this paper, we describe a machine translation system called PalmTree which uses the "patternbased" approach as a fundamental framework. The pure pattern-based translation framework has several issues. One is the performance due to using many rules in the parsing stage, and the other is inefficiency of usage of translation patterns due to the exact-matching. To overcome these problems, we describe several methods; pruning techniques for the former, and introduction of example-based processing for the latter. . | A Pattern-based Machine Translation System Extended by Example-based Processing Hideo Watanabe and Koichi Takeda IBM Research Tokyo Research Laboratory 1623-14 Shimotsuruma Yamato Kanagawa 242-8502 Japan watanabe takeda @ Abstract In this paper we describe a machine translation system called PalmTree which uses the patternbased approach as a fundamental framework. The pure pattern-based translation framework has several issues. One is the performance due to using many rules in the parsing stage and the other is inefficiency of usage of translation patterns due to the exact-matching. To overcome these problems we describe several methods pruning techniques for the former and introduction of example-based processing for the latter. 1 Introduction While the World-Wide Web WWW has quickly turned the Internet into a treasury of information for every netizen non-native English speakers now face a serious problem that textual data are more often than not written in a foreign language. This has led to an explosive popularity of machine translation MT tools in the world. Under these circumstances we developed a machine translation system called PalmTree1 which uses the pattern-based translation 6 7 formalism. The key ideas of the pattern-based MT is to employ a massive collection of diverse transfer knowledge and to select the best translation among the translation candidates ambiguities . This is a natural extension of the example-based MT in the sense that we incorporate not only sentential correspondences bilingual corpora but every other level of linguistic lexical phrasal and collocational expressions into the transfer knowledge. It is also a rule-based counterpart to the word n-grams of the stochastic MT since our patterns intuitively captures the frequent collocations. Although the pattern-based MT framework is promising there are some drawbacks. One is the speed since it uses many rules when parsing. The other is inefficiency of usage of translation .
đang nạp các trang xem trước