tailieunhanh - Báo cáo khoa học: "Automatic Construction of Machine Translation Knowledge Using Translation Literalness"

When machine translation (MT) knowledge is automatically constructed from bilingual corpora, redundant rules are acquired due to translation variety. These rules increase ambiguity or cause incorrect MT results. To overcome this problem, we constrain the sentences used for knowledge extraction to "the appropriate bilingual sentences for the MT." In this paper, we propose a method using translation literalness to select appropriate sentences or phrases. | Automatic Construction of Machine Translation Knowledge Using Translation Literalness Kenji Imamura Eiichiro Sumita ATR Spoken Language Translation Research Laboratories Seika-cho Soraku-gun Kyoto Japan @ Yuji Matsumoto Nara Institute of Science and Technology Ikoma-shi Nara Japan matsu@ Abstract When machine translation MT knowledge is automatically constructed from bilingual corpora redundant rules are acquired due to translation variety. These rules increase ambiguity or cause incorrect MT results. To overcome this problem we constrain the sentences used for knowledge extraction to the appropriate bilingual sentences for the MT. In this paper we propose a method using translation literalness to select appropriate sentences or phrases. The translation correspondence rate TCR is defined as the literalness measure. Based on the TCR two automatic construction methods are tested. One is to filter the corpus before rule acquisition. The other is to split the acquisition process into two phases where a bilingual sentence is divided into literal parts and the other parts before different generalizations are applied. The effects are evaluated by the MT quality and about of MT results were improved by the latter method. 1 Introduction Along with the efforts made to accumulate bilingual corpora for many language pairs quite a few machine translation MT systems that automatically construct their knowledge from corpora have been proposed Brown et al. 1993 Menezes and Richardson 2001 Imamura 2002 . However if we use corpora without any restriction redundant rules are acquired due to translation varieties. Such rules increase ambiguity and may cause inappropriate MT results. Translation variety increases with corpus size. For instance large corpora usually contain multiple translations of the same source sentences. Moreover peculiar translations that depend on context or situation proliferate in large corpora. Our .

TỪ KHÓA LIÊN QUAN
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.