tailieunhanh - Báo cáo khoa học: "Revisiting Pivot Language Approach for Machine Translation"

This paper revisits the pivot language approach for machine translation. First, we investigate three different methods for pivot translation. Then we employ a hybrid method combining RBMT and SMT systems to fill up the data gap for pivot translation, where the sourcepivot and pivot-target corpora are independent. Experimental results on spoken language translation show that this hybrid method significantly improves the translation quality, which outperforms the method using a source-target corpus of the same size. . | Revisiting Pivot Language Approach for Machine Translation Hua Wu and Haifeng Wang Toshiba China Research and Development Center 5 F. Tower W2 Oriental Plaza Beijing 100738 China wuhua wanghaifeng @ Abstract This paper revisits the pivot language approach for machine translation. First we investigate three different methods for pivot translation. Then we employ a hybrid method combining RBMT and SMT systems to fill up the data gap for pivot translation where the sourcepivot and pivot-target corpora are independent. Experimental results on spoken language translation show that this hybrid method significantly improves the translation quality which outperforms the method using a source-target corpus of the same size. In addition we propose a system combination approach to select better translations from those produced by various pivot translation methods. This method regards system combination as a translation evaluation problem and formalizes it with a regression learning model. Experimental results indicate that our method achieves consistent and significant improvement over individual translation outputs. 1 Introduction Current statistical machine translation SMT systems rely on large parallel and monolingual training corpora to produce translations of relatively higher quality. Unfortunately large quantities of parallel data are not readily available for some languages pairs therefore limiting the potential use of current SMT systems. In particular for speech translation the translation task often focuses on a specific domain such as the travel domain. It is especially difficult to obtain such a domain-specific corpus for some language pairs such as Chinese to Spanish translation. To circumvent the data bottleneck some researchers have investigated to use a pivot language approach Cohn and Lapata 2007 Utiyama and Isahara 2007 Wu and Wang 2007 Bertoldi et al. 2008 . This approach introduces a third language named the pivot language for which .

TÀI LIỆU MỚI ĐĂNG
37    137    0    19-04-2024
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.