tailieunhanh - Báo cáo khoa học: "Learning Source-Target Surface Patterns for Web-based Terminology Translation"

This paper introduces a method for learning to find translation of a given source term on the Web. In the approach, the source term is used as a query and part of patterns to retrieve and extract translations in Web pages. The method involves using a bilingual term list to learn sourcetarget surface patterns. At runtime, the given term is submitted to a search engine then the candidate translations are extracted from the returned summaries and subsequently ranked based on the surface patterns, occurrence counts, and transliteration knowledge. We present a prototype called TermMine that applies the method to. | Learning Source-Target Surface Patterns for Web-based Terminology Translation Jian-Cheng Wu Department of Computer Science National Tsing Hua University 101 Kuangfu Road Hsinchu 300 Taiwan D928322@ Tracy Lin Dep. of Communication Eng. National Chiao Tung University 1001 Ta Hsueh Road Hsinchu 300 Taiwan tracylin@ Jason S. Chang Department of Computer Science National Tsing Hua University 101 Kuangfu Road Hsinchu 300 Taiwan jschang@ Abstract This paper introduces a method for learning to find translation of a given source term on the Web. In the approach the source term is used as a query and part of patterns to retrieve and extract translations in Web pages. The method involves using a bilingual term list to learn sourcetarget surface patterns. At runtime the given term is submitted to a search engine then the candidate translations are extracted from the returned summaries and subsequently ranked based on the surface patterns occurrence counts and transliteration knowledge. We present a prototype called TermMine that applies the method to translate terms. Evaluation on a set of encyclopedia terms shows that the method significantly outperforms the state-of-the-art online machine translation systems. 1 Introduction Translation of terms has long been recognized as the bottleneck of translation by translators. By reusing prior translations a significant time spent in translating terms can be saved. For many years now Computer-Aided Translation CAT tools have been touted as very useful for productivity and quality gains for translators. CAT tools such as Trados typically require up-front investment to populate multilingual terminology and translation memory. However such investment has proven prohibitive for many in-house translation departments and freelancer translators and the actual productivity gains realized have been insignificant except for a few very repetitive types of content. Much more productivity gain could be .

TÀI LIỆU MỚI ĐĂNG
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.