tailieunhanh - Báo cáo khoa học: "English-Chinese Bi-Directional OOV Translation based on Web Mining and Supervised Learning"

In Cross-Language Information Retrieval (CLIR), Out-of-Vocabulary (OOV) detection and translation pair relevance evaluation still remain as key problems. In this paper, an English-Chinese Bi-Directional OOV translation model is presented, which utilizes Web mining as the corpus source to collect translation pairs and combines supervised learning to evaluate their association degree. The experimental results show that the proposed model can successfully filter the most possible translation candidate with the lower computational cost, and improve the OOV translation ranking effect, especially for popular new words. . | English-Chinese Bi-Directional OOV Translation based on Web Mining and Supervised Learning Yuejie Zhang Yang Wang and Xiangyang Xue School of Computer Science Shanghai Key Laboratory of Intelligent Information Processing Fudan University Shanghai 200433 . China yjzhang 072021176 xyxue @ Abstract In Cross-Language Information Retrieval CLIR Out-of-Vocabulary OOV detection and translation pair relevance evaluation still remain as key problems. In this paper an English-Chinese Bi-Directional OOV translation model is presented which utilizes Web mining as the corpus source to collect translation pairs and combines supervised learning to evaluate their association degree. The experimental results show that the proposed model can successfully filter the most possible translation candidate with the lower computational cost and improve the OOV translation ranking effect especially for popular new words. 1 Introduction In Cross-Language Information Retrieval CLIR most of queries are generally composed of short terms in which there are many Out-ofVocabulary OOV terms like named entities new words terminologies and so on. The translation quality of OOVs directly influences the precision of querying relevant multilingual information. Therefore OOV translation has become a very important and challenging issue in CLIR. The translation of OOVs can either be acquired from parallel or comparable corpus Lee 2006 or mining from Web Lu 2004 . However how to evaluate the degree of association between source query term and its target translation is quite important. In this paper an OOV translation model is established based on the combination pattern of Web mining and translation ranking. Given an OOV its related information are gotten from search results by search engine from which the possible translation terms in target language can be extracted and then ranked through supervised learning such as Support Vector Machine SVM and Ranking-SVM Cao 2006 . The basic framework

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.