tailieunhanh - Báo cáo khoa học: "A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment"
In this paper, we propose a novel system for translating organization names from Chinese to English with the assistance of web resources. Firstly, we adopt a chunkingbased segmentation method to improve the segmentation of Chinese organization names which is plagued by the OOV problem. Then a heuristic query construction method is employed to construct an efficient query which can be used to search the bilingual Web pages containing translation equivalents. Finally, we align the Chinese organization name with English sentences using the asymmetric alignment method to find the best English fragment as the translation equivalent. . | A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment Fan Yang Jun Zhao Kang Liu National Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences Beijing 100190 China fyang jzhao kliu @ Abstract In this paper we propose a novel system for translating organization names from Chinese to English with the assistance of web resources. Firstly we adopt a chunkingbased segmentation method to improve the segmentation of Chinese organization names which is plagued by the OOV problem. Then a heuristic query construction method is employed to construct an efficient query which can be used to search the bilingual Web pages containing translation equivalents. Finally we align the Chinese organization name with English sentences using the asymmetric alignment method to find the best English fragment as the translation equivalent. The experimental results show that the proposed method outperforms the baseline statistical machine translation system by . 1 Introduction The task of Named Entity NE translation is to translate a named entity from the source language to the target language which plays an important role in machine translation and cross-language information retrieval CLIR . The organization name ON translation is the most difficult subtask in NE translation. The structure of ON is complex and usually nested including person name location name and sub-ON etc. For example the organization name 4tjj iijM3EM 7 w R . c hJ Beijing Nokia Communication Ltd. contains a company name i H3E Nokia and a location name 4 tit Beijing . Therefore the translation of organization names should combine transliteration and translation together. Many previous researchers have tried to solve ON translation problem by building a statistical model or with the assistance of web resources. The performance of ON translation using web knowledge is determined by the solution of the following two .
đang nạp các trang xem trước