tailieunhanh - Báo cáo khoa học: "Chinese-English Backward Transliteration Assisted with Mining Monolingual Web Pages"
In this paper, we present a novel backward transliteration approach which can further assist the existing statistical model by mining monolingual web resources. Firstly, we employ the syllable-based search to revise the transliteration candidates from the statistical model. By mapping all of them into existing words, we can filter or correct some pseudo candidates and improve the overall recall. Secondly, an AdaBoost model is used to rerank the revised candidates based on the information extracted from monolingual web pages. . | Chinese-English Backward Transliteration Assisted with Mining Monolingual Web Pages Fan Yang Jun Zhao Bo Zou Kang Liu Feifan Liu National Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences Beijing 100190 China fyang jzhao bzou kliu ffliu @ Abstract In this paper we present a novel backward transliteration approach which can further assist the existing statistical model by mining monolingual web resources. Firstly we employ the syllable-based search to revise the transliteration candidates from the statistical model. By mapping all of them into existing words we can filter or correct some pseudo candidates and improve the overall recall. Secondly an AdaBoost model is used to rerank the revised candidates based on the information extracted from monolingual web pages. To get a better precision during the reranking process a variety of web-based information is exploited to adjust the ranking score so that some candidates which are less possible to be transliteration names will be assigned with lower ranks. The experimental results show that the proposed framework can significantly outperform the baseline transliteration system in both precision and recall. 1 Introduction The task of Name Entity NE translation is to translate a name entity from source language to target language which plays an important role in machine translation and cross-language information retrieval CLIR . Transliteration is a subtask in NE translation which translates NEs based on the phonetic similarity. In NE translation most person names are transliterated and some parts of location names or organization names also need to be transliterated. Transliteration has two directions forward transliteration which transforms an original name into target language and backward transliteration which recovers a name back to its original expression. For instance the original English per- Contact Jun ZHAO jzhao@. son name Clinton can be forward .
đang nạp các trang xem trước