Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Joint Learning of a Dual SMT System for Paraphrase Generation"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
SMT has been used in paraphrase generation by translating a source sentence into another (pivot) language and then back into the source. The resulting sentences can be used as candidate paraphrases of the source sentence. | Joint Learning of a Dual SMT System for Paraphrase Generation Hong Sun School of Computer Science and Technology Tianjin University kaspersky@tju.edu.cn Ming Zhou Microsoft Research Asia mingzhou@microsoft.com Abstract SMT has been used in paraphrase generation by translating a source sentence into another pivot language and then back into the source. The resulting sentences can be used as candidate paraphrases of the source sentence. Existing work that uses two independently trained SMT systems cannot directly optimize the paraphrase results. Paraphrase criteria especially the paraphrase rate is not able to be ensured in that way. In this paper we propose a joint learning method of two SMT systems to optimize the process of paraphrase generation. In addition a revised BLEU score called iBLEU which measures the adequacy and diversity of the generated paraphrase sentence is proposed for tuning parameters in SMT systems. Our experiments on NIST 2008 testing data with automatic evaluation as well as human judgments suggest that the proposed method is able to enhance the paraphrase quality by adjusting between semantic equivalency and surface dissimilarity. 1 Introduction Paraphrasing at word phrase and sentence levels is a procedure for generating alternative expressions with an identical or similar meaning to the original text. Paraphrasing technology has been applied in many NLP applications such as machine translation MT question answering QA and natural language generation NLG . 1This work has been done while the author was visiting Microsoft Research Asia. 38 As paraphrasing can be viewed as a translation process between the original expression as input and the paraphrase results as output both in the same language statistical machine translation SMT has been used for this task. Quirk et al. 2004 build a monolingual translation system using a corpus of sentence pairs extracted from news articles describing same events. Zhao et al. 2008a enrich this approach by .