tailieunhanh - Báo cáo khoa học: "Combining Multiple Resources to Improve SMT-based Paraphrasing Model∗"
This paper proposes a novel method that exploits multiple resources to improve statistical machine translation (SMT) based paraphrasing. In detail, a phrasal paraphrase table and a feature function are derived from each resource, which are then combined in a log-linear SMT model for sentence-level paraphrase generation. Experimental results show that the SMT-based paraphrasing model can be enhanced using multiple resources. The phrase-level and sentence-level precision of the generated paraphrases are above 60% and 55%, respectively. In addition, the contribution of each resource is evaluated, which indicates that all the exploited resources are useful for generating paraphrases of high quality | Combining Multiple Resources to Improve SMT-based Paraphrasing Model Shiqi Zhao1 Cheng Niu2 Ming Zhou2 Ting Liu1 Sheng Li1 1Harbin Institute of Technology Harbin China zhaosq tliu lisheng @ 2Microsoft Research Asia Beijing China chengniu mingzhou @ Abstract This paper proposes a novel method that exploits multiple resources to improve statistical machine translation SMT based paraphrasing. In detail a phrasal paraphrase table and a feature function are derived from each resource which are then combined in a log-linear SMT model for sentence-level paraphrase generation. Experimental results show that the SMT-based paraphrasing model can be enhanced using multiple resources. The phrase-level and sentence-level precision of the generated paraphrases are above 60 and 55 respectively. In addition the contribution of each resource is evaluated which indicates that all the exploited resources are useful for generating paraphrases of high quality. 1 Introduction Paraphrases are alternative ways of conveying the same meaning. Paraphrases are important in many natural language processing NLP applications such as machine translation MT question answering QA information extraction IE multidocument summarization MDS and natural language generation NLG . This paper addresses the problem of sentencelevel paraphrase generation which aims at generating paraphrases for input sentences. An example of sentence-level paraphrases can be seen below S1 The table was set up in the carriage shed. S2 The table was laid under the cart-shed. This research was finished while the first author worked as an intern in Microsoft Research Asia. Paraphrase generation can be viewed as monolingual machine translation Quirk et al. 2004 which typically includes a translation model and a language model. The translation model can be trained using monolingual parallel corpora. However acquiring such corpora is not easy. Hence data sparseness is a key problem for the SMT-based .
đang nạp các trang xem trước