tailieunhanh - Báo cáo khoa học: "Syntax-Based Word Ordering Incorporating a Large-Scale Language Model"

A fundamental problem in text generation is word ordering. Word ordering is a computationally difficult problem, which can be constrained to some extent for particular applications, for example by using synchronous grammars for statistical machine translation. There have been some recent attempts at the unconstrained problem of generating a sentence from a multi-set of input words (Wan et al., 2009; Zhang and Clark, 2011). By using CCG and learning guided search, Zhang and Clark reported the highest scores on this task. One limitation of their system is the absence of an N-gram language model, which has been used by. | Syntax-Based Word Ordering Incorporating a Large-Scale Language Model Yue Zhang University of Cambridge Computer Laboratory yz360@ Graeme Blackwood University of Cambridge Engineering Department gwb24@ Stephen Clark University of Cambridge Computer Laboratory sc609@ Abstract A fundamental problem in text generation is word ordering. Word ordering is a computationally difficult problem which can be constrained to some extent for particular applications for example by using synchronous grammars for statistical machine translation. There have been some recent attempts at the unconstrained problem of generating a sentence from a multi-set of input words Wan et al. 2009 Zhang and Clark 2011 . By using CCG and learning guided search Zhang and Clark reported the highest scores on this task. One limitation of their system is the absence of an N-gram language model which has been used by text generation systems to improve fluency. We take the Zhang and Clark system as the baseline and incorporate an N-gram model by applying online large-margin training. Our system significantly improved on the baseline by BLEU points. 1 Introduction One fundamental problem in text generation is word ordering which can be abstractly formulated as finding a grammatical order for a multiset of words. The word ordering problem can also include word choice where only a subset of the input words are used to produce the output. Word ordering is a difficult problem. Finding the best permutation for a set of words according to a bigram language model for example is NP-hard which can be proved by linear reduction from the traveling salesman problem. In practice exploring the whole search space of permutations is often prevented by adding constraints. In phrase-based machine translation Koehn et al. 2003 Koehn et al. 2007 a distortion limit is used to constrain the position of output phrases. In syntax-based machine translation systems such as Wu 1997 and Chiang 2007

TỪ KHÓA LIÊN QUAN
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.