tailieunhanh - Báo cáo khoa học: "Insertion Operator for Bayesian Tree Substitution Grammars"

We propose a model that incorporates an insertion operator in Bayesian tree substitution grammars (BTSG). Tree insertion is helpful for modeling syntax patterns accurately with fewer grammar rules than BTSG. The experimental parsing results show that our model outperforms a standard PCFG and BTSG for a small dataset. For a large dataset, our model obtains comparable results to BTSG, making the number of grammar rules much smaller than with BTSG. | Insertion Operator for Bayesian Tree Substitution Grammars Hiroyuki Shindo Akinori Fujino and Masaaki Nagata NTT Communication Science Laboratories NTT Corp. 2-4 Hikaridai Seika-cho Soraku-gun Kyoto 619-0237 Japan @ Abstract We propose a model that incorporates an insertion operator in Bayesian tree substitution grammars BTSG . Tree insertion is helpful for modeling syntax patterns accurately with fewer grammar rules than BTSG. The experimental parsing results show that our model outperforms a standard PCFG and BTSG for a small dataset. For a large dataset our model obtains comparable results to BTSG making the number of grammar rules much smaller than with BTSG. 1 Introduction Tree substitution grammar TSG is a promising formalism for modeling language data. TSG generalizes context free grammars CFG by allowing nonterminal nodes to be replaced with subtrees of arbitrary size. A natural extension of TSG involves adding an insertion operator for combining subtrees as in tree adjoining grammars TAG Joshi 1985 or tree insertion grammars TIG Schabes and Waters 1995 . An insertion operator is helpful for expressing various syntax patterns with fewer grammar rules thus we expect that adding an insertion operator will improve parsing accuracy and realize a compact grammar size. One of the challenges of adding an insertion operator is that the computational cost of grammar induction is high since tree insertion significantly increases the number of possible subtrees. Previous work on TAG and TIG induction Xia 1999 Chiang 2003 Chen et al. 2006 has addressed the problem using language-specific heuristics and a maxi 206 mum likelihood estimator which leads to overfitting the training data Post and Gildea 2009 . Instead we incorporate an insertion operator in a Bayesian TSG BTSG model Cohn et al. 2011 that learns grammar rules automatically without heuristics. Our model uses a restricted variant of subtrees for .

TỪ KHÓA LIÊN QUAN
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.