tailieunhanh - Báo cáo khoa học: "A Best-First Probabilistic Shift-Reduce Parser"

Recently proposed deterministic classifierbased parsers (Nivre and Scholz, 2004; Sagae and Lavie, 2005; Yamada and Matsumoto, 2003) offer attractive alternatives to generative statistical parsers. Deterministic parsers are fast, efficient, and simple to implement, but generally less accurate than optimal (or nearly optimal) statistical parsers. We present a statistical shift-reduce parser that bridges the gap between deterministic and probabilistic parsers. The parsing model is essentially the same as one previously used for deterministic parsing, but the parser performs a best-first search instead of a greedy search. . | A Best-First Probabilistic Shift-Reduce Parser Kenji Sagae and Alon Lavie Language Technologies Institute Carnegie Mellon University Pittsburgh PA 15213 sagae alavie @ Abstract Recently proposed deterministic classifierbased parsers Nivre and Scholz 2004 Sagae and Lavie 2005 Yamada and Matsumoto 2003 offer attractive alternatives to generative statistical parsers. Deterministic parsers are fast efficient and simple to implement but generally less accurate than optimal or nearly optimal statistical parsers. We present a statistical shift-reduce parser that bridges the gap between deterministic and probabilistic parsers. The parsing model is essentially the same as one previously used for deterministic parsing but the parser performs a best-first search instead of a greedy search. Using the standard sections of the WSJ corpus of the Penn Treebank for training and testing our parser has precision and recall using automatically assigned part-of-speech tags . Perhaps more interestingly the parsing model is significantly different from the generative models used by other well-known accurate parsers allowing for a simple combination that produces precision and recall of and respectively. 1 Introduction Over the past decade researchers have developed several constituent parsers trained on annotated data that achieve high levels of accuracy. Some of the more popular and more accurate of these approaches to data-driven parsing Charniak 2000 Collins 1997 Klein and Manning 2002 have been based on generative mod els that are closely related to probabilistic context-free grammars. Recently classifier-based dependency parsing Nivre and Scholz 2004 Yamada and Matsumoto 2003 has showed that deterministic parsers are capable of high levels of accuracy despite great simplicity. This work has led to the development of deterministic parsers for constituent structures as well Sagae and Lavie 2005 Tsuruoka and Tsujii 2005 . However evaluations on the widely

TÀI LIỆU LIÊN QUAN