tailieunhanh - Báo cáo khoa học: "Unary Constraints for Efficient Context-Free Parsing"

We present a novel pruning method for context-free parsing that increases efficiency by disallowing phrase-level unary productions in CKY chart cells spanning a single word. Our work is orthogonal to recent work on “closing” chart cells, which has focused on multi-word constituents, leaving span-1 chart cells unpruned. We show that a simple discriminative classifier can learn with high accuracy which span-1 chart cells to close to phrase-level unary productions. | Unary Constraints for Efficient Context-Free Parsing Nathan Bodenstab Kristy Hollingshead and Brian Roark Center for Spoken Language Understanding Oregon Health Science University Portland OR University of Maryland Institute for Advanced Computer Studies College Park MD bodensta roark @ hollingk@ Abstract We present a novel pruning method for context-free parsing that increases efficiency by disallowing phrase-level unary productions in CKY chart cells spanning a single word. Our work is orthogonal to recent work on closing chart cells which has focused on multi-word constituents leaving span-1 chart cells unpruned. We show that a simple discriminative classifier can learn with high accuracy which span-1 chart cells to close to phrase-level unary productions. Eliminating these unary productions from the search can have a large impact on downstream processing depending on implementation details of the search. We apply our method to four parsing architectures and demonstrate how it is complementary to the cell-closing paradigm as well as other pruning methods such as coarse-to-fine agenda and beam-search pruning. 1 Introduction While there have been great advances in the statistical modeling of hierarchical syntactic structure in the past 15 years exact inference with such models remains very costly and most rich syntactic modeling approaches resort to heavy pruning pipelining or both. Graph-based pruning methods such as best-first and beam-search have both be used within context-free parsers to increase their efficiency. Pipeline systems make use of simpler models to reduce the search space of the full model. For example the well-known Charniak parser Char-niak 2000 uses a simple grammar to prune the search space for a richer model in a second pass. 676 Roark and Hollingshead 2008 2009 have recently shown that using a finite-state tagger to close cells within the CKY chart can reduce the worst-case and average-case complexity of .

TỪ KHÓA LIÊN QUAN