tailieunhanh - Báo cáo khoa học: "Beam-Width Prediction for Efficient Context-Free Parsing"

Efficient decoding for syntactic parsing has become a necessary research area as statistical grammars grow in accuracy and size and as more NLP applications leverage syntactic analyses. We review prior methods for pruning and then present a new framework that unifies their strengths into a single approach. Using a log linear model, we learn the optimal beam-search pruning parameters for each CYK chart cell, effectively predicting the most promising areas of the model space to explore. | Beam-Width Prediction for Efficient Context-Free Parsing Nathan Bodenstab Aaron Dunlop Keith Hall and Brian Roark t Center for Spoken Language Understanding Oregon Health Science University Portland OR Google Inc. Zurich Switzerland bodensta dunlopa roark @ kbhall@ Abstract Efficient decoding for syntactic parsing has become a necessary research area as statistical grammars grow in accuracy and size and as more NLP applications leverage syntactic analyses. We review prior methods for pruning and then present a new framework that unifies their strengths into a single approach. Using a log linear model we learn the optimal beam-search pruning parameters for each CYK chart cell effectively predicting the most promising areas of the model space to explore. We demonstrate that our method is faster than coarse-to-fine pruning exemplified in both the Charniak and Berkeley parsers by empirically comparing our parser to the Berkeley parser using the same grammar and under identical operating conditions. 1 Introduction Statistical constituent parsers have gradually increased in accuracy over the past ten years. This accuracy increase has opened the door to automatically derived syntactic information within a number of NLP tasks. Prior work incorporating parse structure into machine translation Chiang 2010 and Semantic Role Labeling Tsai et al. 2005 Punyakanok et al. 2008 indicate that such hierarchical structure can have great benefit over shallow labeling techniques like chunking and part-of-speech tagging. Although syntax is becoming increasingly important for large-scale NLP applications constituent parsing is slow too slow to scale to the size of many potential consumer applications. The exhaustive CYK algorithm has computational complexity O n3 G where n is the length of the sentence and 440 G is the number of grammar productions a non-negligible constant. Increases in accuracy have primarily been accomplished through an increase in the size of the

TỪ KHÓA LIÊN QUAN
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.