tailieunhanh - Báo cáo khoa học: "Efficient Parsing with Linear Context-Free Rewriting Systems"

Previous work on treebank parsing with discontinuous constituents using Linear Context-Free Rewriting systems (LCFRS) has been limited to sentences of up to 30 words, for reasons of computational complexity. There have been some results on binarizing an LCFRS in a manner that minimizes parsing complexity, but the present work shows that parsing long sentences with such an optimally binarized grammar remains infeasible. Instead, we introduce a technique which removes this length restriction, while maintaining a respectable accuracy. The resulting parser has been applied to a discontinuous treebank with favorable results. . | Efficient Parsing with Linear Context-Free Rewriting Systems Andreas van Cranenburgh Huygens ING ILLC University of Amsterdam Royal Netherlands Academy of Arts and Sciences Postbus 90754 2509 LT The Hague the Netherlands Abstract Previous work on treebank parsing with discontinuous constituents using Linear Context-Free Rewriting systems LCFRS has been limited to sentences of up to 30 words for reasons of computational complexity. There have been some results on binarizing an LCFRS in a manner that minimizes parsing complexity but the present work shows that parsing long sentences with such an optimally binarized grammar remains infeasible. Instead we introduce a technique which removes this length restriction while maintaining a respectable accuracy. The resulting parser has been applied to a discontinuous treebank with favorable results. 1 Introduction Discontinuity in constituent structures cf. figure 1 2 is important for a variety of reasons. For one it allows a tight correspondence between syntax and semantics by letting constituent structure express argument structure Skut et al. 1997 . Other reasons are phenomena such as extraposition and word-order freedom which arguably require discontinuous annotations to be treated systematically in phrase-structures McCawley 1982 Levy 2005 . Empirical investigations demonstrate that discontinuity is present in non-negligible amounts around 30 of sentences contain discontinuity in two German treebanks Maier and S0gaard 2008 Maier and Lichte 2009 . Recent work on treebank parsing with discontinuous constituents Kallmeyer and Maier 2010 Maier 2010 Evang and Kallmeyer 2011 van Cranen-burgh et al. 2011 shows that it is feasible to directly parse discontinuous constituency annotations as given in the German Negra Skut et al. SBARQ I I SQ VP WHNP MD NP VB i l I What should I do Figure 1 A tree with WH-movement from the Penn treebank in which traces have been converted to discontinuity. .

TỪ KHÓA LIÊN QUAN
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.