Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Trace Prediction and Recovery With Unlexicalized PCFGs and Slash Features"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

This paper describes a parser which generates parse trees with empty elements in which traces and fillers are co-indexed. The parser is an unlexicalized PCFG parser which is guaranteed to return the most probable parse. The grammar is extracted from a version of the PENN treebank which was automatically annotated with features in the style of Klein and Manning (2003). The annotation includes GPSG-style slash features which link traces and fillers, and other features which improve the general parsing accuracy. . | Trace Prediction and Recovery With Unlexicalized PCFGs and Slash Features Helmut Schmid IMS University of Stuttgart schmid@ims.uni-stuttgart.de Abstract This paper describes a parser which generates parse trees with empty elements in which traces and fillers are co-indexed. The parser is an unlexicalized PCFG parser which is guaranteed to return the most probable parse. The grammar is extracted from a version of the PENN treebank which was automatically annotated with features in the style of Klein and Manning 2003 . The annotation includes GPSG-style slash features which link traces and fillers and other features which improve the general parsing accuracy. In an evaluation on the PENN treebank Marcus et al. 1993 the parser outperformed other unlexicalized PCFG parsers in terms of labeled bracketing f-score. Its results for the empty category prediction task and the trace-filler coindexation task exceed all previously reported results with 84.1 and 77.4 f-score respectively. 1 Introduction Empty categories also called null elements are used in the annotation of the PENN treebank Marcus et al. 1993 in order to represent syntactic phenomena like constituent movement e.g. wh-extraction discontinuous constituents and missing elements PRO elements empty complementizers and relative pronouns . Moved constituents are co-indexed with a trace which is located at the position where the moved constituent is to be interpreted. Figure 1 shows an example of constituent movement in a relative clause. Empty categories provide important information for the semantic interpretation in particular NP I nNs J things WHPP-1 IN WHnP I 11 of WDT r which SBAR VBP PRP I they I are ADJP-PRD jj Ji unaware -NONE- T -1 Figure 1 Co-indexation of traces and fillers for determining the predicate-argument structure of a sentence. However most broad-coverage statistical parsers Collins 1997 Charniak 2000 and others which are trained on the PENN treebank generate parse trees without empty categories.