Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Building Deep Dependency Structures with a Wide-Coverage CCG Parser"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper describes a wide-coverage statistical parser that uses Combinatory Categorial Grammar (CCG) to derive dependency structures. The parser differs from most existing wide-coverage treebank parsers in capturing the long-range dependencies inherent in constructions such as coordination, extraction, raising and control, as well as the standard local predicate-argument dependencies. A set of dependency structures used for training and testing the parser is obtained from a treebank of CCG normal-form derivations, which have been derived (semi-) automatically from the Penn Treebank. . | Building Deep Dependency Structures with a Wide-Coverage CCG Parser Stephen Clark Julia Hockenmaier and Mark Steedman Division of Informatics University of Edinburgh Edinburgh EH8 9LW Uk stephenc julia steedman @cogsci.ed.ac.uk Abstract This paper describes a wide-coverage statistical parser that uses Combinatory Categorial Grammar CCG to derive dependency structures. The parser differs from most existing wide-coverage treebank parsers in capturing the long-range dependencies inherent in constructions such as coordination extraction raising and control as well as the standard local predicate-argument dependencies. A set of dependency structures used for training and testing the parser is obtained from a treebank of CCG normal-form derivations which have been derived semi- automatically from the Penn Treebank. The parser correctly recovers over 80 of labelled dependencies and around 90 of unlabelled dependencies. 1 Introduction Most recent wide-coverage statistical parsers have used models based on lexical dependencies e.g. Collins 1999 Charniak 2000 . However the dependencies are typically derived from a context-free phrase structure tree using simple head percolation heuristics. This approach does not work well for the long-range dependencies involved in raising control extraction and coordination all of which are common in text such as the Wall Street Journal. Chiang 2000 uses Tree Adjoining Grammar as an alternative to context-free grammar and here we use another mildly context-sensitive formalism Combinatory Categorial Grammar CCG Steedman 2000 which arguably provides the most linguistically satisfactory account of the dependencies inherent in coordinate constructions and extraction phenomena. The potential advantage from using such an expressive grammar is to facilitate recovery of such unbounded dependencies. As well as having a potential impact on the accuracy of the parser recovering such dependencies may make the output more useful. CCG is unlike other .