tailieunhanh - Báo cáo khoa học: "Generative Models for Statistical Parsing with Combinatory Categorial Grammar"

This paper compares a number of generative probability models for a widecoverage Combinatory Categorial Grammar (CCG) parser. These models are trained and tested on a corpus obtained by translating the Penn Treebank trees into CCG normal-form derivations. According to an evaluation of unlabeled word-word dependencies, our best model achieves a performance of , comparable to the figures given by Collins (1999) for a linguistically less expressive grammar. In contrast to Gildea (2001), we find a significant improvement from modeling wordword dependencies. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 335-342. Generative Models for Statistical Parsing with Combinatory Categorial Grammar Julia Hockenmaier and Mark Steedman Division of Informatics University of Edinburgh Edinburgh EH8 9LW United Kingdom julia steedman @ Abstract This paper compares a number of generative probability models for a wide-coverage Combinatory Categorial Grammar CCG parser. These models are trained and tested on a corpus obtained by translating the Penn Treebank trees into CCG normal-form derivations. According to an evaluation of unlabeled word-word dependencies our best model achieves a performance of comparable to the figures given by Collins 1999 for a linguistically less expressive grammar. In contrast to Gildea 2001 we find a significant improvement from modeling wordword dependencies. 1 Introduction The currently best single-model statistical parser Charniak 1999 achieves Parseval scores of over 89 on the Penn Treebank. However the grammar underlying the Penn Treebank is very permissive and a parser can do well on the standard Parseval measures without committing itself on certain semantically significant decisions such as predicting null elements arising from deletion or movement. The potential benefit of wide-coverage parsing with CCG lies in its more constrained grammar and its simple and semantically transparent capture of extraction and coordination. We present a number of models over syntactic derivations of Combinatory Categorial Grammar CCG see Steedman 2000 and Clark et al. 2002 this conference for introduction estimated from and tested on a translation of the Penn Treebank to a corpus of CCG normal-form derivations. CCG grammars are characterized by much larger category sets than standard Penn Treebank grammars distinguishing for example between many classes of verbs with different subcategorization frames. As a result the categorial

TỪ KHÓA LIÊN QUAN