tailieunhanh - Báo cáo khoa học: " A simple pattern-matching algorithm for recovering empty nodes and their antecedents∗"

This paper describes a simple patternmatching algorithm for recovering empty nodes and identifying their co-indexed antecedents in phrase structure trees that do not contain this information. The patterns are minimal connected tree fragments containing an empty node and all other nodes co-indexed with it. This paper also proposes an evaluation procedure for empty node recovery procedures which is independent of most of the details of phrase structure, which makes it possible to compare the performance of empty node recovery on parser output with the empty node annotations in a goldstandard corpus. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 136-143. A simple pattern-matching algorithm for recovering empty nodes and their antecedents Mark Johnson Brown Laboratory for Linguistic Information Processing Brown University Mark_Johnson@ Abstract This paper describes a simple patternmatching algorithm for recovering empty nodes and identifying their co-indexed antecedents in phrase structure trees that do not contain this information. The patterns are minimal connected tree fragments containing an empty node and all other nodes co-indexed with it. This paper also proposes an evaluation procedure for empty node recovery procedures which is independent of most of the details of phrase structure which makes it possible to compare the performance of empty node recovery on parser output with the empty node annotations in a gold-standard corpus. Evaluating the algorithm on the output of Charniak s parser Char-niak 2000 and the Penn treebank Marcus et al. 1993 shows that the patternmatching algorithm does surprisingly well on the most frequently occuring types of empty nodes given its simplicity. 1 Introduction One of the main motivations for research on parsing is that syntactic structure provides important information for semantic interpretation hence syntactic parsing is an important hrst step in a variety of I would like to thank my colleages in the Brown Laboratory for Linguistic Information Processing BLLIP as well as Michael Collins for their advice. This research was supported by NSF awards DMS 0074276 and ITR IIS 0085940. useful tasks. Broad coverage syntactic parsers with good performance have recently become available Charniak 2000 Collins 2000 but these typically produce as output a parse tree that only encodes local syntactic information . a tree that does not include any empty nodes . Collins 1997 discusses the recovery of one kind of empty node viz. WH-traces . This paper