tailieunhanh - Báo cáo khoa học: "A New Statistical Parser Based on Bigram Lexical Dependencies"

This paper describes a new statistical parser which is based on probabilities of dependencies between head-words in the parse tree. Standard bigram probability estimation techniques are extended to calculate probabilities of dependencies between pairs of words. Tests using Wall Street Journal data show that the method performs at least as well as S P A T T E R (Magerman 95; Jelinek et al. 94), which has the best published results for a statistical parser on this task. | A New Statistical Parser Based on Bigram Lexical Dependencies Michael John Collins Dept of Computer and Information Science University of Pennsylvania Philadelphia PA 19104 . Abstract This paper describes a new statistical parser which is based on probabilities of dependencies between head-words in the parse tree. Standard bigram probability estimation techniques are extended to calculate probabilities of dependencies between pairs of words. Tests using Wall Street Journal data show that the method performs at least as well as SPATTER Mager-man 95 Jelinek et al. 94 which has the best published results for a statistical parser on this task. The simplicity of the approach means the model trains on 40 000 sentences in under 15 minutes. With a beam search strategy parsing speed can be improved to over 200 sentences a minute with negligible loss in accuracy. 1 Introduction Lexical information has been shown to be crucial for many parsing decisions such as prepositional-phrase attachment for example Hindle and Rooth 93 . However early approaches to probabilistic parsing Pereira and Schabes 92 Magerman and Marcus 91 Briscoe and Carroll 93 conditioned probabilities on non-terminal labels and part of speech tags alone. The SPATTER parser Magerman 95 Jelinek et al. 94 does use lexical information and recovers labeled constituents in Wall Street Journal text with above 84 accuracy - as far as we know the best published results on this task. This paper describes a new parser which is much simpler than SPATTER yet performs at least as well when trained and tested on the same Wall Street Journal data. The method uses lexical information directly by modeling head-modifier1 relations between pairs of words. In this way it is similar to This research was supported by ARPA Grant N6600194-C6043. JBy modifier we mean the linguistic notion of either an argument or adjunct. Link grammars Lafferty et al. 92 and dependency grammars in general. 2 The .