tailieunhanh - Báo cáo khoa học: " Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques"
We present a stochastic parsing system consisting of a Lexical-Functional Grammar (LFG), a constraint-based parser and a stochastic disambiguation model. We report on the results of applying this system to parsing the UPenn Wall Street Journal (WSJ) treebank. The model combines full and partial parsing techniques to reach full grammar coverage on unseen data. The treebank annotations are used to provide partially labeled data for discriminative statistical estimation using exponential models. Disambiguation performance is evaluated by measuring matches of predicate-argument relations on two distinct test sets. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 271-278. Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques Stefan Riezler Palo Alto Research Center Palo Alto CA 94304 riezler@ Richard Crouch Palo Alto Research Center Palo Alto CA 94304 crouch@ Tracy H. King Palo Alto Research Center Palo Alto CA 94304 thking@ John T. Maxwell III Palo Alto Research Center Palo Alto CA 94304 maxwell@ Ronald M. Kaplan Palo Alto Research Center Palo Alto CA 94304 kaplan@ Mark Johnson Brown University Providence RI 02912 mj@ Abstract We present a stochastic parsing system consisting of a Lexical-Functional Grammar LFG a constraint-based parser and a stochastic disambiguation model. We report on the results of applying this system to parsing the UPenn Wall Street Journal WSJ treebank. The model combines full and partial parsing techniques to reach full grammar coverage on unseen data. The treebank annotations are used to provide partially labeled data for discriminative statistical estimation using exponential models. Disambiguation performance is evaluated by measuring matches of predicate-argument relations on two distinct test sets. On a gold standard of manually annotated f-structures for a subset of the WSJ treebank this evaluation reaches 79 F-score. An evaluation on a gold standard of dependency relations for Brown corpus data achieves 76 F-score. 1 Introduction Statistical parsing using combined systems of hand-coded linguistically fine-grained grammars and stochastic disambiguation components has seen considerable progress in recent years. However such attempts have so far been confined to a relatively small scale for various reasons. Firstly the rudimentary character of functional annotations in standard treebanks has hindered the direct use of such data for statistical estimation of linguistically .
đang nạp các trang xem trước