tailieunhanh - Báo cáo khoa học: " New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron"

This paper introduces new learning algorithms for natural language processing based on the perceptron algorithm. We show how the algorithms can be efficiently applied to exponential sized representations of parse trees, such as the “all subtrees” (DOP) representation described by (Bod 1998), or a representation tracking all sub-fragments of a tagged sentence. We give experimental results showing significant improvements on two tasks: parsing Wall Street Journal text, and namedentity extraction from web data. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 263-270. New Ranking Algorithms for Parsing and Tagging Kernels over Discrete Structures and the Voted Perceptron Michael Collins AT T Labs-Research Florham Park New Jersey. mcollins@ Nigel Duffy iKuni Inc. 3400 Hillview Ave. Building 5 Palo Alto CA 94304. nigeduff@ Abstract This paper introduces new learning algorithms for natural language processing based on the perceptron algorithm. We show how the algorithms can be efficiently applied to exponential sized representations of parse trees such as the all subtrees DOP representation described by Bod 1998 or a representation tracking all sub-fragments of a tagged sentence. We give experimental results showing significant improvements on two tasks parsing Wall Street Journal text and named-entity extraction from web data. 1 Introduction The perceptron algorithm is one of the oldest algorithms in machine learning going back to Rosenblatt 1958 . It is an incredibly simple algorithm to implement and yet it has been shown to be competitive with more recent learning methods such as support vector machines - see Freund Schapire 1999 for its application to image classification for example. This paper describes how the perceptron and voted perceptron algorithms can be used for parsing and tagging problems. Crucially the algorithms can be efficiently applied to exponential sized representations of parse trees such as the all subtrees DOP representation described by Bod 1998 or a representation tracking all sub-fragments of a tagged sentence. It might seem paradoxical to be able to efficiently learn and apply a model with an exponential number of The key to our algorithms is the Although see Goodman 1996 for an efficient algorithm for the DOP model which we discuss in section 7 of this paper. kernel trick Cristianini and Shawe-Taylor 2000 discuss kernel methods at length .