tailieunhanh - Báo cáo khoa học: "Text Chunking using Regularized Winnow"
Many machine learning methods have recently been applied to natural language processing tasks. Among them, the Winnow algorithm has been argued to be particularly suitable for NLP problems, due to its robustness to irrelevant features. However in theory, Winnow may not converge for nonseparable data. To remedy this problem, a modification called regularized Winnow has been proposed. In this paper, we apply this new method to text chunking. We show that this method achieves state of the art performance with significantly less computation than previous approaches. . | Text Chunking using Regularized Winnow Tong Zhangf and Fred DamerauỊ and David Johnson IBM TJ. Watson Research Center Yorktown Heights New York 10598 USA ftzhang@ Ịdamerau@ dejohns@ Abstract Many machine learning methods have recently been applied to natural language processing tasks. Among them the Winnow algorithm has been argued to be particularly suitable for NLP problems due to its robustness to irrelevant features. However in theory Winnow may not converge for non-separable data. To remedy this problem a modification called regularized Winnow has been proposed. In this paper we apply this new method to text chunking. We show that this method achieves state of the art performance with significantly less computation than previous approaches. 1 Introduction Recently there has been considerable interest in applying machine learning techniques to problems in natural language processing. One method that has been quite successful in many applications is the SNoW architecture Dagan et al. 1997 Khardon et al. 1999 . This architecture is based on the Winnow algorithm Littlestone 1988 Grove and Roth 2001 which in theory is suitable for problems with many irrelevant attributes. In natural language processing one often encounters a very high dimensional feature space although most of the features are irrelevant. Therefore the robustness of Winnow to high dimensional feature space is considered an important reason why it is suitable for NLP tasks. However the convergence of the Winnow algorithm is only guaranteed for linearly separable data. In practical NLP applications data are often linearly non-separable. Consequently a direct application of Winnow may lead to numerical instability. A remedy for this called regularized Winnow has been recently proposed in Zhang 2001 . This method modifies the original Winnow algorithm so that it solves a regularized optimization problem. It converges both in the linearly separable case and in the .
đang nạp các trang xem trước