tailieunhanh - Báo cáo khoa học: "Making Tree Kernels practical for Natural Language Learning"

In recent years tree kernels have been proposed for the automatic learning of natural language applications. Unfortunately, they show (a) an inherent super linear complexity and (b) a lower accuracy than traditional attribute/value methods. In this paper, we show that tree kernels are very helpful in the processing of natural language as (a) we provide a simple algorithm to compute tree kernels in linear average running time and (b) our study on the classification properties of diverse tree kernels show that kernel combinations always improve the traditional methods. . | Making Tree Kernels practical for Natural Language Learning Alessandro Moschitti Department of Computer Science University of Rome Tor Vergata Rome Italy moschitti@ Abstract In recent years tree kernels have been proposed for the automatic learning of natural language applications. Unfortunately they show a an inherent super linear complexity and b a lower accuracy than traditional attribute value methods. In this paper we show that tree kernels are very helpful in the processing of natural language as a we provide a simple algorithm to compute tree kernels in linear average running time and b our study on the classification properties of diverse tree kernels show that kernel combinations always improve the traditional methods. Experiments with Support Vector Machines on the predicate argument classification task provide empirical support to our thesis. 1 Introduction In recent years tree kernels have been shown to be interesting approaches for the modeling of syntactic information in natural language tasks . syntactic parsing Collins and Duffy 2002 relation extraction Zelenko et al. 2003 Named Entity recognition Cumby and Roth 2003 Culotta and Sorensen 2004 and Semantic Parsing Mos-chitti 2004 . The main tree kernel advantage is the possibility to generate a high number of syntactic features and let the learning algorithm to select those most relevant for a specific application. In contrast their major drawback are a the computational time complexity which is superlinear in the number of tree nodes and b the accuracy that they produce is often lower than the one provided by linear models on manually designed features. To solve problem a a linear complexity algorithm for the subtree ST kernel computation was designed in Vishwanathan and Smola 2002 . Unfortunately the ST set is rather poorer than the one generated by the subset tree SST kernel designed in Collins and Duffy 2002 . Intuitively an ST rooted in a node n of the target tree always .

TỪ KHÓA LIÊN QUAN