tailieunhanh - Báo cáo khoa học: "Domain Kernels for Word Sense Disambiguation"

In this paper we present a supervised Word Sense Disambiguation methodology, that exploits kernel methods to model sense distinctions. In particular a combination of kernel functions is adopted to estimate independently both syntagmatic and domain similarity. We defined a kernel function, namely the Domain Kernel, that allowed us to plug “external knowledge” into the supervised learning process. | Domain Kernels for Word Sense Disambiguation Alfio Gliozzo and Claudio Giuliano and Carlo Strapparava ITC-irst Istituto per la Ricerca Scientifica e Tecnologica I-38050 Trento ITALY gliozzo giuliano strappa @ Abstract In this paper we present a supervised Word Sense Disambiguation methodology that exploits kernel methods to model sense distinctions. In particular a combination of kernel functions is adopted to estimate independently both syntagmatic and domain similarity. We defined a kernel function namely the Domain Kernel that allowed us to plug external knowledge into the supervised learning process. External knowledge is acquired from unlabeled data in a totally unsupervised way and it is represented by means of Domain Models. We evaluated our methodology on several lexical sample tasks in different languages outperforming significantly the state-of-the-art for each of them while reducing the amount of labeled training data required for learning. 1 Introduction The main limitation of many supervised approaches for Natural Language Processing NLP is the lack of available annotated training data. This problem is known as the Knowledge Acquisition Bottleneck. To reach high accuracy state-of-the-art systems for Word Sense Disambiguation WSD are designed according to a supervised learning framework in which the disambiguation of each word in the lexicon is performed by constructing a different classifier. A large set of sense tagged examples is then required to train each classifier. This methodology is called word expert approach Small 1980 Yarowsky and Florian 2002 . However this is clearly unfeasible for all-words WSD tasks in which all the words of an open text should be disambiguated. On the other hand the word expert approach works very well for lexical sample WSD tasks . tasks in which it is required to disambiguate only those words for which enough training data is provided . As the original rationale of the lexical sample tasks was to define a .