tailieunhanh - Báo cáo khoa học: "Transfer Learning, Feature Selection and Word Sense Disambguation"

We propose a novel approach for improving Feature Selection for Word Sense Disambiguation by incorporating a feature relevance prior for each word indicating which features are more likely to be selected. We use transfer of knowledge from similar words to learn this prior over the features, which permits us to learn higher accuracy models, particularly for the rarer word senses. Results on the O NTO N OTES verb data show significant improvement over the baseline feature selection algorithm and results that are comparable to or better than other state-of-the-art methods. in this case). . | Transfer Learning Feature Selection and Word Sense Disambguation Paramveer S. Dhillon and Lyle H. Ungar Computer and Information Science University of Pennsylvania Philadelphia PA pasingh ungar @ Abstract We propose a novel approach for improving Feature Selection for Word Sense Disambiguation by incorporating a feature relevance prior for each word indicating which features are more likely to be selected. We use transfer of knowledge from similar words to learn this prior over the features which permits us to learn higher accuracy models particularly for the rarer word senses. Results on the OntoNotes verb data show significant improvement over the baseline feature selection algorithm and results that are comparable to or better than other state-of-the-art methods. 1 Introduction The task of WSD has been mostly studied in a supervised learning setting . Florian and Yarowsky 2002 and feature selection has always been an important component of high accuracy word sense disambiguation as one often has thousands of features but only hundreds of observations of the words Florian and Yarowsky 2002 . The main problem that arises with supervised WSD techniques including ones that do feature selection is the paucity of labeled data. For example the training set of Senseval-2 English lexical sample task has only 10 labeled examples per sense Florian and Yarowsky 2002 which makes it difficult to build high accuracy models using only supervised learning techniques. It is thus an attractive alternative to use transfer learning Ando and Zhang 2005 which improves performance by generalizing from solutions to similar learning problems. Ando 2006 abbreviated as Ando CoNLL 06 have successfully applied the ASO Alternating Structure Optimization technique proposed by Ando and Zhang 2005 in its transfer learning configuration to the problem of WSD by doing joint empirical risk minimization of a set of related problems words in this case . In this paper we show .

TỪ KHÓA LIÊN QUAN