tailieunhanh - Báo cáo khoa học: "A Kernel PCA Method for Superior Word Sense Disambiguation"

We introduce a new method for disambiguating word senses that exploits a nonlinear Kernel Principal Component Analysis (KPCA) technique to achieve accuracy superior to the best published individual models. We present empirical results demonstrating significantly better accuracy compared to the state-of-the-art achieved by either na¨ve Bayes ı or maximum entropy models, on Senseval-2 data. We also contrast against another type of kernel method, the support vector machine (SVM) model, and show that our KPCA-based model outperforms the SVM-based model. . | A Kernel PCA Method for Superior Word Sense Disambiguation Dekai Wu1 Weifeng Su Marine Carpuat dekai@ weifeng@ marine@ Human Language Technology Center HKUST Department of Computer Science University of Science and Technology Clear Water Bay Hong Kong Abstract We introduce a new method for disambiguating word senses that exploits a nonlinear Kernel Principal Component Analysis KPCA technique to achieve accuracy superior to the best published individual models. We present empirical results demonstrating significantly better accuracy compared to the state-of-the-art achieved by either naive Bayes or maximum entropy models on Senseval-2 data. We also contrast against another type of kernel method the support vector machine SVM model and show that our KPCA-based model outperforms the SVM-based model. It is hoped that these highly encouraging first results on KPCA for natural language processing tasks will inspire further development of these directions. 1 Introduction Achieving higher precision in supervised word sense disambiguation WSD tasks without resorting to ad hoc voting or similar ensemble techniques has become somewhat daunting in recent years given the challenging benchmarks set by naive Bayes models . Mooney 1996 Chodorow et al. 1999 Pedersen 2001 Yarowsky and Florian 2002 as well as maximum entropy models . Dang and Palmer 2002 Klein and Manning 2002 . A good foundation for comparative studies has been established by the Senseval data and evaluations of particular relevance here are the lexical sample tasks from Senseval-1 Kilgarriff and Rosenzweig 1999 and Senseval-2 Kilgarriff 2001 . We therefore chose this problem to introduce an efficient and accurate new word sense disambiguation approach that exploits a nonlinear Kernel PCA technique to make predictions implicitly based on generalizations over feature combinations. The The author would like to thank the Hong Kong Research Grants Council RGC for supporting this research

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.