tailieunhanh - Báo cáo khoa học: "Revision Learning and its Application to Part-of-Speech Tagging"
This paper presents a revision learning method that achieves high performance with small computational cost by combining a model with high generalization capacity and a model with small computational cost. This method uses a high capacity model to revise the output of a small cost model. We apply this method to English partof-speech tagging and Japanese morphological analysis, and show that the method performs well. | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 497-504. Revision Learning and its Application to Part-of-Speech Tagging Tetsuji Nakagawa and Taku Kudo and Yuji Matsumoto tetsu-na@ taku-ku matsu @ Graduate School of Information Science Nara Institute of Science and Technology 8916 5 Takayama Ikoma Nara 630 0101 Japan Abstract This paper presents a revision learning method that achieves high performance with small computational cost by combining a model with high generalization capacity and a model with small computational cost. This method uses a high capacity model to revise the output of a small cost model. We apply this method to English part-of-speech tagging and Japanese morphological analysis and show that the method performs well. 1 Introduction Recently corpus-based approaches have been widely studied in many natural language processing tasks such as part-of-speech POS tagging syntactic analysis text categorization and word sense disambiguation. In corpus-based natural language processing one important issue is to decide which learning model to use. Various learning models have been studied such as Hidden Markov models HMMs Rabiner and Juang 1993 decision trees Breiman et al. 1984 and maximum entropy models Berger et al. 1996 . Recently Support Vector Machines SVMs Vapnik 1998 Cortes and Vap-nik 1995 are getting to be used which are supervised machine learning algorithm for binary classification. SVMs have good generalization performance and can handle a large number of features and are applied to some tasks Presently with Oki Electric Industry successfully Joachims 1998 Kudoh and Matsumoto 2000 . However their computational cost is large and is a weakness of SVMs. In general a trade-off between capacity and computational cost of learning models exists. For example SVMs have relatively high generalization capacity but have high computational cost. On
đang nạp các trang xem trước