Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Hierarchical Text Classification with Latent Concepts"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

Recently, hierarchical text classification has become an active research topic. The essential idea is that the descendant classes can share the information of the ancestor classes in a predefined taxonomy. In this paper, we claim that each class has several latent concepts and its subclasses share information with these different concepts respectively. Then, we propose a variant Passive-Aggressive (PA) algorithm for hierarchical text classification with latent concepts. | Hierarchical Text Classification with Latent Concepts Xipeng Qiu Xuanjing Huang Zhao Liu and Jinlong Zhou School of Computer Science Fudan University xpqiu xjhuang @fudan.edu.cn zliu.fd abc9703 @gmail.com Abstract Recently hierarchical text classification has become an active research topic. The essential idea is that the descendant classes can share the information of the ancestor classes in a predefined taxonomy. In this paper we claim that each class has several latent concepts and its subclasses share information with these different concepts respectively. Then we propose a variant Passive-Aggressive PA algorithm for hierarchical text classification with latent concepts. Experimental results show that the performance of our algorithm is competitive with the recently proposed hierarchical classification algorithms. 1 Introduction Text classification is a crucial and well-proven method for organizing the collection of large scale documents. The predefined categories are formed by different criterions e.g. Entertainment Sports and Education in news classification Junk Email and Ordinary Email in email classification. In the literature many algorithms Sebastiani 2002 Yang and Liu 1999 Yang and Pedersen 1997 have been proposed such as Support Vector Machines SVM k-Nearest Neighbor kNN Naive Bayes NB and so on. Empirical evaluations have shown that most of these methods are quite effective in traditional text classification applications. In past serval years hierarchical text classification has become an active research topic in database area Koller and Sahami 1997 Weigend et al. 1999 and machine learning area Rousu et al. 2006 Cai and Hofmann 2007 . Different with traditional classification the document collections are organized 598 as hierarchical class structure in many application fields web taxonomies i.e. the Yahoo Directory http dir.yahoo.com and the Open Directory Project ODP http dmoz.org email folders and product catalogs. The approaches of hierarchical .