tailieunhanh - Báo cáo khoa học: "Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora"

Most previous work on multilingual sentiment analysis has focused on methods to adapt sentiment resources from resource-rich languages to resource-poor languages. We present a novel approach for joint bilingual sentiment classification at the sentence level that augments available labeled data in each language with unlabeled parallel data. We rely on the intuition that the sentiment labels for parallel sentences should be similar and present a model that jointly learns improved monolingual sentiment classifiers for each language. . | Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora Bin Lu1 3 Chenhao Tan2 Claire Cardie2 and Benjamin K. Tsou3 1 1 Department of Chinese Translation and Linguistics City University of Hong Kong Hong Kong 2 Department of Computer Science Cornell University Ithaca NY USA 3 Research Centre on Linguistics and Language Information Sciences Hong Kong Institute of Education Hong Kong lubin2010@ chenhao cardiel@ btsou99@ Abstract Most previous work on multilingual sentiment analysis has focused on methods to adapt sentiment resources from resource-rich languages to resource-poor languages. We present a novel approach for joint bilingual sentiment classification at the sentence level that augments available labeled data in each language with unlabeled parallel data. We rely on the intuition that the sentiment labels for parallel sentences should be similar and present a model that jointly learns improved monolingual sentiment classifiers for each language. Experiments on multiple data sets show that the proposed approach 1 outperforms the monolingual baselines significantly improving the accuracy for both languages by 2 outperforms two standard approaches for leveraging unlabeled data and 3 produces albeit smaller performance gains when employing pseudo-parallel data from machine translation engines. 1 Introduction The field of sentiment analysis has quickly attracted the attention of researchers and practitioners alike . Pang et al. 2002 Turney 2002 Hu and Liu 2004 Wiebe et al. 2005 Breck et al. 2007 Pang and Lee 2008 . Indeed sentiment analysis systems which mine opinions from textual sources . news blogs and reviews can be used in a wide variety of The work was conducted when the first author was visiting Cornell University. 320 applications including interpreting product reviews opinion retrieval and political polling. Not surprisingly most methods for sentiment classification are supervised learning

TỪ KHÓA LIÊN QUAN