tailieunhanh - Báo cáo khoa học: "Using Emoticons to reduce Dependency in Machine Learning Techniques for Sentiment Classification"
Sentiment Classification seeks to identify a piece of text according to its author’s general feeling toward their subject, be it positive or negative. Traditional machine learning techniques have been applied to this problem with reasonable success, but they have been shown to work well only when there is a good match between the training and test data with respect to topic. This paper demonstrates that match with respect to domain and time is also important, and presents preliminary experiments with training data labeled with emoticons, which has the potential of being independent of domain, topic and time. . | Using Emoticons to reduce Dependency in Machine Learning Techniques for Sentiment Classification Jonathon Read Department of Informatics University of Sussex United Kingdom Abstract Sentiment Classification seeks to identify a piece of text according to its author s general feeling toward their subject be it positive or negative. Traditional machine learning techniques have been applied to this problem with reasonable success but they have been shown to work well only when there is a good match between the training and test data with respect to topic. This paper demonstrates that match with respect to domain and time is also important and presents preliminary experiments with training data labeled with emoticons which has the potential of being independent of domain topic and time. 1 Introduction Recent years have seen an increasing amount of research effort expended in the area of understanding sentiment in textual resources. A sub-topic of this research is that of Sentiment Classification. That is given a problem text can computational methods determine if the text is generally positive or generally negative Several diverse applications exist for this potential technology ranging from the automatic filtering of abusive messages Spertus 1997 to an in-depth analysis of market trends and consumer opinions Dave et al. 2003 . This is a complex and challenging task for a computer to achieve consider the difficulties involved in instructing a computer to recognise sarcasm for example. Previous work has shown that traditional text classification approaches can be quite effective when applied to the sentiment analysis problem. Models such as Naive Bayes NB Maximum Entropy ME and Support Vector Machines SVM can determine the sentiment of texts. Pang et al. 2002 used a bag-of-features framework based on unigrams and bigrams to train these models from a corpus of movie reviews labelled as positive or negative. The best accuracy achieved was using .
đang nạp các trang xem trước