tailieunhanh - Báo cáo khoa học: "Identifying Sarcasm in Twitter: A Closer Look"

Sarcasm transforms the polarity of an apparently positive or negative utterance into its opposite. We report on a method for constructing a corpus of sarcastic Twitter messages in which determination of the sarcasm of each message has been made by its author. We use this reliable corpus to compare sarcastic utterances in Twitter to utterances that express positive or negative attitudes without sarcasm. We investigate the impact of lexical and pragmatic factors on machine learning effectiveness for identifying sarcastic utterances and we compare the performance of machine learning techniques and human judges on this task. Perhaps unsurprisingly, neither the. | Identifying Sarcasm in Twitter A Closer Look Roberto González-Ibánez Smaranda Muresan Nina Wacholder School of Communication Information Rutgers The State University of New Jersey 4 Huntington St New Brunswick NJ 08901 rgonzal smuresan ninwac @ Abstract Sarcasm transforms the polarity of an apparently positive or negative utterance into its opposite. We report on a method for constructing a corpus of sarcastic Twitter messages in which determination of the sarcasm of each message has been made by its author. We use this reliable corpus to compare sarcastic utterances in Twitter to utterances that express positive or negative attitudes without sarcasm. We investigate the impact of lexical and pragmatic factors on machine learning effectiveness for identifying sarcastic utterances and we compare the performance of machine learning techniques and human judges on this task. Perhaps unsurprisingly neither the human judges nor the machine learning techniques perform very well. 1 Introduction Automatic detection of sarcasm is still in its infancy. One reason for the lack of computational models has been the absence of accurately-labeled naturally occurring utterances that can be used to train machine learning systems. Microblogging platforms such as Twitter which allow users to communicate feelings opinions and ideas in short messages and to assign labels to their own messages have been recently exploited in sentiment and opinion analysis Pak and Paroubek 2010 Davidov et al. 2010 . In Twitter messages can be an 581 notated with hashtags such as bicycling happy and sarcasm. We use these hashtags to build a labeled corpus of naturally occurring sarcastic positive and negative tweets. In this paper we report on an empirical study on the use of lexical and pragmatic factors to distinguish sarcasm from positive and negative sentiments expressed in Twitter messages. The contributions of this paper include i creation of a corpus that includes only sarcastic utterances

TỪ KHÓA LIÊN QUAN