tailieunhanh - Báo cáo khoa học: "A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts"
Sentiment analysis seeks to identify the viewpoint(s) underlying a text span; an example application is classifying a movie review as “thumbs up” or “thumbs down”. To determine this sentiment polarity, we propose a novel machine-learning method that applies text-categorization techniques to just the subjective portions of the document. Extracting these portions can be implemented using efficient techniques for finding minimum cuts in graphs; this greatly facilitates incorporation of cross-sentence contextual constraints. . | A Sentimental Education Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts Bo Pang and Lillian Lee Department of Computer Science Cornell University Ithaca NY 14853-7501 pabo llee @ Abstract Sentiment analysis seeks to identify the viewpoints underlying a text span an example application is classifying a movie review as thumbs up or thumbs down . To determine this sentiment polarity we propose a novel machine-learning method that applies text-categorization techniques to just the subjective portions of the document. Extracting these portions can be implemented using efficient techniques for finding minimum cuts in graphs this greatly facilitates incorporation of cross-sentence contextual constraints. 1 Introduction The computational treatment of opinion sentiment and subjectivity has recently attracted a great deal of attention see references in part because of its potential applications. For instance informationextraction and question-answering systems could flag statements and queries regarding opinions rather than facts Cardie et al. 2003 . Also it has proven useful for companies recommender systems and editorial sites to create summaries of people s experiences and opinions that consist of subjective expressions extracted from reviews as is commonly done in movie ads or even just a review s polarity positive thumbs up or negative thumbs down . Document polarity classification poses a significant challenge to data-driven methods resisting traditional text-categorization techniques Pang Lee and Vaithyanathan 2002 . Previous approaches focused on selecting indicative lexical features . the word good classifying a document according to the number of such features that occur anywhere within it. In contrast we propose the following process 1 label the sentences in the document as either subjective or objective discarding the latter and then 2 apply a standard machine-learning classifier to the resulting extract. This can .
đang nạp các trang xem trước