tailieunhanh - Báo cáo khoa học: "Generalization Methods for In-Domain and Cross-Domain Opinion Holder Extraction"

In this paper, we compare three different generalization methods for in-domain and cross-domain opinion holder extraction being simple unsupervised word clustering, an induction method inspired by distant supervision and the usage of lexical resources. The generalization methods are incorporated into diverse classifiers. We show that generalization causes significant improvements and that the impact of improvement depends on the type of classifier and on how much training and test data differ from each other. . | Generalization Methods for In-Domain and Cross-Domain Opinion Holder Extraction Michael Wiegand and Dietrich Klakow Spoken Language Systems Saarland University D-66123 Saarbrucken Germany @ Abstract In this paper we compare three different generalization methods for in-domain and cross-domain opinion holder extraction being simple unsupervised word clustering an induction method inspired by distant supervision and the usage of lexical resources. The generalization methods are incorporated into diverse classifiers. We show that generalization causes significant improvements and that the impact of improvement depends on the type of classifier and on how much training and test data differ from each other. We also address the less common case of opinion holders being realized in patient position and suggest approaches including a novel linguistically-informed extraction method how to detect those opinion holders without labeled training data as standard datasets contain too few instances of this type. 1 Introduction Opinion holder extraction is one of the most important subtasks in sentiment analysis. The extraction of sources of opinions is an essential component for complex real-life applications such as opinion question answering systems or opinion summarization systems Stoyanov and Cardie 2011 . Common approaches designed to extract opinion holders are based on data-driven methods in particular supervised learning. In this paper we examine the role of generalization for opinion holder extraction in both indomain and cross-domain classification. Generalization may not only help to compensate the availability of labeled training data but also conciliate domain mismatches. In order to illustrate this compare for instance 1 and 2 . 1 Malaysia did not agree to such treatment of Al-Qaeda soldiers as they were prisoners-of-war and should be accorded treatment as provided for under the Geneva Convention. 2 Japan wishes to

TỪ KHÓA LIÊN QUAN