Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Improving the Use of Pseudo-Words for Evaluating Selectional Preferences"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper improves the use of pseudowords as an evaluation framework for selectional preferences. While pseudowords originally evaluated word sense disambiguation, they are now commonly used to evaluate selectional preferences. A selectional preference model ranks a set of possible arguments for a verb by their semantic fit to the verb. Pseudo-words serve as a proxy evaluation for these decisions. | Improving the Use of Pseudo-Words for Evaluating Selectional Preferences Nathanael Chambers and Dan Jurafsky Department of Computer Science Stanford University natec jurafsky @stanford.edu Abstract This paper improves the use of pseudowords as an evaluation framework for selectional preferences. While pseudowords originally evaluated word sense disambiguation they are now commonly used to evaluate selectional preferences. A selectional preference model ranks a set of possible arguments for a verb by their semantic fit to the verb. Pseudo-words serve as a proxy evaluation for these decisions. The evaluation takes an argument of a verb like drive e.g. car pairs it with an alternative word e.g. car rock and asks a model to identify the original. This paper studies two main aspects of pseudoword creation that affect performance results. 1 Pseudo-word evaluations often evaluate only a subset of the words. We show that selectional preferences should instead be evaluated on the data in its entirety. 2 Different approaches to selecting partner words can produce overly optimistic evaluations. We offer suggestions to address these factors and present a simple baseline that outperforms the state-of-the-art by 13 absolute on a newspaper domain. 1 Introduction For many natural language processing NLP tasks particularly those involving meaning creating labeled test data is difficult or expensive. One way to mitigate this problem is with pseudowords a method for automatically creating test corpora without human labeling originally proposed for word sense disambiguation Gale et al. 1992 Schutze 1992 . While pseudo-words are now less often used for word sense disambigation they are a common way to evaluate selectional preferences models that measure the strength of association between a predicate and its argument filler e.g. that the noun lunch is a likely object of eat. Selectional preferences are useful for NLP tasks such as parsing and semantic role labeling Zapirain et al. 2009