tailieunhanh - Báo cáo khoa học: "Generalizing over Lexical Features: Selectional Preferences for Semantic Role Classification"

This paper explores methods to alleviate the effect of lexical sparseness in the classification of verbal arguments. We show how automatically generated selectional preferences are able to generalize and perform better than lexical features in a large dataset for semantic role classification. The best results are obtained with a novel second-order distributional similarity measure, and the positive effect is specially relevant for out-of-domain data. Our findings suggest that selectional preferences have potential for improving a full system for Semantic Role Labeling. . | Generalizing over Lexical Features Selectional Preferences for Semantic Role Classification Benat Zapirain Eneko Agirre Ixa Taldea University of the Basque Country Donostia Basque Country @ehu Lluis Marquez TALP Research Center Technical University of Catalonia Barcelona Catalonia es lluism@ Abstract This paper explores methods to alleviate the effect of lexical sparseness in the classification of verbal arguments. We show how automatically generated selec-tional preferences are able to generalize and perform better than lexical features in a large dataset for semantic role classification. The best results are obtained with a novel second-order distributional similarity measure and the positive effect is specially relevant for out-of-domain data. Our findings suggest that selectional preferences have potential for improving a full system for Semantic Role Labeling. 1 Introduction Semantic Role Labeling SRL systems usually approach the problem as a sequence of two subtasks argument identification and classification. While the former is mostly a syntactic task the latter requires semantic knowledge to be taken into account. Current systems capture semantics through lexicalized features on the predicate and the head word of the argument to be classified. Since lexical features tend to be sparse especially when the training corpus is small SRL systems are prone to overfit the training data and generalize poorly to new corpora. This work explores the usefulness of selectional preferences to alleviate the lexical dependence of SRL systems. Selectional preferences introduce semantic generalizations on the type of arguments preferred by the predicates. Therefore they are expected to improve generalization on infrequent and unknown words and increase the discriminative power of the argument classifiers. For instance consider these two sentences JFK was assassinated in Dallas Location JFK was assassinated in November Temporal Both share .