tailieunhanh - Báo cáo khoa học: "Semantic Classification of Noun Phrases Using Web Counts and Learning Algorithms"

This paper investigates the use of machine learning algorithms to label modifier-noun compounds with a semantic relation. The attributes used as input to the learning algorithms are the web frequencies for phrases containing the modifier, noun, and a prepositional joining term. We compare and evaluate different algorithms and different joining phrases on Nastase and Szpakowicz’s (2003) dataset of 600 modifier-noun compounds. We find that by using a Support Vector Machine classifier we can obtain better performance on this dataset than a current state-of-the-art system; even with a relatively small set of prepositional joining terms. . | Semantic Classification of Noun Phrases Using Web Counts and Learning Algorithms Paul Nulty School of Computer Science and Informatics University College Dublin Belfield Dublin 4 Ireland Abstract This paper investigates the use of machine learning algorithms to label modifier-noun compounds with a semantic relation. The attributes used as input to the learning algorithms are the web frequencies for phrases containing the modifier noun and a prepositional joining term. We compare and evaluate different algorithms and different joining phrases on Nastase and Szpako-wicz s 2003 dataset of 600 modifier-noun compounds. We find that by using a Support Vector Machine classifier we can obtain better performance on this dataset than a current state-of-the-art system even with a relatively small set of prepositional joining terms. 1 Introduction Noun-modifier word pairs occur frequently in many languages and the problem of semantic disambiguation of these phrases has many potential applications in areas such as question-answering and machine translation. One very common approach to this problem is to define a set of semantic relations which capture the interaction between the modifier and the head noun and then attempt to assign one of these semantic relations to each noun-modifier pair. For example the phrase flu virus could be assigned the semantic relation causal the virus causes the flu the relation for desert storm could be location the storm is located in the desert . There is no consensus as to which set of semantic relations best captures the differences in meaning of various noun phrases. Work in theoretical linguistics has suggested that noun-noun compounds may be formed by the deletion of a predicate verb or preposition Levi 1978 . However whether the set of possible predicates numbers 5 or 50 there are likely to be some examples of noun phrases that fit into none of the categories and some that fit in multiple categories. Modifier-noun phrases .