tailieunhanh - Báo cáo khoa học: "AUTOMATIC NOUN CLASSIFICATION BY USING "
Focusing on nouns, the idea claims that each noun is characterized by Verbs with which it occurs, and also that nouns are similar to the extent that they s h a r e verbs. These a u t o m a t i c classification approaches are also p a r t i a l l y successful. However, Hindle says that there is a number of issues to be confronted. The most important issue is that of "polysemy". In Hindle's experiment, two senses of"table", that is to say "table under which one can hide" and "table which. | AUTOMATIC NOUN CLASSIFICATION BY USING JAPANESE-ENGLISH WORD PAIRS Naomi Inoue KDD R D Laboratories 2-1-5 Ohara Kamifukuoka-shi Saitama 356 Japan inoue@ ABSTRACT This paper describes a method of classifying semantically similar nouns. The approach is based on the distributional hypothesis . Our approach is characterized by distinguishing among senses of the same word in order to resolve the polysemy issue. The classification result demonstrates that our approach is successful. 1. INTRODUCTION Sets of semantically similar words are very useful in natural language processing. The general approach toward classifying words is to use semantic categories for example the thesaurus. The is-a relation is connected between words and categories. However it is not easy to acquire the is-a connection by hand and it becomes expensive. Approaches toward automatically classifying words using existing dictionaries were therefore attempted Chodorow Tsurumaru Nakamura . These approaches are partially successful. However there is a fatal problem in these approaches namely existing dictionaries particularly Japanese dictionaries are not assembled on the basis of semantic hierarchy. On the other hand approaches toward automatically classifying words by using a large-scale corpus have also been attempted Shirai Hindle They seem to be based on the idea that semantically similar words appear in similar environments. This idea is derived from Harris s distributional hypothesis Harris in linguistics. Focusing on nouns the idea claims that each noun is characterized by verbs with which it occurs and also that nouns are similar to the extent that they share verbs. These automatic classification approaches are also partially successful. However Hindle says that there is a number of issues to be confronted. The most important issue is that of polysemy . In Hindle s experiment two senses of table that is to say table under which one can hide and table which can be commuted or
đang nạp các trang xem trước