tailieunhanh - Báo cáo khoa học: "Exploring the Sense Distributions of Homographs"
This paper quantitatively investigates in how far local context is useful to disambiguate the senses of an ambiguous word. This is done by comparing the co-occurrence frequencies of particular context words. First, one context word representing a certain sense is chosen, and then the co-occurrence frequencies with two other context words, one of the same and one of another sense, are compared. As expected, it turns out that context words belonging to the same sense have considerably higher co-occurrence frequencies than words belonging to different senses. . | Exploring the Sense Distributions of Homographs Reinhard Rapp University of Mainz FASK 76711 Germersheim Germany rrapp@ Abstract This paper quantitatively investigates in how far local context is useful to disambiguate the senses of an ambiguous word. This is done by comparing the co-occurrence frequencies of particular context words. First one context word representing a certain sense is chosen and then the co-occurrence frequencies with two other context words one of the same and one of another sense are compared. As expected it turns out that context words belonging to the same sense have considerably higher co-occurrence frequencies than words belonging to different senses. In our study the sense inventory is taken from the University of South Florida homograph norms and the co-occurrence counts are based on the British National Corpus. 1 Introduction Word sense induction and disambiguation is of importance for many tasks in speech and language processing such as speech recognition machine translation natural language understanding question answering and information retrieval. As evidenced by several SENSEVAL sense disambiguation competitions Kilgarriff Palmer 2000 statistical methods are dominant in this field. However none of the published algorithms comes close to human performance in word sense disambiguation and it is therefore unclear in how far the statistical regularities that are exploited in these algorithms are a solid basis to eventually solve the problem. Although this is a difficult question in this study we try to give at least a partial answer. Our starting point is the observation that ambiguous words can usually be disambiguated by their context and that certain context words can be seen as indicators of certain senses. For example context words such as finger and arm are typical of the hand meaning of palm whereas coconut and oil are typical of its tree meaning. The essence behind many algorithms for word sense disambiguation is .
đang nạp các trang xem trước