tailieunhanh - Báo cáo khoa học: "Aligning Features with Sense Distinction Dimensions"
In this paper we present word sense disambiguation (WSD) experiments on ten highly polysemous verbs in Chinese, where significant performance improvements are achieved using rich linguistic features. Our system performs significantly better, and in some cases substantially better, than the baseline on all ten verbs. Our results also demonstrate that features extracted from the output of an automatic Chinese semantic role labeling system in general benefited the WSD system, even though the amount of improvement was not consistent across the verbs. . | Aligning Features with Sense Distinction Dimensions 1 Nianwen Xue 2 Jinying Chen 3 Martha Palmer 1CSLR and 3Department of Linguistics University of Colorado Boulder CO 80309 @ 2 Department of Computer and Information Science University of Pennsylvania Philadelphia PA 19104 jinying@ Abstract In this paper we present word sense disambiguation WSD experiments on ten highly polysemous verbs in Chinese where significant performance improvements are achieved using rich linguistic features. Our system performs significantly better and in some cases substantially better than the baseline on all ten verbs. Our results also demonstrate that features extracted from the output of an automatic Chinese semantic role labeling system in general benefited the WSD system even though the amount of improvement was not consistent across the verbs. For a few verbs semantic role information actually hurt WSD performance. The inconsistency of feature performance is a general characteristic of the WSD task as has been observed by others. We argue that this result can be explained by the fact that word senses are partitioned along different dimensions for different verbs and the features therefore need to be tailored to particular verbs in order to achieve adequate accuracy on verb sense disambiguation. 1 Introduction Word sense disambiguation the determination of the correct sense of a polysemous word from a number of possible senses based on the context in which it occurs is a continuing obstacle to high performance natural language processing applications. There are several well-documented factors that make accurate WSD particularly challenging. The first has to do with how senses are defined. The English data used for the SENSEVAL exercises arguably the most widely used data to train and test WSD systems are annotated based on very fine-grained distinctions defined in WordNet Fellbaum 1998 with human inter-annotator agreement at a .
đang nạp các trang xem trước