tailieunhanh - Báo cáo khoa học: "Identification of Domain-Specific Senses in a Machine-Readable Dictionary"

This paper focuses on domain-specific senses and presents a method for assigning category/domain label to each sense of words in a dictionary. The method first identifies each sense of a word in the dictionary to its corresponding category. We used a text classification technique to select appropriate senses for each domain. Then, senses were scored by computing the rank scores. We used Markov Random Walk (MRW) model. The method was tested on English and Japanese resources, WordNet and EDR Japanese dictionary. . | Identification of Domain-Specific Senses in a Machine-Readable Dictionary Fumiyo Fukumoto Interdisciplinary Graduate School of Medicine and Engineering Univ. of Yamanashi fukumoto@ Yoshimi Suzuki Interdisciplinary Graduate School of Medicine and Engineering Univ. of Yamanashi ysuzuki@ Abstract This paper focuses on domain-specific senses and presents a method for assigning cate-gory domain label to each sense of words in a dictionary. The method first identifies each sense of a word in the dictionary to its corresponding category. We used a text classification technique to select appropriate senses for each domain. Then senses were scored by computing the rank scores. We used Markov Random Walk MRW model. The method was tested on English and Japanese resources WordNet and EDR Japanese dictionary. For evaluation of the method we compared English results with the Subject Field Codes SFC resources. We also compared each English and Japanese results to the first sense heuristics in the WSD task. These results suggest that identification of domain-specific senses IDSS may actually be of benefit. 1 Introduction Domain-specific sense of a word is crucial information for many NLP tasks and their applications such as Word Sense Disambiguation WSD and Information Retrieval IR . For example in the WSD task McCarthy et al. presented a method to find predominant noun senses automatically using a thesaurus acquired from raw textual corpora and the Word-Net similarity package McCarthy et al. 2004 McCarthy et al. 2007 . They used parsed data to find words with a similar distribution to the target word. Unlike Buitelaar et al. approach Buitelaar and Sacaleanu 2001 they evaluated their method using publically available resources namely SemCor 552 Miller et al. 1998 and the SENSEVAL-2 English all-words task. The major motivation for their work was similar to ours . to try to capture changes in ranking of senses for documents from different .

TỪ KHÓA LIÊN QUAN