tailieunhanh - Báo cáo khoa học: "A Corpus-Based Approach to Deriving Lexical Mappings"

Dictionaries are now commonly used resources in NLP systems. However, different lexical resources are not uniform; they contain different types of information and do not assign words the same number of senses. One way in which this problem might be tackled is by producing mappings between the senses of different resources, the "dictionary mapping problem". However, this is a non-trivial problem, as examination of existing lexical resources demonstrates. Lexicographers have been divided between "lumpers', or those who prefer a few general senses, and "splitters" who create a larger number of more specific senses so there is no guarantee that. | Proceedings of EACL 99 A Corpus-Based Approach to Deriving Lexical Mappings Mark Stevenson Department of Computer Science University of Sheffield Regent Court 211 Portobello Street Sheffield Si 4DP United Kingdom marks@ Abstract This paper proposes a novel corpusbased method for producing mappings between lexical resources. Results from a preliminary experiment using part of speech tags suggests this is a promising area for future research. 1 Introduction Dictionaries are now commonly used resources in NLP systems. However different lexical resources are not uniform they contain different types of information and do not assign words the same number of senses. One way in which this problem might be tackled is by producing mappings between the senses of different resources the dictionary mapping problem . However this is a non-trivial problem as examination of existing lexical resources demonstrates. Lexicographers have been divided between lumpers or those who prefer a few general senses and splitters who create a larger number of more specific senses so there is no guarantee that a word will have the same number of senses in different resources. Previous attempts to create lexical mappings have concentrated on aligning the senses in pairs of lexical resources and based the mapping decision on information in the entries. For example Knight and Luk 1994 merged WordNet and LDOCE using information in the hierarchies and textual definitions of each resource. Thus far we have mentioned only mappings between dictionary senses. However it is possible to create mappings between any pair of linguistic annotation tag-sets for example part of speech tags. We dub the more general class lexical mappings mappings between two sets of lexical annotations. One example which we shall consider further is that of mappings between part of speech tags sets. This paper shall propose a method for producing lexical mappings based on corpus evidence. It is based on the .

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN