tailieunhanh - Báo cáo khoa học: "Lexical Disambiguation Using Constraint Handling In Prolog "

1 Introduction Automatic sense disambiguation has been recognised by the research community as very important for a number of natural language processing applications like information retrieval, machine translation, or speech recognition. This paper describes experiments with an algorithm for lexieal sense disambiguation, that is, predicting which of many possible senses of a word is intended in a given sentence. | Lexical Disambiguation Using Constraint Handling In Prolog CHIP George c. Demetriou Centre for Computer Analysis of Language And Speech CCALAS Artificial Intelligence Division School of Computer Studies University of Leeds Leeds LS2 9JT United Kingdom 1 Introduction Automatic sense disambiguation has been recognised by the research community as very important for a number of natural language processing applications like information retrieval machine translation or speech recognition. This paper describes experiments with an algorithm for lexical sense disambiguation that is predicting which of many possible senses of a word is intended in a given sentence. The definitions of senses of a given word are those used in LDOCE the Longman Dictionary of Contemporary English Procter et al. 1978 . The algorithm first assigns a set of meanings or senses drawn from LDOCE to each word in the given sentence and then chooses the combination of word-senses one for each word in the sentence yielding the maximum semantic overlap. The metric of semantic overlap is based on the fact that LDOCE sense definitions are made in terms of the Longman Defining Vocabulary effectively a large set of semantic primitives. Since the problem of finding the word-sense-chain with maximum overlap can be viewed as a specialised example of the class of constraint-based optimisation problems for which Constraint Handling In Prolog CHIP was designed we have chosen to implement our algorithm in CHIP. 2 Background LDOCE Word Sense Disambiguation and related work LDOCE s important feature is that its definitions and examples are written in a controlled vocabulary of 2187 words. A definition is therefore always written in simpler terms than the word it describes. These 2187 words effectively constitute semantic primitives and any particular word-sense is defined by a set of these primitives. Several researchers have been experimented with lexical disambiguation using MRDs including Lesk 1986 Wilks et al. .

TỪ KHÓA LIÊN QUAN