tailieunhanh - Báo cáo khoa học: "Target Word Selection as Proximity in Semantic Space"
Lexical selection is a significant problem for widecoverage machine translation: depending on the context, a given source language word can often be translated into different target language words. In this paper I propose a method for target word selection that assumes the appropriate translation is more similar to the translated context than are the alternatives. Similarity of a word to a context is estimated using a proximity measure in corpusderived "semantic space". | Target Word Selection as Proximity in Semantic Space Scott McDonald Centre for Cognitive Science University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW Scotland scottm@ Abstract Lexical selection is a significant problem for wide-coverage machine translation depending on the context a given source language word can often be translated into different target language words. In this paper I propose a method for target word selection that assumes the appropriate translation is more similar to the translated context than are the alternatives. Similarity of a word to a context is estimated using a proximity measure in corpus-derived semantic space . The method is evaluated using an English-Spanish parallel corpus of colloquial dialogue. 1 Introduction When should Spanish detener translate to English arrest and when to stop This paper explores the problem of lexical selection in machine translation MT a given source language SL word can often be translated into different target language TL words depending on the context. Translation is difficult because the conceptual mapping between languages is generally not one-to-one . Spanish reloj maps to both watch and clock. A SL word might be translatable by more than one TL option where the choice is based on stylistic or pragmatic rather than semantic criteria. Alternative TL choices also exist for SL words that are ambiguous from the monolingual point of view . English firm can be translated by Spanish firme estricto sólido or compania. Semantic Space Models In this paper I take a statistical approach to lexical selection under the working assumption that the translated linguistic context can provide sufficient information for choosing the appropriate target. I define the appropriate target as the candidate closest in meaning to the local TL context where local context refers to a window of words centered on the missing TL item. To estimate the similarity in meaning between a word and the bag of .
đang nạp các trang xem trước