tailieunhanh - Báo cáo khoa học: "Idiomatic object usage and support verbs"

Every language contains complex expressions that are language-specific. The general problem when trying to build automated translation systems or human-readable dictionaries is to detect expressions that can be used idiomatically and then whether the expressions can be used idiomatically in a particular text, or whether a literal translation would be preferred. It follows from the definition of idiomatic expression that when a complex expression is used idiomatically, it contains at least one element which is semantically "out of context". . | Idiomatic object usage and support verbs Pasi Tapanainen Jussi Piitulainen and Timo Jarvinen Research Unit for Multilingual Language Technology . Box 4 FIN-00014 University of Helsinki Finland http i 1 Introduction Every language contains complex expressions that are language-specific. The general problem when trying to build automated translation systems or human-readable dictionaries is to detect expressions that can be used idiomatically and then whether the expressions can be used idiomatically in a particular text or whether a literal translation would be preferred. It follows from the definition of idiomatic expression that when a complex expression is used idiomatically it contains at least one element which is semantically out of context . In this paper we discuss a method that finds idiomatic collocations in a text corpus. The method detects semantic asymmetry by taking advantage of differences in syntactic distributions. We demonstrate the method using a specific linguistic phenomenon verb-object collocations. The asymmetry between a verb and its object is the focus in our work and it makes the approach different from the methods that use . mutual information which is a symmetric measure. Our novel approach differs from mutual information and the so-called t-value measures that have been widely used for similar tasks . Church et al. 1994 and Breidt 1993 for German. The tasks where mutual information can be applied are very different in nature as we see in the short comparison at the end of this paper. The work reported in Grefenstette and Teufel 1995 for finding empty support verbs used in nominalisations is also related to the present work. Email and Parsers demos http . 2 Semantic asymmetry The linguistic hypothesis that syntactic relations such as subject-verb and object-verb relations are semantically .