tailieunhanh - Báo cáo khoa học: "Robust Word Sense Translation by EM Learning of Frame Semantics"

We propose a robust method of automatically constructing a bilingual word sense dictionary from readily available monolingual ontologies by using estimation-maximization, without any annotated training data or manual tuning. We demonstrate our method on the English FrameNet and Chinese HowNet structures. Owing to the robustness of EM iterations in improving translation likelihoods, our word sense translation accuracies are very high, at 82% on average, for the 11 most ambiguous words in the English FrameNet with 5 senses or more | Robust Word Sense Translation by EM Learning of Frame Semantics Pascale Fung and Benfeng Chen Human Language Technology Center Department of Electrical Electronic Engineering University of Science Technology HKUST Clear Water Bay Hong Kong pascale bfchen @ Abstract We propose a robust method of automatically constructing a bilingual word sense dictionary from readily available monolingual ontologies by using estimation-maximization without any annotated training data or manual tuning. We demonstrate our method on the English FrameNet and Chinese HowNet structures. Owing to the robustness of EM iterations in improving translation likelihoods our word sense translation accuracies are very high at 82 on average for the 11 most ambiguous words in the English FrameNet with 5 senses or more. We also carried out a pilot study on using this automatically generated bilingual word sense dictionary to choose the best translation candidates and show the first significant evidence that frame semantics are useful for translation disambiguation. Translation disambiguation accuracy using frame semantics is 75 compared to 15 by using dictionary glossing only. These results demonstrate the great potential for future application of bilingual frame semantics to machine translation tasks. 1 Introduction As early as in the 1950s semantic nets were invented as an interlingua for machine translation. The semantic net or semantic map that humans possess in the cognitive process is a structure of concept classes and lexicon Illes and Francis 1999 . In addition the frame-semantic representation of predicate-argument relations has gained much attention in the research com munity. The Berkeley FrameNet Baker et al. 1998 is such an example. We suggest that in addition to dictionaries bilingual frame semantics word sense dictionary is a useful resource for lexical selection in the translation process of a statistical machine translation system. Manual inspection of the contrastive error

TÀI LIỆU LIÊN QUAN