tailieunhanh - Báo cáo khoa học: "Automating the Acquisition of Bilingual Terminology"

As the acquisition problem of bilingual lists of terminological expressions is formidable, it is worthwhile to investigate methods to compile such lists as automatically as possible. In this paper we discuss experimental results for a number of methods, which operate on corpora of previously translated texts. K e y w o r d s : parallel corpora, tagging, terminology acquisition. | Automating the Acquisition of Bilingual Terminology Pim van der Eijk Digital Equipment Corporation Kabelweg 21 1014 BA Amsterdam The Netherlands eijk@ . Abstract As the acquisition problem of bilingual lists of terminological expressions is formidable it is worthwhile to investigate methods to compile such lists as automatically as possible. In this paper we discuss experimental results for a number of methods which operate on corpora of previously translated texts. Keywords parallel corpora tagging terminology acquisition. 1 Introduction In the past several years many researchers have started looking at bilingual corpora as they implicitly contain much information needed for various purposes that would otherwise have to be compiled manually. Some applications using information extracted from bilingual corpora are statistical MT Brown et al. 1990 bilingual lexicography Cati-zone et al. 1989 word sense disambiguation Gale et al. 1992 and multilingual information retrieval Landauer and Littmann 1990 . The goal of the research discussed in this paper is to automate as much as possible the generation of bilingual term lists from previously translated texts. These lists are used by terminologists and translators . in documentation departments. Manual compilation of bilingual term lists is an expensive and laborious effort hence the relative rarity of specialized up-to-date and manageable terminological data collections. However organizations interested in terminology and translation are likely to have archives of previously translated documents which represent a considerable investment. Automatic or semi-automatic extraction of the information contained in these documents would then be an attractive perspective. A bilingual term list is a list associating source language terms with a ranked list of target language terms. The methods to extract bilingual terminology from parallel texts were developed and evaluated experimentally using a bilingual .

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG