tailieunhanh - Báo cáo khoa học: "A Word-to-Word Model of Translational Equivalence"
Many multilingual NLP applications need to translate words between different languages, but cannot afford the computational expense of inducing or applying a full translation model. For these applications, we have designed a fast algorithm for estimating a partial translation model, which accounts for translational equivalence only at the word level . The model's precision/recall trade-off can be directly controlled via one threshold parameter. This feature makes the model more suitable for applications that are not fully statistical. The model's hidden parameters can be easily conditioned on information extrinsic to the model, providing an easy way to integrate pre-existing. | A Word-to-Word Model of Translational Equivalence I. Dan Melamed Dept of Computer and Information Science University of Pennsylvania Philadelphia PA 19104 . Abstract Many multilingual NLP applications need to translate words between different languages but cannot afford the computational expense of inducing or applying a full translation model. For these applications we have designed a fast algorithm for estimating a partial translation model which accounts for translational equivalence only at the word level . The model s preci-sion recall trade-off can be directly controlled via one threshold parameter. This feature makes the model more suitable for applications that are not fully statistical. The model s hidden parameters can be easily conditioned on information extrinsic to the model providing an easy way to integrate pre-existing knowledge such as part-of-speech dictionaries word order etc. Our model can link word tokens in parallel texts as well as other translation models in the literature. Unlike other translation models it can automatically produce dictionary-sized translation lexicons and it can do so with over 99 accuracy. 1 Introduction Over the past decade researchers at IBM have developed a series of increasingly sophisticated statistical models for machine translation Brown et al. 1988 Brown et al. 1990 Brown et al. 1993a . However the IBM models which attempt to capture a broad range of translation phenomena are computationally expensive to apply. Table look-up using an explicit translation lexicon is sufficient and preferable for many multilingual NLP applications including crummy MT on the World Wide Web Church Hovy 1993 certain machine-assisted translation tools . Macklovitch 1994 Melamed 1996b concordancing for bilingual lexicography Catizone et al. 1993 Gale Church 1991 computer-assisted language learning corpus linguistics Melby. 1981 and cross-lingual information retrieval Card Dorr 1996 . In this paper we
đang nạp các trang xem trước