tailieunhanh - Báo cáo khoa học: "Practical Glossing by Prioritised Tiling"

We present the design of a practical context-sensitive glosser, incorporating current techniques for lightweight linguistic analysis based on large-scale lexical resources. We outline a general model for ranking the possible translations of the words and expressions that make up a text. This information can be used by a simple resource-bounded algorithm, of complexity O(n log n) in sentence length, that determines a consistent gloss of best translations. We then describe how the results of the general ranking model may be approximated using a simple heuristic prioritisation scheme. . | Practical Glossing by Prioritised Tiling Victor Poznanski Pete Whitelock Jan Udens Steffan Corley Sharp Laboratories of Europe Ltd. Oxford Science Park Oxford 0X4 4GA United Kingdom vp pete jan steffan @ Abstract We present the design of a practical context-sensitive glosser incorporating current techniques for lightweight linguistic analysis based on large-scale lexical resources. We outline a general model for ranking the possible translations of the words and expressions that make up a text. This information can be used by a simple resource-bounded algorithm of complexity O n log n in sentence length that determines a consistent gloss of best translations. We then describe how the results of the general ranking model may be approximated using a simple heuristic prioritisation scheme. Finally we present a preliminary evaluation of the glosser s performance. 1 Introduction In a lexicalist MT framework such as Shake-and-Bake Whitelock 1994 ttanslation equivalence is defined between collections of suitably constrained lexical material in the two languages. Such an approach has been shown to be effective in the description of many types of complex bilingual equivalence. However the complexity of the associated parsing and generation phases leaves a system of this type some way from commercial exploitation. The parsing phase that is needed to establish adequate constraints on the words is of cubic complexity while the most general generation algorithm needed to order the words in the target text is ơ n4 Poznanski et al. 1996 . In this paper we show how a novel application domain glossing can be explored within such a framework by omitting generation entirely and replacing syntactic parsing by a simple combination of morphological analysis and tagging. The poverty of constraints established in this way and the consequent inaccuracy in translation is mitigated by providing a menu of alternatives for each gloss. The gloss is automatically updated in the light

TÀI LIỆU MỚI ĐĂNG
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.