tailieunhanh - Báo cáo khoa học: "STRUCTURAL MATCHING OF PARALLEL TEXTS"

This paper describes a method for finding strucrural matching between parallel sentences of two languages, (such as Japanese and English). Parallel sentences are analyzed based on unification grammars, and structural matching is performed by making use of a similarity measure of word pairs in the two languages. Syntactic ambiguities are resolved simultaneously in the matching process. The results serve as a. useful source for extracting linguistic lexical knowledge. | STRUCTURAL MATCHING OF PARALLEL TEXTS Yuji Matsumoto Graduate School of Information Science Advanced Institute of Science and Technology Nara Takayama-cho Ikoma-shi Nara 630-01 Japan matsu@ Hiroyuki Ishimoto Takehito utsuro Department of Electrical Engineering Kyoto University Sakyo-ku Kyoto 606 Japan ishimoto utsuro @ Abstract This paper describes a. method for finding structural matching between parallel sentences of two languages such as Japanese and English . Parallel sentences are analyzed based on unification grammars and structural matching is performed by making use of a similarity measure of word pairs in the two languages. Syntactic ambiguities are resolved simultaneously in the matching process. The results serve as a useful source for extracting linguistic and lexical knowledge. INTRODUCTION Bilingual or parallel texts are useful resources for acquisition of linguistic knowledge as well as for applications such as machine translation. Intensive research has been done for aligning bilingual texts at the sentence level using statistical techniques by measuring sentence lengths in words or in characters Brown 91 Gale 91a . Those works are quite successful in that far more than 90 of sentences in bilingual corpora are aligned correctly. Although such parallel texts are shown to be useful in real applications such as machine translation Brown 90 and word sense disambiguation Dagan 91 structured bilingual sentences are undoubtedly more informative and important for future natural language researches structured bilingual or multilingual corpora serve as richer sources for extracting linguistic knowledge Kaji 92 Klavans 90 Sadler 91 Utsuro 92 . Phrase level or word level alignment has also been done by several researchers. The Textual Knowledge Bank Project Sadler 91 is building monolingual and multilingual text bases structured by linking the elements with grammatical dependency referential and bilingual relations. Kaji

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.