Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: " Memory-Based Learning of Morphology with Stochastic Transducers"

Duy Tâm 70 8 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

This paper discusses the supervised learning of morphology using stochastic transducers, trained using the ExpectationMaximization (EM) algorithm. Two approaches are presented: ﬁrst, using the transducers directly to model the process, and secondly using them to deﬁne a similarity measure, related to the Fisher kernel method (Jaakkola and Haussler, 1998), and then using a Memory-Based Learning (MBL) technique. These are evaluated and compared on data sets from English, German, Slovene and Arabic. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 513-520. Memory-Based Learning of Morphology with Stochastic Transducers Alexander Clark ISSCO TIM University of Geneva UNI-MAIL Boulevard du Pont-d Atve CH-1211 Geneve 4 Switzerland Alex.Clark@issco.unige.ch Abstract This paper discusses the supervised learning of morphology using stochastic transducers trained using the ExpectationMaximization EM algorithm. Two approaches are presented first using the transducers directly to model the process and secondly using them to define a similarity measure related to the Fisher kernel method Jaakkola and Haussler 1998 and then using a Memory-Based Learning MBL technique. These are evaluated and compared on data sets from English German Slovene and Arabic. 1 Introduction Finite-state methods are in large part adequate to model morphological processes in many languages. A standard methodology is that of two-level morphology Koskenniemi 1983 which is capable of handling the complexity of Finnish though it needs substantial extensions to handle non-concatenative languages such as Arabic Kiraz 1994 . These models are primarily concerned with the mapping from deep lexical strings to surface strings and within this framework learning is in general difficult Itai 1994 . In this paper I present algorithms for learning the finite-state transduction between pairs of uninflected and inflected words. - supervised learning of morphology. The techniques presented here are however applicable to learning other types of string transductions. Memory-based techniques based on principles of non-parametric density estimation are a powerful form of machine learning well-suited to natural language tasks. A particular strength is their ability to model both general rules and specific exceptions in a single framework van den Bosch and Daelemans 1999 . However they have generally only been used in supervised learning techniques .

TÀI LIỆU LIÊN QUAN

Kỷ yếu tóm tắt báo cáo khoa học: Hội nghị khoa học tim mạch toàn quốc lần thứ XI - Hội tim mạch Quốc gia Việt Nam

Báo cáo nghiên cứu khoa học: "Danh lục các loài thú ở khu bảo tồn thiên nhiên Pù Huống tỉnh Nghệ An và ý nghĩa bảo tồn nguồn gen quí hiếm của chúng"

Báo cáo khoa học: Hỗ trợ nâng cao năng lực quản lý chất thải sinh hoạt tại thành phố Hội An

Báo cáo nghiên cứu khoa học: " DỊCH CHUYỂN TRUY VẤN OQL VÀO CÁC PHÉP TÍNH BAO HÀM"

Báo cáo nghiên cứu khoa học: "Tính năng động nghệ thuật của văn học hiện đại Việt Nam và một cách nhìn hành trình thể loại"

Báo cáo khoa học: " Áp dụng thủ tục phân tích trong kiểm toán báo cáo tài chính"

Báo cáo nghiên cứu khoa học: "Người lính trở về sau chiến tranh với mặc cảm “ăn mày dĩ vãng’ trong tiểu thuyết Chu Lai"

Báo cáo nghiên cứu khoa học: "Khảo sát hiện tượng chuyển đổi chức năng - nghĩa của động từ tiếng Việt"

Báo cáo nghiên cứu khoa học: " BẢN CHẤT KHOA HỌC VÀ CÁCH MẠNG LÀ CỘI NGUỒN SỨC SỐNG CỦA CHỦ NGHĨA MÁC - LÊNIN"

Báo cáo khoa học: " CẢI TIẾN CÁC THUẬT TOÁN MƯỢN VÀ KHOÁ KÊNH TẦN SỐ MẠNG DI ĐỘNG TẾ BÀO"