tailieunhanh - Báo cáo khoa học: " Improved Automatic Detection of Zero Subjects and Impersonal Constructions in Spanish"

In pro-drop languages, the detection of explicit subjects, zero subjects and nonreferential impersonal constructions is crucial for anaphora and co-reference resolution. While the identification of explicit and zero subjects has attracted the attention of researchers in the past, the automatic identification of impersonal constructions in Spanish has not been addressed yet and this work is the first such study. In this paper we present a corpus to underpin research on the automatic detection of these linguistic phenomena in Spanish and a novel machine learning-based methodology for their computational treatment. This study also provides an analysis of the features, discusses. | Elliphant Improved Automatic Detection of Zero Subjects and Impersonal Constructions in Spanish Luz Rello NLP and Web Research Groups Univ. Pompeu Fabra Barcelona Spain Ricardo Baeza-Yates Yahoo Research Barcelona Spain Ruslan Mitkov Research Group in Computational Linguistics Univ. of Wolverhampton UK Abstract In pro-drop languages the detection of explicit subjects zero subjects and non-referential impersonal constructions is crucial for anaphora and co-reference resolution. While the identification of explicit and zero subjects has attracted the attention of researchers in the past the automatic identification of impersonal constructions in Spanish has not been addressed yet and this work is the first such study. In this paper we present a corpus to underpin research on the automatic detection of these linguistic phenomena in Spanish and a novel machine learning-based methodology for their computational treatment. This study also provides an analysis of the features discusses performance across two different genres and offers error analysis. The evaluation results show that our system performs better in detecting explicit subjects than alternative systems. 1 Introduction Subject ellipsis is the omission of the subject in a sentence. We consider not only missing referential subject zero subject as manifestation of ellipsis but also non-referential impersonal constructions. Various natural language processing NLP tasks benefit from the identification of elliptical subjects primarily anaphora resolution Mitkov 2002 and co-reference resolution Ng and Cardie 2002 . The difficulty in detecting missing subjects and non-referential pronouns has been acknowledged since the first studies on This work was partially funded by a La Caixa grant for master students. the computational treatment of anaphora Hobbs 1977 Hirst 1981 . However this task is of crucial importance when processing pro-drop languages since subject ellipsis is a pervasive phenomenon in these languages .

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
8    189    0    17-05-2024
20    205    2    17-05-2024
11    105    0    17-05-2024
6    103    0    17-05-2024
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.