tailieunhanh - Báo cáo khoa học: "A Comparison of Chinese Parsers for Stanford Dependencies"

Stanford dependencies are widely used in natural language processing as a semanticallyoriented representation, commonly generated either by (i) converting the output of a constituent parser, or (ii) predicting dependencies directly. Previous comparisons of the two approaches for English suggest that starting from constituents yields higher accuracies. | A Comparison of Chinese Parsers for Stanford Dependencies Wanxiang Chet Valentin I. Spitkovskyi Ting Liut car@ vals@ tliu@ t School of Computer Science and Technology Harbin Institute of Technology Harbin China 150001 ÍComputer Science Department Stanford University Stanford CA 94305 Abstract Stanford dependencies are widely used in natural language processing as a semantically-oriented representation commonly generated either by i converting the output of a constituent parser or ii predicting dependencies directly. Previous comparisons of the two approaches for English suggest that starting from constituents yields higher accuracies. In this paper we re-evaluate both methods for Chinese using more accurate dependency parsers than in previous work. Our comparison of performance and efficiency across seven popular open source parsers four constituent and three dependency shows by contrast that recent higher-order graph-based techniques can be more accurate though somewhat slower than constituent parsers. We demonstrate also that n-way jackknifing is a useful technique for producing automatic rather than gold part-of-speech tags to train Chinese dependency parsers. Finally we analyze the relations produced by both kinds of parsing and suggest which specific parsers to use in practice. 1 Introduction Stanford dependencies de Marneffe and Manning 2008 provide a simple description of relations between pairs of words in a sentence. This semantically-oriented representation is intuitive and easy to apply requiring little linguistic expertise. Consequently Stanford dependencies are widely used in biomedical text mining Kim et al. 2009 as well as in textual entailment Androutsopou-los and Malakasiotis 2010 information extraction Wu and Weld 2010 Banko et al. 2007 and sentiment analysis Meena and Prabhakar 2007 . In addition to English there is a Chinese version of Stanford dependencies Chang et al. 2009 a A constituent parse tree. Root-

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.