tailieunhanh - Báo cáo khoa học: "A Statistical Parser for Czech*"

This paper considers statistical parsing of Czech, which differs radically from English in at least two respects: (1) it is a highly inflected language, and (2) it has relatively free word order. These differences are likely to pose new problems for techniques that have been developed on English. We describe our experience in building on the parsing model of (Collins 97). Our final results - 80% dependency accuracy - represent good progress towards the 91% accuracy of the parser on English (Wall Street Journal) text. . | A Statistical Parser for Czech Michael Collins AT T Labs-Research Shannon Laboratory 180 Park Avenue Florham Park NJ 07932 mcollins@ Jan Hajic Institute of Formal and Applied Linguistics Charles University Prague Czech Republic haj Lance Ramshaw BBN Technologies 70 Fawcett St. Cambridge MA 02138 lramshaw@ Christoph Tillmann Lehrstuhl fur Informatik VI RWTH Aachen D-52056 Aachen Germany tillmann@ Abstract This paper considers statistical parsing of Czech which differs radically from English in at least two respects 1 it is a highly inflected language and 2 it has relatively free word order. These differences are likely to pose new problems for techniques that have been developed on English. We describe our experience in building on the parsing model of Collins 97 . Our final results - 80 dependency accuracy - represent good progress towards the 91 accuracy of the parser on English Wall Street Journal text. 1 Introduction Much of the recent research on statistical parsing has focused on English languages other than English are likely to pose new problems for statistical methods. This paper considers statistical parsing of Czech using the Prague Dependency Treebank PDT Hajic 1998 as a source of training and test data the PDT contains around 480 000 words of general news business news and science articles This material is based upon work supported by the National Science Foundation under Grant No. IIS-9732388 and was carried out at the 1998 Workshop on Language Engineering Center for Language and Speech Processing Johns Hopkins University. Any opinions findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or The Johns Hopkins University. The project has also had support at various levels from the following grants and programs Grant Agency of the Czech Republic grants No. 405 96 0198 and

crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.