tailieunhanh - Báo cáo khoa học: " Improving Machine Learning Approaches to Coreference Resolution"

We present a noun phrase coreference system that extends the work of Soon et al. (2001) and, to our knowledge, produces the best results to date on the MUC6 and MUC-7 coreference resolution data sets — F-measures of and , respectively. Improvements arise from two sources: extra-linguistic changes to the learning framework and a large-scale expansion of the feature set to include more sophisticated linguistic knowledge. | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 104-111. Improving Machine Learning Approaches to Coreference Resolution Vincent Ng and Claire Cardie Department of Computer Science Cornell University Ithaca NY 14853-7501 yung cardie @ Abstract We present a noun phrase coreference system that extends the work of Soon et al. 2001 and to our knowledge produces the best results to date on the MUC-6 and MUC-7 coreference resolution data sets F-measures of and respectively. Improvements arise from two sources extra-linguistic changes to the learning framework and a large-scale expansion of the feature set to include more sophisticated linguistic knowledge. 1 Introduction Noun phrase coreference resolution refers to the problem of determining which noun phrases NPs refer to each real-world entity mentioned in a document. Machine learning approaches to this problem have been reasonably successful operating primarily by recasting the problem as a classification task . Aone and Bennett 1995 McCarthy and Lehnert 1995 . Specifically a pair of NPs is classified as co-referring or not based on constraints that are learned from an annotated corpus. A separate clustering mechanism then coordinates the possibly contradictory pairwise classifications and constructs a partition on the set of NPs. Soon et al. 2001 for example apply an NP coreference system based on decision tree induction to two standard coreference resolution data sets MUC-6 1995 MUC-7 1998 achieving performance comparable to the best-performing knowledge-based coreference engines. Perhaps surprisingly this was accomplished in a decidedly knowledge-lean manner the learning algorithm has access to just 12 surface-level features. This paper presents an NP coreference system that investigates two types of extensions to the Soon et al. corpus-based approach. First we propose and evaluate three extra-linguistic modifications

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.