Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Machine Learning for Coreference Resolution: From Local Classification to Global Ranking"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
In this paper, we view coreference resolution as a problem of ranking candidate partitions generated by different coreference systems. We propose a set of partition-based features to learn a ranking model for distinguishing good and bad partitions. Our approach compares favorably to two state-of-the-art coreference systems when evaluated on three standard coreference data sets. | Machine Learning for Coreference Resolution From Local Classification to Global Ranking Vincent Ng Human Language Technology Research Institute University of Texas at Dallas Richardson TX 75083-0688 vince@hlt.utdallas.edu Abstract In this paper we view coreference resolution as a problem of ranking candidate partitions generated by different coreference systems. We propose a set of partition-based features to learn a ranking model for distinguishing good and bad partitions. Our approach compares favorably to two state-of-the-art coreference systems when evaluated on three standard coreference data sets. 1 Introduction Recent research in coreference resolution the problem of determining which noun phrases NPs in a text or dialogue refer to which real-world entity has exhibited a shift from knowledgebased approaches to data-driven approaches yielding learning-based coreference systems that rival their hand-crafted counterparts in performance e.g. Soon et al. 2001 Ng and Cardie 2002b Strube et al. 2002 Yang et al. 2003 Luo et al. 2004 . The central idea behind the majority of these learningbased approaches is to recast coreference resolution as a binary classification task. Specifically a classifier is first trained to determine whether two NPs in a document are co-referring or not. A separate clustering mechanism then coordinates the possibly contradictory pairwise coreference classification decisions and constructs a partition on the given set of NPs with one cluster for each set of coreferent NPs. Though reasonably successful this standard approach is not as robust as one may think. First de- sign decisions such as the choice of the learning algorithm and the clustering procedure are apparently critical to system performance but are often made in an ad-hoc and unprincipled manner that may be suboptimal from an empirical point of view. Second this approach makes no attempt to search through the space of possible partitions when given a set of NPs to be clustered .