Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "k-best Spanning Tree Parsing"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

This paper introduces a Maximum Entropy dependency parser based on an efficient kbest Maximum Spanning Tree (MST) algorithm. Although recent work suggests that the edge-factored constraints of the MST algorithm significantly inhibit parsing accuracy, we show that generating the 50-best parses according to an edge-factored model has an oracle performance well above the 1-best performance of the best dependency parsers. This motivates our parsing approach, which is based on reranking the kbest parses generated by an edge-factored model. . | k-best Spanning Tree Parsing Keith Hall Center for Language and Speech Processing Johns Hopkins University Baltimore MD 21218 keith_hall@jhu.edu Abstract This paper introduces a Maximum Entropy dependency parser based on an efficient k-best Maximum Spanning Tree MST algorithm. Although recent work suggests that the edge-factored constraints of the MST algorithm significantly inhibit parsing accuracy we show that generating the 50-best parses according to an edge-factored model has an oracle performance well above the 1-best performance of the best dependency parsers. This motivates our parsing approach which is based on reranking the k-best parses generated by an edge-factored model. Oracle parse accuracy results are presented for the edge-factored model and 1-best results for the reranker on eight languages seven from CoNLL-X and English . 1 Introduction The Maximum Spanning Tree algorithm1 was recently introduced as a viable solution for non-projective dependency parsing McDonald et al. 2005b . The dependency parsing problem is naturally a spanning tree problem however efficient spanning-tree optimization algorithms assume a cost function which assigns scores independently to edges of the graph. In dependency parsing this effectively constrains the set of models to those which independently generate parent-child pairs 1In this paper we deal only with MSTs on directed graphs. These are often referred to in the graph-theory literature as Maximum Spanning Arborescences. 392 these are known as edge-factored models. These models are limited to relatively simple features which exclude linguistic constructs such as verb sub-categorization valency lexical selectional preferences etc.2 In order to explore a rich set of syntactic features in the MST framework we can either approximate the optimal non-projective solution as in McDonald and Pereira 2006 or we can use the constrained MST model to select a subset of the set of dependency parses to which we then apply .