tailieunhanh - Báo cáo khoa học: "Coordination Structure Analysis using Dual Decomposition"

Coordination disambiguation remains a difficult sub-problem in parsing despite the frequency and importance of coordination structures. We propose a method for disambiguating coordination structures. In this method, dual decomposition is used as a framework to take advantage of both HPSG parsing and coordinate structure analysis with alignment-based local features. We evaluate the performance of the proposed method on the Genia corpus and the Wall Street Journal portion of the Penn Treebank. Results show it increases the percentage of sentences in which coordination structures are detected correctly, compared with each of the two algorithms alone. . | Coordination Structure Analysis using Dual Decomposition Atsushi Hanamoto 1 Takuya Matsuzaki1 Jun ichi Tsujii2 1. Department of Computer Science University of Tokyo Japan 2. Web Search Mining Group Microsoft Research Asia China hanamoto matuzaki @ jtsujii@ Abstract Coordination disambiguation remains a difficult sub-problem in parsing despite the frequency and importance of coordination structures. We propose a method for disambiguating coordination structures. In this method dual decomposition is used as a framework to take advantage of both HPSG parsing and coordinate structure analysis with alignment-based local features. We evaluate the performance of the proposed method on the Genia corpus and the Wall Street Journal portion of the Penn Treebank. Results show it increases the percentage of sentences in which coordination structures are detected correctly compared with each of the two algorithms alone. 1 Introduction Coordination structures often give syntactic ambiguity in natural language. Although a wrong analysis of a coordination structure often leads to a totally garbled parsing result coordination disambiguation remains a difficult sub-problem in parsing even for state-of-the-art parsers. One approach to solve this problem is a grammatical approach. This approach however often fails in noun and adjective coordinations because there are many possible structures in these coordinations that are grammatically correct. For example a noun sequence of the form n0 n1 and n2 n3 has as many as five possible structures Resnik 1999 . Therefore a grammatical approach is not sufficient to disambiguate coordination structures. In fact the Stanford parser Klein and Manning 2003 and Enju Miyao and Tsujii 2004 fail to disambiguate a sentence I am a freshman advertising and marketing major. Table 1 shows the output from them and the correct coordination structure. The coordination structure above is obvious to humans because there is a .

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN