tailieunhanh - Báo cáo khoa học: "Strong Lexicalization of Tree Adjoining Grammars"

Recently, it was shown (K UHLMANN , S ATTA: Tree-adjoining grammars are not closed under strong lexicalization. Comput. Linguist., 2012) that finitely ambiguous tree adjoining grammars cannot be transformed into a normal form (preserving the generated tree language), in which each production contains a lexical symbol. | Strong Lexicalization of Tree Adjoining Grammars Andreas Maletti IMS Universitat Stuttgart Pfaffenwaldring 5b 70569 Stuttgart Germany maletti@ Joost Engelfriet LIACS Leiden University . Box 9512 2300 RA Leiden The Netherlands engelfri@ Abstract Recently it was shown Kuhlmann Satta Tree-adjoining grammars are not closed under strong lexicalization. Comput. Linguist. 2012 that finitely ambiguous tree adjoining grammars cannot be transformed into a normal form preserving the generated tree language in which each production contains a lexical symbol. A more powerful model the simple context-free tree grammar admits such a normal form. It can be effectively constructed and the maximal rank of the nonterminals only increases by 1. Thus simple context-free tree grammars strongly lexicalize tree adjoining grammars and themselves. 1 Introduction Tree adjoining grammars TAG Joshi et al. 1969 Joshi et al. 1975 are a mildly context-sensitive grammar formalism that can handle certain nonlocal dependencies Kuhlmann and Mohl 2006 which occur in several natural languages. A good overview on TAG their formal properties their linguistic motivation and their applications is presented by Joshi and Schabes 1992 and Joshi and Schabes 1997 in which also strong lexicalization is discussed. In general lexicalization is the process of transforming a grammar into an equivalent one potentially expressed in another formalism such that each production contains a lexical item or anchor . Each production can then be viewed as lexical information on its anchor. It demonstrates a syntactical construction in which the anchor can occur. Since a lexical item is a letter of the string Financially supported by the German Research Foundation DFG grant mA 4959 1-1. 506 alphabet each production of a lexicalized grammar produces at least one letter of the generated string. Consequently lexicalized grammars offer significant parsing benefits Schabes et al. 1988 as the number of

TỪ KHÓA LIÊN QUAN