tailieunhanh - Báo cáo khoa học: "Structure Sharing with Binary Trees"

Many current interfaces for natural language copy is made of its entry, and unification is applied to the copied graph, not the original one. copying operation. In fact, unification in a typical parser is always preceded by a Because of nondeterminism in parsing, it is, in general, necessary to preserve every representation that gets built. The same graph may be needed again when the parser comes back to pursue some yet unexplored option. | Structure Sharing with Binary Trees Lauri Karttunen SRI International csu Stanford Martin Kay Xerox PARC csu Stanford Many current interfaces for natural language represent syntactic and semantic information in the form of directed graphs where attributes correspond to vectors and values to nodes. There is a simple correspondence between such graphs and the matrix notation linguists traditionally use for feature sets. b. cat np agr number sg person 3rd Figure 1 The standard operation for working with such graphs is unification. The unification operation succedes only on a pair of compatible graphs and its result is a graph containing the information in both contributors. When a parser applies a syntactic rule it unifies selected features of input constituents to check constraints and to build a representation for the output constituent. Problem proliferation of copies When words are combined to form phrases unification is not applied to lexical representations directly because it would result in the lexicon being changed. When a word is encountered in a text a copy is made of its entry and unification is applied to the copied graph not the original one. In fact unification in a typical parser is always preceded by a copying operation. Because of nondeterminism in parsing it is in general necessary to preserve every representation that gets built. The same graph may be needed again when the parser comes back to pursue some yet unexplored option. Our experience suggests that the amount of computational effort that goes into producing these copies is much greater than the cost of unification itself. It accounts for a significant amount of the total parsing time. In a sense most of the copying effort is wasted. Unifications that fail typically fail for a simple reason. If it were known in advance what aspects of structures are relevant in a particular case some effort could be saved by first considering only the crucial features of the input. Solution structure sharing

TỪ KHÓA LIÊN QUAN