tailieunhanh - Báo cáo khoa học: "The Structure of Shared Forests in Ambiguous Parsing"

The Context-Free backbone of some natural language analyzers produces all possible CF parses as some kind of shared forest, from which a single tree is to be chosen by a disambiguation process that may be based on the finer features of the language. We study the structure of these forests with respect to optimality of sharing, and in relation with the parsing schema used to produce them. In addition to a theoretical and experimental framework for studying these issues, the main results presented are: sophistication in chart parsing schemata (. use of look-ahead) may reduce time and space efficiency. | The Structure of Shared Forests in Ambiguous Parsing Sylvie Billot Bernard Lang INRIA and Université d Orleans billot@ lang@ Abstract The Context-Free backbone of some natural language analyzers produces all possible CF parses as some kind of shared forest from which a single tree is to be chosen by a disambiguation process that may be based on the liner features of the language. We study the structure of these forests with respect to optimality of sharing and in relation with the parsing schema used to produce them. In addition to a theoretical and experimental framework for studying these issues the main results presented are - sophistication in chart parsing schemata . use of look-ahead may reduce time and space efficiency instead of improving it - there is a shared forest structure with at most cubic size for any CF grammar - when O n3 complexity is required the shape of a shared forest is dependent on the parsing schema used. Though analyzed on CF grammars for simplicity these results extend to more complex formalisms such as unification based grammars. Key words Context-Free Parsing Ambiguity Dynamic Programming Earley Parsing Chart Parsing Parsing Strategies Parsing Schemata Parse Tree Parse Forest. 1 Introduction Several natural language parser start with a pure Context-Free CF backbone that makes a first sketch of the structure of the analyzed sentence before it is handed to a more elaborate analyzer possibly a coroutine that takes into account the finer grammatical structure to filter out undesirable parses see for example 24 28 . In 28 Shieber surveys existing variants to this approach before giving his own tunable approach based on restrictions that split up the infinite nonterminal domain into a finite set of equivalence classes that can be used for parsing . The basic motivation for this approach is to benefit from the CF parsing technology whose development over 30 years has lead to powerful and efficient parsers 1 7 .

TỪ KHÓA LIÊN QUAN