tailieunhanh - Báo cáo khoa học: "A Cascaded Finite-State Parser for German"

The paper presents two approaches to partial parsing of German: a tagger trained on dependency tuples, and a cascaded finite-state parser (Abney, 1997). For the tagging approach, the effects of choosing different representations of dependency tuples are investigated. Performance of the finite-state parser is boosted by delaying syntactically unsolvable disambiguation problems via underspecification. Both approaches are evaluated on a 340,000-token corpus. | A Cascaded Finite-State Parser for German Michael Schiehlen Institute for Computational Linguistics University of Stuttgart Azenbergstr. 12 D-70174 Stuttgart mike@ Abstract The paper presents two approaches to partial parsing of German a tagger trained on dependency tuples and a cascaded finite-state parser Abney 1997 . For the tagging approach the effects of choosing different representations of dependency tuples are investigated. Performance of the finite-state parser is boosted by delaying syntactically un-solvable disambiguation problems via underspecification. Both approaches are evaluated on a 340 000-token corpus. 1 Introduction Traditional parsers are often quite brittle and optimize precision over recall. It is therefore important to also look at shallow approaches that come at virtually no cost in manual labour but can potentially supplement more knowledge-prone approaches. The paper discusses one such approach which gets by with a tree bank and a tagger. Another issue in parsing is speed which can only be gained by deterministic processing. Deterministic parsers return exactly one syntactic reading which forces them to solve many locally unsolv-able puzzles. Abney 1997 suggests a way out of this dilemma The parser leaves ambiguities unresolved if they are contained in a local domain. So at least ambiguities of this kind can conceivably be handed over to some expert disambiguation module. The paper fleshes out this idea and shows its impact on overall performance. 2 Evaluation Method Instead of using the prevalent PARSEVAL measures we opted for a dependency-based evaluation Lin 1995 which is arguably Srinivas et al. 1996 Kiibler and Telljohann 2002 fairer to partial parsers. In a dependency structure every word token dependent is related to another token head over a grammatical role but for one word token which is called the root. Thus a parser constructing a dependency structure needs to associate every word token either with a

TỪ KHÓA LIÊN QUAN