tailieunhanh - Báo cáo khoa học: "Integrated Shallow and Deep Parsing"

We present a novel, data-driven method for integrated shallow and deep parsing. Mediated by an XML-based multi-layer annotation architecture, we interleave a robust, but accurate stochastic topological field parser of German with a constraintbased HPSG parser. Our annotation-based method for dovetailing shallow and deep phrasal constraints is highly flexible, allowing targeted and fine-grained guidance of constraint-based parsing. We conduct systematic experiments that demonstrate substantial performance gains. . | Integrated Shallow and Deep Parsing TopP meets HPSG Anette Frank Markus Becker Berthold Crysmann Bernd Kiefer and Ulrich Schafer DFKI GmbH School of Informatics 66123 Saarbriicken Germany University of Edinburgh UK Abstract We present a novel data-driven method for integrated shallow and deep parsing. Mediated by an XML-based multi-layer annotation architecture we interleave a robust but accurate stochastic topological field parser of German with a constraintbased HPSG parser. Our annotation-based method for dovetailing shallow and deep phrasal constraints is highly flexible allowing targeted and fine-grained guidance of constraint-based parsing. We conduct systematic experiments that demonstrate substantial performance 1 Introduction One of the strong points of deep processing DNLP technology such as HPSG or LFG parsers certainly lies with the high degree of precision as well as detailed linguistic analysis these systems are able to deliver. Although considerable progress has been made in the area of processing speed DNLP systems still cannot rival shallow and medium depth technologies in terms of throughput and robustness. As a net effect the impact of deep parsing technology on application-oriented NLP is still fairly limited. With the advent of XML-based hybrid shallowdeep architectures as presented in Grover and Las-carides 2001 Crysmann et al. 2002 Uszkoreit 2002 it has become possible to integrate the added value of deep processing with the performance and robustness of shallow processing. So far integration has largely focused on the lexical level to improve upon the most urgent needs in increasing the robustness and coverage of deep parsing systems namely 1 This work was in part supported by a BMBF grant to the DFKI project WHITEBOARD FKZ 01 IW 002 . lexical coverage. While integration in Grover and Lascarides 2001 was still restricted to morphological and PoS information Crysmann et al. 2002 extended .