tailieunhanh - Báo cáo khoa học: "An Implementation of Combined Partial Parser and Morphosyntactic Disambiguator"
The aim of this paper is to present a simple yet efficient implementation of a tool for simultaneous rule-based morphosyntactic tagging and partial parsing formalism. The parser is currently used for creating a treebank of partial parses in a valency acquisition project over the IPI PAN Corpus of Polish. | An Implementation of Combined Partial Parser and Morphosyntactic Disambiguator Aleksander Buczynski Institute of Computer Science Polish Academy of Sciences Ordona 21 01-237 Warszawa Poland olekb@ Abstract The aim of this paper is to present a simple yet efficient implementation of a tool for simultaneous rule-based morphosyntactic tagging and partial parsing formalism. The parser is currently used for creating a treebank of partial parses in a valency acquisition project over the IPI PAN Corpus of Polish. 1 Introduction Motivation Usually tagging and partial parsing are done separately with the input to a parser assumed to be a morphosyntactically fully disambiguated text. Some approaches Karlsson et al. 1995 Schiehlen 2002 Muller 2006 interweave tagging and parsing. Karlsson et al. 1995 is actually using the same formalism for both tasks it is possible because all words in this dependency-based approach come with all possible syntactic tags so partial parsing is reduced to rejecting wrong hypotheses just as in case of morphosyntactic tagging. Rules used in rule-based tagging often implicitly identify syntactic constructs but do not mark such constructs in texts. A typical such rule may say that when an unambiguous dative-taking preposition is followed by a number of possibly dative adjectives and a noun ambiguous between dative and some other case then the noun should be disambiguated to dative. Obviously such a rule actually identifies a PP and some of its structure. Following the observation that both tasks mor-phosyntactic tagging and partial constituency parsing involve similar linguistic knowledge a formalism for simultaneous tagging and parsing was proposed in Przepiorkowski 2007 . This paper presents a revised version of the formalism and a simple implementation of a parser understanding rules written according to it. The input to the rules is a tokenised and morphosyntactically annotated XML text. The output contains disambiguation .
đang nạp các trang xem trước