tailieunhanh - Báo cáo khoa học: "Parsing without lexicon: the MorP system"
MorP is a system for automatic word class assignment on the basis of surface features. It has a very small lexicon of form words (%o entries), and for the rest works entirely on morphological and configurational patterns. This makes it robust and fast, and in spite of the (deliberate) restrictedness of the system, its performance reaches an average accuracy level above 91% when run on unrestricted Swedish text. K e y w o r d s : parsing, morphology. The development of the parser to be presented has been supported by the Swedish Research Council for the Humanities. . | Parsing without lexicon the MorP system Abstract MorP is a system for automatic word class assignment on the basis of surface features. It has a very small lexicon of form words 0 entries and for the rest works entirely on morphological and configurational patterns. This makes it robust and fast and in spite of the deliberate restrictedness of the system its performance reaches an average accuracy level above 91 when run on unrestricted Swedish text. Keywords parsing morphology. The development of the parser to be presented has been supported by the Swedish Research Council for the Humanities. The parser is called MorP for morphology based parser and the hypotheses behind it can be formulated thus a It is to a large extent possible to decide the word class of words in running text from pure surface criteria such as the morphology of the words together with the configurations that they appear in. b These surface criteria can be described so clearly that an automatic identification of word class will be possible. c Surface criteria give signals that will suffice to give a word class identification with a level of around or above Gunnel Kăllgren University of Stockholm Department of Computational Linguistics S-106 91 Stockholm Sweden gunnel@ gunnel@ 90 correctness at least for a language with as much inflectional morphology as Swedish. A parser was constructed along these lines which are first presented in Brodda 1982 and the predictions of the hypotheses were found to hold fairly well. The project is reported in publications in Swedish Kallgren 1984a and English Kallgren 1984b 1985 1991a and the parser has been tested in a practical application in connection with information retrieval Kăllgren 1984c 1991a . We also plan to use the parser in a project aimed at building a large tagged corpus of Swedish the sue corpus Kallgren 1990 1991b . The MorP parser is implemented in a high-level string manipulating language developed at Stockholm University by
đang nạp các trang xem trước