tailieunhanh - Báo cáo khoa học: "ON REPRESENTING GOVERNED PREPOSITIONS AND HANDLING "INCORRECT" AND NOVEL PREPOSITIONS"

ON REPRESENTING GOVERNED PREPOSITIONS AND HANDLING "INCORRECT" AND NOVEL PREPOSITIONS Hatte R. Blejer, Sharon Flank, and A n d r e w K c h l e r SRA Corporation 2000 15th St. N o r t h Arlington, VA 22201, USA ABSTRACT NLP systems, in order to be robust, must h a n d l e novel a n d ill-formed input. One c o m m o n type of error involves the use of n o n - s t a n d a r d prepositions to m a r k arguments. In this paper, we. | ON REPRESENTING GOVERNED PREPOSITIONS AND HANDLING INCORRECT AND NOVEL PREPOSITIONS Hatte R. Blejer Sharon Flank and Andrew Kehler SRA Corporation 2000 15th St. North Arlington VA 22201 USA ABSTRACT NLP systems in order to be robust must handle novel and ill-formed input. One common type of error involves the use of non-standard prepositions to mark arguments. In this paper we argue that such errors can be handled in a systematic fashion and that a system designed to handle them offers other advantages. We offer a classification scheme for preposition usage errors. Further we show how the knowledge representation employed in the SRA NLP system facilitates handling these data. INTRODUCTION It is well known that NLP systems in order to be robust must handle informed input. One common type of error involves the use of non-standard prepositions to mark arguments. In this paper we argue that such errors can be handled in a systematic fashion and that a system designed to handle them offers other advantages. The examples of non-standard prepositions we present in the paper are taken from colloquial language both written and oral. The type of error these examples represent is quite frequent in colloquial written language. The frequency of such examples rises sharply in evolving sub-languages and in oral colloquial language. In developing an NLP system to be used by various . government customers we have been sensitized to the need to handle variation and innovation in preposition usage. Handling this type of variation or innovation is part of our overall capability to handle novel predicates which are frequent in sublanguage. Novel predicates created for sublanguages are less stable in how they mark arguments ARGUMENT MAPPING than general English core predicates which speakers learn as children. It can be expected that the eventual advent of successful speech understanding systems will further emphasize the need to handle this and other variation. _ The NLP system

TỪ KHÓA LIÊN QUAN