tailieunhanh - Báo cáo khoa học: "HOW TO DETECT GRAMMATICAL ERRORS IN A TEXT WITHOUT PARSING IT"

The Constituent Likelihood Automatic Word-tagging System (CLAWS) was originally designed for the low-level grammatical analysis of the million-word LOB Corpus of English text samples. CLAWS does not attempt a full parse, but uses a firat-order Markov model of language to assign word-class labels to words. CLAWS can be modified to detect grammatical errors, essentially by flagging unlikely word-class transitions in the input text. This may seem to be an intuitively implausible and theoretically inadequate model of natural language syntax, but nevertheless it can successfully pinpoint most grammatical errors in a text. Several modifications to CLAWS have been explored | HOW TO DETECT GRAMMATICAL ERRORS IN A TEXT WITHOUT PARSING IT Eric Steven Atwell Artificial Intelligence Group Department of Computer Studies Leeds University Leeds LS2 9JT . EARN BITNET eric ABSTRACT The Constituent Likelihood Automatic Word-tagging System CLAWS was originally designed for the low-level grammatical analysis of the million-word LOB Corpus of English text samples. CLAWS does not attempt a full parse but uses a first-order Markov model of language to assign word-class labels to words. CLAWS can be modified to detea grammatical errors essentially by flagging unlikely word-class transitions in the input text This may seem to be an intuitively implausible and theoretically inadequate model of natural language syntax but nevertheless it can successfully pinpoint most grammatical errors in a text Several modifications to CLAWS have been explored. The resulting system cannot detect all errors in typed documents but then neither do far more complex systems which attempt a full parse requiring much greater computation. Checking Grammar in Texts A number of researchers have experimented with ways to cope with grammatically ill-formed English input for example Carbonell and Hayes 83 Chamiak 83 Granger 83 Hayes and Mouradian 81 Heidom et al 82 Jensen et al 83 Kwasny and Sondheimer 81 Weischedel and Black 80 Weischedel and Sondheimer 83 . However the majority of these systems are designed for Natural Language interfaces to software systems and so can assume a restriaed vocabulary and syntax for example the system discussed by Fass 83 had a vocabulary of less than 50 words. This may be justifiable for a NL front-end to a computer system such as a Database Query system since even an artificial subset of English may be more acceptable to users than a formal command or query language. However for automated text-checking in Word Processing we cannot reasonably ask the wp user to restrict their English text in this way. This means that WP .

TỪ KHÓA LIÊN QUAN