tailieunhanh - Báo cáo khoa học: "Judging Grammaticality with Tree Substitution Grammar Derivations"

In this paper, we show that local features computed from the derivations of tree substitution grammars — such as the identify of particular fragments, and a count of large and small fragments — are useful in binary grammatical classification tasks. Such features outperform n-gram features and various model scores by a wide margin. Although they fall short of the performance of the hand-crafted feature set of Charniak and Johnson (2005) developed for parse tree reranking, they do so with an order of magnitude fewer features. . | Judging Grammaticality with Tree Substitution Grammar Derivations Matt Post Human Language Technology Center of Excellence Johns Hopkins University Baltimore MD 21211 Abstract In this paper we show that local features computed from the derivations of tree substitution grammars such as the identify of particular fragments and a count of large and small fragments are useful in binary grammatical classification tasks. Such features outperform n-gram features and various model scores by a wide margin. Although they fall short of the performance of the hand-crafted feature set of Charniak and Johnson 2005 developed for parse tree reranking they do so with an order of magnitude fewer features. Furthermore since the TSGs employed are learned in a Bayesian setting the use of their derivations can be viewed as the automatic discovery of tree patterns useful for classification. On the BLLIP dataset we achieve an accuracy of in discriminating between grammatical text and samples from an n-gram language model. 1 Introduction The task of a language model is to provide a measure of the grammaticality of a sentence. Language models are useful in a variety of settings for both human and machine output for example in the automatic grading of essays or in guiding search in a machine translation system. Language modeling has proved to be quite difficult. The simplest models n-grams are self-evidently poor models of language unable to easily capture or enforce long-distance linguistic phenomena. However they are easy to train are long-studied and well understood and can be efficiently incorporated into search procedures such 217 as for machine translation. As a result the output of such text generation systems is often very poor grammatically even if it is understandable. Since grammaticality judgments are a matter of the syntax of a language the obvious approach for modeling grammaticality is to start with the extensive work produced over the past two decades in the field of .

TỪ KHÓA LIÊN QUAN
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.