tailieunhanh - Báo cáo khoa học: "A Meta Learning Approach to Grammatical Error Correction"

We introduce a novel method for grammatical error correction with a number of small corpora. To make the best use of several corpora with different characteristics, we employ a meta-learning with several base classifiers trained on different corpora. This research focuses on a grammatical error correction task for article errors. | A Meta Learning Approach to Grammatical Error Correction Hongsuck Seo1 Jonghoon Lee1 Seokhwan Kim2 Kyusong Lee1 Sechun Kang1 Gary Geunbae Lee1 1Pohang University of Science and Technology 2Institute for Infocomm Research hsseo jh21983 @ kims@ kyusonglee freshboy gblee @ Abstract We introduce a novel method for grammatical error correction with a number of small corpora. To make the best use of several corpora with different characteristics we employ a meta-learning with several base classifiers trained on different corpora. This research focuses on a grammatical error correction task for article errors. A series of experiments is presented to show the effectiveness of the proposed approach on two different grammatical error tagged corpora. 1. Introduction As language learning has drawn significant attention in the community grammatical error correction GEC consequently has attracted a fair amount of attention. Several organizations have built diverse resources including grammatical error GE tagged corpora. Although there are some publicly released GE tagged corpora it is still challenging to train a good GEC model due to the lack of large GE tagged learner corpus. The available GE tagged corpora are mostly small datasets having different characteristics depending on the development methods . spoken corpus vs. written corpus. This situation forced researchers to utilize native corpora rather than GE tagged learner corpora for the GEC task. The native corpus approach consists of learning a model that predicts the correct form of an article given the surrounding context. Some researchers 328 focused on mining better features from the linguistic and pedagogic knowledge whereas others focused on testing different classification methods Knight and Chandler 1994 Minnen et al. 2000 Lee 2004 Nagata et al. 2006 Han et al. 2006 De Felice 2008 . Recently a group of researchers introduced methods utilizing a GE tagged learner corpus

TỪ KHÓA LIÊN QUAN