Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We present a pointwise approach to Japanese morphological analysis (MA) that ignores structure information during learning and tagging. Despite the lack of structure, it is able to outperform the current state-of-the-art structured approach for Japanese MA, and achieves accuracy similar to that of structured predictors using the same feature set. We also find that the method is both robust to outof-domain data, and can be easily adapted through the use of a combination of partial annotation and active learning. . | Pointwise Prediction for Robust Adaptable Japanese Morphological Analysis Graham Neubig Yosuke Nakata Shinsuke Mori Graduate School of Informatics Kyoto University Yoshida Honmachi Sakyo-ku Kyoto Japan Abstract We present a pointwise approach to Japanese morphological analysis MA that ignores structure information during learning and tagging. Despite the lack of structure it is able to outperform the current state-of-the-art structured approach for Japanese MA and achieves accuracy similar to that of structured predictors using the same feature set. We also find that the method is both robust to out-of-domain data and can be easily adapted through the use of a combination of partial annotation and active learning. 1 Introduction Japanese morphological analysis MA takes an unsegmented string of Japanese text as input and outputs a string of morphemes annotated with parts of speech POSs . As MA is the first step in Japanese NLP its accuracy directly affects the accuracy of NLP systems as a whole. In addition with the proliferation of text in various domains there is increasing need for methods that are both robust and adaptable to out-of-domain data Escudero et al. 2000 . Previous approaches have used structured predictors such as hidden Markov models HMMs or conditional random fields CRFs which consider the interactions between neighboring words and parts of speech Nagata 1994 Asahara and Matsumoto 2000 Kudo et al. 2004 . However while structure does provide valuable information Liang et al. 2008 have shown that gains provided by structured prediction can be largely recovered by using a richer feature set. This approach has also been called pointwise prediction as it makes a single independent decision at each point Neubig and Mori 2010 . While Liang et al. 2008 focus on the speed benefits of pointwise prediction we demonstrate that it also allows for more robust and adaptable MA. We find experimental evidence that pointwise MA can exceed the accuracy of a .