tailieunhanh - Báo cáo khoa học: "A Cost Sensitive Part-of-Speech Tagging: Differentiating Serious Errors from Minor Errors"

All types of part-of-speech (POS) tagging errors have been equally treated by existing taggers. However, the errors are not equally important, since some errors affect the performance of subsequent natural language processing (NLP) tasks seriously while others do not. | A Cost Sensitive Part-of-Speech Tagging Differentiating Serious Errors from Minor Errors Hyun-Je Song1 Jeong-Woo Son1 Tae-Gil Noh2 Seong-Bae Park1 3 Sang-Jo Lee1 1 School of Computer Sci. Eng. 2Computational Linguistics 3NLP Lab. Kyungpook Nat lUniv. Heidelberg University Dept of Computer Science Daegu Korea Heidelberg Germany University of Illinois at Chicago hj song jwson tgnoh @sej sbpark@ sjlee@ Abstract All types of part-of-speech POS tagging errors have been equally treated by existing taggers. However the errors are not equally important since some errors affect the performance of subsequent natural language processing NLP tasks seriously while others do not. This paper aims to minimize these serious errors while retaining the overall performance of POS tagging. Two gradient loss functions are proposed to reflect the different types of errors. They are designed to assign a larger cost to serious errors and a smaller one to minor errors. Through a set of POS tagging experiments it is shown that the classifier trained with the proposed loss functions reduces serious errors compared to state-of-the-art POS taggers. In addition the experimental result on text chunking shows that fewer serious errors help to improve the performance of subsequent NLP tasks. 1 Introduction Part-of-speech POS tagging is needed as a preprocessor for various natural language processing NLP tasks such as parsing named entity recognition NER and text chunking. Since POS tagging is normally performed in the early step of NLP tasks the errors in POS tagging are critical in that they affect subsequent steps and often lower the overall performance of NLP tasks. Previous studies on POS tagging have shown high performance with machine learning techniques Ratnaparkhi 1996 Brants 2000 Lafferty et al. 1025 2001 . Among the types of machine learning approaches supervised machine learning techniques were commonly used in early studies on POS tagging. With the .

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG