tailieunhanh - Báo cáo khoa học: "Confidence-Weighted Learning of Factored Discriminative Language Models"
Language models based on word surface forms only are unable to benefit from available linguistic knowledge, and tend to suffer from poor estimates for rare features. We propose an approach to overcome these two limitations. We use factored features that can flexibly capture linguistic regularities, and we adopt confidence-weighted learning, a form of discriminative online learning that can better take advantage of a heavy tail of rare features. | Confidence-Weighted Learning of Factored Discriminative Language Models Viet Ha-Thuc Computer Science Department The University of Iowa Iowa City IA 52241 USA hviet@ Nicola Cancedda Xerox Research Centre Europe 6 chemin de Maupertuis 38240 Meylan France Abstract Language models based on word surface forms only are unable to benefit from available linguistic knowledge and tend to suffer from poor estimates for rare features. We propose an approach to overcome these two limitations. We use factored features that can flexibly capture linguistic regularities and we adopt confidence-weighted learning a form of discriminative online learning that can better take advantage of a heavy tail of rare features. Finally we extend the confidence-weighted learning to deal with label noise in training data a common case with discriminative language modeling. 1 Introduction Language Models LMs are key components in most statistical machine translation systems where they play a crucial role in promoting output fluency. Standard n-gram generative language models have been extended in several ways. Generative factored language models Bilmes and Kirchhoff 2003 represent each token by multiple factors -such as part-of-speech lemma and surface form-and capture linguistic patterns in the target language at the appropriate level of abstraction. Instead of estimating likelihood discriminative language models Roark et al. 2004 Roark et al. 2007 Li and Khudanpur 2008 directly model fluency by casting the task as a binary classification or a ranking problem. The method we propose combines advantages of both directions mentioned above. We use factored features to capture linguistic patterns and discriminative learning for directly modeling fluency. We define highly overlapping and correlated factored features and extend a robust learning algorithm to handle them and cope with a high rate of label noise. For discriminatively learning language models we
đang nạp các trang xem trước