tailieunhanh - Báo cáo khoa học: " Exploring Asymmetric Clustering for Statistical Language Modeling"

The n-gram model is a stochastic model, which predicts the next word (predicted word) given the previous words (conditional words) in a word sequence. The cluster n-gram model is a variant of the n-gram model in which similar words are classified in the same cluster. It has been demonstrated that using different clusters for predicted and conditional words leads to cluster models that are superior to classical cluster models which use the same clusters for both words. This is the basis of the asymmetric cluster model (ACM) discussed in our study. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 183-190. Exploring Asymmetric Clustering for Statistical Language Modeling Jianfeng Gao Joshua T. Goodman Microsoft Research Asia Microsoft Research Redmond Beijing 100080 P. Washington 98052 USA jfgao@ joshuago@ Guihong Cao1 Department of Computer Science and Engineering of Tianjin University China Abstract The n-gram model is a stochastic model which predicts the next word predicted word given the previous words conditional words in a word sequence. The cluster n-gram model is a variant of the n-gram model in which similar words are classified in the same cluster. It has been demonstrated that using different clusters for predicted and conditional words leads to cluster models that are superior to classical cluster models which use the same clusters for both words. This is the basis of the asymmetric cluster model ACM discussed in our study. In this paper we first present a formal definition of the ACM. We then describe in detail the methodology of constructing the ACM. The effectiveness of the ACM is evaluated on a realistic application namely Japanese Kana-Kanji conversion. Experimental results show substantial improvements of the ACM in comparison with classical cluster models and word n-gram models at the same model size. Our analysis shows that the high-performance of the ACM lies in the asymmetry of the model. 1 Introduction The n-gram model has been widely applied in many applications such as speech recognition machine translation and Asian language text input Jelinek 1990 Brown et al. 1990 Gao et al. 2002 . It is a stochastic model which predicts the next word predicted word given the previous n-1 words conditional words in a word sequence. Hang Li Microsoft Research Asia Beijing 100080 hangli@ The cluster n-gram model is a variant of the word n-gram model in which similar words are .

TÀI LIỆU LIÊN QUAN