tailieunhanh - Báo cáo khoa học: "Analyzing the Errors of Unsupervised Learning"
We identify four types of errors that unsupervised induction systems make and study each one in turn. Our contributions include (1) using a meta-model to analyze the incorrect biases of a model in a systematic way, (2) providing an efficient and robust method of measuring distance between two parameter settings of a model, and (3) showing that local optima issues which typically plague EM can be somewhat alleviated by increasing the number of training examples. We conduct our analyses on three models: the HMM, the PCFG, and a simple dependency model. . | Analyzing the Errors of Unsupervised Learning Percy Liang Dan Klein Computer Science Division EECS Department University of California at Berkeley Berkeley CA 94720 pliang klein @ Abstract We identify four types of errors that unsupervised induction systems make and study each one in turn. Our contributions include 1 using a meta-model to analyze the incorrect biases of a model in a systematic way 2 providing an efficient and robust method of measuring distance between two parameter settings of a model and 3 showing that local optima issues which typically plague EM can be somewhat alleviated by increasing the number of training examples. We conduct our analyses on three models the HMM the PCFG and a simple dependency model. 1 Introduction The unsupervised induction of linguistic structure from raw text is an important problem both for understanding language acquisition and for building language processing systems such as parsers from limited resources. Early work on inducing grammars via EM encountered two serious obstacles the inappropriateness of the likelihood objective and the tendency of EM to get stuck in local optima. Without additional constraints on bracketing Pereira and Shabes 1992 or on allowable rewrite rules Carroll and Charniak 1992 unsupervised grammar learning was ineffective. Since then there has been a large body of work addressing the flaws of the EM-based approach. Syntactic models empirically more learnable than PCFGs have been developed Clark 2001 Klein and Manning 2004 . Smith and Eisner 2005 proposed a new objective function Smith and Eisner 2006 introduced a new training procedure. Bayesian approaches can also improve performance Goldwater and Griffiths 2007 Johnson 2007 Kurihara and Sato 2006 . Though these methods have improved induction accuracy at the core they all still involve optimizing non-convex objective functions related to the likelihood of some model and thus are not completely immune to the difficulties .
đang nạp các trang xem trước