tailieunhanh - Báo cáo khoa học: "Spectral Learning of Latent-Variable PCFGs"

We introduce a spectral learning algorithm for latent-variable PCFGs (Petrov et al., 2006). Under a separability (singular value) condition, we prove that the method provides consistent parameter Introduction Statistical models with hidden or latent variables are of great importance in natural language processing, speech, and many other fields. The EM algorithm is a remarkably successful method for parameter estimation within these models: it is simple, it is often relatively efficient, and it has well understood formal properties. . | Spectral Learning of Latent-Variable PCFGs Shay B. Cohen1 Karl Stratos1 Michael Collins1 Dean P. Foster2 and Lyle Ungar3 Dept. of Computer Science Columbia University 2Dept. of Statistics 3Dept. of Computer and Information Science University of Pennsylvania scohen stratos mcollins @ foster@ ungar@ Abstract We introduce a spectral learning algorithm for latent-variable PCFGs Petrov et al. 2006 . Under a separability singular value condition we prove that the method provides consistent parameter estimates. 1 Introduction Statistical models with hidden or latent variables are of great importance in natural language processing speech and many other fields. The EM algorithm is a remarkably successful method for parameter estimation within these models it is simple it is often relatively efficient and it has well understood formal properties. It does however have a major limitation it has no guarantee of finding the global optimum of the likelihood function. From a theoretical perspective this means that the EM algorithm is not guaranteed to give consistent parameter estimates. From a practical perspective problems with local optima can be difficult to deal with. Recent work has introduced polynomial-time learning algorithms and consistent estimation methods for two important cases of hidden-variable models Gaussian mixture models Dasgupta 1999 Vempala and Wang 2004 and hidden Markov models Hsu et al. 2009 . These algorithms use spectral methods that is algorithms based on eigenvector decompositions of linear systems in particular singular value decomposition SVD . In the general case learning of HMMs or GMMs is intractable . see Terwijn 2002 . Spectral methods finesse the problem of intractibility by assuming separability conditions. For example the algorithm of Hsu et al. 2009 has a sample complexity that is polynomial in 1 a where a is the minimum singular value of an underlying decomposition. These methods are not .

TỪ KHÓA LIÊN QUAN