tailieunhanh - Báo cáo khoa học: "Scaling Conditional Random Fields Using Error-Correcting Codes"

Conditional Random Fields (CRFs) have been applied with considerable success to a number of natural language processing tasks. However, these tasks have mostly involved very small label sets. When deployed on tasks with larger label sets, the requirements for computational resources mean that training becomes intractable. This paper describes a method for training CRFs on such tasks, using error correcting output codes (ECOC). A number of CRFs are independently trained on the separate binary labelling tasks of distinguishing between a subset of the labels and its complement. . | Scaling Conditional Random Fields Using Error-Correcting Codes Trevor Cohn Department of Computer Science and Software Engineering University of Melbourne Australia Andrew Smith Division of Informatics University of Edinburgh United Kingdom tacohn@ Miles Osborne Division of Informatics University of Edinburgh United Kingdom miles@ Abstract Conditional Random Fields CRFs have been applied with considerable success to a number of natural language processing tasks. However these tasks have mostly involved very small label sets. When deployed on tasks with larger label sets the requirements for computational resources mean that training becomes intractable. This paper describes a method for training CRFs on such tasks using error correcting output codes ECOC . A number of CRFs are independently trained on the separate binary labelling tasks of distinguishing between a subset of the labels and its complement. During decoding these models are combined to produce a predicted label sequence which is resilient to errors by individual models. Error-correcting CRF training is much less resource intensive and has a much faster training time than a standardly formulated CRF while decoding performance remains quite comparable. This allows us to scale CRFs to previously impossible tasks as demonstrated by our experiments with large label sets. 1 Introduction Conditional random fields CRFs Lafferty et al. 2001 are probabilistic models for labelling sequential data. CRFs are undirected graphical models that define a conditional distribution over label sequences given an observation sequence. They allow the use of arbitrary overlapping non-independent features as a result of their global conditioning. This allows us to avoid making unwarranted independence assumptions over the observation sequence such as those required by typical generative models. Efficient inference and training methods exist when the graphical structure of