tailieunhanh - Báo cáo khoa học: "Logarithmic Opinion Pools for Conditional Random Fields"

Recent work on Conditional Random Fields (CRFs) has demonstrated the need for regularisation to counter the tendency of these models to overfit. The standard approach to regularising CRFs involves a prior distribution over the model parameters, typically requiring search over a hyperparameter space. In this paper we address the overfitting problem from a different perspective, by factoring the CRF distribution into a weighted product of individual “expert” CRF distributions. We call this model a logarithmic opinion pool (LOP) of CRFs (LOP-CRFs). We apply the LOP-CRF to two sequencing tasks. . | Logarithmic Opinion Pools for Conditional Random Fields Andrew Smith Trevor Cohn Miles Osborne Division of Informatics University of Edinburgh United Kingdom Department of Computer Science and Software Engineering University of Melbourne Australia Division of Informatics University of Edinburgh United Kingdom tacohn@ miles@ Abstract Recent work on Conditional Random Fields CRFs has demonstrated the need for regularisation to counter the tendency of these models to overht. The standard approach to regularising CRFs involves a prior distribution over the model parameters typically requiring search over a hyperparameter space. In this paper we address the overhtting problem from a different perspective by factoring the CRF distribution into a weighted product of individual expert CRF distributions. We call this model a logarithmic opinion pool LOP of CRFs LOP-CRFs . We apply the LOP-CRF to two sequencing tasks. Our results show that unregularised expert CRFs with an unregularised CRF under a LOP can outperform the unregularised CRF and attain a performance level close to the regularised CRF. LOP-CRFs therefore provide a viable alternative to CRF regularisation without the need for hyperparameter search. 1 Introduction In recent years conditional random fields CRFs Lafferty et al. 2001 have shown success on a number of natural language processing NLP tasks including shallow parsing Sha and Pereira 2003 named entity recognition McCallum and Li 2003 and information extraction from research papers Peng and McCallum 2004 . In general this work has demonstrated the susceptibility of CRFs to overfit the training data during parameter estimation. As a consequence it is now standard to use some form of overfitting reduction in CRF training. Recently there have been a number of sophisticated approaches to reducing overfitting in CRFs including automatic feature induction McCallum 2003 and a full Bayesian approach to .