tailieunhanh - Báo cáo khoa học: "Generalized Expectation Criteria for Semi-Supervised Learning of Conditional Random Fields"

This paper presents a semi-supervised training method for linear-chain conditional random fields that makes use of labeled features rather than labeled instances. This is accomplished by using generalized expectation criteria to express a preference for parameter settings in which the model’s distribution on unlabeled data matches a target distribution. We induce target conditional probability distributions of labels given features from both annotated feature occurrences in context and adhoc feature majority label assignment. . | Generalized Expectation Criteria for Semi-Supervised Learning of Conditional Random Fields Gideon S. Mann Google Inc. 76 Ninth Avenue New York NY 10011 Andrew McCallum Department of Computer Science University of Massachusetts 140 Governors Drive Amherst MA 01003 Abstract This paper presents a semi-supervised training method for linear-chain conditional random fields that makes use of labeled features rather than labeled instances. This is accomplished by using generalized expectation criteria to express a preference for parameter settings in which the model s distribution on unlabeled data matches a target distribution. We induce target conditional probability distributions of labels given features from both annotated feature occurrences in context and ad-hoc feature majority label assignment. The use of generalized expectation criteria allows for a dramatic reduction in annotation time by shifting from traditional instance-labeling to feature-labeling and the methods presented outperform traditional CRF training and other semi-supervised methods when limited human effort is available. 1 Introduction A significant barrier to applying machine learning to new real world domains is the cost of obtaining the necessary training data. To address this problem work over the past several years has explored semi-supervised or unsupervised approaches to the same problems seeking to improve accuracy with the addition of lower cost unlabeled data. Traditional approaches to semi-supervised learning are applied to cases in which there is a small amount of fully labeled data and a much larger amount of unlabeled data presumably from the same data source. For example EM Nigam et al. 1998 transduc-tive SVMs Joachims 1999 entropy regularization Grandvalet and Bengio 2004 and graph-based Traditional Full Instance Labeling address number oak avenue rent I I I I I I I ADDRESS ADDRESS ADDRESS ADDRESS ADDRESS RENT RENT Feature Labeling address number oak avenue rent . ADDRESS ADDRESS . .