tailieunhanh - Báo cáo khoa học: "Semi-supervised Relation Extraction with Large-scale Word Clustering"
We present a simple semi-supervised relation extraction system with large-scale word clustering. We focus on systematically exploring the effectiveness of different cluster-based features. We also propose several statistical methods for selecting clusters at an appropriate level of granularity. When training on different sizes of data, our semi-supervised approach consistently outperformed a state-of-the-art supervised baseline system. | Semi-supervised Relation Extraction with Large-scale Word Clustering Ang Sun Ralph Grishman Satoshi Sekine Computer Science Department New York University asun grishman sekine @ Abstract We present a simple semi-supervised relation extraction system with large-scale word clustering. We focus on systematically exploring the effectiveness of different cluster-based features. We also propose several statistical methods for selecting clusters at an appropriate level of granularity. When training on different sizes of data our semi-supervised approach consistently outperformed a state-of-the-art supervised baseline system. 1 Introduction Relation extraction is an important information extraction task in natural language processing NLP with many practical applications. The goal of relation extraction is to detect and characterize semantic relations between pairs of entities in text. For example a relation extraction system needs to be able to extract an Employment relation between the entities US soldier and US in the phrase US soldier. Current supervised approaches for tackling this problem in general fall into two categories feature based and kernel based. Given an entity pair and a sentence containing the pair both approaches usually start with multiple level analyses of the sentence such as tokenization partial or full syntactic parsing and dependency parsing. Then the feature based method explicitly extracts a variety of lexical syntactic and semantic 521 features for statistical learning either generative or discriminative Miller et al. 2000 Kambhatla 2004 Boschee et al. 2005 Grishman et al. 2005 Zhou et al. 2005 Jiang and Zhai 2007 . In contrast the kernel based method does not explicitly extract features it designs kernel functions over the structured sentence representations sequence dependency or parse tree to capture the similarities between different relation instances Zelenko et al. 2003 Bunescu and Mooney 2005a Bunescu and Mooney 2005b Zhao and .
đang nạp các trang xem trước