Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Counter-Training in Discovery of Semantic Patterns"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper presents a method for unsupervised discovery of semantic patterns. Semantic patterns are useful for a variety of text understanding tasks, in particular for locating events in text for information extraction. The method builds upon previously described approaches to iterative unsupervised pattern acquisition. One common characteristic of prior approaches is that the output of the algorithm is a continuous stream of patterns, with gradually degrading precision. | Counter-Training in Discovery of Semantic Patterns Roman Yangarber Courant Institute of Mathematical Sciences New York University roman@cs.nyu.edu Abstract This paper presents a method for unsupervised discovery of semantic patterns. Semantic patterns are useful for a variety of text understanding tasks in particular for locating events in text for information extraction. The method builds upon previously described approaches to iterative unsupervised pattern acquisition. One common characteristic of prior approaches is that the output of the algorithm is a continuous stream of patterns with gradually degrading precision. Our method differs from the previous pattern acquisition algorithms in that it introduces competition among several scenarios simultaneously. This provides natural stopping criteria for the unsupervised learners while maintaining good precision levels at termination. We discuss the results of experiments with several scenarios and examine different aspects of the new procedure. 1 Introduction The work described in this paper is motivated by research into automatic pattern acquisition. Pattern acquisition is considered important for a variety of text understanding tasks though our particular reference will be to Information Extraction IE . In IE the objective is to search through text for entities and events of a particular kind corresponding to the user s interest. Many current systems achieve this by pattern matching. The problem of recall or coverage in IE can then be restated to a large extent as a problem of acquiring a comprehensive set of good patterns which are relevant to the scenario of interest i.e. which describe events occurring in this scenario. Among the approaches to pattern acquisition recently proposed unsupervised methods1 have gained some popularity due to the substantial reduction in amount of manual labor they require. We build upon these approaches for learning IE patterns. The focus of this paper is on the problem of .