tailieunhanh - Báo cáo khoa học: "Using Predicate-Argument Structures for Information Extraction"

In this paper we present a novel, customizable IE paradigm that takes advantage of predicate-argument structures. We also introduce a new way of automatically identifying predicate argument structures, which is central to our IE paradigm. It is based on: (1) an extended set of features; and (2) inductive decision tree learning. The experimental results prove our claim that accurate predicate-argument structures enable high quality IE results. | Using Predicate-Argument Structures for Information Extraction Mihai Surdeanu and Sanda Harabagiu and John Williams and Paul Aarseth Language Computer Corp. Richardson Texas 75080 USA mihai sanda@ Abstract In this paper we present a novel customizable IE paradigm that takes advantage of predicate-argument structures. We also introduce a new way of automatically identifying predicate argument structures which is central to our IE paradigm. It is based on 1 an extended set of features and 2 inductive decision tree learning. The experimental results prove our claim that accurate predicate-argument structures enable high quality IE results. 1 Introduction The goal of recent Information Extraction IE tasks was to provide event-level indexing into news stories including news wire radio and television sources. In this context the purpose of the HUB Event-99 evaluations Hirschman et al. 1999 was to capture information on some newsworthy classes of events . natural disasters deaths bombings elections financial fluctuations or illness outbreaks. The identification and selective extraction of relevant information is dictated by templettes. Event templettes are frame-like structures with slots representing the event basic information such as main event participants event outcome time and location. For each type of event a separate templette is defined. The slots fills consist of excerpts from text with pointers back into the original source material. Templettes are designed to support event-based browsing and search. Figure 1 illustrates a templette defined for market changes as well as the source of the slot fillers. INSTRUMENT London gold AMOUNT_CHANGE fell cents CURRENT_VALUE LOCATION London DATE daily ___ ộ Time for our daily market report from NASDAQ. I Ì J London gold fell cents to s Ỉ J J ạ b b---- ý Figure 1 Templette filled with information about a market change event. To date some of

TÀI LIỆU LIÊN QUAN