Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Using Predicate-Argument Structures for Information Extraction"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

In this paper we present a novel, customizable IE paradigm that takes advantage of predicate-argument structures. We also introduce a new way of automatically identifying predicate argument structures, which is central to our IE paradigm. It is based on: (1) an extended set of features; and (2) inductive decision tree learning. The experimental results prove our claim that accurate predicate-argument structures enable high quality IE results. | Using Predicate-Argument Structures for Information Extraction Mihai Surdeanu and Sanda Harabagiu and John Williams and Paul Aarseth Language Computer Corp. Richardson Texas 75080 USA mihai sanda@languagecomputer.com Abstract In this paper we present a novel customizable IE paradigm that takes advantage of predicate-argument structures. We also introduce a new way of automatically identifying predicate argument structures which is central to our IE paradigm. It is based on 1 an extended set of features and 2 inductive decision tree learning. The experimental results prove our claim that accurate predicate-argument structures enable high quality IE results. 1 Introduction The goal of recent Information Extraction IE tasks was to provide event-level indexing into news stories including news wire radio and television sources. In this context the purpose of the HUB Event-99 evaluations Hirschman et al. 1999 was to capture information on some newsworthy classes of events e.g. natural disasters deaths bombings elections financial fluctuations or illness outbreaks. The identification and selective extraction of relevant information is dictated by templettes. Event templettes are frame-like structures with slots representing the event basic information such as main event participants event outcome time and location. For each type of event a separate templette is defined. The slots fills consist of excerpts from text with pointers back into the original source material. Templettes are designed to support event-based browsing and search. Figure 1 illustrates a templette defined for market changes as well as the source of the slot fillers. MARKET_CHANGE_PRI199804281700.1717-1 INSTRUMENT London gold AMOUNT_CHANGE fell 4.70 cents CURRENT_VALUE 308.45 LOCATION London DATE daily ___ ộ Time for our daily market report from NASDAQ. I Ì J London gold fell 4.70 cents to 308.35 s Ỉ J J ạ b b---- ý Figure 1 Templette filled with information about a market change event. To date some of