Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "An Entity-Level Approach to Information Extraction"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

We present a generative model of template-filling in which coreference resolution and role assignment are jointly determined. Underlying template roles first generate abstract entities, which in turn generate concrete textual mentions. On the standard corporate acquisitions dataset, joint resolution in our entity-level model reduces error over a mention-level discriminative approach by up to 20%. | An Entity-Level Approach to Information Extraction Aria Haghighi UC Berkeley CS Division aria42@cs.berkeley.edu Dan Klein UC Berkeley CS Division klein@cs.berkeley.edu Abstract We present a generative model of template-filling in which coreference resolution and role assignment are jointly determined. Underlying template roles first generate abstract entities which in turn generate concrete textual mentions. On the standard corporate acquisitions dataset joint resolution in our entity-level model reduces error over a mention-level discriminative approach by up to 20 . 1 Introduction Template-filling information extraction IE systems must merge information across multiple sentences to identify all role fillers of interest. For instance in the MUC4 terrorism event extraction task the entity filling the individual perpetrator role often occurs multiple times variously as proper nominal or pronominal mentions. However most template-filling systems Freitag and McCallum 2000 Patwardhan and Riloff 2007 assign roles to individual textual mentions using only local context as evidence leaving aggregation for post-processing. While prior work has acknowledged that coreference resolution and discourse analysis are integral to accurate role identification to our knowledge no model has been proposed which jointly models these phenomena. In this work we describe an entity-centered approach to template-filling IE problems. Our model jointly merges surface mentions into underlying entities coreference resolution and assigns roles to those discovered entities. In the generative process proposed here document entities are generated for each template role along with a set of non-template entities. These entities then generate mentions in a process sensitive to both lexical and structural properties of the mention. Our model outperforms a discriminative mention-level baseline. Moreover since our model is generative it a b Template SELLER BUSINESS ACQUIRED PURCHASER CSR Limited Oil and