tailieunhanh - Báo cáo khoa học: "Collective Classification for Fine-grained Information Status"

Previous work on classifying information status (Nissim, 2006; Rahman and Ng, 2011) is restricted to coarse-grained classification and focuses on conversational dialogue. We here introduce the task of classifying finegrained information status and work on written text. We add a fine-grained information status layer to the Wall Street Journal portion of the OntoNotes corpus. | Collective Classification for Fine-grained Information Status Katja Markert1 2 Yufang Hou2 Michael Strube2 1 School of Computing University of Leeds UK scskm@leeds . 2 Heidelberg Institute for Theoretical Studies gGmbH Heidelberg Germany @ Abstract Previous work on classifying information status Nissim 2006 Rahman and Ng 2011 is restricted to coarse-grained classification and focuses on conversational dialogue. We here introduce the task of classifying finegrained information status and work on written text. We add a fine-grained information status layer to the Wall Street Journal portion of the OntoNotes corpus. We claim that the information status of a mention depends not only on the mention itself but also on other mentions in the vicinity and solve the task by collectively classifying the information status of all mentions. Our approach strongly outperforms reimplementations of previous work. 1 Introduction Speakers present already known and yet to be established information according to principles referred to as information structure Prince 1981 Lambrecht 1994 Kruijff-Korbayova and Steedman 2003 inter alia . While information structure affects all kinds of constituents in a sentence we here adopt the more restricted notion of information status which concerns only discourse entities realized as noun phrases . mentions1. Information status IS henceforth describes the degree to which a discourse entity is available to the hearer with regard to the speaker s assumptions about the hearer s knowledge and beliefs Nissim et al. 2004 . Old mentions are known to the hearer and have been referred 1Since not all noun phrases are referential we call noun phrases which carry information status mentions. 795 to previously. Mediated mentions have not been mentioned before but are also not autonomous . they can only be correctly interpreted by reference to another mention or to prior world knowledge. All other mentions are new. IS

TỪ KHÓA LIÊN QUAN