tailieunhanh - Báo cáo khoa học: "Learning the Fine-Grained Information Status of Discourse Entities"

While information status (IS) plays a crucial role in discourse processing, there have only been a handful of attempts to automatically determine the IS of discourse entities. We examine a related but more challenging task, fine-grained IS determination, which involves classifying a discourse entity as one of 16 IS subtypes. We investigate the use of rich knowledge sources for this task in combination with a rule-based approach and a learning-based approach. | Learning the Fine-Grained Information Status of Discourse Entities Altaf Rahman and Vincent Ng Human Language Technology Research Institute University of Texas at Dallas Richardson TX 75083-0688 altaf vince @ Abstract While information status IS plays a crucial role in discourse processing there have only been a handful of attempts to automatically determine the IS of discourse entities. We examine a related but more challenging task fine-grained IS determination which involves classifying a discourse entity as one of 16 IS subtypes. We investigate the use of rich knowledge sources for this task in combination with a rule-based approach and a learning-based approach. In experiments with a set of Switchboard dialogues the learning-based approach achieves an accuracy of outperforming the rulebased approach by . 1 Introduction A linguistic notion central to discourse processing is information status IS . It describes the extent to which a discourse entity which is typically referred to by noun phrases NPs in a dialogue is available to the hearer. Different definitions of IS have been proposed over the years. In this paper we adopt Nissim et al. s 2004 proposal since it is primarily built upon Prince s 1992 and Eckert and Strube s 2001 well-known definitions. and is empirically shown by Nissim et al. to yield an annotation scheme for IS in dialogue that has good Specifically Nissim et al. 2004 adopt a threeway classification scheme for IS defining a discourse entity as 1 old to the hearer if it is known to the hearer and has previously been referred to in the dialogue 2 new if it is unknown to her and ht is worth noting that several IS annotation schemes have been proposed more recently. See Gotze et al. 2007 and Riester et al. 2010 for details. has not been previously referred to and 3 mediated henceforth med if it is newly mentioned in the dialogue but she can infer its identity from a previously-mentioned entity. To .

TỪ KHÓA LIÊN QUAN