tailieunhanh - Báo cáo khoa học: "Discourse Generation Using Utility-Trained Coherence Models"

We describe a generic framework for integrating various stochastic models of discourse coherence in a manner that takes advantage of their individual strengths. An integral part of this framework are algorithms for searching and training these stochastic coherence models. We evaluate the performance of our models and algorithms and show empirically that utilitytrained log-linear coherence models outperform each of the individual coherence models considered. | Discourse Generation Using Utility-Trained Coherence Models Radu Soricut Information Sciences Institute University of Southern California 4676 Admiralty Way Suite 1001 Marina del Rey CA 90292 radu@ Daniel Marcu Information Sciences Institute University of Southern California 4676 Admiralty Way Suite 1001 Marina del Rey CA 90292 marcu@ Abstract We describe a generic framework for integrating various stochastic models of discourse coherence in a manner that takes advantage of their individual strengths. An integral part of this framework are algorithms for searching and training these stochastic coherence models. We evaluate the performance of our models and algorithms and show empirically that utility-trained log-linear coherence models outperform each of the individual coherence models considered. 1 Introduction Various theories of discourse coherence Mann and Thompson 1988 Grosz et al. 1995 have been applied successfully in discourse analysis Marcu 2000 Forbes et al. 2001 and discourse generation Scott and de Souza 1990 Kibble and Power 2004 . Most of these efforts however have limited applicability. Those that use manually written rules model only the most visible discourse constraints . the discourse connective although marks a CONCESSION relation while being oblivious to fine-grained lexical indicators. And the methods that utilize manually annotated corpora Carlson et al. 2003 Karamanis et al. 2004 and supervised learning algorithms have high costs associated with the annotation procedure and cannot be easily adapted to different domains and genres. In contrast more recent research has focused on stochastic approaches that model discourse coherence at the local lexical Lapata 2003 and global levels Barzilay and Lee 2004 while preserving regularities recognized by classic discourse theo ries Barzilay and Lapata 2005 . These stochastic coherence models use simple non-hierarchical representations of discourse and can be trained with minimal human

TỪ KHÓA LIÊN QUAN