tailieunhanh - Báo cáo khoa học: "The Rhetorical Parsing of Natural Language Texts"

We derive the rhetorical structures of texts by means of two new, surface-form-based algorithms: one that identifies discourse usages of cue phrases and breaks sentences into clauses, and one that produces valid rhetorical structure trees for unrestricted natural language texts. The algorithms use information that was derived from a corpus analysis of cue phrases. | The Rhetorical Parsing of Natural Language Texts Daniel Marcu Department of Computer Science University of Toronto Toronto Ontario Canada M5S 3G4 Abstract We derive the rhetorical structures of texts by means of two new surface-form-based algorithms one that identifies discourse usages of cue phrases and breaks sentences into clauses and one that produces valid rhetorical structure trees for unrestricted natural language texts. The algorithms use information that was derived from a corpus analysis of cue phrases. 1 Introduction Researchers of natural language have repeatedly acknowledged that texts are not just a sequence of words nor even a sequence of clauses and sentences. However despite the impressive number of discourse-related theories that have been proposed so far there have emerged no algorithms capable of deriving the discourse structure of an unrestricted text. On one hand efforts such as those described by Asher 1993 Lascarides Asher and Oberlander 1992 Kamp and Reyle 1993 Grover et al. 1994 and Priist Scha and van den Berg 1994 take the position that discourse structures can be built only in conjunction with fully specified clause and sentence structures. And Hobbs s theory 1990 assumes that sophisticated knowledge bases and inference mechanisms are needed for determining the relations between discourse units. Despite the formal elegance of these approaches they are very domain dependent and therefore unable to handle more than a few restricted examples. On the other hand although the theories described by Grosz and Sidner 1986 Polanyi 1988 and Mann and Thompson 1988 are successfully applied manually they are too informal to support an automatic approach to discourse analysis. In contrast with this previous work the rhetorical parser that we present builds discourse ưees for unrestricted texts. We first discuss the key concepts on which our approach relies section 2 and the corpus analysis section 3 that provides the empirical .

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN