tailieunhanh - Báo cáo khoa học: "Clustering Clauses for High-Level Relation Detection: An Information-theoretic Approach"
Recently, there has been a rise of interest in unsupervised detection of highlevel semantic relations involving complex units, such as phrases and whole sentences. Typically such approaches are faced with two main obstacles: data sparseness and correctly generalizing from the examples. In this work, we describe the Clustered Clause representation, which utilizes information-based clustering and inter-sentence dependencies to create a simplified and generalized representation of the grammatical clause. . | Clustering Clauses for High-Level Relation Detection An Information-theoretic Approach Samuel Brody School of Informatics University of Edinburgh Abstract Recently there has been a rise of interest in unsupervised detection of high-level semantic relations involving complex units such as phrases and whole sentences. Typically such approaches are faced with two main obstacles data sparseness and correctly generalizing from the examples. In this work we describe the Clustered Clause representation which utilizes information-based clustering and inter-sentence dependencies to create a simplified and generalized representation of the grammatical clause. We implement an algorithm which uses this representation to detect a predefined set of high-level relations and demonstrate our model s effectiveness in overcoming both the problems mentioned. 1 Introduction The semantic relationship between words and the extraction of meaning from syntactic data has been one of the main points of research in the field of computational linguistics see Section 5 and references therein . Until recently the focus has remained largely either at the single word or sentence level for instance dependency extraction word-to-word semantic similarity from syntax etc. or on relations between units at a very high context level such as the entire paragraph or document . categorizing documents by topic . Recently there have been several attempts to define frameworks for detecting and studying interactions at an intermediate context level and 448 involving whole clauses or sentences. Dagan et al. 2005 have emphasized the importance of detecting textual-entailment implication between two sentences and its place as a key component in many real-world applications such as Information Retrieval and Question Answering. When designing such a framework one is faced with several obstacles. As we approach higher levels of complexity the problem of defining the basic units we study . .
đang nạp các trang xem trước