tailieunhanh - Báo cáo khoa học: "Unsupervised Relation Discovery with Sense Disambiguation"

To discover relation types from text, most methods cluster shallow or syntactic patterns of relation mentions, but consider only one possible sense per pattern. In practice this assumption is often violated. In this paper we overcome this issue by inducing clusters of pattern senses from feature representations of patterns. | Unsupervised Relation Discovery with Sense Disambiguation Limin Yao Sebastian Riedel Andrew McCallum Department of Computer Science University of Massachusetts Amherst lmyao riedel mccallum @ Abstract To discover relation types from text most methods cluster shallow or syntactic patterns of relation mentions but consider only one possible sense per pattern. In practice this assumption is often violated. In this paper we overcome this issue by inducing clusters of pattern senses from feature representations of patterns. In particular we employ a topic model to partition entity pairs associated with patterns into sense clusters using local and global features. We merge these sense clusters into semantic relations using hierarchical agglomerative clustering. We compare against several baselines a generative latent-variable model a clustering method that does not disambiguate between path senses and our own approach but with only local features. Experimental results show our proposed approach discovers dramatically more accurate clusters than models without sense disambiguation and that incorporating global features such as the document theme is crucial. 1 Introduction Relation extraction RE is the task of determining semantic relations between entities mentioned in text. RE is an essential part of information extraction and is useful for question answering Ravichandran and Hovy 2002 textual entailment Szpektor et al. 2004 and many other applications. A common approach to RE is to assume that relations to be extracted are part of a predefined ontology. For example the relations are given in knowledge bases such as Freebase Bollacker et al. 2008 or DBpedia Bizer et al. 2009 . However in many applications ontologies do not yet exist or have low 712 coverage. Even when they do exist their maintenance and extension are considered to be a substantial bottleneck. This has led to considerable interest in unsupervised relation discovery Hasegawa et al. 2004 Banko .

TỪ KHÓA LIÊN QUAN