tailieunhanh - Báo cáo khoa học: "Semantic Taxonomy Induction from Heterogenous Evidence"
We propose a novel algorithm for inducing semantic taxonomies. Previous algorithms for taxonomy induction have typically focused on independent classifiers for discovering new single relationships based on hand-constructed or automatically discovered textual patterns. By contrast, our algorithm flexibly incorporates evidence from multiple classifiers over heterogenous relationships to optimize the entire structure of the taxonomy, using knowledge of a word’s coordinate terms to help in determining its hypernyms, and vice versa. . | Semantic Taxonomy Induction from Heterogenous Evidence Rion Snow Computer Science Department Stanford University Stanford CA 94305 rion@ Daniel Jurafsky Linguistics Department Stanford University Stanford CA 94305 jurafsky@ Andrew Y. Ng Computer Science Department Stanford University Stanford CA 94305 ang@ Abstract We propose a novel algorithm for inducing semantic taxonomies. Previous algorithms for taxonomy induction have typically focused on independent classifiers for discovering new single relationships based on hand-constructed or automatically discovered textual patterns. By contrast our algorithm flexibly incorporates evidence from multiple classifiers over heterogenous relationships to optimize the entire structure of the taxonomy using knowledge of a word s coordinate terms to help in determining its hypernyms and vice versa. We apply our algorithm on the problem of sense-disambiguated noun hyponym acquisition where we combine the predictions of hypernym and coordinate term classifiers with the knowledge in a preexisting semantic taxonomy WordNet . We add 10 000 novel synsets to WordNet at 84 precision a relative error reduction of 70 over a non-joint algorithm using the same component classifiers. Finally we show that a taxonomy built using our algorithm shows a 23 relative F-score improvement over WordNet on an independent testset of hypernym pairs. 1 Introduction The goal of capturing structured relational knowledge about lexical terms has been the motivating force underlying many projects in lexical acquisition information extraction and the construction of semantic taxonomies. Broad-coverage semantic taxonomies such as WordNet Fellbaum 1998 and CYC Lenat 1995 have been constructed by hand at great cost while a crucial source of knowledge about the relations between words these taxonomies still suffer from sparse coverage. Many algorithms with the potential for automatically extending lexical .
đang nạp các trang xem trước