tailieunhanh - Báo cáo khoa học: "Learning Word-Class Lattices for Definition and Hypernym Extraction"
Definition extraction is the task of automatically identifying definitional sentences within texts. The task has proven useful in many research areas including ontology learning, relation extraction and question answering. However, current approaches – mostly focused on lexicosyntactic patterns – suffer from both low recall and precision, as definitional sentences occur in highly variable syntactic structures. | Learning Word-Class Lattices for Definition and Hypernym Extraction Roberto Navigli and Paola Velardi Dipartimento di Informatica Sapienza Universita di Roma navigli velardi @ Abstract Definition extraction is the task of automatically identifying definitional sentences within texts. The task has proven useful in many research areas including ontology learning relation extraction and question answering. However current approaches - mostly focused on lexico-syntactic patterns - suffer from both low recall and precision as definitional sentences occur in highly variable syntactic structures. In this paper we propose WordClass Lattices WCLs a generalization of word lattices that we use to model textual definitions. Lattices are learned from a dataset of definitions from Wikipedia. Our method is applied to the task of definition and hypernym extraction and compares favorably to other pattern generalization methods proposed in the literature. 1 Introduction Textual definitions constitute a fundamental source to look up when the meaning of a term is sought. Definitions are usually collected in dictionaries and domain glossaries for consultation purposes. However manually constructing and updating glossaries requires the cooperative effort of a team of domain experts. Further in the presence of new words or usages and - even worse - new domains such resources are of no help. Nonetheless terms are attested in texts and some usually few of the sentences in which a term occurs are typically definitional that is they provide a formal explanation for the term of interest. While it is not feasible to manually search texts for definitions this task can be automatized by means of Machine Learning ML and Natural Language Processing NLP techniques. Automatic definition extraction is useful not only in the construction of glossaries but also in many other NLP tasks. In ontology learning definitions are used to create and enrich concepts with textual information Gangemi
đang nạp các trang xem trước