tailieunhanh - Báo cáo khoa học: "Word Sense Induction for Novel Sense Detection"

We apply topic modelling to automatically induce word senses of a target word, and demonstrate that our word sense induction method can be used to automatically detect words with emergent novel senses, as well as token occurrences of those senses. We start by exploring the utility of standard topic models for word sense induction (WSI), with a pre-determined number of topics (=senses). We next demonstrate that a non-parametric formulation that learns an appropriate number of senses per word actually performs better at the WSI task. . | Word Sense Induction for Novel Sense Detection Jey Han Lau Paul Cook Diana McCarthy David Newman and Timothy Baldwin Ậ NICTA Victoria Research Laboratory Ọ Dept of Computer Science and Software Engineering University of Melbourne Dept of Computer Science University of California Irvine A Lexical Computing jhlau@ paulcook@ diana@ newman@ tb@ Abstract We apply topic modelling to automatically induce word senses of a target word and demonstrate that our word sense induction method can be used to automatically detect words with emergent novel senses as well as token occurrences of those senses. We start by exploring the utility of standard topic models for word sense induction WSI with a pre-determined number of topics senses . We next demonstrate that a non-parametric formulation that learns an appropriate number of senses per word actually performs better at the WSI task. We go on to establish state-of-the-art results over two WSI datasets and apply the proposed model to a novel sense detection task. 1 Introduction Word sense induction WSI is the task of automatically inducing the different senses of a given word generally in the form of an unsupervised learning task with senses represented as clusters of token instances. It contrasts with word sense disambiguation WSD where a fixed sense inventory is assumed to exist and token instances of a given word are disambiguated relative to the sense inventory. While WSI is intuitively appealing as a task there have been no real examples of WSI being successfully deployed in end-user applications other than work by Schutze 1998 and Navigli and Crisafulli 2010 in an information retrieval context. A key contribution of this paper is the successful application of WSI to the lexicographical task of novel sense detection . identifying words which have taken on new senses over time. One of the key challenges in WSI is learning the appropriate sense granularity

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN