tailieunhanh - Báo cáo khoa học: "RAISINS, CLASSIFICATION CURRANTS: VIA LEXICAL CONTEXT PRIMING ABSTRACTION"

In this p a p e r we discuss the results of experiments which use a context, essentially an ordered set of lexical items, as the seed from which to build a network representing statistically important relationships a m o n g lexical items in some corpus. A metric is then applied to the nodes in the network in order to discover those pairs of items related by high indices of similarity. T h e goal of this research is to instantiate a class of items corresponding to each item in the priming context. . | RAISINS SULTANAS AND CURRANTS LEXICAL CLASSIFICATION AND ABSTRACTION VIA CONTEXT PRIMING David J. Hutches Department of Computer Science and Engineering Mail Code 0114 University of California San Diego La Jolla CA 92093-0114 dhutches@ Abstract In this paper we discuss the results of experiments which use a context essentially an ordered set of lexical items as the seed from which to build a network representing statistically important relationships among lexical items in some corpus. A metric is then applied to the nodes in the network in order to discover those pairs of items related by high indices of similarity. The goal of this research is to instantiate a class of items corresponding to each item in the priming context. We believe that this instantiation process is ultimately a special case of abstraction over the entire network in this abstraction similar nodes are collapsed into metanodes which may then function as if they were single lexical items. I. Motivation and Background With respect to the processing of language one of the tasks at which human beings seem relatively adept is the ability to determine when it is appropriate to make generalizations and when it is appropriate to preserve distinctions. The process of abstraction and knowing when it might reasonably be used is a necessary tool in reducing the complexity of the task of processing natural language. Part of our current research is an investigation into how the process of abstraction might be realized using relatively low-level statistical information extracted from large textual corpora. Our experiments are an attempt to discover a method by which class information about the members of some sequence of lexical items may be obtained using strictly statistical methods. For our purposes the class to which a lexical item belongs is defined by its instantiation. Given some context such as he walked across the room we would like to be able to instantiate classes of items corresponding to .

TỪ KHÓA LIÊN QUAN