tailieunhanh - Báo cáo khoa học: "Inferring Selectional Preferences from Part-Of-Speech N-grams"
We present the PONG method to compute selectional preferences using part-of-speech (POS) N-grams. From a corpus labeled with grammatical dependencies, PONG learns the distribution of word relations for each POS N-gram. From the much larger but unlabeled Google N-grams corpus, PONG learns the distribution of POS N-grams for a given pair of words. We derive the probability that one word has a given grammatical relation to the other. PONG estimates this probability by combining both distributions, whether or not either word occurs in the labeled corpus. . | Inferring Selectional Preferences from Part-Of-Speech N-grams Hyeju Jang and Jack Mostow Project LISTEN listen School of Computer Science Carnegie Mellon University Pittsburgh PA 15213 UsA hyejuj@ mostow@ Abstract We present the PONG method to compute selectional preferences using part-of-speech POS N-grams. From a corpus labeled with grammatical dependencies PONG learns the distribution of word relations for each POS N-gram. From the much larger but unlabeled Google N-grams corpus PONG learns the distribution of POS N-grams for a given pair of words. We derive the probability that one word has a given grammatical relation to the other. PONG estimates this probability by combining both distributions whether or not either word occurs in the labeled corpus. PONG achieves higher average precision on 16 relations than a state-of-the-art baseline in a pseudo-disambiguation task but lower coverage and recall. 1 Introduction Selectional preferences specify plausible fillers for the arguments of a predicate . celebrate. Can you celebrate a birthday Sure. Can you celebrate a pencil Arguably yes Today the Acme Pencil Factory celebrated its one-billionth pencil. However such a contrived example is unnatural because unlike birthday pencil lacks a strong association with celebrate. How can we compute the degree to which birthday or pencil is a plausible and typical object of celebrate Formally we are interested in computing the probability Pr r t R where as Table 1 specifies t is a target word such as celebrate r is a word possibly related to it such as birthday or pencil and R is a possible relation between them whether a semantic role such as the agent of an action or a grammatical dependency such as the object of a verb. We call t the target because originally it referred to a vocabulary word targeted for instruction and r its relative. Notation Description R a relation between words t a target word r r possible relatives of t g a word .
đang nạp các trang xem trước