tailieunhanh - Báo cáo khoa học: "Extracting Lexical Reference Rules from Wikipedia"

This paper describes the extraction from Wikipedia of lexical reference rules, identifying references to term meanings triggered by other terms. We present extraction methods geared to cover the broad range of the lexical reference relation and analyze them extensively. Most extraction methods yield high precision levels, and our rule-base is shown to perform better than other automatically constructed baselines in a couple of lexical expansion and matching tasks. Our rule-base yields comparable performance to WordNet while providing largely complementary information. . | Extracting Lexical Reference Rules from Wikipedia Eyal Shnarch Libby Barak Ido Dagan Computer Science Department Dept. of Computer Science Computer Science Department Bar-Ilan University University of Toronto Bar-Ilan University Ramat-Gan 52900 Israel Toronto Canada M5S 1A4 Ramat-Gan 52900 Israel shey@ libbyb@ dagan@ Abstract This paper describes the extraction from Wikipedia of lexical reference rules identifying references to term meanings triggered by other terms. We present extraction methods geared to cover the broad range of the lexical reference relation and analyze them extensively. Most extraction methods yield high precision levels and our rule-base is shown to perform better than other automatically constructed baselines in a couple of lexical expansion and matching tasks. Our rule-base yields comparable performance to Word-Net while providing largely complementary information. 1 Introduction A most common need in applied semantic inference is to infer the meaning of a target term from other terms in a text. For example a Question Answering system may infer the answer to a question regarding luxury cars from a text mentioning Bentley which provides a concrete reference to the sought meaning. Aiming to capture such lexical inferences we followed Glickman et al. 2006 which coined the term lexical reference LR to denote references in text to the specific meaning of a target term. They further analyzed the dataset of the First Recognizing Textual Entailment Challenge Da-gan et al. 2006 which includes examples drawn from seven different application scenarios. It was found that an entailing text indeed includes a concrete reference to practically every term in the entailed inferred sentence. The lexical reference relation between two terms may be viewed as a lexical inference rule denoted LHS RHS. Such rule indicates that the left-hand-side term would generate a reference in some texts to a possible meaning of the right .