tailieunhanh - Báo cáo khoa học: "Automatic Extraction of Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes"
Detecting the linguistic scope of negated and speculated information in text is an important Information Extraction task. This paper presents ScopeFinder, a linguistically motivated rule-based system for the detection of negation and speculation scopes. The system rule set consists of lexico-syntactic patterns automatically extracted from a corpus annotated with negation/speculation cues and their scopes (the BioScope corpus). | Automatic Extraction of Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes Emilia Apostolova DePaul University Chicago IL USA Noriko Tomuro DePaul University Chicago IL UsA tomuro@ Dina Demner-Fushman National Library of Medicine Bethesda Md USA ddemner@ Abstract Detecting the linguistic scope of negated and speculated information in text is an important Information Extraction task. This paper presents ScopeFinder a linguistically motivated rule-based system for the detection of negation and speculation scopes. The system rule set consists of lexico-syntactic patterns automatically extracted from a corpus annotated with negation speculation cues and their scopes the BioScope corpus . The system performs on par with state-of-the-art machine learning systems. Additionally the intuitive and linguistically motivated rules will allow for manual adaptation of the rule set to new domains and corpora. 1 Motivation Information Extraction IE systems often face the problem of distinguishing between affirmed negated and speculative information in text. For example sentiment analysis systems need to detect negation for accurate polarity classification. Similarly medical IE systems need to differentiate between affirmed negated and speculated possible medical conditions. The importance of the task of negation and speculation . hedge detection is attested by a number of research initiatives. The creation of the BioScope corpus Vincze et al. 2008 assisted in the development and evaluation of several negation hedge scope detection systems. The corpus consists of medical and biological texts annotated for negation speculation and their linguistic scope. The 2010 283 i2b2 NLP Shared Task1 included a track for detection of the assertion status of medical problems . affirmed negated hypothesized etc. . The CoNLL-2010 Shared Task Farkas et al. 2010 focused on detecting hedges and their scopes in Wikipedia .
đang nạp các trang xem trước