tailieunhanh - Báo cáo khoa học: "Bootstrapping Events and Relations from Text"

In this paper, we describe a new approach to semi-supervised adaptive learning of event extraction from text. Given a set of examples and an un-annotated text corpus, the BEAR system (Bootstrapping Events And Relations) will automatically learn how to recognize and understand descriptions of complex semantic relationships in text, such as events involving multiple entities and their roles. For example, given a series of descriptions of bombing and shooting incidents (., in newswire) the system will learn to extract, with a high degree of accuracy, other attack-type events mentioned elsewhere in text, irrespective of the form of description. . | Bootstrapping Events and Relations from Text Ting Liu ILS University at Albany USA tliu@ Abstract In this paper we describe a new approach to semi-supervised adaptive learning of event extraction from text. Given a set of examples and an un-annotated text corpus the BEAR system Bootstrapping Events And Relations will automatically learn how to recognize and understand descriptions of complex semantic relationships in text such as events involving multiple entities and their roles. For example given a series of descriptions of bombing and shooting incidents . in newswire the system will learn to extract with a high degree of accuracy other attack-type events mentioned elsewhere in text irrespective of the form of description. A series of evaluations using the ACE data and event set show a significant performance improvement over our baseline system. 1 Introduction We constructed a semi-supervised machine learning process that effectively exploits statistical and structural properties of natural language discourse in order to rapidly acquire rules to detect mentions of events and other complex relationships in text extract their key attributes and construct template-like representations. The learning process exploits descriptive and structural redundancy which is common in language it is often critical for achieving successful communication despite distractions different contexts or incompatible semantic models between a speaker writer and a hearer reader. We also take advantage of the high degree of referential consistency in discourse . as observed in word sense distribution by Gale et al. 1992 and arguably applicable to larger linguistic units which enables the reader to efficiently correlate different forms of description across coherent spans of text. The method we describe here consists of two steps 1 supervised acquisition of initial extraction rules from an annotated training corpus and Tomek Strzalkowski ILS University at Albany USA Polish .

TỪ KHÓA LIÊN QUAN