tailieunhanh - Báo cáo khoa học: "Knowledge Base Population: Successful Approaches and Challenges"
In this paper we give an overview of the Knowledge Base Population (KBP) track at the 2010 Text Analysis Conference. The main goal of KBP is to promote research in discovering facts about entities and augmenting a knowledge base (KB) with these facts. This is done through two tasks, Entity Linking – linking names in context to entities in the KB – and Slot Filling – adding information about an entity to the KB. | Knowledge Base Population Successful Approaches and Challenges Heng Ji Computer Science Department Queens College and Graduate Center City University of New York New York nY 11367 USA hengj i@ Ralph Grishman Computer Science Department New York University New York NY 10003 USA grishman@ Abstract In this paper we give an overview of the Knowledge Base Population KBP track at the 2010 Text Analysis Conference. The main goal of KBP is to promote research in discovering facts about entities and augmenting a knowledge base KB with these facts. This is done through two tasks Entity Linking - linking names in context to entities in the KB -and Slot Filling - adding information about an entity to the KB. A large source collection of newswire and web documents is provided from which systems are to discover information. Attributes slots derived from Wikipedia infoboxes are used to create the reference KB. In this paper we provide an overview of the techniques which can serve as a basis for a good KBP system lay out the remaining challenges by comparison with traditional Information Extraction IE and Question Answering QA tasks and provide some suggestions to address these challenges. 1 Introduction Traditional information extraction IE evaluations such as the Message Understanding Conferences MUC and Automatic Content Extraction ACE assess the ability to extract information from individual documents in isolation. In practice however we may need to gather information about a person or organization that is scattered among the documents of a large collection. This requires the ability to identify the relevant documents and to integrate facts possibly redundant possibly complementary possibly in conflict coming from these documents. Furthermore we may want to use 1148 the extracted information to augment an existing data base. This requires the ability to link individuals mentioned in a document and information about these individuals to entries in the .
đang nạp các trang xem trước