tailieunhanh - Báo cáo khoa học: "Generating Templates of Entity Summaries with an Entity-Aspect Model and Pattern Mining"

In this paper, we propose a novel approach to automatic generation of summary templates from given collections of summary articles. This kind of summary templates can be useful in various applications. We first develop an entity-aspect LDA model to simultaneously cluster both sentences and words into aspects. We then apply frequent subtree pattern mining on the dependency parse trees of the clustered and labeled sentences to discover sentence patterns that well represent the aspects. Key features of our method include automatic grouping of semantically related sentence patterns and automatic identification of template slots that need to be filled in | Generating Templates of Entity Summaries with an Entity-Aspect Model and Pattern Mining Peng Li1 and Jing Jiang2 and Yinglin Wang1 Department of Computer Science and Engineering Shanghai Jiao Tong University 2School of Information Systems Singapore Management University lipeng ylwang @ jingjiang@ Abstract In this paper we propose a novel approach to automatic generation of summary templates from given collections of summary articles. This kind of summary templates can be useful in various applications. We first develop an entity-aspect LDA model to simultaneously cluster both sentences and words into aspects. We then apply frequent subtree pattern mining on the dependency parse trees of the clustered and labeled sentences to discover sentence patterns that well represent the aspects. Key features of our method include automatic grouping of semantically related sentence patterns and automatic identification of template slots that need to be filled in. We apply our method on five Wikipedia entity categories and compare our method with two baseline methods. Both quantitative evaluation based on human judgment and qualitative comparison demonstrate the effectiveness and advantages of our method. 1 Introduction In this paper we study the task of automatically generating templates for entity summaries. An entity summary is a short document that gives the most important facts about an entity. In Wikipedia for instance most articles have an introduction section that summarizes the subject entity before the table of contents and other elaborate sections. These introduction sections are examples of entity summaries we consider. Summaries of entities from the same category usually share some common structure. For example biographies of physicists usually contain facts about the nationality educational background affiliation and major contributions of the physicist whereas introductions of companies usually list information such as the industry founder and

TỪ KHÓA LIÊN QUAN