tailieunhanh - Báo cáo khoa học: "Generating a Table-of-Contents"

This paper presents a method for the automatic generation of a table-of-contents. This type of summary could serve as an effective navigation tool for accessing information in long texts, such as books. To generate a coherent table-of-contents, we need to capture both global dependencies across different titles in the table and local constraints within sections. Our algorithm effectively handles these complex dependencies by factoring the model into local and global components, and incrementally constructing the model’s output. . | Generating a Table-of-Contents . Branavan Pawan Deshpande and Regina Barzilay Massachusetts Institute of Technology branavan pawand regina @ Abstract This paper presents a method for the automatic generation of a table-of-contents. This type of summary could serve as an effective navigation tool for accessing information in long texts such as books. To generate a coherent table-of-contents we need to capture both global dependencies across different titles in the table and local constraints within sections. Our algorithm effectively handles these complex dependencies by factoring the model into local and global components and incrementally constructing the model s output. The results of automatic evaluation and manual assessment confirm the benefits of this design our system is consistently ranked higher than non-hierarchical baselines. 1 Introduction Current research in summarization focuses on processing short articles primarily in the news domain. While in practice the existing summarization methods are not limited to this material they are not universal texts in many domains and genres cannot be summarized using these techniques. A particularly significant challenge is the summarization of longer texts such as books. The requirement for high compression rates and the increased need for the preservation of contextual dependencies between summary sentences places summarization of such texts beyond the scope of current methods. In this paper we investigate the automatic generation of tables-of-contents a type of indicative sum-544 mary particularly suited for accessing information in long texts. A typical table-of-contents lists topics described in the source text and provides information about their location in the text. The hierarchical organization of information in the table further refines information access by specifying the relations between different topics and providing rich contextual information during browsing. Commonly found in books