tailieunhanh - Báo cáo khoa học: "A Risk Minimization Framework for Extractive Speech Summarization"
In this paper, we formulate extractive summarization as a risk minimization problem and propose a unified probabilistic framework that naturally combines supervised and unsupervised summarization models to inherit their individual merits as well as to overcome their inherent limitations. In addition, the introduction of various loss functions also provides the summarization framework with a flexible but systematic way to render the redundancy and coherence relationships among sentences and between sentences and the whole document, respectively. . | A Risk Minimization Framework for Extractive Speech Summarization Shih-Hsiang Lin and Berlin Chen National Taiwan Normal University Taipei Taiwan berlin @ shlin Abstract In this paper we formulate extractive summarization as a risk minimization problem and propose a unified probabilistic framework that naturally combines supervised and unsupervised summarization models to inherit their individual merits as well as to overcome their inherent limitations. In addition the introduction of various loss functions also provides the summarization framework with a flexible but systematic way to render the redundancy and coherence relationships among sentences and between sentences and the whole document respectively. Experiments on speech summarization show that the methods deduced from our framework are very competitive with existing summarization approaches. 1 Introduction Automated summarization systems which enable user to quickly digest the important information conveyed by either a single or a cluster of documents are indispensible for managing the rapidly growing amount of textual information and multimedia content Mani and Maybury 1999 . On the other hand due to the maturity of text summarization the research paradigm has been extended to speech summarization over the years Furui et al. 2004 McKeown et al. 2005 . Speech summarization is expected to distill important information and remove redundant and incorrect information caused by recognition errors from spoken documents enabling user to efficiently review spoken documents and understand the associated topics quickly. It would also be useful for improving the efficiency of a number of potential applications like retrieval and mining of large volumes of spoken documents. A summary can be either abstractive or extractive. In abstractive summarization a fluent and concise abstract that reflects the key concepts of a document is generated whereas in extractive summarization the summary is usually .
đang nạp các trang xem trước