tailieunhanh - Báo cáo khoa học: "Jointly Learning to Extract and Compress"

We learn a joint model of sentence extraction and compression for multi-document summarization. Our model scores candidate summaries according to a combined linear model whose features factor over (1) the n-gram types in the summary and (2) the compressions used. We train the model using a marginbased objective whose loss captures end summary quality. Because of the exponentially large set of candidate summaries, we use a cutting-plane algorithm to incrementally detect and add active constraints efficiently. . | Jointly Learning to Extract and Compress Taylor Berg-Kirkpatrick Dan Gillick Dan Klein Computer Science Division University of California at Berkeley tberg dgillick klein @ Abstract We learn a joint model of sentence extraction and compression for multi-document summarization. Our model scores candidate summaries according to a combined linear model whose features factor over 1 the n-gram types in the summary and 2 the compressions used. We train the model using a marginbased objective whose loss captures end summary quality. Because of the exponentially large set of candidate summaries we use a cutting-plane algorithm to incrementally detect and add active constraints efficiently. Inference in our model can be cast as an ILP and thereby solved in reasonable time we also present a fast approximation scheme which achieves similar performance. Our jointly extracted and compressed summaries outperform both unlearned baselines and our learned extraction-only system on both ROUGE and Pyramid without a drop in judged linguistic quality. We achieve the highest published RoUge results to date on the TAC 2o08 data set. 1 Introduction Applications of machine learning to automatic summarization have met with limited success and as a result many top-performing systems remain largely ad-hoc. One reason learning may have provided limited gains is that typical models do not learn to optimize end summary quality directly but rather learn intermediate quantities in isolation. For example many models learn to score each input sentence independently Teufel and Moens 1997 Shen et al. 2007 Schilder and Kondadadi 2008 and then assemble extractive summaries from the top-ranked sentences in a way not incorporated into the learning process. This extraction is often done in the 481 presence of a heuristic that limits redundancy. As another example Yih et al. 2007 learn predictors of individual words appearance in the references but in isolation from the sentence selection .

TỪ KHÓA LIÊN QUAN