tailieunhanh - Báo cáo khoa học: "Extending the Entity-based Coherence Model with Multiple Ranks"
We extend the original entity-based coherence model (Barzilay and Lapata, 2008) by learning from more fine-grained coherence preferences in training data. We associate multiple ranks with the set of permutations originating from the same source document, as opposed to the original pairwise rankings. We also study the effect of the permutations used in training, and the effect of the coreference component used in entity extraction. With no additional manual annotations required, our extended model is able to outperform the original model on two tasks: sentence ordering and summary coherence rating. . | Extending the Entity-based Coherence Model with Multiple Ranks Vanessa Wei Feng Department of Computer Science University of Toronto Toronto ON M5S 3G4 Canada weifeng@ Graeme Hirst Department of Computer Science University of Toronto Toronto ON M5S 3G4 Canada gh@ Abstract We extend the original entity-based coherence model Barzilay and Lapata 2008 by learning from more fine-grained coherence preferences in training data. We associate multiple ranks with the set of permutations originating from the same source document as opposed to the original pairwise rankings. We also study the effect of the permutations used in training and the effect of the coreference component used in entity extraction. With no additional manual annotations required our extended model is able to outperform the original model on two tasks sentence ordering and summary coherence rating. 1 Introduction Coherence is important in a well-written document it helps make the text semantically meaningful and interpretable. Automatic evaluation of coherence is an essential component of various natural language applications. Therefore the study of coherence models has recently become an active research area. A particularly popular coherence model is the entity-based local coherence model of Barzilay and Lapata B L 2005 2008 . This model represents local coherence by transitions from one sentence to the next in the grammatical role of references to entities. It learns a pairwise ranking preference between alternative renderings of a document based on the probability distribution of those transitions. In particular B L associated a lower rank with automatically created permutations of a source document and learned a model to discriminate an original text from its permutations see Section below . However coherence is matter of degree rather than a binary distinction so a model based only on such pairwise rankings is insufficiently fine-grained and cannot capture the subtle .
đang nạp các trang xem trước