Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Computing weakest readings"
Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We present an efficient algorithm for computing the weakest readings of semantically ambiguous sentences. A corpus-based evaluation with a large-scale grammar shows that our algorithm reduces over 80% of sentences to one or two readings, in negligible runtime, and thus makes it possible to work with semantic representations derived by deep large-scale grammars. | Computing weakest readings Alexander Koller Stefan Thater Cluster of Excellence Saarland University koller@mmci.uni-saarland.de Dept. of Computational Linguistics Saarland University stth@coli.uni-saarland.de Abstract We present an efficient algorithm for computing the weakest readings of semantically ambiguous sentences. A corpus-based evaluation with a large-scale grammar shows that our algorithm reduces over 80 of sentences to one or two readings in negligible runtime and thus makes it possible to work with semantic representations derived by deep large-scale grammars. 1 Introduction Over the past few years there has been considerable progress in the ability of manually created large-scale grammars such as the English Resource Grammar ERG Copestake and Flickinger 2000 or the ParGram grammars Butt et al. 2002 to parse wide-coverage text and assign it deep semantic representations. While applications should benefit from these very precise semantic representations their usefulness is limited by the presence of semantic ambiguity On the Rondane Treebank Oepen et al. 2002 the ERG computes an average of several million semantic representations for each sentence even when the syntactic analysis is fixed. The problem of appropriately selecting one of them to work with would ideally be solved by statistical methods Higgins and Sadock 2003 or knowledge-based inferences. However no such approach has been worked out in sufficient detail to support the disambiguation of treebank sentences. As an alternative Bos 2008 proposes to compute the weakest reading of each sentence and then use it instead of the true reading of the sentence. This is based on the observation that the readings of a semantically ambiguous sentence are partially ordered with respect to logical entailment and the weakest readings - the minimal least informative readings with respect to this order - only express safe information that is common to all other read ings as well. However when a sentence has .