tailieunhanh - Báo cáo khoa học: "Kernels on Linguistic Structures for Answer Extraction"
Natural Language Processing (NLP) for Information Retrieval has always been an interesting and challenging research area. Despite the high expectations, most of the results indicate that successfully using NLP is very complex. In this paper, we show how Support Vector Machines along with kernel functions can effectively represent syntax and semantics. Our experiments on question/answer classification show that the above models highly improve on bag-of-words on a TREC dataset. | Kernels on Linguistic Structures for Answer Extraction Alessandro Moschitti and Silvia Quarteroni DISI University of Trento Via Sommarive 14 38100 POVO TN - Italy moschitti silviaq @ Abstract Natural Language Processing NLP for Information Retrieval has always been an interesting and challenging research area. Despite the high expectations most of the results indicate that successfully using NLP is very complex. In this paper we show how Support Vector Machines along with kernel functions can effectively represent syntax and semantics. Our experiments on question answer classification show that the above models highly improve on bag-of-words on a TREC dataset. 1 Introduction Question Answering QA is an IR task where the major complexity resides in question processing and answer extraction Chen et al. 2006 Collins-Thompson et al. 2004 rather than document retrieval a step usually carried out by off-the shelf IR engines . In question processing useful information is gathered from the question and a query is created. This is submitted to an IR module which provides a ranked list of relevant documents. From these the QA system extracts one or more candidate answers which can then be re-ranked following various criteria. Although typical methods are based exclusively on word similarity between query and answer recent work . Shen and Lapata 2007 has shown that shallow semantic information in the form of predicate argument structures PASs improves the automatic detection of correct answers to a target question. In Moschitti et al. 2007 we proposed the Shallow Semantic Tree Kernel SSTK designed to encode PASs1 in SVMs. 1in PropBank format www .cis. upenn. edu ace . In this paper similarly to our previous approach we design an SVM-based answer extractor that selects the correct answers from those provided by a basic QA system by applying tree kernel technology. However we also provide i a new kernel to process PASs based on the partial tree kernel algorithm
đang nạp các trang xem trước