tailieunhanh - Báo cáo khoa học: "Exploiting Syntactic and Shallow Semantic Kernels for Question/Answer Classification"

We study the impact of syntactic and shallow semantic information in automatic classification of questions and answers and answer re-ranking. We define (a) new tree structures based on shallow semantics encoded in Predicate Argument Structures (PASs) and (b) new kernel functions to exploit the representational power of such structures with Support Vector Machines. Our experiments suggest that syntactic information helps tasks such as question/answer classification and that shallow semantics gives remarkable contribution when a reliable set of PASs can be extracted, . from answers. . | Exploiting Syntactic and Shallow Semantic Kernels for Question Answer Classification Alessandro Moschitti University of Trento 38050 Povo di Trento Italy Silvia Quarteroni The University of York York YO10 5DD United Kingdom silvia@ Roberto Basili Tor Vergata University Via del Politecnico 1 00133 Rome Italy basili@ Suresh Manandhar The University of York York YO10 5DD United Kingdom Abstract We study the impact of syntactic and shallow semantic information in automatic classification of questions and answers and answer re-ranking. We define a new tree structures based on shallow semantics encoded in Predicate Argument Structures PASs and b new kernel functions to exploit the representational power of such structures with Support Vector Machines. Our experiments suggest that syntactic information helps tasks such as question answer classification and that shallow semantics gives remarkable contribution when a reliable set of PASs can be extracted . from answers. 1 Introduction Question answering QA is as a form of information retrieval where one or more answers are returned to a question in natural language in the form of sentences or phrases. The typical QA system architecture consists of three phases question processing document retrieval and answer extraction Kwok et al. 2001 . Question processing is often centered on question classification which selects one of k expected answer classes. Most accurate models apply supervised machine learning techniques . SNoW Li and Roth 2005 where questions are encoded using various lexical syntactic and semantic features. The retrieval and answer extraction phases consist in retrieving relevant documents Collins-Thompson et al. 2004 and selecting candidate answer passages 776 from them. A further answer re-ranking phase is optionally applied. Here too the syntactic structure of a sentence appears to provide more useful information than a bag of .