tailieunhanh - Báo cáo khoa học: "Syntactic and Semantic Kernels for Short Text Pair Categorization"

Automatic detection of general relations between short texts is a complex task that cannot be carried out only relying on language models and bag-of-words. Therefore, learning methods to exploit syntax and semantics are required. In this paper, we present a new kernel for the representation of shallow semantic information along with a comprehensive study on kernel methods for the exploitation of syntactic/semantic structures for short text pair categorization. Our experiments with Support Vector Machines on question/answer classification show that our kernels can be used to greatly improve system accuracy. . | Syntactic and Semantic Kernels for Short Text Pair Categorization Alessandro Moschitti Department of Computer Science and Engineering University of Trento Via Sommarive 14 38100 POVO TN - Italy moschitti@ Abstract Automatic detection of general relations between short texts is a complex task that cannot be carried out only relying on language models and bag-of-words. Therefore learning methods to exploit syntax and semantics are required. In this paper we present a new kernel for the representation of shallow semantic information along with a comprehensive study on kernel methods for the exploitation of syntac-tic semantic structures for short text pair categorization. Our experiments with Support Vector Machines on question answer classification show that our kernels can be used to greatly improve system accuracy. 1 Introduction Previous work on Text Categorization TC has shown that advanced linguistic processing for document representation is often ineffective for this task . Lewis 1992 Furnkranz et al. 1998 Allan 2000 Moschitti and Basili 2004 . In contrast work in question answering suggests that syntactic and semantic structures help in solving TC Voorhees 2004 Hickl et al. 2006 . From these studies it emerges that when the categorization task is linguistically complex syntax and semantics may play a relevant role. In this perspective the study of the automatic detection of relationships between short texts is particularly interesting. Typical examples of such relations are given in Giampiccolo et al. 2007 or those holding between question and answer . Hovy et al. 2002 Punyakanok et al. 2004 Lin and Katz 2003 . if a text fragment correctly responds to a question. In Question Answering the latter problem is mostly tackled by using different heuristics and classifiers which aim at extracting the best answers Chen et al. 2006 Collins-Thompson et al. 2004 . However for definitional questions a more effective approach would be to test if a .

TỪ KHÓA LIÊN QUAN