tailieunhanh - Báo cáo khoa học: "State-of-the-Art Kernels for Natural Language Processing"

In recent years, machine learning (ML) has been used more and more to solve complex tasks in different disciplines, ranging from Data Mining to Information Retrieval or Natural Language Processing (NLP). These tasks often require the processing of structured input, ., the ability to extract salient features from syntactic/semantic structures is critical to many NLP systems. Mapping such structured data into explicit feature vectors for ML algorithms requires large expertise, intuition and deep knowledge about the target linguistic phenomena. Kernel Methods (KM) are powerful ML tools (see ., (Shawe-Taylor and Cristianini, 2004)), which can alleviate the data representation problem. . | State-of-the-Art Kernels for Natural Language Processing Alessandro Moschitti Department of Computer Science and Information Engineering University of Trento Via Sommarive 5 38123 Povo TN Italy moschitti@ Introduction In recent years machine learning ML has been used more and more to solve complex tasks in different disciplines ranging from Data Mining to Information Retrieval or Natural Language Processing NLP . These tasks often require the processing of structured input . the ability to extract salient features from syntactic semantic structures is critical to many NLP systems. Mapping such structured data into explicit feature vectors for ML algorithms requires large expertise intuition and deep knowledge about the target linguistic phenomena. Kernel Methods KM are powerful ML tools see . Shawe-Taylor and Cristianini 2004 which can alleviate the data representation problem. They substitute feature-based similarities with similarity functions . kernels directly defined between train-ing test instances . syntactic trees. Hence feature vectors are not needed any longer. Additionally kernel engineering . the composition or adaptation of several prototype kernels facilitates the design of effective similarities required for new tasks . Moschitti 2004 Moschitti 2008 . Tutorial Content The tutorial aims at addressing the problems above firstly it will introduce essential and simplified theory of Support Vector Machines and KM with the only aim of motivating practical procedures and interpreting the results. Secondly it will simply describe the current best practices for designing applications based on effective kernels. For this purpose it will survey state-of-the-art kernels for diverse NLP applications reconciling the different ap 2 proaches with a uniform and global notation theory. Such survey will benefit from practical expertise acquired from directly working on many natural language applications ranging from Text Categorization to

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN