tailieunhanh - Data Mining and Knowledge Discovery Handbook, 2 Edition part 95

Data Mining and Knowledge Discovery Handbook, 2 Edition part 95. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 920 Johannes Furnkranz DOM-tree . Kushmerick 2000 first studied the problem of inducing such wrappers from a set of training examples where the information to extract is marked. He studies a variety of types of wrapper algorithms with different expressiveness. The simplest class LR wrappers assume a highly regular source page that allows to map its content into a database table by learning delimiters for each attribute. LR wrappers were able to wrap 53 of the pages in an experimental study more expressive classes were able to wrap up to 70 . Moreover it was shown that all studied wrapper classes are PAC-learnable. Grieser Jantke Lange Thomas 2000 extend this work with a study of theoretical properties and learnability results for island wrappers a generalization of the wrapper types studied by Kushmerick 2000 . SoftMealy Hsu and Dung 1998 addresses several of the short-comings of the framework of Kushmerick 2000 most notably the restriction to single sequences of features by learning a finite-state transducer that allows to encode all occurring sequences of features. Lerman Minton and Knoblock 2003 discuss learning approaches for supporting the maintenance of existing wrappers. The field has also seen numerous commercial efforts such as the Lixto project Gottlob et al. 2004 or IBM s Andes project Myllymaki 2001 . The most notable application of information extraction techniques are comparison shopping agents Doorenbos et al. 1997 . The Semantic Web The Semantic Web is a term coined by Tim Berner-Lee for the vision of making the information on the Web machine-processable Berners-Lee et al. 2001 . The basic idea is to enrich web pages with machine-processable knowledge that is represented in the form of ontologies Staab and Studer 2004 Fensel 2001 . Ontologies define certain types of objects and the relations between them. As ontologies are readily accessible like other web documents a computer program can use them to draw inferences about the information .

TỪ KHÓA LIÊN QUAN