tailieunhanh - Measuring semantic similarity between words using page counts and snippets
This approach is based on text snippets and page counts. These two measures are taken from the results of a search engine like Google. To achieve the aim of this paper, lexical patterns are extracted from text snippets and word co-occurrence measures are defined using page counts. The results of these two are combined. | ISSN:2249-5789 Manasa Ch et al , International Journal of Computer Science & Communication Networks,Vol 2(4), 553-558 Measuring Semantic Similarity between Words Using Page Counts and Snippets Computer Science & Engineering, SR Engineering College Warangal, Andhra Pradesh, India Email: Assistant Professor, CSE SR Engineering College, Warangal, Andhra Pradesh, India Email:naikramana@ . Ananda Raj Sr. Assistant Professor, CSE SR Engineering College, Warangal, Andhra Pradesh, India Email: anandsofttech@ Abstract Web mining involves activities such as document clustering, community mining etc. to be performed on web. Such tasks need measuring semantic similarity between words. This helps in performing web mining activities easily in many applications. However, the accuracy of measuring semantic similarity between any two words is difficult task. In this paper a new approach is proposed to measure similarity between words. This approach is based on text snippets and page counts. These two measures are taken from the results of a search engine like Google. To achieve the aim of this paper, lexical patterns are extracted from text snippets and word co-occurrence measures are defined using page counts. The results of these two are combined. Moreover, we proposed algorithms such as pattern clustering and pattern extraction in order to find various relationships between any given two words. Support Vector Machines, a data mining technique, is used to optimize the results. The empirical results reveal that the proposed techniques are finding best results that can be compared with human ratings and accuracy in web mining activities. Key Words - Text snippets, word count, semantic similarity, web mining, lexical patterns 1. INTRODUCTION Web mining has gained popularity as huge amount of information is being made available over web and the automated processing of such data or information is the need of
đang nạp các trang xem trước