tailieunhanh - An Empirical Study of Similarity Search in Stock Data
Financial data are conventionally represented in numeric format for data mining purpose. However, recent works have demonstrated promising results of representing financial data symbolically. For an instance, Kovalerchuk et al. (2002) argues that symbolic relational data mining is more suitable in incorporating background knowledge. Their proposed methodology outperforms numeric financial data in generating IF-Then rules. In (Ting et al. 2006), sequential and non-sequential association rule mining (ARM) were used to perform intra and inter-stock pattern mining, where each stock is represented symbolically based on its performance with respect to a user-defined threshold. Similarly, we. | An Empirical Study of Similarity Search in Stock Data Lay-Ki Soon Sang Ho Lee School of Information Technology Soongsil University 1-1 Sangdo-dong Dongjak-gu Seoul 156-743 Korea laykisoon@ shlee199@ Abstract Using certain artificial intelligence techniques stock data mining has given encouraging results in both trend analysis and similarity search. However representing stock data effectively is a key issue in ensuring the success of a data mining process. In this paper we aim to compare the performance of numeric and symbolic data representation of a stock dataset in terms of similarity search. Given the properly normalized dataset our empirical study suggests that the results produced by numeric stock data are more consistent as compared to symbolic stock data . Keywords financial data mining similarity search data normalization computational finance. 1 Introduction Stock data mining plays an important role in realizing the vision of autonomous financial market analysis or computational finance. The Efficient Market Theory asserts that it is impossible to infer a consistent and global forecasting model to the stock market by using any information that the market already knows. Stock data mining does not accept nor reject this theory instead it aims to discover subtle short term conditional patterns and trends in wide range of financial data Kovalerchuk and Vityaev 2000 . Being one of the key applications in time series data mining stock data mining generally focuses on trend modeling and forecasting Han and Kamber 2006 . In this paper we have analyzed the stock dataset by performing similarity search. By identifying stocks that share similar behavior we can gain insight into the underlying pattern which is helpful in further analysis such as stock market forecasting. Nevertheless one of the main challenges in stock data mining is to find the effective knowledge representation of the stock dataset Kovalerchuk and Vityaev 2000 Kovalerchuk et al. .
đang nạp các trang xem trước