Đang chuẩn bị liên kết để tải về tài liệu:
Data Mining and Knowledge Discovery Handbook, 2 Edition part 90

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

Data Mining and Knowledge Discovery Handbook, 2 Edition part 90. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 870 Slava Kisilevich Florian Mansmann Mirco Nanni Salvatore Rinzivillo 44.4 Open Issues Spatio-temporal properties of the data introduce additional complexity to the data mining process and to the clustering in particular. We can differentiate between two types of issues that the analyst should deal with or take into consideration during analysis general and application dependent. The general issues involve such aspects as data quality precision and uncertainty Miller and Han 2009 . Scalability spatial resolution and time granularity can be related to application dependent issues. Data quality spatial and temporal and precision depends on the way the data is generated. Movement data is usually collected using GPS-enabled devices attached to an object. For example when a person enters a building a GPS signal can be lost or the positioning may be inaccurate due to a weak connection to satellites. As in the general data preprocessing step the analyst should decide how to handle missing or inaccurate parts of the data - should it be ignored tolerated or interpolated. The computational power does not go in line with the pace at which large amounts of data are being generated and stored. Thus the scalability becomes a significant issue for the analysis and demand new algorithmic solutions or approaches to handle the data. Spatial resolution and time granularity can be regarded as most crucial in spatio-temporal clustering since change in the size of the area over which the attribute is distributed or change in time interval can lead to discovery of completely different clusters and therefore can lead to the improper explanation of the phenomena under investigation. There are still no general guidelines for proper selection of spatial and temporal resolution and it is rather unlikely that such guidelines will be proposed. Instead ad hoc approaches are proposed to handle the problem in specific domains see for example Nanni and Pedreschi 2006 . Due to this the involvement