Đang chuẩn bị liên kết để tải về tài liệu:
Microsoft Data Mining integrated business intelligence for e commerc and knowledge phần 4

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

Như có thể được dự kiến , các công ty tài chính nặng nề của người sử dụng khai thác dữ liệu để cải thiện lợi nhuận và giảm rủi ro. Trang chủ tiết kiệm của Mỹ FSB, Irwindale, CA, của quốc gia tiết kiệm lớn nhất và công ty cho vay, phân tích delinquencies thế chấp, nhà bị tịch thu, hoạt động bán hàng, và xu hướng Ngay cả địa chất | 82 3.6 Data mining tasks supported by SQL Server 2000 Analysis Services The goal of cluster analysis is to identify groups of cases that are as similar as possible with respect to a number of variables in the data set yet are as different as possible with respect to these variables when compared with any other cluster in the grouping. Records that have similar purchasing or spending patterns for example form easily identified segments for targeting different products. In terms of personalized interaction different clusters can provide strong cues to suggest different treatments. Clustering is very often used to define market segments. A number of techniques have evolved over time to carry out clustering tasks. One of the oldest clustering techniques is K-means clustering. In K-means clustering the user assigns a number of means that will serve as bins or clusters to hold the observations in the data set. Observations are then allocated to each of the bins or clusters depending on their shared similarity. Another technique is expectation maximization EM . EM differs from K-means in that each observation has a propensity to be in any one bin or cluster based on a probability weight. In this way observations actually belong to multiple clusters except that the probability of being in each of the clusters rises or falls depending on how strong the weight is. Microsoft has experimented with both of these approaches and also with the idea of taking many different starting points in the computation of the bins or clusters so that the identification of cluster results is more consistent the traditional approach is to simply identify the initial K-means based on random assignment . The current Analysis Server in SQL Server 2000 employs a tried-and-true randomly assigned K-means nearest neighbor clustering approach. If we examine a targeted marketing application which looks at the attributes of various people in terms of their propensity to respond to different conference .