tailieunhanh - John wiley sons data mining techniques for marketing sales_4

Khi thông tin về cùng một chủ đề được thu thập từ nhiều nguồn, các nguồn khác nhau thường đại diện cho cùng một dữ liệu khác nhau. Nếu những khác nhau ences không bị bắt, họ thêm phân biệt giả mạo có thể dẫn đến kết luận sai lầm. | 74 Chapter 3 When missing values must be replaced the best approach is to impute them by creating a model that has the missing value as its target variable. Values with Meanings That Change over Time When data comes from several different points in history it is not uncommon for the same value in the same field to have changed its meaning over time. Credit class A may always be the best but the exact range of credit scores that get classed as an A may change from time to time. Dealing with this properly requires a well-designed data warehouse where such changes in meaning are recorded so a new variable can be defined that has a constant meaning over time. Inconsistent Data Encoding When information on the same topic is collected from multiple sources the various sources often represent the same data different ways. If these differences are not caught they add spurious distinctions that can lead to erroneous conclusions. In one call-detail analysis project each of the markets studied had a different way of indicating a call to check one s own voice mail. In one city a call to voice mail from the phone line associated with that mailbox was recorded as having the same origin and destination numbers. In another city the same situation was represented by the presence of a specific nonexistent number as the call destination. In yet another city the actual number dialed to reach voice mail was recorded. Understanding apparent differences in voice mail habits between cities required putting the data in a common form. The same data set contained multiple abbreviations for some states and in some cases a particular city was counted separately from the rest of the state. If issues like this are not resolved you may find yourself building a model of calling patterns to California based on data that excludes calls to Los Angeles. Step Six Transform Data to Bring Information to the Surface Once the data has been assembled and major data problems fixed the data must still be .

TỪ KHÓA LIÊN QUAN
crossorigin="anonymous">
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.