tailieunhanh - Class Notes in Statistics and Econometrics Part 17

CHAPTER 33 Regression Graphics. The “regression” referred to in the title of this chapter is not necessarily linear regression. The population regression can be defined as follows: The random scalar y and the random vector x have a joint distribution, and we want to know how the conditional distribution of y|x = x depends on the value x. | CHAPTER 33 Regression Graphics The regression referred to in the title of this chapter is not necessarily linearregression. The population regression can be defined as follows The random scalar y and the random vector x have a joint distribution and we want to know how the conditional distribution of y x x depends on the value x. The distributions themselves are not known but we have datasets and we use graphical means to estimate the distributions from these datasets. PROBLEM 384. Someone said on an email list about statistics if you cannot see an effect in the data then there is no use trying to estimate it. Right or wrong ANSWER. One argument one might give is the curse of dimensionality. Also higher moments of the distribution kurtosis etc. cannot be seen very cleary with the plain eye. 833 834 33. REGRESSION GRAPHICS . Scatterplot Matrices One common graphical method to explore a dataset is to make a scatter plot of each data series against each other and arrange these plots in a matrix. In R the pairs function does this. Scatterplot matrices should be produced in the preliminary stages of the investigation but the researcher should not think he or she is done after having looked at the scatterplot matrices. In the construction of scatter plot matrices it is good practice to change the signs of some of the variables in order to make all correlations positive if this is possible. BT99 pp. 17-20 gives a good example of what kinds of things can be seen from looking at scatterplot matrices. The data for this book are available at http biometric PROBLEM 385. 5 points Which inferences about the datasets can you draw from looking at the scatterplot matrix in BT99 Exhibit p. 14 ANSWER. The discussion on BT99 p. 19 distinguishes three categories. First the univariate phenomena yield is more concentrated for local genotypes than for imports the converse is true for protein but not as pronounced oil and seed size are lower for local genotypes regarding seed size