tailieunhanh - Class Notes in Statistics and Econometrics Part 35

CHAPTER 69 Binary Choice Models. . Fisher’s Scoring and Iteratively Reweighted Least Squares This section draws on chapter 55 about Numerical Minimization. Another important “natural” choice for the positive definite matrix Ri in the gradient method is available if one maximizes a likelihood function | CHAPTER 69 Binary Choice Models . Fisher s Scoring and Iteratively Reweighted Least Squares This section draws on chapter 55 about Numerical Minimization. Another important natural choice for the positive definite matrix R in the gradient method is available if one maximizes a likelihood function then R can be the inverse of the information matrix for the parameter values . This is called Fisher s Scoring method. It is closely related to the Newton-Raphson method. The Newton-Raphson method uses the Hessian matrix and the information matrix is minus the expected value of the Hessian. Apparently Fisher first used the information matrix as a computational simplification in the Newton-Raphson method. Today IRLS is used in the GLIM program for generalized linear models. 1487 1488 69. BINARY CHOICE MODELS As in chapter 56 discussing nonlinear least squares fl is the vector of parameters of interest and we will work with an intermediate vector r of predictors whose dimension is comparable to that of the observations. Therefore the likelihood function has the form L L y r - By the chain rule one can write the I ii 11 1 11 L 41lifnnnliAn ie dL niT v vli ir 1 tiT dL jj 311 acoian o t e i .eii oo unction as q u were u dn is the Jacobian of L as a function of r evaluated at r and X is the Jacobian of r- This is the same notation as in the discussion of the Gauss-Newton regression. Define A E mmt . Since X does not depend on the random variables the information matrix of y with respect to fl is then E Xtmmt X XT AX. If one uses the inverse of this information matrix as the R-matrix in the gradient algorithm one gets 1 fl X TAX -1X Tu The Iterated Reweighted Least Squares interpretation of this comes from rewriting as 1 XT AX -1X TAA-1u . one obtains the step by regressing A-1u on X with weighting matrix A. . BINARY DEPENDENT VARIABLE 1489 Justifications of IRLS are the information matrix is usually analytically simpler than the Hessian of