tailieunhanh - Book Econometric Analysis of Cross Section and Panel Data By Wooldridge - Chapter 19

Count Data and Related Models Why Count Data Models? A count variable is a variable that takes on nonnegative integer values. Many variables that we would like to explain in terms of covariates come as counts. A few examples include the number of times someone is arrested during a given year | Count Data and Related Models Why Count Data Models A count variable is a variable that takes on nonnegative integer values. Many variables that we would like to explain in terms of covariates come as counts. A few examples include the number of times someone is arrested during a given year number of emergency room drug episodes during a given week number of cigarettes smoked per day and number of patents applied for by a firm during a year. These examples have two important characteristics in common there is no natural a priori upper bound and the outcome will be zero for at least some members of the population. Other count variables do have an upper bound. For example for the number of children in a family who are high school graduates the upper bound is number of children in the family. If y is the count variable and x is a vector of explanatory variables we are often interested in the population regression E y x . Throughout this book we have discussed various models for conditional expectations and we have discussed different methods of estimation. The most straightforward approach is a linear model E y x xb estimated by OLS. For count data linear models have shortcomings very similar to those for binary responses or corner solution responses because y 0 we know that E y x should be nonnegative for all x. If b is the OLS estimator there usually will be values of x such that xb 0 so that the predicted value of y is negative. For strictly positive variables we often use the natural log transformation log y and use a linear model. This approach is not possible in interesting count data applications where y takes on the value zero for a nontrivial fraction of the population. Transformations could be applied that are defined for all y 0 for example log 1 y but log 1 y itself is nonnegative and it is not obvious how to recover E y x from a linear model for E log 1 y x . With count data it is better to model E y x directly and to choose functional forms that .

TỪ KHÓA LIÊN QUAN