The approach of treating heteroskedasticity that has been described until now is what you usually find in basic text books in econometrics. But this approach is old fashion and researchers today tend to use a more convenient approach that is based on using an estimator for the standard errors that is robust to heteroskedasticity rather than doing all these investigations and then correct for it assuming a specific structure of the variance.
We know how the variance of the OLS estimator should look like for the simple linear regression model:
Halbert White is an econometrician that showed that the unknown population variance could be replaced by the corresponding squared least square residual e1. By doing that one would receive consistent estimates of the true standard errors which provide a basis for inference in large samples. Hence, a heteroskedasticityconsistent variance estimator could be estimated using the following formula:
Since (9.24) is a large sample estimator it is only valid asymptotically, and test based on them are not exact and when using small samples the precision of the estimator may be poor. Fortunately there exist a small sample adjustment factor that could improve the precision considerably by multiplying the variance estimator given by n/(nk). Furthermore and more importantly it is possible to generalize this formula to the multiple regression case, even thought it become slightly more complicated. Fortunately most econometric software such as STATA and SAS, includes the option of receiving robust standard errors together with the parameter estimates when running the regression. Hence in the practical work of your own you should always use the robust standard errors when running regression models.
Example 9.6
In this example we are going to use a random sample of 1483 individuals and estimate the population parameters of the following regression function:
where Y represents the log hourly wages, ED the number of years of schooling, Male a dummy variable that indicates if the sample person is a man, and year that represents the number of years of work experience. We are not sure whether we have a problem of heteroskedasticity and we therefore estimate the parameters with and without robust standard errors, to see how the estimates of the standard errors change. We received the following results:
Variables
OLS
Robust Estimation I
Robust Estimation II
P.E.
S.E.
P.E.
R.S.E.
P.E.
R.S.E.
Intercept
3.646
0.087
3.646
0.105
3.815
0.041
Years of Education
0.063
0.012
0.063
0.016
0.037
0.003
Years of Education 2
0.001
0.0004
0.001
0.0006


Male (dummy)
0.123
0.017
0.123
0.017
0.124
0.0167
Years of Work exp.
0.008
0.001
0.008
0.001
0.008
0.001
RMSE
0.3079
0.3079
0.3083
Table 9.2 Regression results
Note: P.E. stands for Parameter Estimates; S.E. stands for Standard Errors; R.S.E. stands for Robust Standard Errors. RMSE stands for Root Mean Square Error which is the standard deviation of the estimated residual.
Table 9.2 contains three regressions and the first column shows the results from the standard OLS regression assuming homoskedasticity. These results should be compared with the second column of estimates that use robust standard errors, which are heteroskedasticity consistent standard errors. Comparing those with the OLS case, we see that the robust standard errors are some what larger, which had consequences on the significance of the parameter for the squared education term, which no longer is significant. Including irrelevant variables in the regression makes the estimates less efficient. It therefore makes no sense to have the squared term included. In the third column, we reestimate the model with out the squared term using robust standard errors.
Since we decided to use robust standard errors we could end up with a more parsimonious model, including only relevant terms. If we had included the squared education term, the marginal effects of education on earnings would be different and wrong. As can be seen from the RMSE measure that represents the estimated standard deviation of the error term it does not change very much among the specifications in Table 9.2. We should therefore conclude that the earnings model is not very sensitive to heteroskedasticity using this specification.
Found a mistake? Please highlight the word and press Shift + Enter