In the previous discussion we concluded that our error term was homoskedastic, or that the trace of any heteroskedasticity was not to worry about. However, if we have followed the suggestion by the graphical inspection and the GQ-test we would have believed that the heteroskedasticity could have been driven by one of the explanatory variables, which is one example of how heteroskedasticty could look like. It could also be the case that our model contains two different subgroups with different variances, so that there are two different variances to deal with. There is of course a number of different ways heteroskedasticty could be expressed. Below we will look at some examples were we correct for heteroskedasticity under the assumption of a specific form of heteroskedasticity.

When the nature of the heteroskedasticity is known, one can use Generalized Least Squares (GLS) to estimate the unknown population parameters. Below we will look at three different cases on how to transform the model so that GLS could be applied.

To run a regression using GLS instead of OLS is in practical terms the same thing, but we call it GLS when we have transformed the variables in the model so that the error term become homoskedastic. Below we will go through three cases under the assumption of using the following population regression model:

Case 1:

The first thing to note is that we still assume that the expected value of the error term equals zero,

1 1

which means that the variance of the error term may be expressed as EUi = a XXi. The objective with the transformation of the variables is to make this expectation equal to a2 and nothing more. If we accomplish this with just some transformations of the involved variables we are home free. Let us see how we can do that in this particular case.

We know that the variance of the error term is V Ut = a2XXi. Hence, if we divided everything in the model with the square root of the variable that the variance is proportional to, we would end up with a homoskedastic error term. To see this:

In practice this is carried out by just transforming Y, X1, and X2 and creating a new constant equal to y^Jx~' instead of 1 that use to be there next to B0. Hence when running this specification in a computer you have to ask the software to run the regression through the origin, since we now have a model specific constant that moves with X1. All computer software has that option, and once you have found how to do that, you just regress -on _/^=,_7=^,—r=^—. When you transform the variables in this way, you automatically transform the error term, which now is divided by the square root of X1. Once that is done, we have a homoskedastic error term. That is

Observe that nothing happens with the parameter estimates. The only thing that happens is that the error term is transformed into a constant which will correct the standard errors for the parameters.

Case 2:

This case is very similar to the previous case with the exception that the variable X1 is squared, which means that the variance increases exponentially with X1. The argumentation is similar to the one we had above, and the objective is to receive a constant error term. Hence instead of dividing by the square root of X1 we simply divide by X1 it self. If we do that we receive:

Case 3: Two different variances

In this case we have an error term that has two different variances dependent on the observation. Hence, our sample could include two groups with intrinsic differences in their variation. If these two groups are known, we can sort the data set with respect to these groups. For the first n observations, which contains the first group, the error term has the variance cr2 and for the remaining n2 observations, corresponding to the second group, the error term has the variance cr2 . In order to solve the heteroskedasticity problem here, we need to estimate the two variances, by splitting the sample in two parts and estimate the regression variance separately for the two groups. Once that is done we proceed and transform as follows:

Step 1: Split the data set into two parts and estimate the model separately for the two sets of data:

Step 2: Transform each section of the data set with the relevant standard deviation, and run the regression on the full sample of n observations using the transformed variables:

By scaling the error term for each group using their standard deviation, the new transformed error term will have a variance that equals 1 in both subsamples. When merging the samples the total variance for the full model using all observations together will then be constant and equal to 1. To see this

Example 9.5

Assume that we would like estimate the parameters of the following model

and we know that the nature of the error variance is proportional to X1 in the following way:

We would like to estimate the model using OLS and GLS and compare the results. Since we know how the structure of the heteroskedasticity, we apply GLS according to case 1.

Table 9.1 OLS and GLS estimates using 2000 observations

In Figure 9.3 we compare the residual plots before and after correcting for heteroskedasticity to see if the problem is fully solved. From Figure 9.3b the picture looks satisfying.

Figure 9.3 Estimated residual plots before and after correction for heteroscedasticity

Table 9.1 show the results from OLS and GLS applied to a heteroskedastic model. As can be seen the estimated coefficient does not deviate that much which is what we expected since heteroskedasticity has no effect on the unbiasedness and consistency of the OLS estimator. When comparing the standard errors of the two estimations, large differences appear. The standard errors of the OLS estimated slope coefficients are twice as large as those for the corrected model. However the conclusions from the two estimations are the same. That is due to the relatively large sample that was used. If the sample would have been smaller, the corresponding f-values could have been much smaller and the chance of drawing the wrong conclusion would be greater. However, these results are sample specific. When the error term is heteroskedastic the standard errors are wrong and could be smaller or larger than the correct ones. So it is impossible to say something in advance without knowing something about the exact nature of the heteroskedasticity.

Another important observation is related to the coefficient of determination. As can be seen it increased substantially. However, that does not mean that the fit of the model increased that much. Unfortunately it just means that after a transformation of the variables of the kind we did here, the coefficient of determination is of no use, since it is simply wrong.

Found a mistake? Please highlight the word and press Shift + Enter