Another important use of the regression model is to predict the size of the dependent variable for different values of X. Let us start with a definition:

Prediction and Forecasting

To make a statement about an event before the event occurs. In econometrics a statement made in advance about the value of a dependent variable using regression results.

The words prediction and forecasting are going to be used interchangeably. However, often the word prediction is used for models that covers cross sectional analysis, while predictions made using times series models on future events are called forecasting. Since the literature does not show any consensus on this part we will treat them synonymously in this text.

Assume the following population regression equation:

and we would like to make predictions about the future, that is we would like to know the value of Y in period T+1. We have basically two important cases to consider: known values of X or values of X with uncertainty. Whether we have exact information about X or not will affect the variance for the predicted value. We will start the discussion assuming that the X value is known, and later relax this assumption to explore the difference. The exact value of the population parameters is never an issue, and it is therefore obvious that they have to be estimated.

The predicted value of the dependent variable is therefore given by the conditional expectation of the dependent variable and is denoted in the following way:

where the population parameters has been replaced by the sample estimators. Since the sample estimators are the same for all t, it is the value of Xt that generates the forecast for Yt. Hence the forecast value of Y in period T+1 is therefore given by:

This is often called a point prediction. In order to make inference on the future, we need an interval prediction as well, that is, we need to calculate the forecast error. The forecast error will help us say something about how good the prediction is. The forecast error is the difference between the predicted value and the actual value and may be expressed in the following way:

Now, what is the expected value ft the forecast error:

Since the expected value of the forecast error is zero, we have an unbiased forecast. Assuming that X is known, the variance of the forecast error is given by:

assuming that X is constant in repeated sampling. Replacing the variances and the covariance with the expression for the sample estimators and rearrange we end up with the following expression:

Observe that the forecast error variance is smallest when the future value of X equals the mean value of X. This formula is true if the future value of X is known. That is often not the case and hence the formula has to be elaborated accordingly. One way to deal with the uncertainty is to impose a distribution for X, with a component of uncertainty. That is, assume that

With this assumption we may form an expression for the error variance that takes the extra variation from the uncertainty into account:

The important point to notice here is that this variance is impossible to estimate unless we know the exact value of the variance for the uncertainty. That is of course not possible. Furthermore, the expression involves the population parameter multiplied with the variance of the uncertainty. Hence, in practice (4.12) is often use, but one should hold in mind that it most likely is an understatement of the true forecast error variance.

Taking the square root of the variance in (4.12) or (4.13) gives us the standard error of the forecast. With this standard error it is possible to calculate confidence interval around the predicted values using the usual formula for a confidence interval, that is:

Confidence interval of a forecast

Found a mistake? Please highlight the word and press Shift + Enter