next up previous

STAT 804: Lecture 17 Notes

Forecasting: an introduction

Given data tex2html_wrap_inline183 our goal will be to guess, or forecast, tex2html_wrap_inline185 or more generally tex2html_wrap_inline187 . There are a variety of ad hoc methods as well as a variety of statistically derived methods. I illustrate the ad hoc methods with the exponentially weighted moving average (EWMA). In this case we simply take

displaymath189

where c(a,T) makes it a weighted average: tex2html_wrap_inline193 . If we take a near to 1 we are almost using the sample mean while if we take a near 0 we are virtually using tex2html_wrap_inline199 . You are supposed to choose a to trade off the desire to use lots of data against the possibility that the structure of the series has changed over time.

Statistically based methods concentrate on some measure of the size of tex2html_wrap_inline203 ; the mean squared prediction error tex2html_wrap_inline205 is the most common.

In general tex2html_wrap_inline207 must be some function tex2html_wrap_inline209 . The mean squared prediction error can be seen by conditioning on the data to be minimized by

displaymath211

For most distributions of the X's this would be hard to compute but for Gaussian processes the solution is the usual linear regression of tex2html_wrap_inline185 on the data, namely

displaymath217

where the coefficient vector a is given by

displaymath221

When T is large the computation of these forecasts is difficult in general. There are some shortcuts, however.

Forecasting AR(p) processes

When the process is an AR the computation of the conditional expectation is easier:

eqnarray38

For r > 0 we have the recursion

eqnarray48

Notice the the forecast into the future uses current values where these are available and forecasts already calculated for the other X's.

Forecasting ARMA(p,q) processes

An ARMA(p,q) can be inverted to be an infinite order AR process. We could then use the method just given for the AR except that now the formula actually mentions values of tex2html_wrap_inline235 for t < 0. In practice we simply truncate the series and ignore the missing terms in the forecast, assuming that the coefficients of these omitted terms are very small. Remember each term is built up out of a geometric series for tex2html_wrap_inline239 with tex2html_wrap_inline241 .

A more direct method goes like this:

eqnarray62

where now the conditioning ``|X'' means given the observed data.

Whenever the time index on an epsilon is T or more the conditional expectations are 0. For T+r-i < T we need to guess the value of tex2html_wrap_inline249 . The same recurtion can be re-arranged to help compute tex2html_wrap_inline251 for tex2html_wrap_inline253 , at least approximately:

eqnarray76

This recursion works you backward but you have to get it started. Generally we start the recursion by putting

displaymath255

for negative t and then using the recursion. The coefficients b are such that the effect of getting these values of tex2html_wrap_inline261 wrong is damped out at a geometric rate as we increase t so if we have enough data and the smallest root of the characteristic polynomial for the MA part is not too close to 1 then we will have accurate values for tex2html_wrap_inline265 for t near T.

As we discussed in the section on estimation these computed estimates of the epsilon's can be improved by backcasting the values of tex2html_wrap_inline271 for negative t and then forecasting and backcasting, etc.

Forecasting ARIMA(p,d,q) series

If tex2html_wrap_inline277 and X is ARIMA(p,d,q) then we: compute Z, forecast Z and reconstruct X by undoing the differencing. For d=1 for example we just have

displaymath291

Forecast standard errors

You should remind yourself that the computations of conditional expectations we have just made used the fact that the a's and b's are constants - the true parameter values. In fact we then replace the parameter values with estimates. The quality of our forecasts will be summarized by the forecast standard error:

displaymath297

We will compute this ignoring the estimation of the parameters and then discuss how much that might have cost us.

If tex2html_wrap_inline299 then tex2html_wrap_inline301 so that our forecast standard error is just the variance of tex2html_wrap_inline303 .

Consider first the case of an AR(1) and one step ahead forecasting:

displaymath305

The variance of this forecast is tex2html_wrap_inline307 so that the forecast standard error is just tex2html_wrap_inline309 .

For forecasts further ahead in time we have

displaymath311

and

displaymath313

Subtracting we see that

displaymath315

so that we may calculate forecast standard errors recursively. As tex2html_wrap_inline317 we can check that the forecast variance converges to

displaymath319

which is simply the variance of individual Xs. When you forecast a stationary series far into the future the forecast error is just the standard deviation of the series.

Turn now to a general ARMA(p,q). Rewrite the process as the infinite order AR

displaymath325

to see that again, ignoring the truncation of the infinite sum in the forecast we have

displaymath327

so that the one step ahead forecast standard error is again tex2html_wrap_inline309 .

Parallel to the AR(1) argument we see that

displaymath331

The errors on the right hand side are not independent of one another so that computation of the variance requires either computation of the covariances or recognition of the fact that the right hand side is a linear combination of tex2html_wrap_inline333 .

A simpler approach is to write the process as an infinite order MA:

displaymath335

for suitable coefficients tex2html_wrap_inline337 . Now if we treat conditioning on the data as being effectively equivalent to conditioning on all tex2html_wrap_inline235 for t < T we are effectively conditioning on tex2html_wrap_inline271 for all t < T. This means that

eqnarray131

and the forecast error is just

displaymath347

so that the forecast standard error is

displaymath349

Again as tex2html_wrap_inline317 this converges to tex2html_wrap_inline353 .

Finally consider forecasting the ARIMA(p,d,q) process tex2html_wrap_inline357 where W is ARMA(p,q). The forecast errors in X can clearly be written as a linear combination of forecast errors for W permitting the forecast error in X to be written as a linear combination of the underlying errors tex2html_wrap_inline271 . As an example consider first the ARIMA(0,1,0) process tex2html_wrap_inline371 . The forecast of tex2html_wrap_inline373 is just 0 and so the forcast of tex2html_wrap_inline187 is just

displaymath377

The forecast error is

displaymath379

whose standard deviation is tex2html_wrap_inline381 . Notice that the forecast standard error grows to infinity as tex2html_wrap_inline317 . For a general ARIMA(p,1,q) we have

displaymath387

and

displaymath389

which can be combined with the expression above for the forecast error for an ARMA(p,q) to compute standard errors.

Software

The S-Plus function arima.forecast can do the forecasting.

Comments

I have ignored the effects of parameter estimation throughout. In ordinary least squares when we predict the Y corresponding to a new x we get a forecast standard error of

displaymath397

which is

displaymath399

The procedure used here corresponds to ignoring the term tex2html_wrap_inline401 which is the variance of the fitted value. Typically this value is rather smaller than the 1 to which it is added. In a 1 sample problem for instance it is simply 1/n. Generally the major component of forecast error is the standard error of the noise and the effect of parameter estimation is unimportant.


next up previous



Richard Lockhart
Wed Oct 29 11:11:17 PST 1997