Forecasting is required in many situations: deciding whether to build another power generation plant in the next five years requires forecasts of future demand; scheduling staff in a call centre next week requires forecasts of call volume; stocking an inventory requires forecasts of stock requirements. Forecasts can be required several years in advance (for the case of capital investments), or only a few minutes beforehand (for telecommunication routing). Whatever the circumstances or time horizons involved, forecasting is an important aid in effective and efficient planning.
Before the 1920s, forecasting meant drawing lines through clouds of data values. Yule invented the autoregressive technique in 1927, so he could predict the annual number of sunspots. This was a linear model and the basic approach was to assume a linear underlying process modified by noise. That model is often used in marketing (e.g., what will my sales of wheat be next month).
Forecast accuracy measures such as mean squared error (MSE) can be used for selecting a model for a given set of data, provided the errors are computed from data in a hold-out set and not from the same data as were used for model estimation. However, there are often too few out-of-sample errors to draw reliable conclusions. Consequently, a penalized method based on the in-sample fit is usually better.
Complex models tend to correctly handle the training data but fail to generalize, a phenomenon usually termed as overfitting. The usual statistical approach to this situation is model selection, where different candidate models are evaluated according to a generalization estimate. Several complex estimators have been developed (e.g., Bootstraping), which are computationally burdensome. A reasonable alternative is the use of simple statistics that add a penalty to the model that is a function its complexity, such as the Bayesian Information Criterion (BIC).
Another such approach that uses a penalized likelihood is Akaike’s Information Criterion. The AIC is able to select between the error types because it is based on likelihood rather than one-step forecasts. We select the model that minimizes the AIC amongst all of the models that are appropriate for the data. The AIC also provides a method for selecting between the additive and multiplicative error models. Models with multiplicative errors are useful when the data are strictly positive, but are not numerically stable when the data contain zeros or negative values. Therefore multiplicative errors models will not be considered if the time series is not strictly positive. In that case only the six fully additive models will be applied.
R code example for Customer Service Request’s forecast for next 20 days:
s=seq(from=as.POSIXct(paste(startDate,” 3″),format=”%d.%m.%Y %H”),
to=as.POSIXct(paste(endDate,” 3″),format=”%d.%m.%Y %H”), by=”day”)
wrkdys=wrkdys[!(as.Date(wrkdys[]) %in% hollidays)]
for(pr in levels(df$Prioritet))
forecast=forecast(ets(xts,lambda=lmbd), h=ahead, simulate=TRUE, bootstrap=TRUE)
forecast=forecast(auto.arima(ts(xts), stepwise=FALSE, approximation=FALSE,
lambda=lmbd), h=ahead, simulate=TRUE, bootstrap=TRUE)