<- aus_livestock %>%
aus_pigs filter(str_detect(Animal, "Pigs") & str_detect(State, "Victoria")) %>%
select(Month, Count)
autoplot(aus_pigs, .vars = Count) +
scale_y_continuous(labels = scales::label_comma(1, scale = 1/1e3, suffix = "K"),
breaks = seq(0, 200e3, 20000)) +
labs(y = "Count of slaughtered pigs", x = "",
title = "Number of pigs slaughtered in Victoria")
1 Introduction
As I’ve said in one of my previous posts, Rob J. Hyndman has excellent book on forecasting that is full of exercises by the end of each chapter. One of those chapters - to be more precise, Chapter 8 - Exponential Smoothing - has tasks that require the implementation of ETS. This really helped me to understand ETS. But, before I go into that part, let’s see what theory says about ETS.
2 Theory
Hyndman says the following on ETS:
Exponential smoothing was proposed in the late 1950s (Brown, 1959; Holt, 1957; Winters, 1960), and has motivated some of the most successful forecasting methods. Forecasts produced using exponential smoothing methods are weighted averages of past observations, with the weights decaying exponentially as the observations get older. In other words, the more recent the observation the higher the associated weight. This framework generates reliable forecasts quickly and for a wide range of time series, which is a great advantage and of major importance to applications in industry.
He also goes on to say that ETS (to be more precise, SES or simple exponential smoothing) is most suitable for forecasting data with no clear trend or seasonal pattern. The math behind the ETS is presented in 2 ways: through weighted average form and through component form. From the programming standpoint, I like component form better, but let’s present both approaches.
2.1 Weighted form
Meaning of these mathematical symbols is as follows:
: first (forecasted) future value ( ), given last historical value ( ). : last historical value. : last forecasted value at index . Spoiler alert: this is where magic happens. : smoothing parameter.
Hold your horses, you’re probably having questions about
2.2 Component form
The component form will get us to the same results, but it’s actually easier to see what is really going on. It consists of one simple equation, which we (surprise, surprise) call the forecast equation:
This is identical to the first formula in the weighted form, only the symbols are different. That’s it.
Okay, enough with the theory. Let’s implement this.
3 Implementation
Exercise 1 is as follows:
Consider the the number of pigs slaughtered in Victoria, available in the
aus_livestock
dataset. Use theETS()
function to estimate the equivalent model for simple exponential smoothing. Find the optimal values ofand , and generate forecasts for the next four months.
tidyverse
and fpp3
is loaded, so we can go on with the solution.
Let’s forecast with ETS
using tools from fable
:
<- aus_pigs %>%
fit model(ETS(Count ~ error("A") + trend("N") + season("N")))
report(fit)
Series: Count
Model: ETS(A,N,N)
Smoothing parameters:
alpha = 0.3221247
Initial states:
l[0]
100646.6
sigma^2: 87480760
AIC AICc BIC
13737.10 13737.14 13750.07
Remember those
Let’s see fit on the training data:
augment(fit) %>%
select(Month, Count, .fitted) %>%
rename(Training = Count, Fitted = .fitted) %>%
pivot_longer(-Month, names_to = "Category", values_to = "Count") %>%
mutate(Year = year(Month),
MonthInt = month(Month)) %>%
filter(Year >= 2003) %>%
ggplot(aes(x = MonthInt, y = Count, color = Category)) +
geom_line() +
scale_y_continuous(labels = scales::label_comma(1, scale = 1/1e3, suffix = "K"),
breaks = seq(0, 200e3, 30000)) +
scale_x_continuous(breaks = seq(0, 12, 2)) +
labs(y = "Count of slaughtered pigs", x = "",
title = "Number of pigs slaughtered in Victoria") +
scale_color_brewer(palette = "Set1") +
facet_wrap(. ~ Year, scales = "free_x")
Nomen est omen, and in ETS we can see the smoothed curve for each1 historical year.
What is the actual forecast?
<- fit %>%
fc forecast(h = 4)
fc
# A fable: 4 x 4 [1M]
# Key: .model [1]
.model Month Count .mean
<chr> <mth> <dist> <dbl>
1 "ETS(Count ~ error(\"A\") + trend(\"N\") + ~ 2019 Jan N(95187, 8.7e+07) 95187.
2 "ETS(Count ~ error(\"A\") + trend(\"N\") + ~ 2019 Feb N(95187, 9.7e+07) 95187.
3 "ETS(Count ~ error(\"A\") + trend(\"N\") + ~ 2019 Mar N(95187, 1.1e+08) 95187.
4 "ETS(Count ~ error(\"A\") + trend(\"N\") + ~ 2019 Apr N(95187, 1.1e+08) 95187.
So, around 95k. Let’s plot this:
<- fc %>%
plot_all autoplot(aus_pigs) +
scale_y_continuous(labels = scales::label_comma(1, scale = 1/1e3, suffix = "K"),
breaks = seq(0, 300e3, 20000)) +
labs(y = "Count of slaughtered pigs",
title = "Number of pigs slaughtered in Victoria",
x = "")
<- fc %>%
plot_zoom autoplot(aus_pigs %>% filter(lubridate::year(Month) >= 2010)) +
scale_y_continuous(labels = scales::label_comma(1, scale = 1/1e3, suffix = "K"),
breaks = seq(0, 300e3, 20000)) +
labs(y = "Count of slaughtered pigs",
title = "Zoomed plot (2010 onwards)",
x = "")
::plot_grid(plot_all, plot_zoom,
cowplotnrow = 1)
Seems reasonable. The 95% confidence interval accounts for historical variation that is clearly visible in the past 5 years.
Let’s try to implement ETS on our own. Our target is:
coef(fit) %>%
select(term, estimate)
# A tibble: 2 x 2
term estimate
<chr> <dbl>
1 alpha 0.322
2 l[0] 100647.
4 Coding the ETS method
Solution is made out of few components, so bear with me.
4.1 exp_smoothing_manual()
<- function(y,
exp_smoothing_manual
arg_alpha,
arg_ell_zero,bool_forecast_h1 = FALSE) {
if (bool_forecast_h1) {
<- length(y) + 1
total_len else {
} <- length(y)
total_len
}
<- (1 - arg_alpha)
anti_alpha
<- rep(NA, total_len)
store_predictions 1] <- arg_ell_zero
store_predictions[
for (i in seq_along(store_predictions)) {
if (i == 1) {
<- store_predictions[i]
last_y <- store_predictions[i]
last_yhat else {
} <- y[i - 1]
last_y <- store_predictions[i - 1]
last_yhat
}
<- arg_alpha * last_y
term_01
<- anti_alpha * last_yhat
term_02
<- term_01 + term_02
yhat
<- yhat
store_predictions[i]
}
<- yhat[length(yhat)]
out
return(out)
}
This is nothing else than function2 that implements the formula as given in theory
So, does this function give the same result as ETS model from fable
?
<- coef(fit) %>%
tbl_coef select(term, estimate)
<- exp_smoothing_manual(
forecast_josip y = aus_pigs$Count,
arg_alpha = tbl_coef$estimate[tbl_coef$term == "alpha"],
arg_ell_zero = tbl_coef$estimate[tbl_coef$term == "l[0]"],
bool_forecast_h1 = TRUE
)
<- fc$.mean[1]
forecast_fable
# Moment of truth:
== forecast_fable forecast_josip
[1] TRUE
Splendid!
But, now we want to build our own ETS model.
We will need the implementation of exp_smoothing_manual()
that returns SSE.
4.2 optimize_exp_smoothing()
<- function(pars = c(arg_alpha, arg_ell_zero), y) {
optimize_exp_smoothing
<- rep(NA, length(y))
out_predicted
for (i in seq_along(out_predicted)) {
<- exp_smoothing_manual(
out_predicted[i] y = y[1:i],
arg_alpha = pars[1],
arg_ell_zero = pars[2]
)
}
<- (out_predicted - y) ^ 2
squared_errors <- sum(squared_errors)
out return(out)
}
So, we will need to find
Let’s optimize it.
4.3 optim()
R has optimization function called optim()
(info)
The optim()
requires starting values for it’s algorithms. For
<- optim(
optim_pigs par = c(0.5, aus_pigs$Count[1]),
y = aus_pigs$Count,
fn = optimize_exp_smoothing,
method = "Nelder-Mead"
)
optim_pigs
$par
[1] 3.221274e-01 9.922308e+04
$value
[1] 48642449532
$counts
function gradient
99 NA
$convergence
[1] 0
$message
NULL
Let’s compare these results with parameters from ETS (fable
):
coef(fit) %>%
select(term, estimate) %>%
mutate(estimate_me = optim_pigs$par,
diff_abs = abs(estimate_me - estimate),
diff_rel = diff_abs / estimate) %>%
mutate(across(where(is.numeric), function(x) round(x, 4)))
# A tibble: 2 x 5
term estimate estimate_me diff_abs diff_rel
<chr> <dbl> <dbl> <dbl> <dbl>
1 alpha 0.322 0.322 0 0
2 l[0] 100647. 99223. 1424. 0.0141
So, I’ve got fable
ETS for differences or choose different optimization algorithm from optim()
. But, this is good enough for me, at least for the purpose of solving this exercise.