Example: Hidden Markov Model

This vignette1 describes the workflow of the {ino} package for the likelihood optimization of an hidden Markov model (HMM). For more technical details about HMMs see, e.g., Zucchini et al. (2016).

Data

The example data set considered throughout this vignette covers a series of earthquake counts. The data set is also included in the {ino} package.

data("earthquakes")
ggplot(earthquakes, aes(x = year, y = obs)) +
  geom_point() + geom_line() + ylab("count")

As the observations are unbounded counts, we consider a Poisson-HMM. The (log-)likelihood of a Poisson-HMM is given by the function f_ll_hmm().

Model fitting

We consider an 2-state HMM (N = 2) with four parameters (npar = 4) to be estimated. Two parameters are required for the transition probability matrix (tpm), and another two correspond to the state-dependent Poisson means. neg = TRUE indicates that we minimize the negative log-likelihood.

hmm_ino <- setup_ino(
  f = f_ll_hmm,
  npar = 4,
  data = ino::earthquakes,
  N = 2,
  neg = TRUE,
  opt = set_optimizer_nlm()
)

Random initialization

We first use randomly chosen starting values.

for(i in 1:3)
  hmm_ino <- random_initialization(hmm_ino)

Fixed initialization

For choosing fixed starting values, we set the two values for the tpm to -2, corresponding to probabilities to stay in state 1 and 2 of about 0.88 (note that we employ a multinomial logit link here to ensure that the probabilities are between 0 and 1). For the Poisson means, we consider values close to 15 and 25, as most counts fall in this region (note here that we exponentiate the means in the likelihood as they are constrained to be positive). We consider the following three sets of starting values:

starting_values <- list(c(-2, -2, log(20), log(40)), 
                        c(-2, -2, log(15), log(25)),
                        c(-2, -2, log(10), log(20)))
for(val in starting_values)
  hmm_ino <- fixed_initialization(hmm_ino, at = val)

Subset initialization

To illustrate the subset initialization strategy, we fit our HMM to the first 50% of the observation. The starting values for these subsets are chosen randomly. The function subset_initialization() then fits the HMM again to the entire sample using the estimates obtained from the subsets as initial values.

for(i in 1:3)
  hmm_ino <- subset_initialization(hmm_ino, arg = "data", how = "first", prop = 0.5)

Evaluating the optimization runs

Local optima

Selecting the starting values for HMMs is a well-known issue, as poor starting values may likely result in local maxima. Other R packages designed to fit HMMs discuss this topic in more detail (see, e.g., https://cran.r-project.org/package=moveHMM). We thus first evaluate the optimizations by comparing the likelihood values at convergence, which can be displayed using overview_optima():

overview_optima(hmm_ino)
#>   optimum frequency
#> 1  342.32         4
#> 2  391.92         5

The frequency table indicates that 4 runs converged to the same likelihood value (342.32), which is the global maximum (note these are the negative log-likelihood values).

Using summary, we can investigate all optimization runs:

summary(hmm_ino)
#>       .strategy           .time .optimum code iterations
#> 1        random 0.15281987 secs 391.9189    1         34
#> 2        random 0.10261297 secs 391.9189    1         24
#> 3        random 0.20910716 secs 342.3183    1         40
#> 4         fixed 0.08521390 secs 342.3183    1         26
#> 5         fixed 0.07301402 secs 342.3183    1         20
#> 6         fixed 0.08365798 secs 342.3183    1         26
#> 7 subset>random 0.06353283 secs 391.9189    1          5
#> 8 subset>random 0.07892013 secs 391.9189    1          4
#> 9 subset>random 0.06573987 secs 391.9189    1          4

The table further indicates that (e.g.) the fourth optimization converged to the global optimum. We can extract the corresponding initial values and estimates as follows:

hmm_ino$runs$pars[[4]]
#> $.init
#> [1] -2.000000 -2.000000  2.995732  3.688879
#> 
#> $.estimate
#> [1] -1.914219 -2.650478  2.739047  3.262906

Optimization time

We use the plot() function to investigate the computation time. Intuitively, the optimization runs with fixed starting values should be faster than with random starting values, since we have carefully chosen the fixed starting values by investigating the data.The boxplots confirm this intuition:

plot(hmm_ino, var = ".time", by = ".strategy")


  1. The vignette was build using R .4 with the {ino} 0.1.0 package.↩︎