The package uGMAR contains tools to estimate and analyze univariate Gaussian mixture autoregressive (GMAR), Student’s t mixture autoregressive (StMAR) and Gaussian and Student’s t mixture autoregressive (G-StMAR) models. We refer to these three models as the GSMAR models. This vignette does not explain details about the models and it’s assumed that the reader is familiar with the cited articles introducing the models. More thorough vignette will be probably added in somewhere (relatively) near future.
The models in uGMAR are defined as class gsmar S3 objects whose can be created with the estimation function fitGSMAR or with the constructor function GSMAR. The created gsmar objects can then be conveniently used as main arguments in several other functions, allowing one for example to perform quantile residual based model diagnostics, simulate from the processes, and to forecast. It’s thus easy to carry out further analyses after the model has been estimated. Some tasks, however, such as setting up initial population for the genetic algorithm, applying linear constraints or building gsmar models with pre-specified parameter values, require knowledge of the model details such as the form of the parameter vector. These details will be therefore explained in this vignette.
The rest of this vignette is organized as follows. In the second section notations for the parameter vector are described in detail, and it is also shown how to apply constraints to autoregressive parameters of the models. In the third section, some useful functions provided by uGMAR are briefly described.
Building a GSMAR model requires the user to specify the autoregressive (AR) order of the model p and the number of mixture components M. For the G-StMAR model one has to define the number of GMAR type components M1 and the number of StMAR type components M2. If one wished to build a model with pre-specified parameter values rather than estimating them, knowledge of the exact form of the parameter vector is obviously necessary. In uGMAR, the form of the parameter vector depends on specifics of the model: is GMAR, StMAR or G-StMAR model considered, are all the AR coefficients restricted to be the same for all regimes and/or are linear constraints applied to the AR-parameters? It’s vital to use the correct type of parameter vector accordingly.
In the following, the intercept parametrization with intercept parameters \(\phi_{m,0}\) is considered. One may alternatively use the mean parametrization; in that case, one simply needs to replace each intercept parameter with the corresponding mean parameter \(\mu_m=\phi_{m,0}/(1-\sum_{i=1}^p\phi_{i,m}),\enspace m=1,...,M.\)
The parameter vector for unconstrained GMAR model is a size (M(p+3)-1)x1 vector of the form \[\boldsymbol{\theta}=(\boldsymbol{\upsilon_{1}},...,\boldsymbol{\upsilon_{M}}, \alpha_{1},...,\alpha_{M-1}),\quad where\] \[\boldsymbol{\upsilon_{m}}=(\phi_{m,0},\boldsymbol{\phi_{m}}, \sigma_{m}^2) \enspace and \enspace \boldsymbol{\phi_{m}}=(\phi_{m,1},...,\phi_{m,p}) ,\quad m=1,...,M.\] Symbol \(\phi_{m,i}\) denotes an AR coefficient, \(\sigma_m^2\) is a variance parameter and \(\alpha_m\) a mixing weight parameter.
For the StMAR model, the parameter vector has to be expanded to include the degrees of freedom parameters. The parameter vector for unconstrained StMAR model is thus a size (M(p+4)-1)x1 vector of the form \[(\boldsymbol{\theta}, \boldsymbol{\nu}),\quad where \quad \boldsymbol{\nu}=(\nu_{1},...,\nu_{M})\] contains the degrees of freedom parameters and the parameter \(\boldsymbol{\theta}\) is as in the case of the GMAR model. To ensure existence of finite second moments, the degrees of freedom parameters \(\nu_{m}\) are assumed to be larger than \(2\).
In the G-StMAR model the first M1 components are GMAR type and the rest M2 components are StMAR type. Parameter vector of the G-StMAR model is similar to the one of the StMAR model but with M2 degrees of freedom parameters for the StMAR components. That is, a size (M(p+3)+M2-1)x1 vector of the form \[(\boldsymbol{\theta}, \boldsymbol{\nu}),\quad where \quad \boldsymbol{\nu}=(\nu_{M1+1},...,\nu_{M})\] contains the degrees of freedom parameters and the parameter \(\boldsymbol{\theta}\) is as in the case of the GMAR model. As in the StMAR case, the degrees of freedom parameters are assumed to be larger than two.
In addition to unconstrained GSMAR models, uGMAR gives an option to analyze restricted models whose AR coefficients \(\phi_{m,1},...,\phi_{m,p}\) are restricted to be the same for all regimes \(m=1,..,M\). Structure of the parameter vector is different for restricted and non-restricted models.
Parameter vector of the restricted GMAR model is a size (3M-p+1)x1 vector of the form \[\boldsymbol{\theta}=(\phi_{1,0},...,\phi_{M,0},\boldsymbol{\phi},\sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1}),  \quad where \quad \boldsymbol{\phi}=(\phi_{1},...,\phi_{p}).\]
Parameter vector of the restricted StMAR model is then defined by adding the degrees of freedom parameters, yielding a size (4M-p+1)x1 vector of the form \[(\boldsymbol{\theta}, \boldsymbol{\nu}),\quad where \quad \boldsymbol{\nu}=(\nu_{1},...,\nu_{M})\] again contains the degrees of freedom parameters and parameter \(\boldsymbol{\theta}\) is as in the case of the GMAR model.
Parameter vector of the restricted G-StMAR model is similar to the StMAR model’s one but with M2 degrees of freedom parameters for the StMAR type components.
So one will have to work with different kind of parameter vectors depending on whether you work with restricted or non-restricted model. In order to restrict the AR parameters or to implicate that the parameter vector is restricted, one needs to supply the considered function with the argument restricted=TRUE.
uGMAR makes it easy to apply linear constraints to the autoregressive parameters of GSMAR models. Considering non-restricted models, each mixture component has its own constraint matrix. uGMAR considers constraints of the form \[\boldsymbol{\phi_{m}}=\boldsymbol{C_{m}\psi_{m}}, \enspace m=1,...,M,\] where \(\boldsymbol{C_{m}}\) is a known size \((pxq_{m})\) constraint matrix of full column rank and \(\boldsymbol{\psi_{m}}\) is a size \((q_{m}x1)\) parameter vector. Observe that this particular specification for linear constraints is not the most general one. However, it keeps the constraint matrices small and simple, and it’s convenient for applying the most typical constraints such as constraining some of the AR coefficients to be zero. For instance, in order to constraint the second AR coefficient of the second regime to zero in a model with p=2 and M=2, the constraint matrix for the first regime is diag(2) implying no constraints, while the constraint matrix for the second regime is simply matrix(1:0).
For further illustration, consider the following special case of linear constraints. We obtain a mixture version of the Heterogeneous Autoregressive model (see Corsi 2009 for the original version) by setting \[\boldsymbol{C_{m}}=\left[{\begin{array}{ccc}
   \boldsymbol{\iota}_{5} & \frac{1}{5}\boldsymbol{1}_{5} & \frac{1}{22}\boldsymbol{1}_{5} \\
   0_{17} & 0_{17} & \frac{1}{22}\boldsymbol{1}_{17} \\
  \end{array}}\right],\] where \(\boldsymbol{\iota}_{5}=[1,0,0,0,0]'\) for all regimes \(m=1,...,M\) and applying the constraints to the GMAR(22,M) model.
In order to apply the linear constraints in uGMAR, one simply needs to parametrize the model with vectors \(\boldsymbol{\psi_{m}}\) instead of \(\boldsymbol{\phi_{m}}\) and provide the constraint matrices \(\boldsymbol{C_{m}}\) in the argument constraints (or if one estimates the parameters, only the constraint matrices need to be provided). Note that despite the lengths of \(\boldsymbol{\psi_{m}}\), the nominal order of AR coefficients is always \(p\) for all regimes.
Similarly to the case of unconstrained GMAR model, parameter vector for the constrained GMAR model is of the form \[\boldsymbol{\theta}=(\boldsymbol{\upsilon_{1}},...,\boldsymbol{\upsilon_{M}}, \alpha_{1},...,\alpha_{M-1}),\] but now the vectors \(\boldsymbol{\upsilon_{m}}\) are defined by using the vectors \(\boldsymbol{\psi_{m}}\), that is, \[\boldsymbol{\upsilon_{m}}=(\phi_{m,0},\boldsymbol{\psi_{m}}, \sigma_{m}^2) \enspace and \enspace \boldsymbol{\psi_{m}}=(\psi_{m,1},...,\psi_{m,q_{m}}), \enspace m=1,...,M.\] The user has to also provide a list of constraint matrices \(\boldsymbol{R_{m}}\) that satisfy \(\boldsymbol{\phi_{m}}=\boldsymbol{R_{m}\psi_{m}}\) for all \(m=1,...,M.\)
Parameter vector for the constrained StMAR model is defined by simply adding the degrees of freedom parameters to the GMAR’s parameter vector, that is, \[(\boldsymbol{\theta}, \boldsymbol{\nu}),\quad where \quad \boldsymbol{\nu}=(\nu_{1},...,\nu_{M}),\] and \(\boldsymbol{\theta}\) is as in the case of constrained GMAR model.
Parameter vector for the constrained G-StMAR model is similar to the one of the constrained StMAR model, but with degrees of freedom parameters for the StMAR components only.
Analogously non-restricted models, the parameter vectors for constrained versions of restricted GSMAR models are defined by simply replacing vector \(\boldsymbol{\phi}\) with vector \(\boldsymbol{\psi}\). Hence the parameter vector for restricted and constrained GMAR model has the form \[\boldsymbol{\theta}=(\phi_{1,0},...,\phi_{M,0},\boldsymbol{\psi},\sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1}), \quad where \quad \boldsymbol{\psi}=(\psi_{1},...,\psi_{p}).\] The constraint matrix \(\boldsymbol{C}\) needs to be provided and it’s assumed to satisfy \(\boldsymbol{\phi}=\boldsymbol{R\psi}.\)
Parameter vector for the restricted and constrained StMAR model is then again defined by adding the degrees of freedom parameters, that is \((\boldsymbol{\theta}, \boldsymbol{\nu})\) where \(\boldsymbol{\nu}=(\nu_{1},...,\nu_{M}).\) For the restricted and constrained G-StMAR model, the parameter vector is similar to the one of restricted and constrained StMAR model but with degrees of freedom parameters for the StMAR components only.
The function used to estimate models in uGMAR is fitGSMAR. It estimates the model parameters using the method of maximum likelihood and employs a hybrid estimation scheme that is performed in two phases. In the first phase fitGSMAR uses a genetic algorithm to find starting values for the gradient based variable metric algorithm, which it then uses in the second phase for finalize the estimation. It’s important to note that it’s not guaranteed that the numerical estimation algorithms end up in the global maximum point rather than a local one or a saddle point. Because of multimodality and challenging surface of the log-likelihood function, it’s actually expected that many of the estimation rounds won’t find the global maximum point. For this reason one should always perform multiple estimation rounds since more estimation rounds yield more reliable result. The number of estimation rounds can be chosen with the argument ncalls but multiple estimation rounds is also performed default. To shorten the estimation time, uGMAR uses parallel computing to run multiple estimation rounds in parallel. The number of cores used can be set with the argument ncores.
There is also an option to perform some quantile residual tests for the estimated model to get a quick sense on how the model fits to the data.
If the model estimates poorly, it’s often because the number of mixture components is chosen too large. One may also adjust settings of the genetic algorithm employed, or set up an initial population with guesses for the estimates. This can by done by passing arguments in fitGSMAR to the (non-exported) function GAfit which implements the genetic algorithm. To check the available settings, read the documentation ?GAfit. If the iteration limit is reached when estimating the model, the function iterate_more can be used to finish the estimation.
The parameters of the estimated model are printed in an illustrative and easy to read form. In order to easily compare approximate standard errors to certain estimates, it’s advisable to use the summary method, which prints the errors inside brackets next to the estimates. Numerical approximation of the gradient and Hessian matrix of the log-likelihood function at the estimates can be obtained conveniently with the functions get_gradient and get_hessian. The estimated objects also have their own plot method.
Use the function ‘stmar_to_gstmar’ in order to conveniently switch from a StMAR model with large degrees of freedom estimates to the corresponding G-StMAR model.
uGMAR considers model diagnostics based on quantile residuals (see Kalliovirta 2012). Quantile residuals are asymptotically standard normal distributed if the model is correctly specified, and they can be hence used for graphical diagnostics and testing.
The function quantileResidualTests performs the quantile residual tests introduced by Kalliovirta (2012), testing for normality, autocorrelation and conditional heteroscedasticity. For graphical diagnostics, one may use the functions diagnosticPlot and quantileResidualPlot.
Consider installing the suggested package gsl for much faster evaluations of quantile residuals in the cases of StMAR and G-StMAR models. If the model and data are both large, performing quantile residuals tests may take significantly long time for StMAR and G-StMAR models without the package gsl because numerical integration is used. It’s not imported because, in our experience, it might not install to some platforms directly when installing uGMAR.
One may wish to construct an arbitrary model without estimating the parameters, for example in order to simulate from the particular process of interest. An arbitrary model can be created with the function GSMAR. If one wants to add or update data to the model afterwards, it’s advisable to use the function add_data.
The function simulateGSMAR is the one for the job. As the main argument it uses a gsmar object created with fitGSMAR or GSMAR.
We advice to directly use the function simulateGSMAR for quantile based forecasting. However, uGMAR contains the predict method predict.gsmar for forecasting GSMAR processes. For one step predictions using the exact formula for conditional mean is supported, but the forecasts further than that are based on independent simulations. The predictions are either sample means or medians and the confidence intervals are based on sample quantiles. The objects generated by predict.gsmar have their own plot method.
Test linear hypotheses using Wald test with the function and using likelihood ratio test with the function .
For analysing multivariate versions of the model, you are welcome to try the package gmvarkit. It currently supports the GMVAR model which is the multivariate extension of the GMAR model.