ao

Lennart Oelschläger

2022-01-04

Theory

What is alternating optimization?

Alternating optimization is an iterative procedure for optimizing some function jointly over all parameters by alternating restricted optimization over individual parameter subsets.

More precisely, consider minimizing or maximizing \(f:\mathbb{R}^n \to \mathbb{R}\) over the set of feasible points \(x \in X \subseteq \mathbb{R}^n\). The underlying algorithm of alternating optimization is as follows:

  1. Assign an initial value for \(x\).

  2. Optimize \(f\) with respect to a subset of parameters \(\tilde{x}\) while holding the other parameters constant1. This can be done explicitly or implicitly (by numerical optimization, which is the implemented method in this package).

  3. Replace the values in \(x\) by the optimal values for \(\tilde{x}\) found in step 2.

  4. Repeat from step 2 with another parameter subset.

  5. Stop when the process has converged or reached an iteration limit.

When is alternating optimization a good idea?

What are the properties of alternating optimization?

Alternating optimization, under certain conditions on \(f\), can convergence to the global optimum. However, the set of possible solutions also contains saddle points of \(f\), see for example (James C. Bezdek et al. 1987).

(J. Bezdek and Hathaway 2003) shows that alternating optimization under reasonable assumptions is locally q-linearly convergent.

Application

As an application, we consider minimizing the Himmelblau’s function in \(n = 2\) dimensions with parameter constraints \(-5 \leq x_1, x_2 \leq 5\):

himmelblau <- function(x) (x[1]^2+x[2]-11)^2+(x[1]+x[2]^2-7)^2

Problem specification

The ao package requires that the function first is introduced via the ao::set_f() function in the following way:

f <- ao::set_f(f = himmelblau, npar = 2, lower = -5, upper = 5, check = TRUE)
#> Configuration checked.

The output f is an object of class ao_f which can be passed to the implementation of the alternating optimization algorithm ao::ao() in the following.

ao::set_f() has the following arguments:

Alternating Optimization

Next, we pass f to ao::ao() as follows to perform alternating optimization:

ao::ao(f = f, partition = list(1, 2), initial = 0, iterations = 10, 
       tolerance = 1e-6, minimize = TRUE, progress = FALSE, plot = FALSE)
#> Optimum value: 1.940035e-12 
#> Optimum at: 3.584428 -1.848126 
#> Optimization time: 0.33 seconds

This does the following: It minimizes f by alternating optimizing with respect to each parameter separately, where the parameters all are initialized at the value 0. The algorithm terminates after 10 iterations or prematurely if the euclidean distance between the current solution and the one from the last iteration is smaller than tolerance = 1e-6.2

Let’s also review the arguments of ao::ao():

References

Bezdek, James C., and Richard J. Hathaway. 2002. “Some Notes on Alternating Optimization.” Proceedings of the 2002 AFSS International Conference on Fuzzy Systems. Calcutta: Advances in Soft Computing.
Bezdek, James C, Richard J Hathaway, Michael J Sabin, and William T Tucker. 1987. “Convergence Theory for Fuzzy c-Means: Counterexamples and Repairs.” IEEE Transactions on Systems, Man, and Cybernetics 17 (5): 873–77.
Bezdek, James, and Richard Hathaway. 2003. “Convergence of Alternating Optimization.” Neural, Parallel and Scientific Computations 11 (December): 351–68.
Hu, Yingkang, and Richard J Hathaway. 2002. “On Efficiency of Optimization in Fuzzy c-Means.” Neural, Parallel and Scientific Computations 10.

  1. Note that alternating optimization is a generalization of joint optimization, where the only parameter subset would be the whole set of parameters.↩︎

  2. We set progress = FALSE and plot = FALSE here because setting these parameters to TRUE would heavily expand the output. However, these options for printing and plotting the progress are useful for analyzing the behaviour of the alternating optimization.↩︎