        Next: Linear Programming Up: Numerical Problems Previous: Determinants and Permanents

## Constrained and Unconstrained Optimization Input description: A function .

Problem description: What point maximizes (or minimizes) the function f?

Discussion: Most of this book concerns algorithms that optimize one thing or another. This section considers the general problem of optimizing functions where, due to lack of structure or knowledge, we are unable to exploit   the problem-specific algorithms seen elsewhere in this book.

Optimization arises whenever there is an objective function that must be tuned for optimal performance.   Suppose we are building a program to identify good stocks to invest in.   We have available certain financial data to analyze, such as the price-earnings ratio, the interest and inflation rates, and the stock price, all as a function of time t. The key question is how much weight we should give to each of these factors, where these weights correspond to coefficents of a formula: We seek the numerical values , , , whose stock-goodness function does the best job of evaluating stocks.    Similar issues arise in tuning evaluation functions for game playing programs such as chess.

Unconstrained optimization problems also arise in scientific computation.   Physical systems from protein structures to particles naturally seek to minimize their ``energy functions.'' Thus programs that attempt to simulate nature often define energy potential functions for the possible configurations of objects and then take as the ultimate configuration the one that minimizes this potential.

Global optimization problems tend to be hard, and there are lots of ways to go about them.   Ask the following questions to steer yourself in the right direction:

• Am I doing constrained or unconstrained optimization? - In unconstrained optimization,   there are no limitations on the values of the parameters other than that they maximize the value of f. Often, however, there are costs or constraints on these parameters. These constraints make certain points illegal, points that might otherwise be the global optimum. Constrained optimization problems typically require mathematical programming approaches like linear programming, discussed in Section .
• Is the function I am trying to optimize described by a formula or data? - If the function that you seek to optimize is presented as an algebraic formula (such as the minimum of ), the solution is to analytically take its derivative f'(n) and see for which points p' we have f'(p') = 0.   These points are either local maxima or minima, which can be distinguished by taking a second derivative or just plugging back into f and seeing what happens.     Symbolic computation systems such as Mathematica and Maple are fairly     effective at computing such derivatives, although using computer algebra systems effectively is somewhat of a black art. They are definitely worth a try, however, and you can always use them to plot a picture of your function to get a better idea of what you are dealing with.
• How expensive is it to compute the function at a given point? - If the function f is not presented as a formula, what to do depends upon what is given. Typically, we have a program or subroutine that evaluates f at a given point, and so can request the value of any given point on demand. By calling this function, we can poke around and try to guess the maxima. Our freedom to search in such a situation depends upon how efficiently we can evaluate f. If f is just a complicated formula, evaluation will be very fast.   But suppose that f represents the effect of the coefficients on the performance of the board evaluation function in a computer chess program, such that is how much a pawn is worth, is how much a bishop is worth, and so forth. To evaluate a set of coefficients as a board evaluator, we must play a bunch of games with it or test it on a library of known positions.   Clearly, this is time-consuming, so we must be frugal in the number of evaluations of f we use.
• How many dimensions do we have? How many do we need? - The difficulty in finding a global maximum increases rapidly with the number of dimensions (or parameters). For this reason, it often pays to reduce the dimension by ignoring some of the parameters. This runs counter to intuition, for the naive programmer is likely to incorporate as many variables as possible into their evaluation function. It is just too hard to tweak such a complicated function. Much better is to start with the 3 to 5 seemingly most important variables and do a good job optimizing the coefficients for these.
• How smooth is my function? The main difficulty of global optimization is getting trapped in local optima.   Consider the problem of finding the highest point in a mountain range.     If there is only one mountain and it is nicely shaped, we can find the top by just walking in whatever direction is up. However, if there are many false summits or other mountains in the area, it is difficult to convince ourselves whether we are really at the highest point. Smoothness is the property that enables us to quickly find the local optimum from a given point.   We assume smoothness in seeking the peak of the mountain by walking up. If the height at any given point was a completely random function, there would be no way we could find the optimum height short of sampling every single point.

Efficient algorithms for unconstrained global optimization use derivatives and partial derivatives to find local optima, to point out the direction in which moving from the current point does the most to increase or decrease the function. Such derivatives can sometimes be computed analytically, or they can be estimated numerically by taking the difference between values of nearby points. A variety of steepest descent and conjugate gradient methods to find local optima have been developed, similar in many ways to numerical root-finding algorithms.

It is a good idea to try out several different methods on any given optimization problem. For this reason, we recommend experimenting with the implementations below before attempting to implement your own method. Clear descriptions of these algorithms are provided in several numerical algorithms books, in particular [PFTV86].

For constrained optimization, finding points that satisfy all the constraints is often the difficult problem. One approach is to use a method for unconstrained optimization, but add a penalty according to how many constraints are violated.     Determining the right penalty function is problem-specific, but it often makes sense to vary the penalties as optimization proceeds. At the end, the penalties should be very high to ensure that all constraints are satisfied.

Simulated annealing is a fairly robust and simple approach to constrained optimization, particularly when we are optimizing over combinatorial   structures (permutations, graphs, subsets) instead of continuous functions. Techniques for simulated annealing are described in Section .

Implementations: Several of the Collected Algorithms of the ACM are Fortran codes for unconstrained optimization, most notably Algorithm 566 [MGH81], Algorithm 702 [SF92], and Algorithm 734 [Buc94]. Algorithm 744 [Rab95] does unconstrained optimization in Lisp. They are available from Netlib (see Section ).   Also check out the selection at GAMS, the NIST Guide to Available Mathematical Software, at http://gams.nist.gov.

NEOS (Network-Enabled Optimization System) provides a unique service, the opportunity to solve your problem on computers and software at Argonne National Laboratory, over the WWW.     Linear programming and unconstrained optimization are both supported. This is worth checking out at http://www.mcs.anl.gov/home/otc/Server/ when you need a solution instead of a program.

General purpose simulated annealing implementations are available and probably are the best place to start experimenting with this technique for constrained optimization.     Particularly popular is Adaptive Simulated Annealing (ASA), written in C and retrievable via anonymous ftp from ftp.alumni.caltech.edu [131.215.139.234] in the /pub/ingber directory.   To get on the ASA mailing list send e-mail to asa-request@alumni.caltech.edu.

Genocop, by Zbigniew Michalewicz [Mic92], is a genetic algorithm-based program for constrained and unconstrained optimization, written in C.     I tend to be quite skeptical of genetic algorithms (see Section ), but many people find them irresistible. Genocop is available from ftp://ftp.uncc.edu/coe/evol/ for noncommercial purposes.

Notes: Steepest-descent methods for unconstrained optimization are discussed in most books on numerical methods, including [PFTV86, BT92]. Unconstrained optimization is the topic of several books, including [Bre73, Fle80].

Simulated annealing was devised by Kirkpatrick et. al. [KGV83] as a modern variation of the Metropolis algorithm [MRRT53].       Both use Monte Carlo techniques to compute the minimum energy state of a system. Good expositions on simulated annealing include [AK89].

Genetic algorithms were developed and popularized by Holland [Hol75, Hol92].   Expositions on genetic algorithms include [Gol89, Koz92, Mic92]. Tabu search [Glo89a, Glo89b, Glo90]   is yet another heuristic search procedure with a devoted following.

Related Problems: Linear programming (see page ), satisfiability (see page ).        Next: Linear Programming Up: Numerical Problems Previous: Determinants and Permanents

Algorithms
Mon Jun 2 23:33:50 EDT 1997