Nonlinear Programming:ONE-VARIABLE UNCONSTRAINED OPTIMIZATION

ONE-VARIABLE UNCONSTRAINED OPTIMIZATION

We now begin discussing how to solve some of the types of problems just described by considering the simplest case—unconstrained optimization with just a single vari- able x (n = 1), where the differentiable function f (x) to be maximized is concave.7 Thus, the necessary and sufficient condition for a particular solution x = x* to be op- timal (a global maximum) is

as depicted in Fig. 13.12. If this equation can be solved directly for x*, you are done. However, if f (x) is not a particularly simple function, so the derivative is not just a linear or quadratic function, you may not be able to solve the equation analytically. If not, a number of search procedures are available for solving the problem numerically.

The approach with any of these search procedures is to find a sequence of trial so- lutions that leads toward an optimal solution. At each iteration, you begin at the current trial solution to conduct a systematic search that culminates by identifying a new improved

trial solution. The procedure is continued until the trial solutions have converged to an optimal solution, assuming that one exists.

We now will describe two common search procedures. The first one (the bisection method) was chosen because it is such an intuitive and straightforward procedure. The second one (Newton’s method) is included because it plays a fundamental role in nonlinear programming in general.

The Bisection Method

This search procedure always can be applied when f(x) is concave (so that the second derivative is negative or zero for all x) as depicted in Fig. 13.12. It also can be used for certain other functions as well. In particular, if x* denotes the optimal solution, all that is needed8 is that

These conditions automatically hold when f(x) is concave, but they also can hold when the second derivative is positive for some (but not all) values of x.

The idea behind the bisection method is a very intuitive one, namely, that whether the slope (derivative) is positive or negative at a trial solution definitely indicates whether improvement lies immediately to the right or left, respectively. Thus, if the derivative evaluated at a particular value of x is positive, then x* must be larger than this x (see Fig. 13.12), so this x becomes a lower bound on the trial solutions that need to be considered there- after. Conversely, if the derivative is negative, then x* must be smaller than this x, so x would become an upper bound. Therefore, after both types of bounds have been identified, each new trial solution selected between the current bounds provides a new tighter bound of one type, thereby narrowing the search further. As long as a reasonable rule is used to select each trial solution in this way, the resulting sequence of trial solutions must converge to x*. In practice, this means continuing the sequence until the distance between the bounds is sufficiently small that the next trial solution must be within a prespecified error tolerance of x*.

This entire process is summarized next, given the notation

Although there are several reasonable rules for selecting each new trial solution, the one used in the bisection method is the midpoint rule (traditionally called the Bolzano search plan), which says simply to select the midpoint between the two current bounds.

8Another possibility is that the graph of f(x) is flat at the top so that x is optimal over some interval [a, b]. In this case, the procedure still will converge to one of these optimal solutions as long as the derivative is positive for x < a and negative for x > b.

Summary of the Bisection Method

Initialization: Select E. Find an initial x and x by inspection (or by respectively finding any value of x at which the derivative is positive and then negative). Select an initial trial solution

INTRODUCTION TO OPERATIONS RESEARCH-0185

Your IOR Tutorial includes an interactive procedure for executing the bisection method.

Newton’s Method

Although the bisection method is an intuitive and straightforward procedure, it has the disadvantage of converging relatively slowly toward an optimal solution. Each iteration only decreases the difference between the bounds by one-half. Therefore, even with the fairly simple function being considered in Table 13.1, seven iterations were required to reduce the error tolerance for x* to less than 0.01. Another seven iterations would be needed to reduce this error tolerance to less than 0.0001.

The basic reason for this slow convergence is that the only information about f(x) be- ing used is the value of the first derivative f '(x) at the respective trial values of x. Addi- tional helpful information can be obtained by considering the second derivative f "(x) as well. This is what Newton’s method 9 does.

9This method is due to the great 17th-century mathematician and physicist, Sir Isaac Newton. While a young student at the University of Cambridge (England), Newton took advantage of the university being closed for two years (due to the bubonic plague that devastated Europe in 1664–65) to discover the law of universal gravita- tion and invent calculus (among other achievements). His development of calculus led to this method.

The basic idea behind Newton’s method is to approximate f(x) within the neighbor- hood of the current trial solution by a quadratic function and then to maximize (or min- imize) the approximate function exactly to obtain the new trial solution to start the next iteration. (This idea of working with a quadratic approximation of the objective func- tion has since been made a key feature of many algorithms for more general kinds of nonlinear programming problems.) This approximating quadratic function is obtained by truncating the Taylor series after the second derivative term. In particular, by letting xi+1 be the trial solution generated at iteration i to start iteration i + 1 (so x1 is the initial trial solution provided by the user to begin iteration 1), the truncated Taylor series for xi+1 is

Having fixed xi at the beginning of iteration i, note that f(xi), f'(xi), and f "(xi) also are fixed constants in this approximating function on the right. Thus, this approximating function is just a quadratic function of xi+1. Furthermore, this quadratic function is such a good approximation of f(xi+1) in the neighborhood of xi that their values and their first and sec- ond derivatives are exactly the same when xi+1 = xi.

This quadratic function now can be maximized in the usual way by setting its first derivative to zero and solving for xi+1. (Remember that we are assuming that f(x) is con- cave, which implies that this quadratic function is concave, so the solution when setting the first derivative to zero will be a global maximum.) This first derivative is

This is the key formula that is used at each iteration i to calculate the next trial solution xi+1 after obtaining the trial solution xi to begin iteration i and then calculating the first and second derivatives at xi. (The same formula is used when minimizing a convex function.)

Iterations generating new trial solutions in this way would continue until these solutions have essentially converged. One criterion for convergence is that ⏐xi+1 - xi⏐ has be- come sufficiently small. Another is that f'(x) is sufficiently close to zero. Still another is that ⏐f(xi+1) - f(xi)⏐ is sufficiently small. Choosing the first criterion, define E as the value such that the algorithm is stopped when ⏐xi+1 - xi⏐ < E.

Here is a complete description of the algorithm.

Summary of Newton’s Method

Initialization: Select E. Find an initial trial solution xi by inspection. Set i = 1.

Iteration i:

1. Calculate f '(xi) and f "(xi). [Calculating f(xi) is optional.]

2. Set x = x - f '(xi) .

i+1 i

f "(xi)

Stopping Rule: If ⏐xi+1 - xi⏐< E, stop; xi+1 is essentially the optimal solution. Otherwise, reset i = i +1 and perform another iteration.

After selecting E = 0.00001 and choosing x1 = 1 as the initial trial solution, Table 13.2 shows the results from applying Newton’s method to this example. After just four iterations, this method has converged to x = 0.83762 as the optimal solution with a very high degree of precision.

A comparison of this table with Table 13.1 illustrates how much more rapidly Newton’s method converges than the bisection method. Nearly 20 iterations would be required for the bisection method to converge with the same degree of precision that Newton’s method achieved after only four iterations.

Although this rapid convergence is fairly typical of Newton’s method, its performance does vary from problem to problem. Since the method is based on using a quadratic approximation of f(x), its performance is affected by the degree of accuracy of the approximation.

Search This Blog

Operation Research course