METAHEURISTICS:GENETIC ALGORITHMS

GENETIC ALGORITHMS

Genetic algorithms provide a third type of metaheuristic that is quite different from the first two. This type tends to be particularly effective at exploring various parts of the feasible region and gradually evolving toward the best feasible solutions.

After introducing the basic concepts for this type of metaheuristic, we will apply a basic genetic algorithm to the same nonlinear programming example just considered above with the additional constraint that the variable is restricted to integer values. We then will apply this approach to the same traveling salesman problem example considered in each of the preceding sections.

Basic Concepts

Just as simulated annealing is based on an analogy to a natural phenomenon (the physical annealing process), genetic algorithms are greatly influenced by another form of a natural phenomenon. In this case, the analogy is to the biological theory of evolution formulated by Charles Darwin in the mid-19th century. Each species of plants and animals has great individual variation. Darwin observed that those individuals with variations that impart a survival advantage through improved adaptation to the environment are most likely to survive to the next generation. This phenomenon has since been referred to as survival of the fittest.

The modern field of genetics provides a further explanation of this process of evo- lution and the natural selection involved in the survival of the fittest. In any species that

An Application Vignette

Intel Corporation is the world’s largest semiconductor chip maker. With well over 80,000 employees and annual revenues over $53 billion, it has over 5000 prod- ucts serving a wide variety of markets.

With so many products, one key to the continuing success of the company is an effective system for con- tinually updating the design and scheduling of its prod- uct line. It can maximize its revenues only by introducing products into markets with the right fea- tures, at the right price, and at the right time. Therefore, a major operations research study was undertaken to optimize how this is done. The resulting model incorpo- rated market requirements and financials, design- engineering capabilities, manufacturing costs, and multiple-time dynamics. This model then was embedded in a decision support system that soon was used by hun- dreds of Intel employees representing most major Intel groups and many distinct job functions.

The algorithmic heart of this decision support system is a genetic algorithm that handles resource constraints, scheduling, and financial optimization. This algorithm uses a fitness function to evaluate candidate solutions and then performs the usual genetic operators of mutation and crossover. It also calls on a combination of heuristic methods and mathematical optimization techniques to optimize product composition. This algo- rithm and its associated database enabled a new busi- ness process that is shifting Intel divisions to a unified focus on global profit maximization.

This dramatic application of operations research revolving around a genetic algorithm led to OR profes- sionals from Intel winning the prestigious 2011 Daniel H. Wagner Prize for Excellence in Operations Research Practice.

Source: Rash, E., and K. Kempf, “Product Line Design and Scheduling at Intel,” Interfaces, 42(5): 425–436, September– October 2012. (A link to this article is provided on our website, www.mhhe.com/hillier.)

reproduces by sexual reproduction, each offspring inherits some of the chromosomes from each of the two parents, where the genes within the chromosomes determine the indi- vidual features of the child. A child who happens to inherit the better features of the par- ents is slightly more likely to survive into adulthood and then become a parent who passes on some of these features to the next generation. The population tends to improve slowly over time by this process. A second factor that contributes to this process is a random, low-level mutation rate in the DNA of the chromosomes. Thus, a mutation occasionally occurs that changes the features of a chromosome that a child inherits from a parent. Although most mutations have no effect or are disadvantageous, some mutations provide desirable improvements. Children with desirable mutations are slightly more likely to survive and contribute to the future gene pool of the species.

These ideas transfer over to dealing with optimization problems in a rather natural way. Feasible solutions for a particular problem correspond to members of a particular species, where the fitness of each member now is measured by the value of the objective function. Rather than processing a single trial solution at a time (as with basic forms of tabu search and simulated annealing), we now work with an entire population of trial solutions.1 For each iteration (generation) of a genetic algorithm, the current population consists of the set of trial solutions currently under consideration. These trial solutions are thought of as the currently living members of the species. Some of the youngest members of the population (including especially the fittest members) survive into adulthood and become parents (paired at random) who then have children (new trial solutions) who share some of the features (genes) of both parents. Since the fittest members of the population are more likely to be- come parents than others, a genetic algorithm tends to generate improving populations of trial solutions as it proceeds. Mutations occasionally occur so that certain children also can acquire features (sometimes desirable features) that are not possessed by either parent. This helps a genetic algorithm to explore a new, perhaps better part of the feasible region than previously considered. Eventually, survival of the fittest should tend to lead a genetic algorithm to a trial solution (the best of any considered) that is at least nearly optimal.

1One of the intensification strategies of tabu search also maintains a population of best solutions. The population is used to create linking paths between its members and to relaunch the search along these paths.

Although the analogy of the process of biological evolution defines the core of any genetic algorithm, it is not necessary to adhere rigidly to this analogy in every detail. For example, some genetic algorithms (including the one outlined below) allow the same trial solution to be a parent repeatedly over multiple generations (iterations). Thus, the anal- ogy needs to be only a starting point for defining the details of the algorithm to best fit the problem under consideration.

Here is a rather typical outline of a genetic algorithm that we will employ for the two examples.

Outline of a Basic Genetic Algorithm

Initialization. Start with an initial population of feasible trial solutions, perhaps by generating them randomly. Evaluate the fitness (the value of the objective function) for each member of this current population.

Iteration. Use a random process that is biased toward the more fit members of the cur- rent population to select some of the members (an even number) to become parents. Pair up the parents randomly and then have each pair of parents give birth to two children (new feasible trial solutions) whose features (genes) are a random mixture of the features of the par- ents, except for occasional mutations. (Whenever the random mixture of features and any mutations result in an infeasible solution, this is a miscarriage, so the process of attempting to give birth then is repeated until a child is born that corresponds to a feasible solution.) Retain the children and enough of the best members of the current population to form the new population of the same size for the next iteration. (Discard the other members of the current population.) Evaluate the fitness for each new member (the children) in the new population.

Stopping rule. Use some stopping rule, such as a fixed number of iterations, a fixed amount of CPU time, or a fixed number of consecutive iterations without any improvement in the best trial solution found so far. Use the best trial solution found on any iteration as the final solution.

Before this algorithm can be implemented the following questions need to be answered:

1. What should the population size be?

2. How should the members of the current population be selected to become parents?

3. How should the features of the children be derived from the features of the parents?

4. How should mutations be injected into the features of the children?

5. Which stopping rule should be used?

The answers to these questions depend greatly on the structure of the specific problem being addressed. The metaheuristics area in the IOR Tutorial does include two versions of the algorithm. One is for very small integer nonlinear programming problems like the example considered next. The other is for small traveling salesman problems. Both versions answer some of the questions in the same way, as described below:

1. Population size: Ten. (This size is reasonable for the small problems for which this soft- ware is designed, but much larger populations commonly are used for large problems.)

2. Selection of parents: From among the five most fit members of the population (ac- cording to the value of the objective function), select four randomly to become par- ents. From among the five least fit members, select two randomly to become parents. Pair up the six parents randomly to form three couples.

3. Passage of features (genes) from parents to children: This process is highly problem dependent and so differs for the two versions of the algorithm in the software, as described later for the two examples.

4. Mutation rate: The probability that an inherited feature of a child mutates into an opposite feature is set at 0.1 in the software. (Much smaller mutation rates commonly are used for large problems.)

5. Stopping rule: Stop after five consecutive iterations without any improvement in the best trial solution found so far.

Now we are ready to apply the algorithm to the two examples.

The Integer Version of the Nonlinear Programming Example

We return again to the small nonlinear programming problem that was introduced in Sec. 14.1 (see Fig. 14.1) and then addressed using a simulated annealing algorithm at the end of the pre- ceding section. However, we now add the additional constraint that the problem’s single vari- able x must have an integer value. Because the problem already has the constraint that 0 x31, this means that the problem has 32 feasible solutions, x = 0, 1, 2, . . . , 31. (Having such bounds is very important for a genetic algorithm, since it reduces the search space to the relevant region.) Thus, we now are dealing with an integer nonlinear programming problem.

When applying a genetic algorithm, strings of binary digits often are used to represent the solutions of the problem. Such an encoding of the solutions is a particularly convenient one for the various steps of a genetic algorithm, including the process of parents giving birth to children. This encoding is easy to do for our particular problem because we simply can write each value of x in base 2. Since 31 is the maximum feasible value of x, only five binary digits are required to write any feasible value. We always will include all five binary digits even when the leading digit or digits are zeroes. Thus, for example,

INTRODUCTION TO OPERATIONS RESEARCH-0259

This particular method of generating the children from the parents is known as uni- form crossover. It is perhaps the most intuitive of the various alternative methods that have been proposed.

We now need to consider the possibility of mutations that would affect the genetic makeup of the children.

Since the probability of a mutation in any gene (flipping the binary digit to the opposite value) has been set at 0.1 for our algorithm, we can let the random numbers 0.0000–0.0999 correspond to a mutation, 0.1000–0.9999 correspond to no mutation.

For example, suppose that in the next 10 random numbers generated, only the eighth one is less than 0.1000. This indicates that no mutation occurs in the first child, but the third gene (digit) in the second child flips its value. Therefore, the final conclusion is that the two children are

C1: 01011 and

C2: 00110.

Returning to base 10, the two parents correspond to the solutions, x = 3 and x = 10, whereas their children would have been (barring mutations) x = 11 and x = 2. However, because of the mutation, the children become x = 11 and x = 6.

For this particular example, any integer value of x such that 0 x 31 (in base 10) is a feasible solution, so every 5-digit number in base 2 also is a feasible solution. Therefore, the above process of creating children never results in a miscarriage (an infeasible solution). However, if the upper bound on x were, say, x 25 instead, then miscarriages would oc- cur occasionally. Whenever a miscarriage occurs, the solution is discarded and the entire process of creating a child is repeated until a feasible solution is obtained.

This example includes only a single variable. For a nonlinear programming problem with multiple variables, each member of the population again would use base 2 to show the value of each variable. The above process of generating children from parents then would be done in the same way one variable at a time.

Table 14.7 shows the application of the complete algorithm to this example through both the initialization step (part a of the table) and iteration 1 (part b of the table). In the initialization step, each of the members of the initial population were generated by gen- erating five random numbers and using the correspondence between a random number and a binary digit given earlier to obtain the five binary digits in turn. The corresponding value of x in base 10 then is plugged into the objective function given at the beginning of Sec. 14.1 to evaluate the fitness of that member of the population.

The five members of the initial population that have the highest degree of fitness (in order) are members 10, 8, 4, 1, and 7. To randomly select four of these members to become parents, a random number is used to select one member to be rejected, where 0.0000– 0.1999 corresponds to ejecting the first member listed (member 10), 0.2000–0.3999 corresponds to rejecting the second member, and so forth. In this case, the random number was 0.9665, so the fifth member listed (member 7) does not become a parent.

From among the five less fit members of the initial population (members 2, 1, 6, 5, and 9), random numbers now are used to select which two of these members will become parents. In this case, the random numbers were 0.5634 and 0.1270. For the first random number, 0.0000–0.1999 corresponds to selecting the first member listed (member 2), 0.2000–0.3999 corresponds to selecting the second member, and so forth, so the third member listed (member 6) is the one selected in this case. Since only four members (2, 1, 5, and 9) now remain for selecting the last parent, the corresponding intervals for the second random number are 0.0000–0.2499, 0.2500–0.4999, 0.5000–0.7499, and

INTRODUCTION TO OPERATIONS RESEARCH-0261

The next step is to pair up the six parents—members 10, 8, 4, 1, 6, and 2. Let us begin by using a random number to determine the mate of the first member listed (member 10). The random number 0.8204 indicated that it should be paired up with the fifth of the other five parents listed (member 2). To pair up the next member listed (member 8), the next ran- dom number was 0.0198, which is in the interval 0.0000–0.3333, so the first of the three remaining parents listed (member 4) is chosen to be the mate of member 8. This then leaves the two remaining parents (members 1 and 6) to become the last couple.

Part (b) of Table 14.7 shows the children that were reproduced by these parents by using the process illustrated earlier in this subsection. Note that mutations occurred in the third gene of the second child and the fourth gene of the fourth child. By and large, the six children have a relatively high degree of fitness. In fact, for each pair of parents, both of the children turned out to be more fit than one of the parents. This does not always occur but is fairly common. In the case of the second pair of parents, both of the children happen to be more fit than both parents. Fortuitously, both of these children (x = 19 and x = 20) actually are su- perior to any of the members of the preceding population given in part (a) of the table. To form the new population for the next iteration, all six children are retained along with the four most fit members of the preceding population (members 10, 8, 4, and 1).

Subsequent iterations would proceed in a similar fashion. Since we know from the discussion in Sec. 14.1 (see Fig. 14.1) that x = 20 (the best trial solution generated in iteration 1) actually is the optimal solution for this example, subsequent iterations would not provide any further improvement. Therefore, the stopping rule would terminate the algorithm after five more iterations and provide x = 20 as the final solution.

Your IOR Tutorial includes a procedure for applying this same genetic algorithm to other very small integer nonlinear programming problems. (The form and size restrictions are the same as specified in Sec. 14.3 for nonlinear programming problems.)

You might find it interesting to apply this procedure in IOR Tutorial to this same ex- ample. Because of the randomness inherent in the algorithm, different intermediate results are obtained each time that it is applied. (Problem 14.4-3 asks you to apply the algorithm to this example several times.)

Although this was a discrete example, genetic algorithms can also be applied to continuous problems such as a nonlinear programming problem without an integer constraint. In this case, the value of a continuous variable would be represented (or closely approximated) by a decimal number in base 2. For example, x = 235 is 10111.10100 in base 2, and x = 23.66 is closely approximated by 10111.10101 in base 2. All the binary digits on both sides of the decimal point can be treated just as before to have parents reproduce children, and so forth.

The Traveling Salesman Problem Example

Sections 14.2 and 14.3 illustrated how a tabu search algorithm and a simulated annealing algorithm would be applied to the particular traveling salesman problem introduced in Sec. 14.1 (see Fig. 14.4). Now let us see how our genetic algorithm can be applied using this same example.

Rather than using binary digits in this case, we will continue to represent each so- lution (tour) in the natural way as a sequence of cities visited. For example, the first solution considered in Sec. 14.1 is the tour of the cities in the following order: 1-2-3-4-5-6- 7-1, where city 1 is the home base where the tour must begin and end. We should point out, however, that genetic algorithms for traveling salesman problems frequently use other meth- ods for encoding solutions. In general, clever methods of representing solutions (often by using strings of binary digits) can make it easier to generate children, create mutations, main- tain feasibility, and so forth, in a natural way. The development of an appropriate encoding scheme is a key part of developing an effective genetic algorithm for any application.

A complication with this particular example is that, in a sense, it is too easy. Because of the rather limited number of links between pairs of cities in Fig. 14.4, this problem barely has 10 distinct feasible solutions if we rule out a tour that is simply a previously considered tour in the reverse direction. Therefore, it is not possible to have an initial population with 10 distinct trial solutions such that the resulting six parents then reproduce distinct children that also are distinct from the members of the initial population (including the parents).

Fortunately, a genetic algorithm can still operate reasonably well when there is a modest amount of duplication in the trial solutions in a population or in two consecutive populations. For example, even when both parents in a couple are identical, it still is possible for their children to differ from the parents because of mutations.

The genetic algorithm for traveling salesman problems in your IOR Tutorial does not do anything to avoid duplication in the trial solutions considered. Each of the 10 trial solutions in the initial population is generated in turn as follows. Starting from the home base city, random numbers are used to select the next city from among those that have a link to the home base city (cities 2, 3, and 7 in Fig. 14.4). Random numbers then are used to select the third city from among the remaining cities that have a link to the second city. This process is continued until either every city is included once in the tour (plus a return to the home base city from the last city) or a dead end is reached because there is no link from the current city to any of the remaining cities that still need to be visited. In the latter case, the entire process for generating a trial solution is restarted from the beginning with new random numbers.

Random numbers are also used to reproduce children from a pair of parents. To il- lustrate this process, consider the following pair of parents:

INTRODUCTION TO OPERATIONS RESEARCH-0262INTRODUCTION TO OPERATIONS RESEARCH-0263

Ignoring the possibility of mutations for the time being, here is the main idea for how to generate a child.

Inheriting Links: Genes correspond to the links in a tour. Therefore, each of the links (genes) inherited by a child should come from one parent or the other (or both). (One other possibility described later is that a parent also can pass down a sub-tour reversal.) These links being inherited are randomly selected one at a time until a complete tour (the child) has been generated.

To start this process with the above parents, since a tour must begin in city 1, a child’s initial link must come from one of the parent’s links that connect city 1 to another city. For parent P1, these are links 1-2 and 1-7. (Link 1-7 qualifies since it is equivalent to take the tour in either direction.) For parent P2, the corresponding links are 1-2 (again) and 1-3. The fact that both parents have link 1-2 doubles the probability that it will be inherited by a child. Therefore, when using a random number to determine which link the child will inherit, the interval 0.0000–0.4999 (or any interval of this size) corresponds to inheriting link 1-2 whereas the intervals 0.50000–0.7499 and 0.7500–0.9999 then would correspond to the choice of link 1-7 and link 1-3, respectively. Suppose 1-2 is selected, as shown in the first row of Table 14.8. After 1-2, one parent next uses link 2-3 whereas the other uses 2-4. Therefore, in generating the child, a random choice should be made between these two options. Suppose 2-4 is selected. (See the second row of Table 14.8.) There now are three options for the link to follow 1-2-4 because the first parent uses two links (4-3 and 4-5) to connect city 4 in its tour and the second parent uses link 4-6 (link 4-2 is ignored because city 2 already is in the child’s tour). When randomly selecting one of these op- tions, suppose 4-3 is chosen to form 1-2-4-3 as the beginning of the child’s tour thus far, as shown in the third row of Table 14.8.

We now come to an additional feature of this process for generating a child’s tour, namely, using a sub-tour reversal from a parent.

Inheriting a Sub-Tour Reversal: One other possibility for a link inherited by a child is a link that is needed to complete a sub-tour reversal that the child’s tour is making in a portion of a parent’s tour.

To illustrate how this possibility can arise, note that the next city beyond 1-2-4-3 needs to be one of the cities not yet visited (city 5, 6, or 7), but the first parent does not have a link from city 3 to any of these other cities. The reason is that the child is using a sub- tour reversal (reversing 3-4) of this parent’s tour, 1-2-3-4-5-6-7-1. Completing this sub- tour reversal requires adding the link 3-5, so this becomes one of the options for the next

link in the child’s tour. The other option is link 3-7 provided by the second parent (link 3-1 is not an option because city 1 must come at the very end of the tour). One of these two op- tions is selected randomly. Suppose the choice is link 3-5, which provides 1-2-4-3-5 as the child’s tour thus far, as shown in the fourth row of Table 14.8.

To continue this tour, the options for the next link are 5-6 (provided by both parents) and 5-7 (provided by the second parent). Suppose that the random choice among 5-6, 5-6, and 5-7 is 5-6, so that the tour thus far is 1-2-4-3-5-6. (See the fifth row of Table 14.8.) Since the only city not yet visited is city 7, link 6-7 is automatically added next, followed by link 7-1 to return to home base. Thus, as shown in the last row of Table 14.8, the com- plete tour for the child is

C1: 1-2-4-3-5-6-7-1

Figure 14.5 in Sec. 14.1 displays how closely this child resembles the first parent, since the only difference is the sub-tour reversal obtained by reversing 3-4 in the parent.

If link 5-7 had been chosen instead to follow 1-2-4-3-5, the tour would have been com- pleted automatically as 1-2-4-3-5-7-6-1. However, there is no link 6-1 (see Fig. 14.4), so a dead end is reached at city 6. When this happens, a miscarriage occurs and the entire process needs to be restarted from the beginning with new random numbers until a child with a complete tour is obtained. Then this process is repeated to obtain the second child.

We now need to add one more feature—the possibility of mutations—to complete the description of the process of generating children.

Mutations of Inherited Links: Whenever a particular link normally would be inherited from a parent of a child, there is a small possibility that a mutation will occur that will reject that link and instead randomly select one of the other links from the current city to another city not already on the tour, regardless of whether that link is used by either parent.

Our genetic algorithm for traveling salesman problems implemented in your IOR Tutor- ial uses a probability of 0.1 that a mutation will occur each time the next link in the child’s tour needs to be selected. Thus, whenever the corresponding random number is less than 0.1000, the choice of the link made in the normal manner described above is rejected (if any other possible choice exists). Instead, all the other links from the current city to a city not already in the tour (including links not provided by either parent) are identified, and one of these links is randomly selected to be the next link in the tour. For example, sup- pose that a mutation occurs when generating the very first link for the child. Even though 1-2 had been the random choice as the first link, this link now would be rejected because of the mutation. Since city 1 also has links to cities 3 and 7 (see Fig. 14.4), either link 1-3 or link 1-7 would be randomly selected to be the first tour. (Since the parents end their tours by using one or the other of these links, this can be viewed in this case as start- ing the child’s tour by reversing the direction of one of the parents’ tours.)

We now can outline the general procedure for generating a child from a pair of parents.

Procedure for Generating a Child

1. Initialization: To start, designate the home base city as the current city.

2. Options for the next link: Identify all the links from the current city to another city not already in the child’s tour that are used by either parent in either direction. Also, add any link that is needed to complete a sub-tour reversal that the child’s tour is mak- ing in a portion of a parent’s tour.

3. Selection of the next link: Use a random number to randomly select one of the options identified in step 2.

4. Check for a mutation: If the next random number is less than 0.1000, a mutation occurs and the link selected in step 3 is rejected (unless there is no other link from the current

city to another city not already in the tour). If the link is rejected, identify all the other links from the current city to another city not already in the tour (including links not used by either parent). Use a random number to randomly select one of these other links.

5. Continuation: Add the link selected in step 3 (if no mutation occurs) or in step 4 (if a mutation occurs) to the end of the child’s current incomplete tour and redesignate the city at the end of this link as the current city. If there still remains more than one city not included on the tour (plus the return to the home base city), return to steps 2–4 to select the next link. Otherwise, go to step 6.

6. Completion: With only one city remaining that has not yet been added to the child’s tour, add the link from the current city to this remaining city. Then add the link from this last city back to the home base city to complete the tour for the child. However, if the needed link does not exist, a miscarriage occurs and the procedure must restart again from step 1.

This procedure is applied for each pair of parents to obtain each of their two children.

The genetic algorithm for traveling salesman problems in your IOR Tutorial incorpo- rates this procedure for generating children as part of the overall algorithm outlined near the beginning of this section. Table 14.9 shows the results from applying this algorithm to the example through the initialization step and the first iteration of the overall algorithm. Be- cause of the randomness built into the algorithm, its intermediate results (and perhaps the final best solution as well) will vary each time the algorithm is run to its completion. (To explore this further, Prob. 14.4-7 asks you to use your IOR Tutorial to apply the complete algorithm to this example several times.)

The fact that the example has only a relatively small number of distinct feasible so- lutions is reflected in the results shown in Table 14.9. Members 1, 4, 6, and 10 are iden- tical, as are members 2, 7, and 9 (except that member 2 takes its tour in the reverse direction). Therefore, the random generation of the 10 members of the initial population resulted in only five distinct feasible solutions. Similarly, four of the six children gener- ated (members 12, 14, 15, and 16) are identical to one of its parents (except that member 14 takes its tour in the opposite direction of its first parent). Two of the children (members

INTRODUCTION TO OPERATIONS RESEARCH-0264

12 and 15) have a better fitness (shorter distance) than one of its parents, but neither improved upon both of its parents. None of these children provide an optimal solution (which has a distance of 63). This illustrates the fact that a genetic algorithm may require many generations (iterations) on some problems before the survival-of-the-fittest phenomenon results in clearly superior populations.

The Solved Examples section of the book’s website provides another example of applying this genetic algorithm to a traveling salesman problem. This problem has a somewhat larger number of distinct feasible solutions than the above example, so there is a greater diversity in its initial population, the resulting parents, and their children.

Genetic algorithms are well suited for dealing with the traveling salesman problem and good progress has been made on developing considerably more sophisticated versions than the one described above. In fact, at the time of this writing, a particularly powerful version that has successfully obtained high-quality solutions for problems with up to 200,000 cities (!) has just been announced.2

Comments

Popular posts from this blog

DUALITY THEORY:THE ESSENCE OF DUALITY THEORY

NETWORK OPTIMIZATION MODELS:THE MINIMUM SPANNING TREE PROBLEM

INTEGER PROGRAMMING:THE BRANCH-AND-CUT APPROACH TO SOLVING BIP PROBLEMS