Genetic Algorithms - Sam HsiungContribution concerning Genetic Algorithms (this is just a test sample) [hsiung@cs.uchicago.edu] submitted on August 29, 1998 at 22:39:10:
Create a Random Initial State An initial population is created from a random selection of solutions (which are analagous to chromosomes). This is unlike the situation for Symbolic AI systems, where the initial state in a problem is already given instead.
Evaluate Fitness A value for fitness is assigned to each solution (chromosome) depending on how close it actually is to solving the problem (thus arriving to the answer of the desired problem). (These "solutions" are not to be confused with "answers" to the problem, think of them as possible characteristics that the system would employ in order to reach the answer.)
Reproduce (& Children Mutate) Those chromosomes with a higher fitness value are more likely to reproduce offspring (which can mutate after reproduction). The offspring is a product of the father and mother, whose composition consists of a combination of genes from them (this process is known as "crossing over".
Next Generation If the new generation contains a solution that produces an output that is close enough or equal to the desired answer then the problem has been solved. If this is not the case, then the new generation will go through the same process as their parents did. This will continue until a solution is reached.
Let us consider a diophantine (only integer solutions) equation: a+2b+3c+4d=30,
where a,b,c,d are positive integers.
Using a genetic algorithm, all that is needed is a little time to reach a solution (a,b,c,d). Of course
you could ask, why not just use a brute force method (plug in every possible value for a,b,c,d given
the constraints 1 =< a,b,c,d =< 30)? The architecture of GA systems allow for a solution to be reached quicker
since "better" solutions have a better chance of surviving and procreating, as opposed to randomly throwing
out solutions and seeing which ones work.
Let's start from the beginning. First we will choose 5 random initial solution sets, with constraints
1 =< a,b,c,d =< 30. (Note that we can choose smaller constraints for b,c,d, but for the sake of simplicity
we shall use 30)
To calculate the fitness values, plug each solution set into the expression a+2b+3c+4d. Then, calculate the absolute value of the difference of each expression with 30, this will be our fitness value.
Since values that are lower are closer to the desired answer (30), these values are more desirable. In this case, higher fitness values are not desirable, while lower ones are. In order to create a system where chromosomes with more desirable fitness values are more likely to be chosen as parents, we must first calculate the percentages that each chromosome has of being picked. One solution is to take the sum of the multiplicative inverses of the fitness values (0.135266), and calculate the percentages from there. (Note: ALL simulations were created using a random number generator)
In order to pick our 5 pairs of parents (each of which will have 1 offspring, and thus 5 new solutions sets total), imagine that we had a 10000 sided die, and on 880 of those sides, chromosome 1 was labeled, and on 3080 of those sides, chromosome 2 was labeled, and on 2640 of those sides, chromosome 3 was labeled, and on 556 of those sides, chromosome 4 was labeled, and on 2640 of those sides, chromosome 5 was labeled. To choose our first pair we roll the die twice, and take those chromosomes to be our first two parents. Continuing in this fashion we have the following parents:
The offspring of each of these parents contains the genetic information of both father and mother. How this can be determined is very arbitrary. However for this case, we could use something called a "cross-over". Let us say a mother has the solution set a1,b1,c1,d1, and a father has the solution set a2,b2,c2,d2, then there can be six possible cross overs (| = dividing line):
There are many other ways in which parents can trade genetic information to create an offspring, crossing over is just one way. Where the dividing line would be located is completely arbitrary, and so is whether or not the father or mother will contribute to the left or right of the dividing line. Now let's apply this to our offspring:
Now we can calculate the fitness values for the new generation of offsprings.
The average fitness value for the offspring chromosomes were 38.8, while the average fitness value for the parent chromosomes were 59.4. Of course, the next generation (the offspring) are supposed to mutate, that is, for example we can change one of the values in the ordered quadruple of each chromosome to some random integer between 1 and 30. Progressing at this rate, one chromosome should eventually reach a fitness level of 0 eventually, that is when a solution is found. If you tried and simulated this yourself, depending on the mutations you may actually get a fitness average that is higher, but on the long-run, the fitness levels will decrease. For systems where the population is larger (say 50, instead of 5), the fitness levels should more steadily and stabily approach the desired level (0).
|