Generation 5: Artificial Intelligence Repository - Genetic Algorithms: Diophantine Equation Solver

Genetic Algorithms: Diophantine Equation Solver

This is a C++ program that solves a diophantine equation using genetic algorithms. This is the first program is a new set of programs popping up on Generation5 - case studies. Therefore this page is a huge break down of the code and what it does, how it does it, and how to use it. Without further ado, here is the download link:

diophantine.zip (3.12k).

This code is the accompanying code to the genetic algorithms example, so please read that if you don't know anything about GAs before attempting to look at this code. Trust me, Sam wrote the essay, and I learnt and coded the program straight from what I learnt from that essay - its great! Now for the code:

CDiophantine

Firstly the class header (note for formatting reasons, a lot of the documentation is taken out):
```
#include <stdlib.h>
#include <time.h>

#define MAXPOP 25

struct gene {
int alleles[4];
int fitness;
float likelihood;

// Test for equality.
operator==(gene gn) {
for (int i=0;i<4;i++) {
if (gn.alleles[i] != alleles[i]) return false;
}

return true;
}
};

class CDiophantine {
public:
CDiophantine(int, int, int, int, int);
int Solve();

// Returns a given gene.
gene GetGene(int i) { return population[i];}

protected:
int ca,cb,cc,cd;
int result;
gene population[MAXPOP];

int Fitness(gene &);
void GenerateLikelihoods();
float MultInv();inverse.
int CreateFitnesses();
void CreateNewPopulation();
int GetIndex(float val);

gene Breed(int p1, int p2);

};
```
Firstly you notice that there are two structures, the gene structure and the actual CDiophantine class. The gene structure is used to keep track of the different solution sets. The population generated is a population of genes. The gene structure keeps track of its own fitness and likelihood values itself. I also coded a small function to test for equality, this just made some other code a lot more concise. Now onto the functions.

Fitness function
The fitness functions calculate the fitness of each gene. In our case the fitness function is the difference between the calculated value of the gene and the result we want. This class uses two functions, one that calculates all the fitnesses and another smaller one (you should probably make the function inline) to calculate it per gene.

```
int CDiophantine::Fitness(gene &gn) {
int total = ca * gn.alleles[0] + cb * gn.alleles[1]
+ cc * gn.alleles[2] + cd * gn.alleles[3];

return gn.fitness = abs(total - result);
}

int CDiophantine::CreateFitnesses() {
float avgfit = 0;
int fitness = 0;
for(int i=0;i<MAXPOP;i++) {
fitness = Fitness(population[i]);
avgfit += fitness;
if (fitness == 0) {
return i;
}
}

return 0;
}
```
Note that if the fitness is zero, then a solution has been found - so return the index. After we calculate the fitnesses, we then have to calculate the likelihoods of the gene's being chosen.

Likelihood functions
Just like Sam explained, we calculate the likelihood by first calculating the sum of the multiplicative inverses, then by dividing the inverse of the fitnesses by that value. When the likelihoods are calculated though, the likelihoods are cumulative - this makes calculating the parents easy. For example, in his example, Sam wrote the likelihoods like this:

 Chromosome Likelihood 1 (1/84)/0.135266 = 8.80% 2 (1/24)/0.135266 = 30.8% 3 (1/26)/0.135266 = 28.4% 4 (1/133)/0.135266 = 5.56% 5 (1/28)/0.135266 = 26.4%

In my program though, with the same initial values, the likelihoods would be cumulatively added - imagine it like the percentage of a pie chart. The first gene is from 0 to 8.80%, the next goes to 39.6% (because it starts from 8.8). The likelihood table would look like this:

 Chromosome Likelihood (smi = 0.135266) 1 (1/84)/smi = 8.80% 2 (1/24)/smi = 39.6% (30.6+8.8) 3 (1/26)/smi = 68% (28.4+39.6) 4 (1/133)/smi = 73.56% (5.56+68) 5 (1/28)/smi = 99.96% (26.4+73.56)

The last value will always be 100 (in the above example, it isn't due to rounding). Now with the theory of it, let's look at the code. The code is very simple - note though that a type cast to float is required on the fitness value so that integer division doesn't occur. There are two functions, one to calculate the smi, and another to generate all the likelihoods.

```
float CDiophantine::MultInv() {
float sum = 0;

for(int i=0;i<MAXPOP;i++) {
sum += 1/((float)population[i].fitness);
}

return sum;
}

void CDiophantine::GenerateLikelihoods() {
float multinv = MultInv();

float last = 0;
for(int i=0;i<MAXPOP;i++) {
population[i].likelihood = last
= last + ((1/((float)population[i].fitness) / multinv) * 100);
}
}
```
Ok, so now that we have the fitness and likelihoods values all set up, it's time to do some breeding!

Breeding Functions
The breeding functions are composed of three functions, one to get the gene index corresponding to a randomly generated number between 0 and 100, a function to actually calculate the crossover of two genes, and a main function to create the new population. We'll take the functions one at a time, seeing how they call each other. Here is the main breeding function:

```
void CDiophantine::CreateNewPopulation() {
gene temppop[MAXPOP];

for(int i=0;i<MAXPOP;i++) {
int parent1 = 0, parent2 = 0, iterations = 0;
while(parent1 == parent2 || population[parent1]
== population[parent2]) {
parent1 = GetIndex((float)(rand() % 101));
parent2 = GetIndex((float)(rand() % 101));
if (++iterations > (MAXPOP * MAXPOP)) break;
}

temppop[i] = Breed(parent1, parent2);  // Create a child.
}

for(i=0;i<MAXPOP;i++) population[i] = temppop[i];
}
```
So, we firstly create a temporary population of genes. Then we loop through all the genes. Now, when choosing genes we don't want the genes to be the same (no point mating with yourself :), and we don't need the genes to be the same either (that is where the operator== from the gene structure comes in handy). In choosing a parent, we generate a random number, then call the GetIndex function. GetIndex uses the idea of the cumulative likelihoods and merely iterates through the genes until it find the gene that contains that number:
```
int CDiophantine::GetIndex(float val) {
float last = 0;
for(int i=0;i<MAXPOP;i++) {
if (last <= val && val <= population[i].likelihood) return i;
else last = population[i].likelihood;
}

return 4;
}
```
Returning to the CreateNewPopulation() function, you can also see that if the number of iterations exceeds MAXPOP squared, it will take any parents. After parents are chosen, we breed them, by passing the indices up to the Breed function. The Breed function returns a gene, which is put in the temporary population. Here's the code:
```
gene CDiophantine::Breed(int p1, int p2) {
int crossover = rand() % 3+1;
int first = rand() % 100;

gene child = population[p1];

int initial = 0, final = 3;
if (first < 50) initial = crossover;
else final = crossover+1;

for(int i=initial;i<final;i++) {
child.alleles[i] = population[p2].alleles[i];
if (rand() % 101 < 5) child.alleles[i] = rand() % (result + 1);
}

return child;
}
```
Firstly we determine the crossover point. Now remember, we don't want the crossover to be the first or the last, because that entails copying over all or none of the second parent - pointless. We then create a random number that will determine when the first parents takes the initial crossover or not. The rest is self-explanatory - you can see that I've added a tiny mutation factor to the breeding. There's a 5% chance that a new number will occur.

And finally...
Now we can look at the Solve() function. It merely calls the above functions iteratively. Note that we test whether the function managed to find a result on the initial population - this is unlikely, but I thought I'd put it in.

```
int CDiophantine::Solve() {
int fitness = -1;

// Generate initial population.
srand((unsigned)time(NULL));

for(int i=0;i<MAXPOP;i++) {
for (int j=0;j<4;j++) {
population[i].alleles[j] = rand() % (result + 1);
}
}

if (fitness = CreateFitnesses()) {
return fitness;
}

int iterations = 0;
while (fitness != 0 || iterations < 50) {
GenerateLikelihoods();
CreateNewPopulation();
if (fitness = CreateFitnesses()) {
return fitness;
}

iterations++;
}

return -1;
}
```
So there you have it - the full explanation. An example of how to use the class can be found at the bottom of the example essay. I'm sure this code is 100% ANSI C++ - if it is not, and you have trouble compiling it, please contact me.
• Interview with Steve Grand - Lead Programmer of 'Creatures.'
• Interview with Al Biles - GenJam programmer.
• An Introduction to Genetic Algorithms.
• Genetic Algorithm Example: Diophantine Equation.
• Genetic Algorithm Example: Diophantine Equation (CODE).
• How Do Genetic Algorithms Work?
• Introduction To Genetic Algorithms - Book review.