Generation 5: Artificial Intelligence Repository - Back-Propagation: CBPNet

#### Back-Propagation: CBPNet

I was always very much intimidated by back-propagation (BP), so it took me a long time to get it in my head. Not only did many books skim over the topic, the examples (if any) were often obfuscated and complicated. Also, I found very little example code to work from. I found a good book that explained the concept well, but had no code. So, I wrote some of my own!

I use BP to train a small 3-layer network to generalize the XOR binary function. The class itself can take any floating point numbers, but I cannot guarantee success due to the limited architecture of the net. The class is surprisingly small:

```class CBPNet {
public:
CBPNet();                    // Constructor.
~CBPNet() {};                // Destructor.

// These two functions train or run the net. Both return
// the result that the net gives. The final variable that
// Train() takes is the expected value.
float Train(float, float, float);
float Run(float, float);

private:
float m_fWeights[3][3];      // Weights for the 3 neurons.
float Sigmoid(float);        // The sigmoid function.
};
```
Obviously, the most important function here is Train() (Run() is simply a cut-down version of Train()), so lets break it down.

Firstly, we want to calculate the summed weight, net, for each of the hidden layer neurons. Remember, that the first input is a bias of 1, multipled by its weight, then summed to the two inputs passed to the function multipled by their weights.

```      net1 = 1 * m_fWeights[0][0] + i1 * m_fWeights[1][0] +
i2 * m_fWeights[2][0];
net2 = 1 * m_fWeights[0][1] + i1 * m_fWeights[1][1] +
i2 * m_fWeights[2][1];
```
Two quick notes: First, on the first pass of training the weights will be completely random (see the constructor code). Second, I am quite a visual thinker, so I program 'visually' - I found it easier to arrange the weights for each neuron in the rows, not the columns.

Now, that we have the net values we have to run them through a hard-limiter. So, I've created a simple function to return the Sigmoid value of a number. The function simply looks like:

```float CBPNet::Sigmoid(float num) {
return (float)(1/(1+exp(-num)));
}
```
So, the output layer takes the sigmoid values of the net. It then calculates its own net and sigmoid values. That final value is result that the net outputs. Hopefully, after the net has been run several thousand times it will be close to 0 or 1 (for our training set). It will only get close to those values if we adjust the weights, and this is what the function does next.

```   // We have to calculate the deltas for the two layers.
// Remember, we have to calculate the errors backwards
// from the output layer to the hidden layer (thus the
// name 'BACK-propagation').
float deltas[3];

deltas[2] = out*(1-out)*(d-out);
deltas[1] = i4*(1-i4)*(m_fWeights[2][2])*(deltas[2]);
deltas[0] = i3*(1-i3)*(m_fWeights[1][2])*(deltas[2]);
```
Notice that I put the values into an array, this is simply to make the job of adjusting the weights easier. As I loop through the 3 neurons, the delta value will be the same as the index of the neuron - eg, the output neuron (index 2) will use deltas[2] as its delta value. If the delta formulas don't look familiar to you, go over the back propagation essay.

Now, we have to adjust the weights accordingly:

```   // Now, alter the weights accordingly.
float v1 = i1, v2 = i2;
for(int i=0;i<3;i++) {
// Change the values for the output layer, if necessary.
if (i == 2) {
v1 = i3;
v2 = i4;
}

m_fWeights[0][i] += BP_LEARNING*1*deltas[i];
m_fWeights[1][i] += BP_LEARNING*v1*deltas[i];
m_fWeights[2][i] += BP_LEARNING*v2*deltas[i];
}
```
BP_LEARNING is the learning coefficient and is defined at the top of the file. Feel free to adjust it and see how it affects the learning.

So, to use the class, simply call the training process with the different values you want to train. Note that the class will not control when to stop training, that all has to be handled outside the class. After training has been completed, you can call the Run() function with two values to see what the network outputs. I supplied a very simple main() function that trains that network with the entire XOR data set 2500 times, then outputs what the train has learnt. Notice how the network hasn't learnt it exactly, but with rounding it is sufficient for most purposes. Here is main():

```void main() {
CBPNet bp;

for (int i=0;i<BPM_ITER;i++) {
bp.Train(0,0,0);
bp.Train(0,1,1);
bp.Train(1,0,1);
bp.Train(1,1,0);
}

cout << "0,0 = " << bp.Run(0,0) << endl;
cout << "0,1 = " << bp.Run(0,1) << endl;
cout << "1,0 = " << bp.Run(1,0) << endl;
cout << "1,1 = " << bp.Run(1,1) << endl;
}
```