Practical Backpropagation

Article from www.avaye.com


WHAT IS BACKPROPAGATION?

Backpropagation is simply a way to determine the errors values in hidden layers. This needs be done in order to update the weights.

The best example of where backpropagation can be used is the (classic) XOR problem. I'll explain it here for people who are not familiar with it.

If you are reading this tutorial, you should have read some other, basic stuff on NN's. They should have given you the simplest example: a binary network with 1 neuron. When such a network is used, its output can be presented in a simple graph.
AND x1 x2 y 1 1 1 1 0 0 0 1 0 0 0 0 XOR x1 x2 y 1 1 0 1 0 1 0 1 1 0 0 0

As you can see in this simple graph, all points on the left of the line are positive, therefore the output of the neuron should be positive. On the other side of the line, outputs are negative. With this graph, you can make a simple table of inputs and outputs, shown on the left.

Training a network to operate as an AND switch can be done easily through only one neuron.

But what if you'd want to train an XOR? You can't draw a single-lined graph from the table on the left. Therefore, an XOR problem can't be solved using only one neuron. You'll need 3 neurons, fully-connected in a feedforward network. Schematically, that is:

So, you say: 'Ok, I get it, but for a network so simple I don't need backpropagation!' That is true, but remember, this is just an example. In this XOR example, backpropagation is used to update the weights of neurons a and b. Now, on to the maths!



SOME MATHS

First, we need to determine whether we even should update a and b's weights. That is done by specifying a so-called target. A target is simply the output you desire. When specifying a target, you are training your network supervised.

Note: there are other ways to train an XOR, but I use the XOR only as an example for this tutorial.

In this tutorial I use fully-connected sigmoid neurons. The activation function I use is:

  1 / (1 + exp(-net))

When the network has been iterated, the output has to be compared to the target. If the output is not within acceptable range, backpropagation should be used. To update the weights, we must first calculate every unit's error. Unfortunately, the error formulas differ for the output and hidden units. For every output unit, the error is the following:

  outuniterror := (target - output) * output * (1 - output)

Then, we must calculate the error for the hidden units. For every neuron:

  
  For CurrentNeuron := 1 to NumberOfNeurons - 1 do
    Begin
      HiddenNeuronsError[CurrentNeuron] := 
        Neurons[CurrentNeuron].Output * 
        (1 - Neurons[CurrentNeuron].Output);

       For CurrentWeight := 1 to NumberOfWeights do
         Inc(HiddenNeuronsError[CurrentNeuron],
              Neurons[CurrentNeuron].Output * 
              Neurons[CurrentNeuron].Weights[CurrentWeight]);
    End;

As you can see, I have kept this tutorial programmers-oriented. The above code is in Delphi and shouldn't be hard to understand.

In the above example, Neurons is an array (or enum), which contains the data from all the neurons. Since the output unit error is calculated in a different way, CurrentNeuron only grows to NumberOfNeurons minus 1.

After the errors are calculated, it is time to update the weights. The function for updating the weights of the output units is rather simple:

    For CurrentWeight := 1 to NrWeights do
      Inc(Neurons[NumberOfNeurons].Weights[CurrentWeight],
          outuniterror);

As you can see, all weights are updated with the same delta weight.

After this, the new weights for the hidden units should be calculated. Here, we encounter something new: the learning rate. Clearly, the learning rate tells the network how fast it should learn. But don't use a too big number, because then you'll get faulty results. I usually use 0,25 and it always works fine. The new weights for hidden units are calculated as follows:

  For CurrentNeuron := 1 to NrNeurons-1 do
    For CurrentWeight := 1 to NrWeights do
      Inc(Neurons[CurrentNeuron].Weights[CurrentWeight],
          LearningRate * 
          (HiddenNeuronsError[CurrentNeuron] * Neurons[NrNeurons].OutPut)

When all weights are updated, the network should be iterated again, if output is not within acceptable range use backprop, etcetera.




< Back

All content copyrighted by Avaye.com