why doesnt my simple XOR gate using backpropagation work?

11 views (last 30 days)
Hello everyone, I have been trying to create a simple neural network for solving XOR problems, without any success for the couple of days now. I have tried different flavors, with biases, without biases, with biases as weights , not a single one worked! Here is my first try:
% XOR calculator using BackPropagation
clear all;clc;
%specify input samples (creating trainings set)
pattern = [1 1;
0 0;
1 0;
0 1];
[patternRows,patternCols] = size(pattern);
%creating targets or desired answers for each set
target = [0,0,1,1];
%initializing weights with random values
w1 = [-1 1 ; %neuron 1
-1 2]; %neuron two
w2 = [-2,0.5]; %outputneuron
%bias
b12 = [1 -1]; % for neuron one and two respectively
b3 = -1.5; % for neuron three (output neuron)
lr = 0.01; %learning rate
a12 = [2 1]; %layer 1 outputs (neuron one and twos output respectively)
a3 = 1; %layer 2 output (neuron 3 output)
delta = 0;
i=1;
while (i<1000 || error ~=0)
for row_number = 1: patternRows
for neuron = 1:2
%calculate the output of each neuron
a12(neuron) = logsig( pattern(row_number,:) * w1(neuron,:)' + b12(neuron) );
end
%calculate the output
a3 = hardlim(w2(:)'*a12(:)+ b3 );
%calculate the error
error = target(row_number) - a3;
%calculate the local gradient (sensivity)
%which is f'(n) * e(n) for output layer
SM = a3*error;
%calculating delta which is learningrate * localgradient*inputs
delta = lr * SM * a12;
%updating the weights of last layer
w2 = w2 + delta;
%updating the last layers bias
b3 = b3 + lr * SM;
%calculating local gradients(sensivities) for hiddent layers
%which is f(n)*SumOfNextLayers(localgradient*weights)
Sm = [a12(1)*(1-a12(1)) 0;
0 a12(2)*(1-a12(2)) ] * w2' * SM ;
%calculating delta which is the same as before
delta = (lr .* Sm)* pattern(row_number,:);
%updating the weights and biases of hidden layer
w1 = w1 + delta;
b12 = b12 + (lr * Sm)';
sss = sprintf('%d ) a3= %d # e= %d # w1=%d,%d # w2 =%d,%d \n',i,a3 ,error,w1(1),w1(2),w2(1),w2(2));
fprintf(' %s ',sss);
end
i= i+1;
end
------------------- This is another implementation (without any biases)
%XOR without any biases
% training set
ts=[0 0;
1 1;
1 0;
0 1];
d=[0 0 1 1];
% weights for hidden layers (each row represents wiegths for the corrosponding neuron
wh=[rand() rand();
rand() rand()];
% wiegths from hidden layer to output layer
wo=[rand() rand()];
% output for neuron 1 and 2 and bias for neuron3
a=[1 1];
% output for neuron 3
a3=0;
% learning rate
n=0.1;
iteration=10;
i=0;
while (i<iteration || e~=0)
for tindex=1:4
for neuron=1:2
a(neuron)= logsig(wh(neuron,:)*ts(neuron,:)');
end
a3 = hardlim(wo(1)*a(1)+wo(2)*a(2))
e = d(tindex) - a3
grad_out = a3*e;
deltaW = n*grad_out*a;
wo = wo+deltaW;
grad_h = a(1)*(1-a(1)) * grad_out*wo(1)'
deltaW = n*grad_h*wh(1,:);
wh(1,:)= wh(1,:) +deltaW
grad_h = a(2)*(1-a(2)) * grad_out*wo(2)'
deltaW = n*grad_h*wh(2,:)
wh(2,:) = wh(2,:) +deltaW
end
i=i+1;
end
---------------------------- And this is the last implementation which treats biases like weights
%XOR calculator ( biases as weights )
% training set
% (the last number is bias coefficent which is always 1
ts=[0 0 1;1 1 1;
1 0 1;0 1 1];
% targets or desired outputs
d=[0 0 1 1];
% weights for hidden layers
% bias ,weight1, weigth2
wh=[rand() rand() rand() ;
rand() rand() rand()];
% wiegths from hidden layer to output layer
% bias, weight1, weigth2
wo=[rand() rand() rand()];
%output for neuron 1 and 2 and bias for neuron3
a=[1 1];
%output for neuron 3
a3=0;
%learning rate
n=0.1;
iteration=100;
i=0;
while (i<iteration || e~=0)
for tindex=1:4
for neuron=1:2
a(neuron)= logsig(wh(neuron,:)*ts(neuron,:)');
end
output_neuronsInput = wo(1)*a(1)+wo(2)*a(2)+wo(3)*1;
a3 = hardlim(output_neuronsInput);
e = d(tindex) - a3
% neuron 3(output)
localgrad_out = a3*e;
deltaW = n * localgrad_out * output_neuronsInput;
wo = wo + deltaW;
% neuron 1
grad_h = a(1)*(1-a(1)) * localgrad_out * wo(1)'
deltaW = n * grad_h * wh(1,:);
wh(1,:) = wh(1,:) + deltaW
% neuron 2
grad_h = a(2)*(1-a(2)) * localgrad_out*wo(2)'
deltaW = n * grad_h * wh(2,:)
wh(2,:) = wh(2,:) + deltaW
end
i=i+1;
end
I would be grateful if anyone could help me. Thanks in advance

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!