I have a problem manually programming a neural network
2 views (last 30 days)
Show older comments
I need to build from scratch a shallow layer neural network without using ready made functions, The shallow layer activation function is sigmoid, the output layer activation function is SoftMax and the loss function is least squares. The code is not classifying as intended I believe the problem is in the back propagation derivatives which have been calculated using the chain rule but the problem could be elsewhere. the data set has 300 observations and two inputs. Any help would be much appreciated
%% defining the network
inputs = 2;
layerneurons = 18;
outputneurons = 3;
stepsize = 0.001;
requirederror = 1/(10e4);
W1 = zeros(inputs, layerneurons); % will be updated each iteration
W2 = zeros(layerneurons, outputneurons); % will be updated each iteration
loss = zeros(1,300); % will be kept from each iteration
b1 = (zeros(1,layerneurons))'; b2 = [0;0;0]; % will be updated each iteration
q = zeros(300,outputneurons);
for e = 200
for i = 1: 300
dldb2 = (zeros(1,outputneurons))';
dldb1 = (zeros(1,layerneurons))';
dldw2 = zeros(layerneurons, outputneurons);
dldw1 = zeros(inputs,layerneurons);
a = zeros(1,layerneurons); % will be resetted each iteration
h = zeros(1,layerneurons); % will be resetted each iteration
c = zeros(1,outputneurons);% will be resetted each iteration
etotal = 0;
for j = 1: layerneurons
%% forward pass
a(j) = W1(:,j)'* observations(i,:)'+b1(j);
h(j) = 1/(1+exp(-a(j)));
end
for k = 1:outputneurons
c(k) = W2(:,k)'*h'+b2(k);
end
for k = 1:outputneurons
q(i,k) = exp(c(k))/sum(exp(c));
end
for k = 1:outputneurons
loss(i) = loss(i) + (q(i,k) - labels(k,i))^2;
end
%% backward pass
for m = 1:outputneurons
dldb2(m) = (q(i,m)-labels(m,i))q(i,m)(1-q(i,m));
for j = 1: layerneurons
dldw2(j,m) = dldb2(m)*h(j);
end
end
for j = 1: layerneurons
for m = 1: outputneurons
dldb1(j) = dldb1(j)+ dldb2(m)*W2(j,m);
end
dldb1(j) = dldb1(j)h(j)(1-h(j));
end
for j = 1:layerneurons
w = dldb1(j)*(observations(i,:))';
dldw1(1,j) = w(1); dldw1(2,j) = w(2);
end
%% update
W1 = W1 - stepsize*dldw1;
W2 = W2 - stepsize*dldw2;
b1 = b1 -stepsize*dldb1; b2 = b2 + stepsize*dldb2;
%%check the size of gradient
gradientsize = sumabs(dldw1) + sumabs(dldw2) + sumabs(dldb1) + sumabs(dldb2);
if gradientsize < requirederror
break;
end
end
end
0 Comments
Answers (1)
Krishna
on 5 Feb 2024
Hello Shimon,
I've identified a number of problems with the code you've shared. Let me outline the necessary amendments,
Initialization of weights W1 and W2 should be done using small random numbers to prevent symmetry and promote diverse feature learning by the neurons. The main loop should cycle through the variable e, but your code has for e = 200, which will execute only once. It should be for e = 1:epochs, where epochs represents the total number of training cycles over the dataset. The loss for each data point needs to be reset to zero at the start of the inner loop, rather than outside of it, as it is intended to be recalculated for each epoch.
There are significant errors in your backpropagation algorithm that need to be addressed. Please go through this documenation to learn more about mathematics behind backpropgation algorithm. https://towardsdatascience.com/an-introduction-to-gradient-descent-and-backpropagation-81648bdb19b2
Please go trhough this documentation to learn mode how to code Neural Networks from sctach.
Hope this helps.
0 Comments
See Also
Categories
Find more on Image Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!