Perceptron learns to reproduce just one pattern in batch learning

1 view (last 30 days)
Good day dear forum members and dear MathWorks team.
I am having a hard time trying to implement batch learning of a multi-layer perceptron by myself. For certain reasons (including self-learning), I won't use Neural Network Toolbox.
A similar problem may be found at some places over the Internet, but nowhere a clear answer has been given. Basically, the net learns only to reproduce the average over the targets in batch learning, yet back propagation itself seems to work just fine.
I'd really appreciate if someone from here, who knows neural nets well, took a little time and looked at my code. I'm completely stuck and have no ideas how to go further.
I've tried everything:
  1. Introduced bias weights
  2. Tried with and without updating of input weights
  3. Shuffled the patterns in batch learning
  4. Tried to update after each pattern and accumulating
  5. Initialized weights in different possible ways
  6. Double-checked the code 10 times
  7. Normalized accumulated updates by the number of patterns
  8. Tried different layer, neuron numbers
  9. Tried different activation functions
  10. Tried different learning rates
  11. Tried different number of epochs from 50 to 10000
  12. Tried to normalize the data
And nothing helps. Doing just simple scalar function approximation, I always get something like:
In one epoch of batch learning, I compute weight updates for all patterns, accumulate them and then update by accumulated weight deltas. However, I've tried all different possible ways to rearrange cycles etc. and nothing helped.
Here is my code:

Accepted Answer

Valery Saharov
Valery Saharov on 5 May 2015
Edited: Valery Saharov on 5 May 2015
So, I got it working more or less finally. I used 18 neurons for that. And even with this number of neurons, the learning sometimes converges to reproduce the target, sometimes not (I initialize weights at random each time). A bit confused that it works so ineffectively. I've been expecting something more.
It seems to be that the average over the target patterns always corresponds to a local minimum of the error function. There is quite a clear possibility that the output neurons get "decoupled" from the input all together. In this case, the gradients get eventually averaged "over the target output". Quite an unpleasant phenomenon.

More Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!