I have a problem with understanding the actual architecture of this network. Usually, LeCun et al have used different weights for the connections from different feature maps of a previous layer (something that looks like 3D kernel). Therefore, the number of weights of a convolution layer (assuming full map of connections) is kernelHeight*kernelWidth*numFeatMapsLayer(k)*numFeatMapsLayer(k-1). I am not sure, but it seems that there are kernelHeight*kernelWidth*numFeatMapsLayer(k) different weights used in this program. Does it means that the connections from different feature maps of a previous layer to a particular feature map of the next layer have the same weights? Or maybe, I misunderstand something?
I want to use this toolbox on my own sound data set. Each spectogram equivalent to 1 image has dimensions 13x500, where as each image in MNIST has dimensions 28x28. I want to change the input width and height to 13 and 500 respectively. How do I go about doing so?
I apllied it to traffic sign recognition,it classified all classes to a same class,have you had any experience with this - if so what parameters might you suggest I change?How can I train the cnn with the features extracted from the images instead of the images themselves?
Where should I change?