I have a problem with understanding the actual architecture of this network. Usually, LeCun et al have used different weights for the connections from different feature maps of a previous layer (something that looks like 3D kernel). Therefore, the number of weights of a convolution layer (assuming full map of connections) is kernelHeight*kernelWidth*numFeatMapsLayer(k)*numFeatMapsLayer(k-1). I am not sure, but it seems that there are kernelHeight*kernelWidth*numFeatMapsLayer(k) different weights used in this program. Does it means that the connections from different feature maps of a previous layer to a particular feature map of the next layer have the same weights? Or maybe, I misunderstand something?