Can't improve a near-zero training accuracy for sequence-to-label 1D CNN over a signal datastore

Question

Said El-Hawwat on 8 Jan 2024

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/2067786-can-t-improve-a-near-zero-training-accuracy-for-sequence-to-label-1d-cnn-over-a-signal-datastore

Commented: David Ho on 18 Jan 2024

I have a signal datastore with 41 unique classes. The datastore consists of two channels, in which there are 101 observations per class per channel. I'm attempting to develop a deep learning model (sequence-to-label) to classify test signals into one a class and observe the accuracy. However, I've tried many different network architectures, and I get a near-zero training accuracy after every attempt.

I followed the steps outlined by Chapter 8 of the Deep Learning with Matlab Online Course to prepare the datastore for training. Thus, upon processing the datastore, I have a cell array of size 4141x1, each cell containing a 2x4000 double. A categorical vector array is also defined for the labels. The code for the train-test split is as follows:

idx = splitlabels(labels,[0.8,0.1,0.1])
trainidx = idx{1}
validx = idx{2}
testidx = idx{3}
traindata = sigdata(trainidx)
trainlabels = labels(trainidx)
valdata = sigdata(validx)
vallabels = labels(validx)
testdata = sigdata(testidx)
testlabels = labels(testidx)

The training options are defined as:

opts = trainingOptions("adam", ...
    "Plots","training-progress", ...
    "ValidationData",{valdata vallabels}, ...
    "InitialLearnRate",0.0001, ...
    "MaxEpochs",15, ...
    "Shuffle","every-epoch", ...
    "ValidationFrequency",20, ...
    "MiniBatchSize",32)

This just leaves the network architecture. I've tried many different networks with varying layers and properties, none of which have been successful. For instance, here's the latest one I tried:

1 'input' Sequence Input Sequence input with 2 dimensions

2 'conv1d' 1-D Convolution 512 7 convolutions with stride 1 and padding 'same'

3 'relu' ReLU ReLU

4 'batchnorm' Batch Normalization Batch normalization

5 'maxpool1d' 1-D Max Pooling Max pooling with pool size 7, stride 1, and padding 'same'

6 'conv1d_1' 1-D Convolution 256 7 convolutions with stride 1 and padding 'same'

7 'relu_1' ReLU ReLU

8 'batchnorm_1' Batch Normalization Batch normalization

9 'maxpool1d_1' 1-D Max Pooling Max pooling with pool size 7, stride 1, and padding 'same'

10 'conv1d_2' 1-D Convolution 128 5 convolutions with stride 1 and padding 'same'

11 'relu_2' ReLU ReLU

12 'batchnorm_2' Batch Normalization Batch normalization

13 'maxpool1d_2' 1-D Max Pooling Max pooling with pool size 7, stride 1, and padding 'same'

14 'conv1d_3' 1-D Convolution 64 5 convolutions with stride 1 and padding 'same'

15 'relu_3' ReLU ReLU

16 'batchnorm_3' Batch Normalization Batch normalization

17 'maxpool1d_3' 1-D Max Pooling Max pooling with pool size 7, stride 1, and padding 'same'

18 'conv1d_4' 1-D Convolution 32 3 convolutions with stride 1 and padding 'same'

19 'relu_4' ReLU ReLU

20 'batchnorm_4' Batch Normalization Batch normalization

21 'maxpool1d_4' 1-D Max Pooling Max pooling with pool size 7, stride 1, and padding 'same'

22 'conv1d_5' 1-D Convolution 16 3 convolutions with stride 1 and padding 'same'

23 'relu_5' ReLU ReLU

24 'layernorm' Layer Normalization Layer normalization

25 'gmpool1d' 1-D Global Max Pooling 1-D global max pooling

26 'fc' Fully Connected 41 fully connected layer

27 'softmax' Softmax softmax

28 'classification' Classification Output crossentropyex

All the other ones I've tried incorporated variations of the filter size and num filters in each conv1d layer, and the size of the maxpool filters. I also tried using averagepool filters, varying the amount of conv1d and pooling filters, used an LSTM filter after all the convolution layers, and even once tried using a cwt layer right after the input, leading to conv2d and maxpool2d layers. And after every attempt, I would end up with an accuracy plot looking similar to this:

Obviously, this is unacceptable. I'm not necessarily expecting a >90% accuracy or anything, but I at least want something I can work with. I also considered that the problem may be the training options and not the layer architecture. So I varied the minibatch size and learning rate a bit, but to no avail.

Closing thoughts: I think the main culprit is the global pooling layer before the fc layer. It doesn't make sense to me that an entire 4000 sample length signal is condensed into a single value after only a few convolution and pooling operations. So I think that rather than doing a global pooling, I should try to add more fc layers. However, I can't do this without Matlab thinking that I'm trying to do a sequence-to-sequence classification problem. I don't know how to get around that.

I'm also not sharing too many details about my input training database because I don't want that to be public. If you want to look at that, I'd prefer if we messaged directly. But this database was successfully classified with a >95% accuracy using SVM shallow learning, so I don't see why deep learning could work on it as well.

Much kudos to anyone who's read this far. Thanks in advance for any and all help, and I look forward to your responses!

3 Comments
Show 1 older commentHide 1 older comment

Said El-Hawwat on 11 Jan 2024

Hello Debraj,

Thanks for your response. Unfortunately I can't find examples with data similar to mine, as I'm working with response signal data from excitation wave packets for defect detection. I'm not seeing any Matlab deep learning examples with such data. Would it be possible to reach out to you directly so I can privately share my exact data?

Best,

Said

David Ho on 18 Jan 2024

Regarding your comment on "An entire 4000 sample length signal is condensed into a single value". If you want to downsample in time before pooling, you could add stride to your intermediate pooling layers or convolution1dLayers.

Sign in to comment.

Sign in to answer this question.

Answer 1

Matt J on 8 Jan 2024

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/2067786-can-t-improve-a-near-zero-training-accuracy-for-sequence-to-label-1d-cnn-over-a-signal-datastore#answer_1385796

Edited: Matt J on 8 Jan 2024

As far as we can say, you've only tried a single combination of training hyperparameters. You need to explore different combinations, possibly using the Experiment Manager.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Can't improve a near-zero training accuracy for sequence-to-label 1D CNN over a signal datastore

3 Comments
Show 1 older commentHide 1 older comment

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Can't improve a near-zero training accuracy for sequence-to-label 1D CNN over a signal datastore

3 Comments Show 1 older commentHide 1 older comment

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

3 Comments
Show 1 older commentHide 1 older comment

0 Comments
Show -2 older commentsHide -2 older comments