Unable to prepare roi labeled audio training dataset for training Deep learning network - Training Accuracy is constant and loss is not getting plotted.

%% Load Data
filepath='C:\Users\HP\Documents\Stuttering Project\Solution\Labels\';'FileExtension';'.wav';
ads = audioDatastore(filepath);
dss=load('F_0558_09y3m_1.mat');
ds1=dss.labelData.Labels.Pause;
dss=load('F_0101_15y2m_1.mat');
ds2=dss.labelData.Labels.Pause;
dss=load('F_0101_13y1m_1.mat');
ds3=dss.labelData.Labels.Pause;
dss=load('F_0101_10y4m_1.mat');
ds4=dss.labelData.Labels.Pause;
dss=load('F_0050_10y9m_1.mat');
ds5=dss.labelData.Labels.Pause;
ld=vertcat(ds1(1),ds2(4),ds3(3),ds4(2),ds5(1));
%% Prepare YTrain
YTrain=cell(size(ld));
for i=1:length(ld)
ld{i}.ROILimits = sortrows(ld{i}.ROILimits);
Current=[round((ld{i}.ROILimits).*40),double(cell2mat(ld{i}.Value))]; % To change sampling to 44100, change 40 to 44100
k=zeros(Current(end,2),2);
k(:,1)=(1:Current(end,2));
for j=1:Current(end,2)
for l=1:length(Current)
if j>=Current(l,1) && j<=Current(l,2)
k(j,2)=1;
end
end
end
YTrain{i}=k(:,2);
end
%% Prepare XTrain
XTrain=cell(size(ld));
Par=cell(size(ld));
for i=1:length(ld)
[x,adsinfo]=read(ads);
n=round(length(x)/length(YTrain{i})); r=rem(length(x),length(YTrain{i}));
x=x(1:end-r);
q=(length(x)/length(YTrain{i}));
P=reshape(x,[q,length(YTrain{i})]);
Par{i}=P;
end
%% Extract Features
for i=1:length(Par)
aFE = audioFeatureExtractor('SampleRate',44100, ...
'gtcc',true,...
'pitch',true,...
'spectralCentroid',true,...
'mfcc',true);
store=Par{i};
[m,n]=size(store);
e=zeros(m,1);
eval=zeros(n,28);
for j=1:n
e=store(:,j);
features = extract(aFE,e);
features = mean(features- mean(features)./std(features));
eval(j,:)=features;
end
XTrain{i}=eval;
end
%% Prepare Training model
layers = [ ...
sequenceInputLayer(28)
lstmLayer(200)
fullyConnectedLayer(2)
softmaxLayer
classificationLayer];
options = trainingOptions('adam', ...
'MaxEpochs',100, ...
'MiniBatchSize',50, ...
'InitialLearnRate',0.1, ...
'LearnRateDropPeriod',3, ...
'LearnRateSchedule','piecewise', ...
'GradientThreshold',1, ...
'Plots','training-progress',...
'shuffle','every-epoch',...
'Verbose',1,...
'DispatchInBackground',true);
%% Preparing Training Data
for i=1:length(Par)
XTrain{i}=(XTrain{i})';
YTrain{i}=(YTrain{i})';
YTrain{i}=categorical(YTrain{i});
end
%% Train Network
net = trainNetwork(XTrain,YTrain,layers,options);
Here I am training the an ROI labeled audio dataset to train for speech and pause. Therefore after labeling the data, I devided the datas and extracted the features from each files. Then on I prepared the XTrain and YTrain datasets and the pipeline could be trained. The problem I am coming up with the code is that the accuracy is coming as constant, and there is no loss. Thus it means there is some problem with the code. How do I resolve this?
Training Progress

11 Comments

Hi, can you provide some details on the feature extraction process from the files. What kind of features did you extract?
I extracted some features like mfcc, gtcc, pitch, harmonics, spectral density etc. The issue is not extracting. The problem I am facing is preparing the Xtrain dataset for the training.
I don't think this should be very hard then. You have the labels correspoinding the frames (pauses / non-pauses) and you also have the features associated with the frame. Are you facing problems with writing the code?
Yes. I am being unable to process the data in the required form. Can you help me?
You can edit the question to include your current MATLAB code for better undersanding and where exactly are you facing the problem.
The code is in a raw state, pretty long too. Alhough I have posted it. If you can, please check out. Thank you.
These are the datas respectively. I changed the code somewhat too. Now it shows the following error.
size(XTrain)=
5 1
size(Ytrain) =
5 1
class(Ytrain{1}) =
'double'
size(YTrain{1}) =
134 1
size(XTrain{1}) =
134 50
class(XTrain{1}) =
'double'
Error using trainNetwork (line 170)
Invalid training data. Responses must be a vector of categorical responses, or a cell array of categorical response
sequences.
Error in BasicTrain (line 91)
net = trainNetwork(XT,YT,layers,options);size(XTrain)
>>
When you use a sequence layer, your Ytrain needs to have as the same number of entries as your XTrain, and each YTrain entry needs be the class information associated with the entire XTrain entry.
So for Xtrain{1} being 134 x 50, the corresponding Ytrain entry should be a scalar -- the class assosciated with the entire Xtrain{1} sequence. There should not be one Ytrain entry for every row in Xtrain{1}
Imagine, for example, you had the hypothesis that you could determine the difference between listening to different kinds of music by following some physiological measurements. Your first row of every Xtrain{K} entry might be blood pressure, and the second row might be neck muscle tension, and for any one entry, the two rows of Xtrain{K} might reflect sampling every second for one minute, so Xtrain{K} being 2 x 60. But you do not have one class information associated with each of the rows: for any one test, you have only one class information, like Ytrain(K) = "polka".
The particular configuration you are using requires that the class information be passed to it as a vector of categorical objects, or as a cell array in which each entry is a scalar categorical object.
Okay. I changed the code and now the pipeline is getting trained. But I am facing another issue now. Updated the code here and posted a picture of the training model. The Accuracy is being constant at one value, and there is no loss at all. How do I rectify this?
Hello, can anyone help me with my Deep Learning model accuracy coming as constant?

Sign in to comment.

Answers (0)

Products

Release

R2020a

Asked:

on 16 Aug 2020

Commented:

on 27 Aug 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!