Semantic Segmentaion with Deeplabv3+ Training Jumping Around After a Few Iterations

4 views (last 30 days)
I'm pretty new to deep learning, so I've been trying to follow off of tutorials to train my model. The code I was using as a base is the example here, the tutorial for using deeplabv3+ with a ResNet18 base. I got the basic tutorial to run fine and edited the training options to include a training plot. The results look like below.
After I got this working, I included my own data. I did so doing the following code below. The partitionData function came from renaming the partitionCamVidData function in this example. I've adjusted the split to be an 80/20% split between training and validation with no testing data set aside since I wish to run tests separately.
dataFolderImages = fullfile("RGB Images Folder Path");
dataFolderLabels = fullfile("Classified Images Folder Path");
imds = imageDatastore(dataFolderImages);
classNames = ["List of names of my 19 classes for my application"];
labelIDs = [0; 1; 3; 5; 8; 10; 11; 13; 14; 15; 16; 17; 18; 19; 20; 21; 23; 28; 31]; %The values in my classified images corresponding to the 19 classes I have
imageSize = [512 512 3];
numClasses = numel(classNames);
[imdsTrain, imdsVal, imdsTest, pxdsTrain, pxdsVal, pxdsTest] = partitionData(imds,pxds);
patchds = randomPatchExtractionDatastore(imdsTrain,pxdsTrain,[512 512],'PatchesPerImage',320); %These lines are to extract image patches out so that the model receives images of the same size
valpatchds = randomPatchExtractionDatastore(imdsVal,pxdsVal,[512 512],'PatchesPerImage',320); %The patches per image has been changed around a bit since then, but it was around this value at the time of testing
%The patchesPerImage may have been higher during the time I ran this code
%initially.
After doing this, my training progress looked like the image below. I stopped it early in training because of the strange jumping going on with the training lines. The plots all seem to start with the shape that would be typical for training progress before jumping around sporatically. I know training isn't necessarily a smooth line, but the jumps down to such extreme values felt off.
I then tried to lower the initial learning rate, added shuffling after every epoch, and adjusted the learn rate schedule to be piecewise as a way to check if I needed to make changes to the model options. Doing so resulted in the image below that I once again stopped training early for due ot the strange jumps.
This result felt like it had periods where it stayed near the ranges it was supposed to before doing the jumping again. These jumps also felt more regular (~10 iterations) when they would jump to extreme values.
I guess my question is does anyone know why this would be happening and what to do to fix it? I've already tried adding model weights in the way I show in the code below with the loss function from here, but that doesn't seem to fix it.
tbl = countEachLabel(pxdsTrain)
imageFreq = tbl.PixelCount ./ tbl.ImagePixelCount;
classWeights = median(imageFreq) ./ imageFreq;
net = trainnet(patchds,net,@(Y,T) modelLoss(Y,T,classWeights),opts);

Answers (1)

Gayathri
Gayathri on 6 Feb 2025
I understand that you are experiencing instability during the training of your model, which can manifest as erratic jumps in your loss or accuracy plots. Below are some potential reasons for this behavior and suggestions on how to address it.
  • A learning rate that's too high can cause the training process to overshoot, leading to erratic loss behavior. You mentioned that you tried lowering the initial learning rate, which is a good step. Consider using a learning rate scheduler that decreases the learning rate gradually over epochs. For this you can specifically use the parameters like "LearnRateSchedule", "LearnRateDropFactor", and "LearnRateDropPeriod" in the "trainingOptions" function.
  • A very small batch size can cause high variance in the gradient estimates, leading to instability. Try increasing the batch size if possible. This can help stabilize the training process by providing more reliable gradient estimates.
  • Large gradients can cause updates that are too large, leading to instability. Implement gradient clipping to prevent the gradients from becoming too large. For this again, you can set the "GradientThreshold" parameter in the "trainingOptions" functions.
You can try exploring different parameters within the "trainingOptions" function, to stabilise the training process. Please use the "help" command to know the different parameters that can be defined within the function.
help trainingOptions
Hope you find this information helpful!

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!