Multimodal deep learning : resizing a layer output

29 views (last 30 days)
Hello everyone, here is my problem below :
I am currently studying the combination of numeric features and images as inputs for a neural-network-based classifier. Until now, I have based my code on the following example : https://nl.mathworks.com/help/deeplearning/ug/train-network-on-image-and-feature-data.html
My data is a mix of RGB images and radar data (position, speed & radar cross section).
The twist is that for my code, I wish to use the pretrained networks that Matlab offers (GoogLeNet, AlexNet, ...) for the image treatment part. This gives me a network which looks as the following (with the pretrained network GoogLeNet) :
As you can see on the picture, I have to concatenate along dimension 3, as the output of the layer "pool5-drop_7x7_s1" is of size 1x1x1024.
When I then want to train the network (still according to the mathworks example), I get the error message that all data must have the same dimension label, which can be SSCB or CB ( see below )
Because of that, I tried to concatenate along the first dimension, which means resizing the previous ouptut to size 1024, like in the mathworks example. To do so, I added a fully connected layer between the layer "pool5-drop_7x7_s1" and the concatenation layer. This worked, however when I trained the whole network, I ended up having NaN values for my loss array, making the regression useless. I'm suspecting that this comes from the fact that I added a the layer mentionned above.
Has anyone experienced this kind of problem and would have a solution ?
Please find my code below :
close all; clear; clc;
addpath("pretrained networks", "pretrained networks\networks");
%% Importing the data
path = 'M:\00_dataset\data5';
radarData = getRadarData(path);
radarData.Properties.VariableNames = ["ID", "x", "y", "z", "vx", "vy", "vz", "rcs"];
eoData = imageDatastore(path, 'IncludeSubfolders', true, 'LabelSource', 'foldernames');
nbrClasses = numel(categories(unique(eoData.Labels)));
%% Pretrained network importation
pretrainedNet = GoogLeNet();
analyzeNetwork(pretrainedNet);
layer1 = pretrainedNet.Layers(141); % "pool5-drop_7x7_s1";
layer2 = pretrainedNet.Layers(142); % 'loss3-classifier';
%% Preprocessing the data
% Shuffling the data
eoData = shuffle(eoData);
% Splitting the EO Data (train, test & validation sets)
[trainEOData, testEOData] = splitEachLabel(eoData, 0.85);
[trainEOData, valEOData] = splitEachLabel(trainEOData, 0.7/0.85);
yTrain = trainEOData.Labels; yTest = testEOData.Labels; yVal = valEOData.Labels;
classes = categories(yTrain);
% Preprocessing the radar data
[m1, ~] = size(radarData); % Getting rid of the id
labels = [];
for k = 1:m1
label = char(radarData.ID(k));
label = label(1:end-3);
labels = [labels; convertCharsToStrings(label)];
end
radarData.Label = labels;
radarData = convertvars(radarData, "Label", 'categorical'); % Making labels categorical
% Building the radar data sets (train, test & validation sets)
trainRadarData = []; testRadarData = []; valRadarData = [];
for idx = 1:length(eoData.Files)
file = eoData.Files{idx};
tmp = erase(file, path);
tmp = split(tmp, "\");
id = tmp{3, 1};
l = length(char(findLabelInID(id)));
id = id(1:l+3);
lineIdx = find(strcmp(radarData.ID, id));
if any(strcmp(trainEOData.Files, file))
trainRadarData = [trainRadarData; radarData(lineIdx, 2:end-1)];
elseif any(strcmp(testEOData.Files, file))
testRadarData = [testRadarData; radarData(lineIdx, 2:end-1)];
else
valRadarData = [valRadarData; radarData(lineIdx, 2:end-1)];
end
end
% Preprocessing the EO data
trainEOData = transform(trainEOData, @(x) imresize(x, pretrainedNet.Layers(1).InputSize(1:2)));
testEOData = transform(testEOData, @(x) imresize(x, pretrainedNet.Layers(1).InputSize(1:2)));
valEOData = transform(valEOData, @(x) imresize(x, pretrainedNet.Layers(1).InputSize(1:2)));
xTrain = readall(trainEOData); xTest = readall(testEOData); xVal = readall(valEOData);
[m1, n, p] = size(xTrain); [m2, ~] = size(xTest); [m3, ~] = size(xVal);
splitsTrain = {n*ones(1, m1/n), n, p}; splitsTest = {n*ones(1, m2/n), n, p};
splitsVal = {n*ones(1, m3/n), n, p};
splitCellTrain = mat2cell(xTrain, splitsTrain{:});
splitCellTest = mat2cell(xTest, splitsTest{:});
splitCellVal = mat2cell(xVal, splitsVal{:});
xTrain = cat(4, splitCellTrain{:}); xTest = cat(4, splitCellTest{:});
xVal = cat(4, splitCellVal{:});
% Combining into one datastore
dsTrain = arrayDatastore(xTrain, 'IterationDimension', 4);
dsTest = arrayDatastore(xTest, 'IterationDimension', 4);
dsVal = arrayDatastore(xVal, 'IterationDimension', 4);
[~, nbrFeatures] = size(trainRadarData);
for k = 1:nbrFeatures
trainFeat = table2array(trainRadarData(:, k)); valFeat = table2array(valRadarData(:, k));
testFeat = table2array(testRadarData(:, k));
dsTrain = combine(dsTrain, arrayDatastore(trainFeat));
dsVal = combine(dsVal, arrayDatastore(valFeat));
dsTest = combine(dsTest, arrayDatastore(testFeat));
end
dsTrain = combine(dsTrain, arrayDatastore(yTrain));
dsVal = combine(dsVal, arrayDatastore(yVal));
dsTest = combine(dsTest, arrayDatastore(yTest));
%% Network adaptation
layers = layerGraph(pretrainedNet);
layers = disconnectLayers(layers, layer1.Name, layer2.Name);
fullyConnectedLayerOld = layers.Layers(142);
outputClassifierOld = layers.Layers(144);
% Uncomment to add extra layer
% fc15 = fullyConnectedLayer(fullyConnectedLayerOld.InputSize, 'Name', 'fc1.5', 'WeightLearnRateFactor', 10, ...
% 'BiasLearnRateFactor', 10);
% layers = addLayers(layers, fc15);
% layers = connectLayers(layers, layer1.Name, fc15.Name);
dim = 3; % dim = 1 if extra layer is added
concatLayer = concatenationLayer(dim, nbrFeatures+1, 'Name', 'concat');
layers = addLayers(layers, concatLayer);
% layers = connectLayers(layers, fc15.Name,
% strcat(concatLayer.Name,"/in1")); % uncomment to add extra layer
layers = connectLayers(layers, layer1.Name, strcat(concatLayer.Name,"/in1")); % comment to add extra layer
for k = 1:nbrFeatures
featInputLayer = featureInputLayer(1,'Name',strcat('featuresInput', int2str(k+1)));
layers = addLayers(layers, featInputLayer);
layers = connectLayers(layers, featInputLayer.Name, strcat(concatLayer.Name,"/in", int2str(k+1)));
end
layers = connectLayers(layers, concatLayer.Name, layer2.Name);
fullyConnectedLayerNew = fullyConnectedLayer(nbrClasses, ...
'Name', 'Fully Connected New', ...
'WeightLearnRateFactor', 10, ...
'BiasLearnRateFactor', 10);
layers = replaceLayer(layers,fullyConnectedLayerOld.Name, fullyConnectedLayerNew);
layers = removeLayers(layers,outputClassifierOld.Name);
net = dlnetwork(layers);
analyzeNetwork(net);
%% Training options
numEpochs = 15;
miniBatchSize = 128;
learnRate = 1e-5;
decay = 0.01;
momentum = 0.9;
plots = "training-progress";
%% Model training
velocity = [];
mbq = minibatchqueue(dsTrain,...
'MiniBatchSize', miniBatchSize,...
'MiniBatchFcn', @preprocessMiniBatch,...
'MiniBatchFormat', {'SSCB','CB','CB', 'CB', 'CB', 'CB', 'CB', 'CB', ''});
if plots == "training-progress"
figure
lineLossTrain = animatedline('Color',[0.85 0.325 0.098]);
ylim([0 inf])
xlabel("Iteration")
ylabel("Loss")
grid on
end
iteration = 0;
start = tic;
% Loop over epochs.
for epoch = 1:numEpochs
% Shuffle data.
shuffle(mbq)
% Loop over mini-batches.
while hasdata(mbq)
iteration = iteration + 1;
% Read mini-batch of data.
[dlX1, dlX2, dlX3, dlX4, dlX5, dlX6, dlX7, dlX8 ,dlY] = next(mbq);
% Evaluate the model gradients, state, and loss using dlfeval and the
% modelGradients function and update the network state.
[gradients, state, loss] =...
dlfeval(@modelGradients, net, dlX1, dlX2, dlX3, dlX4, dlX5, dlX6, dlX7, dlX8, dlY);
net.State = state;
% Update the network parameters using the SGDM optimizer.
[net, velocity] = sgdmupdate(net, gradients, velocity, learnRate, momentum);
if plots == "training-progress"
% Display the training progress.
D = duration(0,0,toc(start),'Format','hh:mm:ss');
%completionPercentage = round(iteration/numIterations*100,0);
title("Epoch: " + epoch + ", Elapsed: " + string(D));
extractdata(loss)
addpoints(lineLossTrain,iteration,double(gather(extractdata(loss))))
% drawnow
end
end
end
%% Model testing
mbqTest = minibatchqueue(dsTest,...
'MiniBatchSize',miniBatchSize,...
'MiniBatchFcn', @preprocessMiniBatch,...
'MiniBatchFormat', ["SSCB", "CB", "CB", "CB", "CB", "CB", "CB", "CB", ""]);
[predictions, predCorr] = modelPredictions(net, mbqTest, categories(eoData.Labels));
accuracy = mean(predCorr);
idx = randperm(size(xTest,4), 9);
figure
for i = 1:9
subplot(3,3,i)
I = xTest(:,:,:,idx(i));
imshow(I)
label = string(predictions(idx(i)));
title("Predicted Label: " + label)
end
%% Utilities
function data = getRadarData(adress)
% retrieves the radar data at the given adress
data = [];
folder = dir(fullfile(adress,'*'));
subfolders = setdiff({folder([folder.isdir]).name},{'.','..'}); % list of subfolders of D.
for ii = 1:numel(subfolders)
extension = dir(fullfile(adress,subfolders{ii},'*.txt')); % improve by specifying the file extension.
files = {extension(~[extension.isdir]).name}; % files in subfolder.
for jj = 1:numel(files)
file = fullfile(adress,subfolders{ii},files{jj});
% fid = fopen(file);
% table = textscan(fid, '%s,%f,%f,%f,%f,%f,%f,%f', 'Delimiter', ',');
% fclose(fid);
table = readtable(file, 'TextType', 'string');
data = [data; table];
end
end
end
function s2 = findLabelInID(s1)
labels = ["aircraftCarrier", "tanker", "fighter", "liner", "smallPlane", "tank", "car", "truck"];
idx = 1;
while ~contains(s1, labels(idx))
idx = idx + 1;
end
s2 = labels(idx);
end
function [gradients,state,loss] = modelGradients(dlnet, dlX1, dlX2, dlX3, dlX4, dlX5, dlX6, dlX7, dlX8, Y)
[dlYPred,state] = forward(dlnet, dlX1, dlX2, dlX3, dlX4, dlX5, dlX6, dlX7, dlX8)
loss = crossentropy(dlYPred, Y)
gradients = dlgradient(loss, dlnet.Learnables);
end
function [classesPredictions,classCorr] = modelPredictions(dlnet, mbq, classes)
classesPredictions = [];
classCorr = [];
% Loop over mini-batches.
while hasdata(mbq)
[dlX1, dlX2, dlX3, dlX4, dlX5, dlX6, dlX7, dlX8, dlY] = next(mbq);
% Make prediction.
dlYPred = predict(dlnet, dlX1, dlX2, dlX3, dlX4, dlX5, dlX6, dlX7, dlX8);
dlYPred(1:4)
% Determine predicted classes.
YPredBatch = onehotdecode(dlYPred, classes, 1);
classesPredictions = [classesPredictions, YPredBatch];
% Compare predicted and true classes
Y = onehotdecode(dlY, classes, 1);
classCorr = [classCorr YPredBatch == Y];
end
end
function [X1, X2, X3, X4, X5, X6, X7, X8, T] =...
preprocessMiniBatch(dataX1, dataX2, dataX3, dataX4, dataX5, dataX6, dataX7, dataX8, dataT)
% Preprocess predictors.
X1 = cat(4, dataX1{:});
X2 = cat(2, dataX2{:});
X3 = cat(2, dataX3{:});
X4 = cat(2, dataX4{:});
X5 = cat(2, dataX5{:});
X6 = cat(2, dataX6{:});
X7 = cat(2, dataX7{:});
X8 = cat(2, dataX8{:});
% Extract label data from cell and concatenate.
T = cat(2,dataT{1:end});
% One-hot encode labels.
T = onehotencode(T,1);
% whos X1 X2 X3 X4 X5 X6 X7 X8 T
end
Thank you very much for your help !

Answers (1)

Milan Bansal
Milan Bansal on 5 Oct 2023
Hi Arthur CASSOU,
As per my understanding, you are facing an error while training the Neural Network with multiple inputs and concatenating these input layers to the pretrained model.
The output of the layer "pool5-drop_7x7_s1" of "GoogLeNet" has a dimension 1x1x1024 with dimension labels as "SSC", while the the new input layers have a dimension of 1 with dimension label "C" and therefore these layers cannot be concatenated.
To resolve this issue, kindly add a "flatten" layer after the "pool5-drop_7x7_s1" layer. The output of the "flatten" layer will be 1024 with dimension label (C), same as that of new feature input layers and therefore can be concatenated.
Please refer to the following code to add "flatten" layer to the network and then the "concatenationLayer" layer.
layers = googlenet();
layer1 = layers.Layers(141); % "pool5-drop_7x7_s1";
layer2 = layers.Layers(142); % 'loss3-classifier';
layers = disconnectLayers(layers, layer1.Name, layer2.Name);
% Connecting the flatten layer
flatLayer = flattenLayer('Name','flatten1');
layers = addLayers(layers,flatLayer)
layers = connectLayers(layers,layer1.Name,flatLayer.Name);
concatLayer = concatenationLayer(1, nbrFeatures+1, 'Name', 'concat');
layers = addLayers(layers, concatLayer);
layers = connectLayers(layers, flatLayer.Name, strcat(concatLayer.Name,"/in1"));
for k = 1:7
featInputLayer = featureInputLayer(1,'Name',strcat('featuresInput', int2str(k+1)));
layers = addLayers(layers, featInputLayer);
layers = connectLayers(layers, featInputLayer.Name, strcat(concatLayer.Name,"/in", int2str(k+1)));
end
Please refer to the following documentation to learn more about "flatten" layer.
Hope it helps!

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!