Size of the input layer is different from the expected input size

Hi everyone,
I want to combine a feedforward net with 3 features (3x1) with a RNN with 2 time varying features (each having 252 observations). Say I want to concatenate both networks into a single feedforward layer. No matter how I specify the dimentions in the concatenation layer (4,3,2,1), I always get the error message in the app designer "Size of the input layer is different from the expected input size". I also tried to add another feedforward layer after the GRU layer but nothing worked. The network structure I have set up looks the following way:
%Create Layer Graph
lgraph = layerGraph();
%Add Layer Branches
tempLayers = [
sequenceInputLayer(2,"Name","sequence")
gruLayer(200,"Name","gru_1")
gruLayer(200,"Name","gru_2")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
featureInputLayer(3,"Name","featureinput")
fullyConnectedLayer(128,"Name","fc_1")
fullyConnectedLayer(200,"Name","fc_4")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
concatenationLayer(4,2,"Name","concat")
fullyConnectedLayer(200,"Name","fc_2")
fullyConnectedLayer(10,"Name","fc_3")];
lgraph = addLayers(lgraph,tempLayers);
% clean up helper variable
clear tempLayers;
%Connect Layer Branches
%Connect all the branches of the network to create the network graph.
lgraph = connectLayers(lgraph,"gru_2","concat/in2");
lgraph = connectLayers(lgraph,"fc_4","concat/in1");
%Plot Layers
plot(lgraph);
Any comment or feedback is highly appreciated.

 Accepted Answer

You're trying to concatenate the output of "gru_2" with the output of "fc_4". However "gru_2" outputs sequence data, and "fc_4" doesn't. There are probably 2 things to try depending on your task:
  1. If your target data is not sequences, you can set the OutputMode of gruLayer to "last" to only output the last hidden state. This should be able to concatenate with the output of "fc_4" along dimension 1.
  2. If your target data are sequences, you could concatenate the output of "fc_4" to each sequence element of the output of "gru_2".
An example of 1.
layers = [sequenceInputLayer(2)
gruLayer(200,OutputMode="last")
concatenationLayer(1,2,"Name","cat")
fullyConnectedLayer(10,"Name","output")];
lgraph = layerGraph(layers);
lgraph = lgraph.addLayers([featureInputLayer(3); fullyConnectedLayer(200,"Name","fc")]);
lgraph = lgraph.connectLayers("fc","cat/in2");
% This is fine for dlnetwork. For trainNetwork you will need an output layer:
analyzeNetwork(lgraph,"TargetUsage","dlnetwork");
For 2. it's a little harder, you need a custom layer to repmat the "fc_4" output over the sequence dimension. The shortest way to do this is probably with functionLayer as follows
concatLayer = functionLayer(@concatToSequence,"Formattable",true,"Name","cat");
layers = [sequenceInputLayer(2)
gruLayer(200)
concatLayer
fullyConnectedLayer(10,"Name","Output")];
lgraph = layerGraph(layers);
lgraph = lgraph.addLayers([featureInputLayer(3); fullyConnectedLayer(200,"Name","fc")]);
lgraph = lgraph.connectLayers("fc","cat/in2");
analyzeNetwork(lgraph,"TargetUsage","dlnetwork");
function z = concatToSequence(x,y)
% assume x is sequence and y is not
% x is CBT and y is CB
y = repmat(y,[1,1,size(x,3)]);
% apply labels to y - i.e. add the T label.
y = dlarray(y,"CBT");
z = cat(1,x,y);
end

6 Comments

Hi Ben, thank you so much for your answer! This already helped me. I accutally want a sequence as my output, so your second proposed solution is my perferred way. However, I still do not know how to structure my training data that I can pass it as input into the network. I am happy to accept your answer and open a new one question, if you think both problems are unrelated or should be differenciated. However, I think they are in a nutshell the same questions.
% Data (numeric)
ff_input % double of (3x1)
rnn_input % dobule of (2x252)
output % double of (1x252)
The above input data is one training sample/instance out of my training set. I can put everything in a cell array
XTrain{1,1} = {rnn_input};
XTrain{1,2} = {ff_input};
XTrain{1,3} = {output}; % My understanding is that the last column gets recognized as output
% Convert to datastore
XTrain_ds = arrayDatastore(XTrain,"OutputType","same");
My variable XTrain now only has 1 row and 3 columns, which is one training instance. In reality, the cell array will have dimentions 10000x3 (or any arbitrary row number) if I have more training data. So I can define a custom training loop.
% Define net as your second solution suggests
net = dlnetwork(lgraph);
% Set custom training parameter
miniBatchSize = 1; %(for simplicitly)
% Specify to plot training progress
plots = "training-progress";
% Create minibatchque for regression task
mbq = minibatchqueue(XTrain_ds,...
MiniBatchSize=miniBatchSize);
My problem is now, no matter how I structure XTrain (e.g. as a table instead of cell array or dlarray), I always get an error message (either my input dimentions do not match, or that 'Input data must be formatted dlarray objects.' A 'simple' check running
forward(net,rnn_input, ff_input)
does not work either. Again tested rnn_input and ff_input to be formatted as datastore, cell array, dlarray or single. Do you have any idea, how I should pass my training data to work together with the dlnetwork in your proposed solution? In a nutshell I want mbq variable later to be passed into dlfeval which returns my loss and gradient.
Thank you again for your time and effort. I am happy to post it as a seperate question.
It's no problem to follow up here. For dlnetwork your inputs should be dlarray with dimension labels. I was able to make this work with the network in your original question and the sequence concatenation layer:
lgraph = layerGraph();
concatToSequenceLayer = functionLayer(@concatToSequence,"Formattable",true,"Name","concat");
tempLayers = [
sequenceInputLayer(2,"Name","sequence")
gruLayer(200,"Name","gru_1")
gruLayer(200,"Name","gru_2")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
featureInputLayer(3,"Name","featureinput")
fullyConnectedLayer(128,"Name","fc_1")
fullyConnectedLayer(200,"Name","fc_4")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
concatToSequenceLayer
fullyConnectedLayer(200,"Name","fc_2")
fullyConnectedLayer(10,"Name","fc_3")];
lgraph = addLayers(lgraph,tempLayers);
clear tempLayers;
lgraph = connectLayers(lgraph,"gru_2","concat/in1");
lgraph = connectLayers(lgraph,"fc_4","concat/in2");
analyzeNetwork(lgraph,"TargetUsage","dlnetwork");
net = dlnetwork(lgraph);
rnn_input = dlarray(randn(2,252),"CTB");
ff_input = dlarray(randn(3,1),"CB");
output = forward(net,rnn_input,ff_input);
% If you have many observations in memory:
numObs = 100;
rnn_input = randn(2,252,numObs);
ff_input = randn(3,numObs);
% Note - this network outputs 10 x numObs x 252 dlarray-s in format CBT.
% If you need 1 x numObs x 252 as in the question, you will need to change
% the final fullyConnectedLayer size to 1, or choose some other way to
% reduce the channel dimension from 10 -> 1.
output = randn(10,numObs,252);
% My preferred way to handle data for training is separate datastores for
% each data stream, and combining these:
rnn_ds = arrayDatastore(rnn_input,"IterationDimension",3);
ff_ds = arrayDatastore(ff_input, "IterationDimension",2);
output_ds = arrayDatastore(output,"IterationDimension",3);
ds = combine(rnn_ds,ff_ds,output_ds);
% Then set up a minibatchqueue
mbq = minibatchqueue(ds,3,...
"MiniBatchFormat",["CTB","CB","CBT"]);
[rnn_batch,ff_batch,output_batch] = mbq.next;
net_output = forward(net,rnn_batch,ff_batch);
function z = concatToSequence(x,y)
% assume x is sequence and y is not
% x is CBT and y is CB
y = repmat(y,[1,1,size(x,3)]);
% apply labels to y - i.e. add the T label.
y = dlarray(y,"CBT");
z = cat(1,x,y);
end
If you need to use trainNetwork then you need to create a datastore that reads out cell-s for the sequence input and the sequence output, and double or single for the ff_input:
lgraph = addLayers(lgraph,regressionLayer("Name","output"));
lgraph = connectLayers(lgraph,"fc_3","output");
% fake data with 2 observations
rnn_input = {randn(2,252);randn(2,252)};
rnn_ds = arrayDatastore(rnn_input,"OutputType","same");
ff_input = randn(3,2);
ff_ds = arrayDatastore(ff_input,"IterationDimension",2);
output = {randn(10,252); randn(10,252)};
output_ds = arrayDatastore(output,"OutputType","same");
cds = combine(rnn_ds,ff_ds,output_ds);
opts = trainingOptions("adam");
net = trainNetwork(cds,lgraph,opts);
Again, thank you so much for this detailed answer.
  • I actually think, if I work with trainNetwork, it is much easier for me to iterate over the different hyperparameter etc later. However, if I follow your example from above for trainNetwork, I again get the error message "
net = trainNetwork(cds,lgraph,opts);
Caused by:
Network: Invalid input layers. If the network has a sequence input layer, then it must not have any other input layers.
Which let's me to believe that I have to specify a dlnetwork and I cannot work with trainNetwork. But maybe I am just missing something here. The code that I am running, which produces the error message:
lgraph = layerGraph();
concatToSequenceLayer = functionLayer(@concatToSequence,"Formattable",true,"Name","concat");
tempLayers = [
sequenceInputLayer(2,"Name","sequence")
gruLayer(200,"Name","gru_1")
gruLayer(200,"Name","gru_2")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
featureInputLayer(3,"Name","featureinput")
fullyConnectedLayer(128,"Name","fc_1")
fullyConnectedLayer(200,"Name","fc_4")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
concatToSequenceLayer
fullyConnectedLayer(200,"Name","fc_2")
fullyConnectedLayer(10,"Name","fc_3")];
lgraph = addLayers(lgraph,tempLayers);
clear tempLayers;
lgraph = connectLayers(lgraph,"gru_2","concat/in1");
lgraph = connectLayers(lgraph,"fc_4","concat/in2");
lgraph = addLayers(lgraph,regressionLayer("Name","output"));
lgraph = connectLayers(lgraph,"fc_3","output");
% fake data with 2 observations
rnn_input = {randn(2,252);randn(2,252)};
rnn_ds = arrayDatastore(rnn_input,"OutputType","same");
ff_input = randn(3,2);
ff_ds = arrayDatastore(ff_input,"IterationDimension",2);
output = {randn(10,252); randn(10,252)};
output_ds = arrayDatastore(output,"OutputType","same");
cds = combine(rnn_ds,ff_ds,output_ds);
opts = trainingOptions("adam");
net = trainNetwork(cds,lgraph,opts);
function z = concatToSequence(x,y)
% assume x is sequence and y is not
% x is CBT and y is CB
y = repmat(y,[1,1,size(x,3)]);
% apply labels to y - i.e. add the T label.
y = dlarray(y,"CBT");
z = cat(1,x,y);
end
  • Given my first comment, I adjusted your suggested code for the dlnetwork slightly that it works for 10 x numObs x 252 dlarray-s. And this works. Then I tried to change the 10 to a 1 and the dimentions in minibatchqueue are not stay consistent anymore. Any ideas, why this is the case? Again any help/comment is highly appreciated!
%% Set-up network to train
lgraph = layerGraph();
concatToSequenceLayer = functionLayer(@concatToSequence,"Formattable",true,"Name","concat");
tempLayers = [
sequenceInputLayer(2,"Name","sequence")
gruLayer(200,"Name","gru_1")
gruLayer(200,"Name","gru_2")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
featureInputLayer(3,"Name","featureinput")
fullyConnectedLayer(128,"Name","fc_1")
fullyConnectedLayer(200,"Name","fc_4")];
lgraph = addLayers(lgraph,tempLayers);
tempLayers = [
concatToSequenceLayer
fullyConnectedLayer(200,"Name","fc_2")
fullyConnectedLayer(1,"Name","fc_3")];
lgraph = addLayers(lgraph,tempLayers);
clear tempLayers;
lgraph = connectLayers(lgraph,"gru_2","concat/in1");
lgraph = connectLayers(lgraph,"fc_4","concat/in2");
%analyzeNetwork(lgraph,"TargetUsage","dlnetwork");
net = dlnetwork(lgraph);
% If you have many observations in memory:
numObs = 1000;
rnn_input = randn(2,252,numObs);
ff_input = randn(3,numObs);
% Note - this network outputs 10 x numObs x 252 dlarray-s in format CBT.
% If you need 1 x numObs x 252 as in the question, you will need to change
% the final fullyConnectedLayer size to 1, or choose some other way to
% reduce the channel dimension from 10 -> 1.
output = randn(1,252, numObs);
% My preferred way to handle data for training is separate datastores for
% each data stream, and combining these:
rnn_ds = arrayDatastore(rnn_input,"IterationDimension",3);
ff_ds = arrayDatastore(ff_input, "IterationDimension",2);
output_ds = arrayDatastore(output,"IterationDimension",3);
ds = combine(rnn_ds,ff_ds,output_ds);
% Then set up a minibatchqueue
mbq = minibatchqueue(ds,3,...
"MiniBatchFormat",["CTB","CB", "CTB"]);
% In a next step I will loop over my epchs, but for this example it is not
% necessary here. In a nutshell in the same fashion as in the Matlab
% documentation here: https://de.mathworks.com/help/deeplearning/ug/train-network-using-custom-training-loop.html
% in section 'Train Model'.
[rnn_batch,ff_batch,output_batch] = mbq.next;
% As a next step I would call dlfeval(@modelLoss,net,rnn_batch,ff_batch,output_batch). in my modelLoss
% function I have mse as loss. The problem is again in the forward function
% which gets called in modelLoss.
% Forward data through network.
[Y,state] = forward(net,rnn_batch,ff_batch);
% The dimention of Y are now 1x128x252 which is exactly what I need. The
% problem is that the variable output_batch has dimention 128x1x252, which
% produces and error in the later called mse function due to the dimention mismatch.
What version of MATLAB and Deep Learning Toolbox do you have? I'm using R2022a, and I see that trainNetwork with multiple input layers, including one sequenceInputLayer, was only release in R2022a: https://www.mathworks.com/help/deeplearning/release-notes.html
Sorry about that, I lose track of when we release features.
As for the dlnetwork approach - the remaining issue is that for output_ds, since it reads out a cell containing a 1 x 252 array, our default minibatching in minibatchqueue seems to think it should concatenate along the first dimension. I should have noticed that in the code I sent, it surprised me too!
One option to fix this is to write a minibatch function:
mbq = minibatchqueue(ds,3,...
"MiniBatchFormat",["CTB","CB","CTB"],...
"MiniBatchFcn", @minibatch);
function [rnn_batch,ff_batch,output_batch] = minibatch(rnn_cell,ff_cell,output_cell)
rnn_batch = cat(3,rnn_cell{:});
ff_batch = cat(2,ff_cell{:});
output_batch = cat(3,output_cell{:});
end
Another option would be to set the MiniBatchFormat for output_batch to "BTC", since by default we're using dimension 1 for batch, you can just label it as such, and tell minibatchqueue that there's an extra singleton dimension of "C" at the end:
mbq = minibatchqueue(ds,3,...
"MiniBatchFormat",["CTB","CB","BTC"]);
Hopefully that works now!
Then that is the issue why trainNetwork does not work. I am running Matlab R2021b. Good to know, thanks for the pointer. I could have looked at the release notes earlier.
Now everything works for the dlnetwork approach and both ways for the minibatchqueue work for me. Again, thank you so much Ben for your outstanding help!
Hi Ben, I just had a chance to check your first approach on R2022a via trainNetwork and subsequently via the Deep Learning App. It worked perfectly fine. Thank you again, I really appreciate your help.

Sign in to comment.

More Answers (0)

Asked:

MR
on 17 Jun 2022

Commented:

MR
on 24 Jun 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!