neural network hyperparameter tuning

161 views (last 30 days)
Hello,
since there is no hyperparameter tuning function for neural network I wanted to try the bayesopt function. I tried to recreate the example here: https://de.mathworks.com/help/stats/bayesian-optimization-case-study.html. But this does not work. Is there a possibility to tune the number of hidden neurons? My code does not work...
[m,n] = size(Daten) ;
P = 0.7 ;
Training = Daten(1:round(P*m),:) ;
Testing = Daten(round(P*m)+1:end,:);
XTrain=Training(:,1:n-1);
YTrain=Training(:,n);
XTest=Testing(:,1:n-1);
YTest=Testing(:,n);
c = cvpartition(YTrain,'KFold',10);
hiddenLayerSize=optimizableVariable('hiddenLayerSize',[0,20]);
minfn = @(z)kfoldLoss(fitnet(XTrain,YTrain,'CVPartition',c,...
'hiddenLayerSize',z.hiddenLayerSize));
results = bayesopt(minfn,hiddenLayerSize,'IsObjectiveDeterministic',true,...
'AcquisitionFunctionName','expected-improvement-plus');

Accepted Answer

Don Mathis
Don Mathis on 17 Nov 2018
If you want a more complete workflow that also optimizes the learning rate, and tests the final model on your test set, you could try this:
% Make some data
Daten = rand(100, 3);
Daten(:,3) = Daten(:,1) + Daten(:,2) + .1*randn(100, 1); % Minimum asymptotic error is .1
[m,n] = size(Daten) ;
% Split into train and test
P = 0.7 ;
Training = Daten(1:round(P*m),:) ;
Testing = Daten(round(P*m)+1:end,:);
XTrain = Training(:,1:n-1);
YTrain = Training(:,n);
XTest = Testing(:,1:n-1);
YTest = Testing(:,n);
% Define a train/validation split to use inside the objective function
cv = cvpartition(numel(YTrain), 'Holdout', 1/3);
% Define hyperparameters to optimize
vars = [optimizableVariable('hiddenLayerSize', [1,20], 'Type', 'integer');
optimizableVariable('lr', [1e-3 1], 'Transform', 'log')];
% Optimize
minfn = @(T)kfoldLoss(XTrain', YTrain', cv, T.hiddenLayerSize, T.lr);
results = bayesopt(minfn, vars,'IsObjectiveDeterministic', false,...
'AcquisitionFunctionName', 'expected-improvement-plus');
T = bestPoint(results)
% Train final model on full training set using the best hyperparameters
net = feedforwardnet(T.hiddenLayerSize, 'traingd');
net.trainParam.lr = T.lr;
net = train(net, XTrain', YTrain');
% Evaluate on test set and compute final rmse
ypred = net(XTest');
finalrmse = sqrt(mean((ypred - YTest').^2))
function rmse = kfoldLoss(x, y, cv, numHid, lr)
% Train net.
net = feedforwardnet(numHid, 'traingd');
net.trainParam.lr = lr;
net = train(net, x(:,cv.training), y(:,cv.training));
% Evaluate on validation set and compute rmse
ypred = net(x(:, cv.test));
rmse = sqrt(mean((ypred - y(cv.test)).^2));
end
  6 Comments
SAIF MEHDI
SAIF MEHDI on 10 Aug 2022
Most of these solvers are single objective functions. For your problem, you need a multi objective solver. I know two of them multiobjective GA and Pareto front. You would have to go through their help documents to understand the syntax.

Sign in to comment.

More Answers (2)

Sean de Wolski
Sean de Wolski on 6 Nov 2018
Edited: Sean de Wolski on 6 Nov 2018
This is nowhere near as easy as it should be. The shallow neural net infrastructure is old and uses row-major variables. This needs to be accounted for and you'll see it below with a ton of.' transposes. Second, you'll need to wrap around fitnet because it doesn't take in all of the options as name-value pairs like with the modern fit* functions in the statistics toolbox. Third, the training is non-deterministic unless you seed the rng yourself.
I don't understand the math behind using kfold cross validation with a neural net. Hence, I'll use holdout below which will reliably train and evaluate the network on an independent test sets.
Daten = rand(100, 3);
[m,n] = size(Daten) ;
P = 0.7 ;
Training = Daten(1:round(P*m),:) ;
Testing = Daten(round(P*m)+1:end,:);
XTrain=Training(:,1:n-1).'; % Note transposes
YTrain=Training(:,n).';
XTest=Testing(:,1:n-1).';
YTest=Testing(:,n).';
c = cvpartition(numel(YTrain),'Holdout', 0.25);
hiddenLayerSize=optimizableVariable('hiddenLayerSize',[1,20], 'Type', 'integer');
minfn = @(z)wrapFitNet(XTrain,YTrain, 'CVPartition', c, ...
'hiddenLayerSize',z.hiddenLayerSize);
results = bayesopt(minfn,hiddenLayerSize,'IsObjectiveDeterministic',false,...
'AcquisitionFunctionName','expected-improvement-plus');
Wrapper function
function cvrmse = wrapFitNet(x, y, varargin)
% Handle variable inputs
ip = inputParser;
ip.addParameter('hiddenLayerSize', 20);
ip.addParameter('CVPartition', cvpartition(numel(y),'Holdout', 0.10));
parse(ip, varargin{:});
cv = ip.Results.CVPartition;
hiddensz = ip.Results.hiddenLayerSize;
% Train net. You would adjust other hyper parameters here.
net = fitnet(hiddensz);
nets = train(net, x(:, cv.training.'), y(:, cv.training.'));
% Evaluate on test set and compute rmse
ypred = nets(x(:, cv.test.'));
cvrmse = sqrt(sum(ypred-y(cv.test.').^2)/numel(y(cv.test)));
end
Finally, if the only thing you want to optimize is hidden layer size, it may be easiest to just run a loop from 1:20 and try them all. Bayesian optimization really helps when you have many different parameters (trainfcn, etc.)
  3 Comments

Sign in to comment.


Dimitri
Dimitri on 10 Nov 2018
I'm sorry to bother you again, but I'm having trouble with your code. If the code runs through I get the following answer:
Additionally he doesn't plot any curves at bayesian optimization, which probably has to do with the error. I didn't change anything in your code. Can you help me again, please?
Dimitri
  6 Comments
Madushan Rathnayaka
Madushan Rathnayaka on 22 Feb 2022
how do we extend this to other parameters?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!