fmincon user-supplied hessian inconsistency

1 view (last 30 days)
Phil
Phil on 11 Jul 2013
I'm using fmincon to optimize a scalar function with this abridged set of options:
options = optimset('Display', 'iter-detailed','MaxFunEvals',10000,...
'Maxiter',1000,'TolX',1e-18,'TolFun',1e-18, 'GradObj', 'on',...
'Algorithm', 'trust-region-reflective', 'Hessian',...
'user-supplied');
The following is the call to fmincon, objective function, and constraint functi n). Note that all are nested functions allowing the passing of extra parameters and unnecessary details have been left out.
[X,fval, exitflag, output, lambda, grad, hessian] = fmincon(@nestedfcn, min, [], [], [], [], lb, [], @cfcn, options);
function [c, ceq] = cfcn(min)
for i = 1:length(min)
if i ~= 9
c(i) = -25000 + abs(min(i));
else
c(i) = -75000 + abs(min(i));
end
end
ceq = [];
end
function [y, grady, H] = nestedfcn(min)
a = F ./ min;
[u,sig,vol] = truss2d(nnod,nel,e,a,conn,x,bc,f);
y = vol;
for i = 1:length(min)
grady(i) = -F(i)*L / (sig(i)^2);
for j=1:length(min)
if i==j
H(i,j) = F(i)*L / (sig(i)^3);
else
H(i,j) = 0;
end
end
end
end
The answer does not come out as it is supposed to and the Hessian is, at every point, a diagonal matrix, while fmincon outputs a non-diagonal matrix. For brevity, I've left out the specific matrices, but if it helps to arrive at a solution, I can post those as well (I think the particular values are irrelevant at the moment as I think it is an implementation issue).

Answers (1)

Matt J
Matt J on 11 Jul 2013
Edited: Matt J on 11 Jul 2013
Well, I for one, can't see why FMINCON would output a diagonal Hessian from what you've posted. The reason for that would have to be in details that you've omitted. Note, however, that the matrix returned by FMINCON will not be the Hessian purely of your objective function. As explained in the documentation, it will be the Hessian of the entire Lagrangian (i.e., it includes the Hessian of your constraints). So, if there are any other constraints you've omitted in an effort to simplify presentation, that could be an explanation.
Other than that, I see one red flag issue. The use of expressions abs(min(i)) in cfcn makes your constraints non-differentiable, violating smoothness assumptions of FMINCON. If you weren't aware of differentiability requirements, I assume you could have violated them in truss2d() as well. This could explain why you're not converging to the required point, especially (but not necessarily) if some of the desired min(i) lie near zero.
If the lb(i) you haven't shown are all >=0 then you can replace abs(min(i)) with just min(i) and that would solve the issue. A better solution though would be to rewrite the constraints you've shown as ub,lb bounds. For example
c(i) = -25000 + abs(min(i));
is equivalent to -25000<=min(i)<=25000 and you can use ub(i),lb(i) to express this constraint instead, merging with any lb that you already have.
  4 Comments
Phil
Phil on 17 Jul 2013
Thanks again. I have done what you had suggested with no success in narrowing down the problem. The problem I am solving is well documented with regards to the solution and I am using the traditional initial guess.
When running fmincon using minFmincon as the initial guess, there is little to no change in the output, which may indicate a local minimum. However, fmincon, in compliance with the Kuhn-Tucker (KKT) conditions, would have the gradient of the Lagrangian equal to zero (<http://www.mathworks.com/help/optim/ug/first-order-optimality-measure.html#brh0y76>), which, as you pointed out earlier, is the gradient that should either be supplied by the user or is calculated using finite differences, but my final outputted gradient is not equal to zero. Curious, unless the gradient outputted by fmincon is no longer the gradient of the lagrangian in order to take into account the constraints.
Regardless of the non-zero gradient, my guess is that I have an issue with updating intermediate parameters used within the nestedfcn, i.e., the inputs to truss2d.
Thank you for all of your help, you have definitely clarified many of my issues and errors. While the problem is not solved yet, I don't believe that the remaining error is in the implementation of the built in MATLAB functions (thanks to you) and I will rethink my intermediate updates and rewrite accordingly. Thanks again.
Matt J
Matt J on 17 Jul 2013
Edited: Matt J on 17 Jul 2013
but my final outputted gradient is not equal to zero.
I'm not sure whether the gradient output is the gradient of the objective function or of the Lagrangian. You can experiment with a few simple problems to check. Either way, even at an unconstrained minimum, the gradient wouldn't be exactly zero. There are other stopping criteria besides the first order optimality measure that FMINCON uses to decide whether to stop iterating. There's TolX, TolFun, etc... It certainly won't wait until you land smack dab on top of an optimal solution.

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!