FMINUNC CHECK GRADIENT FAILS

Hello everyone,
I'm trying to minimize this function through fminunc running:
[ygrad, cost] = tvd_sim_grad(x, lam, Nit,t);
where x is 4096x1 double & lam, Nit, t are 1x1 double.
function [xden,fval] = tvd_sim_grad(y, lam, Nit,t)
rng default % For reproducibility
ycut=double(abs(y)-t>0); % OUTLIERS REDUCTION TO t = variance calculated using robust covariance estimation.
yind=find(ycut==1);
y(yind)=t;
y=y+1; % necessary to get out of the neighborhood of zero
y0=y;
ObjectiveFunction = @(y) tvd_sim2(y,y0,lam);
options = optimoptions('fminunc','MaxIter',Nit,'ObjectiveLimit',0,'MaxFunEvals',Inf,'TolFun',1e-20,...
'TolX',1e-20,'UseParallel',false,'SpecifyObjectiveGradient',true,'CheckGradients',true,...
'FinDiffRelStep',1e-10,'DiffMinChange',0,'DiffMaxChange',Inf,'Diagnostics','off','Algorithm','quasi-newton',...
'HessUpdate','bfgs','FinDiffType','central','HessianFcn',[],...
'PlotFcns','optimplotfval','Display','final-detailed');
[xden,fval] = fminunc(ObjectiveFunction,y,options);
xden= xden-1; % zero realignment
end
function [TVD,mygrad] = tvd_sim2(x,y, lam)
TVD=1/2.*sum(abs((y-x).^2)) + lam.*sum(abs(diff(diff(-y./(1-x.*y-x.^2)))));
f=@(x) 1/2.*sum(abs((y-x).^2)) + lam.*sum(abs(diff(diff(-y./(1-x.*y-x.^2)))));
mygrad=gradient(f(x'));mygrad=mygrad';
end
This is a modification of the total variation denoising that I created to make the function itself derivable (the original is not differentiable in the second term). This function is differentiable in all real space except to 0. As you can see I made the opportune modification to the dataset to avoid zeros and now the data is condensed around the value 1.
When I use :
'SpecifyObjectiveGradient',false
I obtain this great results (red line is xden) :
But when I use :
'SpecifyObjectiveGradient',true
it makes 0 iteration and fails returning :
Optimization stopped because the objective function cannot be decreased in the
current search direction. Either the predicted change in the objective function,
or the line search interval is less than eps.
'CheckGradients',true
gives me :
Objective function derivatives:
Maximum relative difference between supplied
and finite-difference derivatives = 33382.1.
Supplied derivative element (1012,1): 0.480282
Finite-difference derivative element (1012,1): -33381.7
CheckGradients failed.
____________________________________________________________
Error using validateFirstDerivatives (line 102)
CheckGradients failed:
Supplied and finite-difference derivatives not within 1e-06.
how to get the above results by providing the gradient and why it doesn't work?
Thanks !

13 Comments

Matt J
Matt J on 19 Dec 2022
Edited: Matt J on 19 Dec 2022
Your post doesn't show how you ran your code, but just for starters, the gradient() command does not take the analytical gradient of an anonymous function. You imagined somehow that it does.
Hi !
I simply run :
[ygrad, cost] = tvd_sim_grad(x, lam, Nit,t);
where x is 4096x1 double & lam, Nit, t are 1x1 double.
I think syms calculate analytical gradient and I used it before. But why so you say I need it ?
Why I can't use gradient() for an anonymous function and to obtain numerical gradient ?
Does it's wrong?
This is the reason why I avoided using syms :
Thanks!
What is
size(mygrad)
in tvd_sim2 ?
Matt J
Matt J on 19 Dec 2022
Edited: Matt J on 19 Dec 2022
where x is 4096x1 double & lam, Nit, t are 1x1 double.
I suggest posting them in a .mat file
Why I can't use gradient() for an anonymous function and to obtain numerical gradient ?
Because that's not one of its capabilities, as you should see from the documentation.
"gradient" in your case computes
[f(2)-f(1);0.5*(f(3)-f(1));...;0.5*(f(end)-f(end-2));f(end)-f(end-1)]
where
f = f(x')
I doubt this is what you want.
Emiliano Rosso
Emiliano Rosso on 19 Dec 2022
Edited: Emiliano Rosso on 19 Dec 2022
mygrad is 4096x1 and fminunc accepts it.
I added .mat workspace after the process finished.
Thanks!
I think I understand the problem.
This is a new code I wrote to compute the gradient :
TVD=1/2.*sum(abs((y-x).^2)) + lam.*sum(abs(diff(diff(-y./(1-x.*y-x.^2)))));
f=@(x) 1/2.*sum(abs((y-x).^2)) + lam.*sum(abs(diff(diff(-y./(1-x.*y-x.^2)))));
delta=1e-6;
mygrad(1:4096,1)=0;
for i=1:4096
xm=x;xm(i,1)=xm(i,1)-delta;
xp=x;xp(i,1)=xp(i,1)+delta;
mygrad(i,1)=(f(xp)-f(xm))./2*delta;
end
It works but it's slower than finite difference !!!
It really works although it should be
mygrad(i,1)=(f(xp)-f(xm))/(2*delta);
instead of
mygrad(i,1)=(f(xp)-f(xm))./2*delta;
?
But not well, I guess.
Emiliano Rosso
Emiliano Rosso on 20 Dec 2022
Edited: Emiliano Rosso on 20 Dec 2022
Sure !
I think what I made is quite similar to finite difference....
but if I would want to calculate the analytic function?
Syms gives me strange results as you can see here ...
Your function is not differentiable everywhere since it contains "abs" expressions.
Did you change
mygrad(i,1)=(f(xp)-f(xm))./2*delta;
to
mygrad(i,1)=(f(xp)-f(xm))/(2*delta);
?
Does it run better with the corrected derivative ?
yes I changed it , time is the same .
Thanks!
A one-sided finite difference approximation for the derivative instead of a centered one will half the number of function calls...
Thanks , I save 5 sec !
I discovered I can use :
TVD=1/2.*sum((y-x).^2) + lam.*sum((diff(diff(-y./(1-x.*y-x.^2)))).^2);
instead of using abs , it works the same...I think now it's derivable and I can extract the analytic function!
In any case what I have done up to now is to provide the finite differences so the calculation times are the same, indeed, mine takes a few seconds longer but I would be surprised otherwise. Without an analytical function I can't break down the times.
Alternatively I'm trying to optimize the gradient computation code, which matlab normally can't do because it receives many different functions.

Sign in to comment.

Answers (0)

Products

Release

R2020b

Asked:

on 19 Dec 2022

Edited:

on 21 Dec 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!