fminsearch for a function f(x,y) = x^4 -3xy+2y^2
3 views (last 30 days)
Star Strider on 13 Oct 2022
It depends how you set up the function, and whether you use the function by itself or the norm of the function.
The easiest way to find out is to do the experiment —
f = @(x,y) x.^4 -3*x.*y + 2*y.^2;
[B1,fval1] = fminsearch(@(b)f(b(1),b(2)), [0 0])
[B2,fval2] = fminsearch(@(b)norm(f(b(1),b(2))), [0 0])
[B3,fval3] = fminsearch(@(b)norm(f(b(1),b(2))), [1 1]*100)
[B4,fval4] = fminsearch(@(b)norm(f(b(1),b(2))), [1 1]*1E+12)
fsurf(f, [-1 1 -1 1]*2)
fcontour(f, [-1 1 -1 1]*2.5)
John D'Errico on 13 Oct 2022
Edited: John D'Errico on 13 Oct 2022
This MUST be a homework question. But by now, you have multiple answers, that I think don't explain in any depth what you asked. (Star kind of discussed some of it, but there is no need at all to use norm here, since your function returns a scalar.)
What is the shape of that function near the origin? First, learn to write using operators! MATLAB does not have implicit multiplication.
fun = @(x,y) x.^4 -3*x.*y+2*y.^2
Your function has a point if interest at the originn, but it is not a min or max. As you can see, the zero contours cross at that point. There are two local minima away from the origin, but the origin is a saddle point. We can see that here:
So there is a valley in one direction, where the function acually goes through a local max at the origin. But in the other direction, the function is minimized at the origin. This would be called a saddle point. You can identify it from the contours themselves, if you recognize that characteristic shape, where the zero contour wants to cross itself.
(To go into way more depth than this problem merits, I would point out that the Hessian matrix at that point for this function is a known as a defective matrix, lacking a complete set of eigenvectors.)
syms X Y
Gradxy = gradient(fun(X,Y))
xystat = solve(Gradxy == 0)
So there are three stationary points, one of them at the origin, and the other two live in the first and third quadrants.
Now, try using fminsearch on this function of two variables.
opts.Display = 'iter';
funxy = @(xy) fun(xy(1),xy(2)); % make it a function of a vector of length 2
[xysol,fval] = fminsearch(funxy,[0 0],opts)
As you can see, it returns a solution that is not a local min. But at the same time, it does not return the point [0,0]. So what happened?
Fminsearch starts out by choosing a small initial simplex, with one vertex at the start point. But in this case, all of the function values at that initial point were small, and close to each other. fminsearch decided to then look INSIDE the simplex, rather then look further afield, where it might have decided to find one of the global minima. After only a few such iterations, fminsearch decided that where you started it out was actually a solution. As you can see, it reports that the point it has does satisfy all the convergence criteria.
When instead, you start the solver out somewhere else, it can then converge to a valid solution.
[xysol,fval] = fminsearch(funxy,[-2,3],opts)
So here it does manage to escape the stable point at [0,0],
Essentially, fminsearch has a problem because the point [0,0] is a stable point on the surface, where the gradient is exactly zero. So your function is perfectly flat there, and fminsearch becomes confused, because the initial simplex is small enough that the function appears close enough to constant that it gives up.
We can test that claim, by changing the parameters for fminsearch.
Do you see the initial tolerance for fminsearch are actually pretty large in this context? Ill change TolFun, so that fminsearch does not think it can terminate so easily.
opts.TolFun = 1e-12;
[xysol,fval] = fminsearch(funxy,[0 0],opts)
Indeed now, fminsearch finds one of the global solutions. Understanding the general algorithm for fminsearch is a valuable thing, as it can help you to see what fminsearch is doing, and to then diagnose stange behavior like this.
David Goodmanson on 13 Oct 2022
Edited: David Goodmanson on 14 Oct 2022
While it's useful to find out about how to get fminsearch to cooperate, that's hardly necessary here. For the minimum, you can calculate the gradient and set it to zero:
f(x,y) = x^4 -3xy +2y^2
grad f(x,y) = (4x^3 - 3y) ux + (4y -3x) uy
where ux and uy are unit vectors along x and y respectively:
grad_y = 0 --> y = (3/4)x.
plug that into grad_x.
grad_x = 0 --> 4x^3 - (9/4)x = 0 --> x^2 = 9/16 (also x = 0)
x = 3/4 y = 9/16
x = -3/4 y = -9/16
as is verified by the posted plots.