When is minimum p-norm solution independent of p?

I have created three optimization models for the same objective function but different norms-L1, L2, Linf- and subjected to the same constraints as shown below.
‖F_1*w+F^0 ‖_L1=min. (first objective function)
‖F_1*w+F^0 ‖_L2=min. (second objective function)
‖F_1*w+F^0 ‖_L∞=min. (third objective function)
subjected to the same constraints
The results of both models (based on L1 and L2 ) are identical, while the result of the third model(based on Linf) is very close to the results of L1 and L2.
My question is: is there any explanation of these identicality between the results of L1 and L2 also for the very close result of Linf ?
Thanks

 Accepted Answer

Yes, there are possibilities for F0 that can make this happen, e.g.
F0=[1 2 3 4 5];
fminsearch(@(w) norm(w-F0,1),3)
ans = 3
fminsearch(@(w) norm(w-F0,2),3)
ans = 3
fminsearch(@(w) norm(w-F0,inf),3)
ans = 3

8 Comments

you means when F0 be a vector this may happened?
please explain and if you have a refference please provide it me
I think everyone who reads your post will assume F0 is a vector. You haven't said otherwise.
Another case where this can happen is if an exact solution to F_1*w0+F0=0 exists (and which satisfies the constraints). Such a w0 will be optimal with respect to any norm objective, with an optimal objective value of zero.
That happens but it's a rare coincidence. For vector known v and unknown scalar x.
argmin |x-v|_2 = mean(v)
argmin |x-v|_1 = median(v)
argmin |x-v|_inf = (min(v)+max(v))/2
Matt provides an example v that these 3 quantities are the same. Generally they are NOT.
There is no reason why miracully these solutions suddendly merge together with problems with linear operator in the cost function. It might be caused by the constraints: they are so tight so that the solution is only driven by the constraints become a vertices of the admissible set, and the vertice can be confounded with different objective norms.
There is no reason why miracully these solutions suddendly merge together with problems with linear operator in the cost function.
Here's a case with a linear operator,
F1 =[
0.7094 0.6797 0.1190
0.7547 0.6551 0.4984
0.2760 0.1626 0.9597];
F0=[ -0.7665
-0.9862
-0.7344];
w0=[0.9,0.3,0.5].';
opts=optimset('MaxIter',5e4,'MaxFunEvals',5e6,'TolFun',1e-12,'TolX',1e-12);
fminsearch(@(w) norm(F1*w+F0,1), w0, opts)
ans = 3×1
0.7504 0.2559 0.5061
fminsearch(@(w) norm(F1*w+F0,2),w0, opts)
ans = 3×1
0.7503 0.2560 0.5061
fminsearch(@(w) norm(F1*w+F0,inf),w0, opts)
ans = 3×1
0.7503 0.2560 0.5061
This example has unique trivial solution (w=-inv(F1)*F0) that makes all the norm = 0.
You can even check with argmin norm(F1*w+F0,p); with any p>=1.
Why is that "trivial"? Because the solution is unique? The point of the question was to find reasons why the solutions for all norms might be the same. Here is a case where the solution is not unique:
F1 =[ 0.8143 0.3500 0.8143
0.2435 0.1966 0.2435
0.9293 0.2511 0.9293];
F0=-sum(F1,2);
opts=optimset('MaxIter',5e4,'MaxFunEvals',5e6,'TolFun',1e-12,'TolX',1e-12);
w0=[2,1,0].';
fminsearch(@(w) norm(F1*w+F0,1), w0, opts)
ans = 3×1
2.0000 1.0000 0.0000
fminsearch(@(w) norm(F1*w+F0,2),w0, opts)
ans = 3×1
2.0000 1.0000 -0.0000
fminsearch(@(w) norm(F1*w+F0,inf),w0, opts)
ans = 3×1
2.0000 1.0000 0.0000
w0=[1,1,1].';
fminsearch(@(w) norm(F1*w+F0,1), w0, opts)
ans = 3×1
1 1 1
fminsearch(@(w) norm(F1*w+F0,2),w0, opts)
ans = 3×1
1 1 1
fminsearch(@(w) norm(F1*w+F0,inf),w0, opts)
ans = 3×1
1 1 1
In my case I get the sam solution for w based different norms. But when I change w0 i get a new solution and agin sam values for different norms. All time new solution and sam values for L1, L2, and Linf.
F0=[50 50 50 5 50 50 50 ];
F1 is a matrix 7x35
w is a vector (35x1)
constraints are sam for the three models(L1, L2, and Linf)

Sign in to comment.

More Answers (2)

This code is a toy example of 2 x 2 with random f0 and show most of the time it driven by the linear constraints and all three give [1;1] as solution. [1;1] is a sharp vertice of the feasible region. So it converge there regardless the selected norm.
Some other time they give different results (not a vertice of the feasible region). As long as the dimension of the span of the active constraints is less than the dimension of the problem (size of x), the current solution still have some degree of freedom to move and the objective norm is effectively matter (e.g. simple example of mean/median/(min+max)/2).
x = optimvar('x',2,1,'LowerBound',0);
A=[-7 8;
8 -7];
b=[1; 1];
f0 = 2*rand(2,1);
L1prob = optimproblem();
L1prob.Constraints.ineg = A*x <= b;
L1prob.Objective = sum(f0-x);
L1sol = solve(L1prob);
L1sol.x
L2prob = L1prob;
L2prob.Objective = sum((f0-x).^2);
L2sol = solve(L2prob);
L2sol.x
L10prob = L1prob;
L10prob.Objective = sum((f0-x).^10); % ~Linf, cannot translated by MATLAB pb based
L10prob = solve(L10prob, struct('x',[0;0]));
L10prob.x
So yes it's possible. Where as an explanation is suitable for YOUR problem, hard to tell with so little details.

13 Comments

But i have linear and nonlinear constraints
Can i send the completed cod?
But i have linear and nonlinear constraints
But are any of the nonlinear constraints active at the solution you're getting?
what do you mean by active at the solution?
Are the nonlinear equalities satisfied with equality, within the ConstraintTolerance?
Whereas the constraint is linear or non-linear it's irrelevant for the discussion of the reason I suggest. The important thing is solution might get stuck at the vertice. The sharpness is more relevant, meaning the linearized (active) constraints are very mutually anti-correlated.
Yes they satisfy , but they introduce inequality not equality
Then they are not active, and may as well not be there. The solution you're getting will remain the solution and satisfy the nonlinear constraints even if you tell the solver to ignore them.
But Bruno is right. Whether your constraints are linear or nonlinear, they can still determine the solution independently of the objective norm.
Comment restored (please do not delete a comment that has been address):
Ahmed Galal: "Does this mean that the norm in the objective function is useless?"
NOOOOOO, sometime it does sometime it does not.
From what we show you, nothing allows to make such black and white assumption.
It depends on the constraints you impose, which is active, their sharpness, the location of F0. It can give the same solution or not.
What I understood is that the results idintical because the constraints are very mutually anti-correlated..
So, I can say that any one of these models can be used. But I can not prefer one of them
If the two models have the identical result(W) does this mean that their objective functions must have the same min. value?
  1. If you have an nice (convex) objective function f(x) and g(x):=2*f(x) and search for
xminf = argmin f(x)
xming = argmin g(x)
by wharever method to find them.
Is xminf = xming?
What are the values of min of f and g at the argmin?
2. Do you know you can check the min vavue by using the second output when calling MATLAB minimizer (here fmincon is given example, but any MATLAB solver will have the same output convention)
[xmin, fmin, ...] = fmincon(...)
These two should give you a hints to answer your question.

Sign in to comment.

Bruno Luong
Bruno Luong on 7 Dec 2020
Edited: Bruno Luong on 7 Dec 2020
The optimizers fail to initialize find the first feasible point with its gradient and return the same guess, which are identical.

6 Comments

I changed the initial data and got differrent F0 also got a slight differences in the results of w based different norms. Moreover, i found that the standard deviation for the results of w based on L2 is relatively greater than those of L1.
how the standard deviation of L2 is less than the standard deviation of L1.
The standard deviation for the L2 norm must always be less than the STD for the norm L1! These are the basic principles of these methods!
Hence you can conclude that the l2 minimization is not working: fail, not converge, converge to local minima.
Does this mean that l2 is not suitable for this objective function?
The standard deviation for the L2 norm must always be less than the STD for the norm L1!
That might be true if the elements of F0(i) are uniformly distributed, if F1=eye(N), and if you dropped the constraints, but I don't see why that would be true otherwise. The L2-norm you are using is not weighted by the distribution of F0 as far as you've told us.
So, the standard deviation for the L2 norm must always be less than the STD for the norm L1 only if one of these three conditions exist or all of them exist in the same time?
And what about if std of Linf is the lowest? is it possible also
I'm just saying I'm struggling to see a statistical argument that would guarantee that STD for the L2 norm would be less than for the L1 norm. Assuming F0 is supposed to be Gaussian distributed N(-F1*wtrue,sig*I), then it is known that a minimum variance unbiased estimator of wtrue is a linear function of F0, but the min. L2 norm estimator is only a linear function of F0 in the unconstrained case, so I don't know how we extend things to the constrained case. Also, F1 is rank deficient in this case, so I don't immediately see how that will affect things either.

Sign in to comment.

Categories

Asked:

on 5 Dec 2020

Edited:

on 8 Dec 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!