**You are now following this question**

- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.

# can I pass these nonlinear constraints to lsqnonlin?

##### 1 Comment

### Accepted Answer

##### 44 Comments

" **That seems to be the same as Laplacian(f)=0**."

But f is not a second order tensor in the Laplacian.

" **You've restated here that f should be convex, but still haven't explained why. You also haven't explained why this convexity needs to be enforced at all iterations**."

f(C) should be polyconvex, which is often weakened by requiring f(x,y) to be convex, where x and y are the first two principal invariants of C. I solve the pde with finite elements. Since it is nonlinear in the displacement field U, a Newton scheme is applied where the increments are obtained by solving the linear system K*deltaU = -R

If f(x,y) is convex, it can be guaranteed that the pde has a unique solution and that K is PSD. We can not guarantee that if f(x,y) is not convex. If the convexity is not enforced at all iterations, I often see that K is rank deficient, i.e. I can not solve the pde.

**Strengthened, I think you mean,.**

Yes.

**Ah well. We've already discussed what I think about solving PDEs inside objective functions...
One question. Is the converse true? If you have an f that solves the PDEs will necessarily be convex or polyconvex?**

It is on my list for the future to implement the pde as a nonlinear constraint.

No. If f solves the pde, f must not necessarily be convex/polyconvex. Does this help?

Also, were the information that I provided helpful to make a suggestion for a different set of basis functions / different approximation of f? You mentioned that there are also convex basis functions or semi-local approximation strategies.

You often have great ideas, I really hope also here once again :)

**You would have to solve a potentially non-sparse linear system to find coefficients E_ij giving a certain set of f(xi,yj) values. That means that the number of basis functions you could have might have to be somewhat modest.**

I thought the approximation f = sum N_ij implies that we have ad many basis functions as there are nodal points. So what do you mean by "number of basis functions..." modest?

Let me finally clarify a few points regarding

N_i(x,y) = E_i*log( 1 + exp(a_i*(x-x_i)) + exp(b_i*(y-y_i)) )/log(3)

1. The purpose of the factor 1/log(3) is to make sure that N_i(x_i,y_i)=E_i ? Is that necessari at all given that the basis is not interpolating?

2. What is the purpose of log(...), i.e. why not simple

N_i(x,y) = E_i*( 1 + exp(a_i*(x-x_i)) + exp(b_i*(y-y_i)) )

**(1)That would make the basis function separable in x,y. Not sure that would be a good basis unless f(x,y) is also separable in x and y**

Most of the f(x,y) that I am working on are indeed of the form f(x,y)=h(x)+g(y). Given that, you think

N_i(x,y) = E_i*log( 1 + exp(a_i*(x-x_i)) + exp(b_i*(y-y_i)) )

is appropriate?

**you will want to use log1p for small values of the operands, a_i**(x-x_i)<=33 and b_i(y-y_i)<=33 and you will need to use an appropriate Taylor approximation when either of the exponential terms exp(a_i*(x-x_i)), exp(b_i*(y-y_i)) overflows.*

I think x_i,y_i,E_i<=10 are reasonable upper bounds, so overflow should not be a matter of concern. Why in particular 33 (a_i*(x-x_i)<=33)? The examples on your link use logp1(1+x) for quite small values of x.

### More Answers (1)

##### 89 Comments

**the convexity conditions E(i+1)-2*E(i)+E(i-1) do not have to be satisfied at all i. It is sufficient for this to hold for i=2,...n-1**

But if we just have E(n)>=E(n-1), convexity at the boundary can not be guaranteed. Why do you think the constraints for i=2,...,n-1 are sufficient?

**So there is reason to think that all 3*n parameters could be estimated, assuming you have 3*n sample points**

So I should have at least 3n sample points to determine the 3n parameters? Did you mean that?

**There is no other constraint involving E(n) that I've proposed.**

You did not indeed. But I thought E(n)-E(n-1) would be necessary in addition to ensure convexity. But I do not need this constraint?

May it be that E(n)-E(n-1) along with E(n)-2E(n-1)+E(n-2)>=0 makes f non-convex?

**I can modify my example as below to get poor accuracy again, even with your actual tolerances.**

As you said, setting the tolerances to 1e-12 is probably a good choice. I think I read in a different chat that one should not go below 1e-14. Of course, as your examples shows, one could create cases where this still may not be sufficient.

**Even in my example with (x-1)^4, the objective is not perfectly flat anywhere, but you can still see that it is flat enough to cause early termination.**

The fact that the optimizer pulls away from ground truth, along with exitflag=2, suggests that my objective is "quite" flat around ground truth, right?

**where D_g and D_h are the D parameters (from our earlier discussion) of g and h**

I discretized g and h with five parameters each. Do you mean D_g and D_h is the solution that I got in the noisy case? Would the associated Jacobian not be zero then if D_g and D_h are const numbers?

Makes totally sense what you say. One point:

**so the second derivatives will merely be encouraged to be small in magnitude rather required to have some exact value, as with constraints.**

My motivation to multiply A by 1e4 was because some components 1e3*A*E were still greater than zero. So I increased the exponent by one. What I want to say is that with c, c*A*E, I do influence the sign and magnitude simultaneously, right?

And is it really that this factor somehow sets the derivatives to an exact value (...required to have some exact value, as with constraints)? I mean is there not also an merit function internally whose goal is to decrease the constraint violations?

**It's hard to know what's really happening, since I don't know which of lsqnonlin's algorithms you are running.**

I use lsqnonlin's interior point algorithm which, If I am not mistaken, calls fmincon's interior-point algorithm internally. Given that, do you know what the scaling of the A matrix affects qualitatively? Just magnitude or sign, or both?