MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

# Thread Subject: Question on the derivate /calculus of a 2-norm matrix . Thanks a lot

Subject: Question on the derivate /calculus of a 2-norm matrix . Thanks a lot

From: Antony

### Antony

Date: 25 Jul, 2010 13:30:08

Message: 1 of 9

Hi, all, I wonder how to solve the following problem:
Assuming g(x)=||KX-B||^2 where both K and X are matrices and B is a vector, please compute \partial{g}/\partial{x} .

I don't know what the exact answer is. Checking some related stuffs, I think the answer might be 2K^{T}(KX-B). Could you please show me how to compute \partial{g}/\partial{x} in this problem? Or is there some website discussing such a problem? Thanks a lot.

In addition, I wonder where to find the possible answer to such similar problems. May I call it multi variables calculus problem, or derivates of quadratic 2-norm matrix problem, or some other problem? I tried to search the web, but failed. Thanks a lot again!

I have read some related material, such as matrix calculus, but it seems that this type of problem is not discussed at all. I feel it is not a simple d(X^{T}AX)/dX problem which is rather simple.

Antony

Subject: Question on the derivate /calculus of a 2-norm matrix . Thanks a lot

From: Roger Stafford

### Roger Stafford

Date: 25 Jul, 2010 19:22:05

Message: 2 of 9

"Antony " <mutang.bing@gmail.com> wrote in message <i2he90$21a$1@fred.mathworks.com>...
> Hi, all, I wonder how to solve the following problem:
> Assuming g(x)=||KX-B||^2 where both K and X are matrices and B is a vector, please compute \partial{g}/\partial{x} .
>
> I don't know what the exact answer is. Checking some related stuffs, I think the answer might be 2K^{T}(KX-B). Could you please show me how to compute \partial{g}/\partial{x} in this problem? Or is there some website discussing such a problem? Thanks a lot.
>
> In addition, I wonder where to find the possible answer to such similar problems. May I call it multi variables calculus problem, or derivates of quadratic 2-norm matrix problem, or some other problem? I tried to search the web, but failed. Thanks a lot again!
>
> I have read some related material, such as matrix calculus, but it seems that this type of problem is not discussed at all. I feel it is not a simple d(X^{T}AX)/dX problem which is rather simple.
>
> Antony
- - - - - - - - - -
If you want the partial derivatives listed as a column vector, then the answer you gave is correct:

2*K'*(K*X-B)

where the apostrophe is used here to denote (complex) transpose as is done in matlab.

Here is a somewhat intuitive demonstration. Your L2 norm can be written as:

||K*X-B||^2 = (K*X-B)'*(K*X-B) = X'*K'*K*X - X'*K'*B - B'*K*X - B'*B

Since B'*K*X is a scalar, it is equal to its own transpose

B'*K*X = (B'*K*X)' = X'*K'*B

which gives

||K*X-B||^2 = X'*K'*K*X - 2*X'*K'*B - B'*B

If we have an n-element row vector whose elements are each a function of x1,x2,...,xn, we can produce a matrix in which for each of the functions there is an n-element column of the partial derivatives of that function taken with respect to x1,x2,...,xn. If the row vector is simply X' = [x1,x2,...,xn] then applying this partial derivative operator will clearly give just the identity matrix, I.

If we apply this operator to the above L2 norm we get

2*I*K'*K*X - 2*I*K'*B + 0 = 2*K'*(K*X-B)

as asserted above. In the case of X'*K'*K*X its derivative is to be taken first with respect to the x values in the left factor while holding the right hand set of x's fixed, plus the derivatives with respect to the right hand set while holding the left hand set fixed. But because the expression is a scalar it is equal to its own transpose, so that second term is the same as if we held the right hand set fixed and varied the left hand set in both terms. Hence the above operator applied to X'*K'*K*X gives

I*K'*K*X + I*K'*K*X = 2*K'*K*X.

If you are not convinced by this last argument, you can always show it rigorously using summation notion. It is just a problem in taking the derivatives of a homogeneous quadratic function of many variables.

Roger Stafford

Subject: Question on the derivate /calculus of a 2-norm matrix . Thanks a lot

From: Matt J

### Matt J

Date: 26 Jul, 2010 02:19:04

Message: 3 of 9

"Antony " <mutang.bing@gmail.com> wrote in message <i2he90$21a$1@fred.mathworks.com>...
> Hi, all, I wonder how to solve the following problem:
> Assuming g(x)=||KX-B||^2 where both K and X are matrices and B is a vector, please compute \partial{g}/\partial{x} .
>
> I don't know what the exact answer is. Checking some related stuffs, I think the answer might be 2K^{T}(KX-B). Could you please show me how to compute \partial{g}/\partial{x} in this problem? Or is there some website discussing such a problem? Thanks a lot.
============

You could also just use the multivariable chain rule, which says that for vector-valued functions g() and h() and the composition f(X)=g(h(X)), then

Now just apply this with h(X)=K*X-B and g(z)=||z||^2

and substituting z=K*X-B gives

Subject: Question on the derivate /calculus of a 2-norm matrix . Thanks a lot

From: Antony

### Antony

Date: 26 Jul, 2010 02:26:04

Message: 4 of 9

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <i2i2st$n4j$1@fred.mathworks.com>...
> "Antony " <mutang.bing@gmail.com> wrote in message <i2he90$21a$1@fred.mathworks.com>...
...
> I*K'*K*X + I*K'*K*X = 2*K'*K*X.
>
> If you are not convinced by this last argument, you can always show it rigorously using summation notion. It is just a problem in taking the derivatives of a homogeneous quadratic function of many variables.
>
> Roger Stafford

Thank you, Roger. The explanation is extremely clearly and helpful, especially on how to rewrite the 2-norm into matrix multiplication and how the identity matrix is generated. I think I fully understand this process now.

I have another problem. Maybe we can not directly solve it and I think the result might be more complxe than my former problem. The problem is:
if g(x) = ||KX-B||^0.6 with all the other settings as the former problem, what is \partial{g}/\partial{x}?

Do you have any advice on how to solve this equation? Thanks a lot again for your help!

Antony

Subject: Question on the derivate /calculus of a 2-norm matrix . Thanks a lot

From: Antony

### Antony

Date: 26 Jul, 2010 02:47:04

Message: 5 of 9

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i2iran$deb$1@fred.mathworks.com>...
> "Antony " <mutang.bing@gmail.com> wrote in message <i2he90$21a$1@fred.mathworks.com>...
> > Hi, all, I wonder how to solve the following problem:
> > Assuming g(x)=||KX-B||^2 where both K and X are matrices and B is a vector, please compute \partial{g}/\partial{x} .
> >
> > I don't know what the exact answer is. Checking some related stuffs, I think the answer might be 2K^{T}(KX-B). Could you please show me how to compute \partial{g}/\partial{x} in this problem? Or is there some website discussing such a problem? Thanks a lot.
> ============
>
> You could also just use the multivariable chain rule, which says that for vector-valued functions g() and h() and the composition f(X)=g(h(X)), then
>
>
> Now just apply this with h(X)=K*X-B and g(z)=||z||^2
>
>
> and substituting z=K*X-B gives
>

I see. Thanks a lot, Matt. Pretty simple solution!

But, according to the chain rule, I may apply it to f(X)=||KX-B||^0.6 and obtain the result of the derivate as 0.6*K.'*(K*X-B)^{-0.4}? This result seems rather complex for some numerical optimization.

Subject: Question on the derivate /calculus of a 2-norm matrix . Thanks a lot

From: Matt J

### Matt J

Date: 26 Jul, 2010 02:50:05

Message: 6 of 9

"Antony " <mutang.bing@gmail.com> wrote in message <i2irns$94i$1@fred.mathworks.com>...

> I have another problem. Maybe we can not directly solve it and I think the result might be more complxe than my former problem. The problem is:
> if g(x) = ||KX-B||^0.6 with all the other settings as the former problem, what is \partial{g}/\partial{x}?
========

This is equivalent to (||KX-B||^2)^0.3

So you can use your original result, with one more step of the chain rule leading to

Subject: Question on the derivate /calculus of a 2-norm matrix . Thanks a lot

From: Antony

### Antony

Date: 26 Jul, 2010 03:01:04

Message: 7 of 9

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i2it4t$73b$1@fred.mathworks.com>...
> "Antony " <mutang.bing@gmail.com> wrote in message <i2irns$94i$1@fred.mathworks.com>...
>
> > I have another problem. Maybe we can not directly solve it and I think the result might be more complxe than my former problem. The problem is:
> > if g(x) = ||KX-B||^0.6 with all the other settings as the former problem, what is \partial{g}/\partial{x}?
> ========
>
> This is equivalent to (||KX-B||^2)^0.3
>
> So you can use your original result, with one more step of the chain rule leading to
>
> Gradient = 0.3*(||KX-B||^2)^(-.7) * 2*K'*(K*X-B)

Why not write it as Gradient = 0.3 *2*K'*(K*X-B)*(||KX-B||^2)^(-.7) according to the chain rule? It is because K'*(K*X-B) is a scalar and there is no difference between them? Thank you!

Subject: Question on the derivate /calculus of a 2-norm matrix . Thanks a lot

From: Matt J

### Matt J

Date: 26 Jul, 2010 03:04:04

Message: 8 of 9

"Antony " <mutang.bing@gmail.com> wrote in message <i2isv7$pi7$1@fred.mathworks.com>...

> But, according to the chain rule, I may apply it to f(X)=||KX-B||^0.6 and obtain the result of the derivate as 0.6*K.'*(K*X-B)^{-0.4}?
======================

No, this wouldn't be the correct expression. From my last post, I get, after some simplification

>This result seems rather complex for some numerical optimization.
=======================

Well, your objective function f(X)=||KX-B||^0.6 is unusually complex...

For one thing, this function is not differentiable at points where K*X=B, which means that if the minimum lies there, you cannot use gradient-based approaches to find it.

Subject: Question on the derivate /calculus of a 2-norm matrix . Thanks a lot

From: Antony

### Antony

Date: 26 Jul, 2010 03:33:05

Message: 9 of 9

"Matt J " <mattjacREMOVE@THISieee.spam> wrote in message <i2itv4$rr8$1@fred.mathworks.com>...
> "Antony " <mutang.bing@gmail.com> wrote in message <i2isv7$pi7$1@fred.mathworks.com>...
>
> > But, according to the chain rule, I may apply it to f(X)=||KX-B||^0.6 and obtain the result of the derivate as 0.6*K.'*(K*X-B)^{-0.4}?
> ======================
>
> No, this wouldn't be the correct expression. From my last post, I get, after some simplification
>
>
>
> >This result seems rather complex for some numerical optimization.
> =======================
>
> Well, your objective function f(X)=||KX-B||^0.6 is unusually complex...
>
> For one thing, this function is not differentiable at points where K*X=B, which means that if the minimum lies there, you cannot use gradient-based approaches to find it.

Dear Matt, thank a lot for your time in my question. I appreciate your help! I understand the difficulties of such type of optimization problems now. This might be the reason that papers always figure out another efficient solutions to such type of non-convex problems. Thanks again!

Also, thanks a lot for all other guys' kind and patient helps, especially to Roger Stafford and Brian Borchers.

Antony