MATLAB Answers

1

Matrix inversion using "pinv" or any other technique

Asked by Harsha K on 23 Feb 2018
Latest activity Commented on by John D'Errico
on 23 Feb 2018
I am trying to solve a system of independent linear equations.
Ax = B for x.
Many functions within Matlab achieve this with different algorithms. mldivide or '\' operator, 'lsqminnorm' and 'pinv' are the ones I have tried using. For my purpose, pinv seems to be the fastest and relatively good in accuracy.
I am trying to understand which of the equations or part of the matrices are used for computing the inverse of A, i.e., (A^-1)
Let us take an example where A is a matrix of dimensions 500*250. And B is a vector of dimension 500*1. I have two prominent and probably very obvious questions:
1. Now when the inversion of A is performed which of the 500 equations is used to obtain solutions to the 250 variables? 2. How does the 'pinv' function decide which of these equations to choose from?
Kindly advice.

  0 Comments

Sign in to comment.

2 Answers

Answer by John D'Errico
on 23 Feb 2018
Edited by John D'Errico
on 23 Feb 2018
 Accepted Answer

The answer to this question is to take a recent course on numerical linear algebra. Literally so. OR, do a lot of reading, because writing a complete answer here will take the equivalent to writing a book on the subject. So at best I'll just try to point you in a good direction.
Interestingly, you would want to take a recent course on numerical linear algebra, or at least, read a rather recent book on the subject. :-) I say this because tools like lsqminnorm are relatively new. (For example, there was no discussion of the ideas behind LSQMINNORM when I was in grad school, only in the late 1980s.) In fact, it was only introduced in R2017b in MATLAB. For example, a pretty decent read on the topic of the underpinnings of lsqminnorm might be found here:
which dates to only 2009/2010. Personally, I'd suggest reading that reference carefully. It is a pretty decent read, at least to me, and it discusses the underpinnings of PINV, as well as LSQMINNORM. I was surprised to not even see a Wikipedia page for the COD, which forms the basis for LSQMINNORM.
The reason why I said to take an entire CURRENT course on numerical linear algebra, is because you might need that as background to appreciate why tools like inv, backslash, etc., fail on numerically singular problems.
PINV will not in general be the fastest tool. In fact, PINV will generally be the slowest of the available tools, because it relies on the SVD and SVD will be relatively slow for large matrices. Also, PINV does not apply to sparse matrices, while lsqminnorm is able to handle sparse matrices.
You ask which parts of A these tools use. In fact, the goal is to use all of the matrix, but in subtly different ways.
So, you are talking about a non-square matrix, of size 500x250. INV is not even an option, and we cannot compute the inverse of A ever. It does not exist for non-square matrices. At best, you can compute a generalized inverse of some sort. In the past (and, yes numerical linear algebra has changed over the last 10 to 40 years or so) this usually came down to tools that were based on the SVD, so PINV. LSQMINNORM uses the COD (Complete Orthogonal Decomposition), which you might think of as a QR decomposition on steroids. (I wonder if Christine smiles when she reads that.) But because it uses the COD, it should be faster than PINV, since you can compute the COD from just a fixed set of householder rotations on the result of QR.
Sadly, I recognize that I've not really said much here, but what I did say took half a page to write. Those lecture notes I provided a link to are a very good start. Happy reading. If you are very lucky, perhaps Christine will clear up some of what I have glossed over.

  3 Comments

Thank you so much, John. That is a lot to digest. Will read that more carefully and get back to you at the earliest. It still amazes me how can a technique select the best equations especially when the equations are more than the variables to solve. How would it find a unique or best solution? I guess the reading you have recommended would clarify this dilemma. Appreciate very much for the detailed post.
For an overdetermined system like yours (more equations than variables), the solver usually trys to find x such that norm(A*x-b) is minimized. Thus the "solution" x supplied does not solve A*x=b exactly, but only minimizes the distance between the column space of A and the vector b.
Best wishes
Torsten.
The idea is not that you select the best equations.
For example, given a 4x3 array A. If A is just some rand matrix, then in theory, you could choose any 3 of the 4 rows, then invert the 3x3. That is NOT what is done however.
The idea is you want to minimize the sum of the squares of the residuals for all 4 "equations". You only have 3 variables to play with however, so you don't expect to find a solution that gives an exact result. You look for the solution that trades off the squares of the residuals.
Some might think that minimizing the sum of absolute values of the residuals is sufficient, but that problem is not quite as well behaved, nor is it as trivially solved using linear algebra. The sum of squares penalizes the large residuals, so that is good.
As it turns out, on full rank problems all solvers that are used (NOT INV. I've often argued that you should never use INV unless you know why you should not use INV) will yield essentially the same least squares solution, thus pinv, backslash, lsqminnorm, LSQR, etc. Again, that is true if A has full rank.
The problem arises when A is rank deficient. Now there really are some choices one can make. But the choice is not in terms of which equations one will use. If A is less than full rank, then there are choices, so many choices. In fact, infinitely many choices. Personally, I like to explain things using an example, because I think it becomes easier to visualize, easier to understand what is happening. so...
A = rand(4,2)*rand(2,3)
A =
1.02431707010311 0.155141345679453 1.00283035050698
1.19496047597925 0.392311903556264 1.18025167611197
1.06619931243511 0.794911450018101 1.07487947203104
1.28734563705132 0.70126581872722 1.28515551876245
So the matrix A will have rank 2. Using a little toy I wrote recently, we see that, as well as some other information.
singularityReport(A)
The matrix A has size: [4 3]
Thus maximum possible rank of A: 3
Max and min singular values are: [3.4023 2.1479e-16]
Tolerance for inclusion of a vector into the nullspace: 1.7764e-15
Numerical rank: 2
Condition number: 1.584018822650977e+16
The dimension of the null space: 1
Relative size of the next singular value above the null tolerance: 253527845837652.4
Smallest rank 1 perturbation to A that would increase the nullspace: 0.45036
Now, what happens if we try to solve a problem of the form A*x=b?
b = rand(4,1)
b =
0.757740130578333
0.743132468124916
0.392227019534168
0.655477890177557
x = A\b
Warning: Rank deficient, rank = 2, tol = 2.039176e-15.
x =
0.830272727359056
-0.60989110569557
0
So backslash returns a solution with two non-zero elements. In effect, backslash DID choose two columns (NOT rows) of A to work with. It did that because we effectively have only two independent pieces of information in the matrix A.
norm(A*x-b)
ans =
0.0192610487566233
By way of comparison, see that PINV is no better in terms of the result:
xp = pinv(A)*b
xp =
0.442512431710242
-0.629451475697874
0.399094534646983
norm(A*xp-b)
ans =
0.0192610487566233
So in terms of the norm of residuals, pinv does exactly the same. LSQMINNORM gives us the same result.
xL = lsqminnorm(A,b)
xL =
0.442512431710241
-0.629451475697874
0.399094534646983
The difference between the solutions is that the PINV (& LSQMINNORM) solutions also minimize the norm(x).
We can add any amount of the vector null(A) to x, and not change the norm of the residuals. For example...
Anull = null(A)
Anull =
-0.696418839869398
-0.0351304925679089
0.716775172538013
norm(A*(x + 5*Anull)-b)
ans =
0.0192610487566244
So the backslash solution is one that makes one of the components of x zero. There is some good logic in how that zero element is chosen, but one or more of the elements of x will be zero. At the same time, norm(x) is not as small as we could possibly make it. That is gained from the pinv/lsqminnorm solution.
norm(x)
ans =
1.03020384516988
norm(xp)
ans =
0.866777284001741
Oh well. This can go on forever. In fact, there are books (even large books) written on the subject.

Sign in to comment.


Answer by Walter Roberson
on 23 Feb 2018

At the command prompt give the command
type pinv
You can see the entire code there. It is not a long routine.

  1 Comment

Thank you Walter, will do as recommended.

Sign in to comment.