LU decomposition algorithm change

For this particular example:
A=[ 1 2 3;
4 5 6;
7 8 9;
10 11 12];
I get two different results depending on MATLAB version. E.g.,
Online MATLAB R2023a:
[L,U,P] = lu(A)
L = 4×3
1.0000 0 0 0.1000 1.0000 0 0.7000 0.3333 1.0000 0.4000 0.6667 0
U = 3×3
10.0000 11.0000 12.0000 0 0.9000 1.8000 0 0 0
P = 4×4
0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0
You can see that the last row of U is exactly 0's, so the L(4,3) spot is arbitrary in that sense but in this case the algorithm produced exactly 0.
Now compare this result with my PCwin MATLAB R2021a:
>> A
A =
1 2 3
4 5 6
7 8 9
10 11 12
>> [L,U,P] = lu(A)
L =
1.0000 0 0
0.1000 1.0000 0
0.7000 0.3333 1.0000
0.4000 0.6667 0.4599
U =
10.0000 11.0000 12.0000
0 0.9000 1.8000
0 0 0.0000
P =
0 0 0 1
1 0 0 0
0 0 1 0
0 1 0 0
>> U(3,3)
ans =
6.9204e-16
>> L(4,3)
ans =
0.4599
You can see that in this case the U(3,3) spot is not exactly 0, and that "arbitrary" L(4,3) spot now has a non-zero value. Any L(4,3) value that is small enough would work equally as well. Slightly different LU algorithms might produce slightly different results due to floating point effects, so both answers are "correct" in that sense and that is not my concern. I am simply wondering when this change took place. I no longer have access to the many MATLAB versions I used to, so I am posing to the group. Which MATLAB version between R2021a and R2023a made the change to report exact 0's for U(3,3) and L(4,3) for this example? Can someone with access to R2021b, R2022a, R2022b, R2023a on desktops run the above code and post what you get? Thanks.

Answers (2)

This was introduce in R2022a and might be a promising candidate to answer your question.
Quoting Matlab realease notes:
inv Function: Improved performance when inverting large triangular matrices
The inv function shows improved performance when operating on large triangular matrices.
For example, inverting a 5,000-by-5,000 upper triangular matrix is about 3.7x faster than in the previous release.
function timingInv
rng default
A = randn(5e3);
[~,R] = lu(A);
tic
Y = inv(R);
toc
end
The approximate execution times are:
R2021b: 1.1 s
R2022a: 0.3 s
The code was timed on a Windows 10, Intel Xeon CPU W-2133 @ 3.60 GHz test system by calling the timingInv function.

7 Comments

I just checked on my copies of Matlab (also Windows 10), and indeed R2022a produdes L(end)=0, while R2021b produces L(end)=0.4599.
R2022a had many changes regarding matrix manipulation, not easy to point out the exact one.
James Tursa
James Tursa on 2 Jun 2023
Edited: James Tursa on 2 Jun 2023
Thanks! Do you have a C compiler installed? If so, would you be willing to run a short test on R2022a to see if it is the LAPACK library producing the change or the MATLAB interface to the LAPACK library that is producing the change? I can post the code to compile.
A few releases ago the installation of the compiler from the file exchange started to not work for me, but I can try. Does it matter that the compiler would be the same across the releases?
Here is the C-code for the mex routine:
/* File: dgetrf.c */
/* MATLAB mex routine, Calls LAPACK dgetrf routine for LU factoring */
/* Programmer: James Tursa */
/* B = dgetrf(A) */
/* Includes ----------------------------------------------------------- */
#include "mex.h"
#include "blas.h"
#include "lapack.h"
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSignedIndex M, N, LDA, INFO, MN;
mwSignedIndex *IPIV;
mxArray *A;
if( nlhs > 3 ) {
mexErrMsgTxt("Too many outputs.");
}
if( nrhs != 1 || !mxIsDouble(prhs[0]) || mxIsSparse(prhs[0]) || mxIsComplex(prhs[0])
|| mxGetNumberOfDimensions(prhs[0]) != 2 ) {
mexErrMsgTxt("Need one full real double 2D matrix input.");
}
M = mxGetM(prhs[0]);
N = mxGetN(prhs[0]);
LDA = M;
MN = M<N ? M : N;
IPIV = mxMalloc(MN * sizeof(*IPIV));
A = mxDuplicateArray(prhs[0]);
dgetrf( &M, &N, mxGetData(A), &LDA, IPIV, &INFO );
if( INFO ) {
mexErrMsgTxt("dgetrf returned an error code.");
}
if( nrhs <= 1 ) {
plhs[0] = A;
} else if( nrhs == 2 ) {
mexErrMsgTxt("2 outputs not yet implemented.");
} else {
mexErrMsgTxt("3 outputs not yet implemented.");
}
mxFree(IPIV);
}
Here is a compilation routine for PCwin:
function compile_lapack(varargin)
libdir = 'microsoft';
lib_blas = fullfile(matlabroot,'extern','lib',computer('arch'),libdir,'libmwblas.lib');
lib_lapack = fullfile(matlabroot,'extern','lib',computer('arch'),libdir,'libmwlapack.lib');
mex(varargin{:},lib_blas,lib_lapack,'-largeArrayDims');
end
And here is how to compile it:
>> compile_lapack('dgetrf.c')
Building with 'MinGW64 Compiler (C)'.
MEX completed successfully.
Then just run these two functions on both R2021b and R2022a and compare results:
B = lu(A)
C = dgetrf(A)
I don't think the specific compiler matters since the C gateway routine isn't doing any computations ... it is simply passing the matrix to LAPACK and returning the result back to MATLAB.
I used the A from your original post. isequal(B,C) returns true on both releases.
R2021b =
10.0000 11.0000 12.0000
0.1000 0.9000 1.8000
0.7000 0.3333 0.0000
0.4000 0.6667 0.4599
R2022a =
10.0000 11.0000 12.0000
0.1000 0.9000 1.8000
0.7000 0.3333 0.0000
0.4000 0.6667 0
@Rik Thanks. That confirms that the change was in the LAPACK library itself, and not MATLAB's interface code to LAPACK.

Sign in to comment.

Christine Tobler
Christine Tobler on 2 Jun 2023
Edited: Christine Tobler on 2 Jun 2023
As you point out, this kind of difference in result is to be expected and is well withing expected round-off behavior.
Another point is that the algorithm change here is most likely not anything deep: The algorithm used for LU has some tuning parameters (the block size) which can change from CPU to CPU for performance reasons. So the reason for the change here is as likely to be that MATLAB Online uses a different type of CPU as it is a change between MATLAB versions.
Round-off can also change based on what registers are available on any CPU, as a performance-oriented implementation will try to use these to best advantage.
If the reason isn't different CPUs, and also isn't different versions of the BLAS or LAPACK library used, the most likely reason is this R2022b release note:
mldivide and pagemldivide Functions: Improved performance with small matrices

Categories

Products

Release

R2021a

Tags

Asked:

on 2 Jun 2023

Commented:

on 6 Jun 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!