All three of the matrix factorizations discussed in this section
make use of *triangular* matrices, where all the
elements either above or below the diagonal are zero. Systems of linear
equations involving triangular matrices are easily and quickly solved
using either *forward* or *back substitution*.

The Cholesky factorization expresses a symmetric matrix as the product of a triangular matrix and its transpose

*A* = *R*′*R*,

where *R* is an upper triangular matrix.

Not all symmetric matrices can be factored in this way; the
matrices that have such a factorization are said to be positive definite.
This implies that all the diagonal elements of *A* are
positive and that the off-diagonal elements are "not too big."
The Pascal matrices provide an interesting example. Throughout this
chapter, the example matrix `A`

has been the 3-by-3
Pascal matrix. Temporarily switch to the 6-by-6:

A = pascal(6) A = 1 1 1 1 1 1 1 2 3 4 5 6 1 3 6 10 15 21 1 4 10 20 35 56 1 5 15 35 70 126 1 6 21 56 126 252

The elements of `A`

are binomial coefficients.
Each element is the sum of its north and west neighbors. The Cholesky
factorization is

R = chol(A) R = 1 1 1 1 1 1 0 1 2 3 4 5 0 0 1 3 6 10 0 0 0 1 4 10 0 0 0 0 1 5 0 0 0 0 0 1

The elements are again binomial coefficients. The fact that `R'*R`

is
equal to `A`

demonstrates an identity involving sums
of products of binomial coefficients.

and
is said to be |

The Cholesky factorization allows the linear system

*Ax* = *b*

to be replaced by

*R*′*R**x* = *b*.

Because the backslash operator recognizes triangular systems,
this can be solved in the MATLAB^{®} environment quickly with

x = R\(R'\b)

If `A`

is *n*-by-*n*,
the computational complexity of `chol(A)`

is O(*n*^{3}),
but the complexity of the subsequent backslash solutions is only O(*n*^{2}).

LU factorization, or Gaussian elimination, expresses any square
matrix *A* as the product of a permutation of a lower
triangular matrix and an upper triangular matrix

*A* = *LU*,

where *L* is a permutation of a lower triangular
matrix with ones on its diagonal and *U* is an upper
triangular matrix.

The permutations are necessary for both theoretical and computational reasons. The matrix

$$\left[\begin{array}{cc}0& 1\\ 1& 0\end{array}\right]$$

cannot be expressed as the product of triangular matrices without interchanging its two rows. Although the matrix

$$\left[\begin{array}{cc}\epsilon & 1\\ 1& 0\end{array}\right]$$

can be expressed as the product of triangular matrices, when *ε* is
small, the elements in the factors are large and magnify errors, so
even though the permutations are not strictly necessary, they are
desirable. Partial pivoting ensures that the elements of *L* are
bounded by one in magnitude and that the elements of *U* are
not much larger than those of *A*.

For example:

[L,U] = lu(B) L = 1.0000 0 0 0.3750 0.5441 1.0000 0.5000 1.0000 0 U = 8.0000 1.0000 6.0000 0 8.5000 -1.0000 0 0 5.2941

The LU factorization of `A`

allows the linear
system

A*x = b

to be solved quickly with

x = U\(L\b)

Determinants and inverses are computed from the LU factorization using

det(A) = det(L)*det(U)

and

inv(A) = inv(U)*inv(L)

You can also compute the determinants using ```
det(A)
= prod(diag(U))
```

, though the signs of the determinants might
be reversed.

An *orthogonal* matrix, or a matrix with
orthonormal columns, is a real matrix whose columns all have unit
length and are perpendicular to each other. If *Q* is
orthogonal, then

*Q*′*Q* = 1.

The simplest orthogonal matrices are two-dimensional coordinate rotations:

$$\left[\begin{array}{cc}\mathrm{cos}(\theta )& \mathrm{sin}(\theta )\\ -\mathrm{sin}(\theta )& \mathrm{cos}(\theta )\end{array}\right].$$

For complex matrices, the corresponding term is *unitary*.
Orthogonal and unitary matrices are desirable for numerical computation
because they preserve length, preserve angles, and do not magnify
errors.

The orthogonal, or QR, factorization expresses any rectangular matrix as the product of an orthogonal or unitary matrix and an upper triangular matrix. A column permutation might also be involved:

*A* = *QR*

or

*AP* = *QR*,

where *Q* is orthogonal or unitary, *R* is
upper triangular, and *P* is a permutation.

There are four variants of the QR factorization—full or economy size, and with or without column permutation.

Overdetermined linear systems involve a rectangular matrix with
more rows than columns, that is *m*-by-*n* with *m* > *n*.
The full-size QR factorization produces a square, *m*-by-*m* orthogonal *Q* and
a rectangular *m*-by-*n* upper triangular *R*:

C=gallery('uniformdata',[5 4], 0); [Q,R] = qr(C) Q = 0.6191 0.1406 -0.1899 -0.5058 0.5522 0.1506 0.4084 0.5034 0.5974 0.4475 0.3954 -0.5564 0.6869 -0.1478 -0.2008 0.3167 0.6676 0.1351 -0.1729 -0.6370 0.5808 -0.2410 -0.4695 0.5792 -0.2207 R = 1.5346 1.0663 1.2010 1.4036 0 0.7245 0.3474 -0.0126 0 0 0.9320 0.6596 0 0 0 0.6648 0 0 0 0

In many cases, the last *m – n* columns
of *Q* are not needed because they are multiplied
by the zeros in the bottom portion of *R*. So the
economy-size QR factorization produces a rectangular, *m*-by-*n* *Q* with
orthonormal columns and a square *n*-by-*n* upper
triangular *R*. For the 5-by-4 example, this is not
much of a saving, but for larger, highly rectangular matrices, the
savings in both time and memory can be quite important:

[Q,R] = qr(C,0) Q = 0.6191 0.1406 -0.1899 -0.5058 0.1506 0.4084 0.5034 0.5974 0.3954 -0.5564 0.6869 -0.1478 0.3167 0.6676 0.1351 -0.1729 0.5808 -0.2410 -0.4695 0.5792 R = 1.5346 1.0663 1.2010 1.4036 0 0.7245 0.3474 -0.0126 0 0 0.9320 0.6596 0 0 0 0.6648

In contrast to the LU factorization, the QR factorization does
not require any pivoting or permutations. But an optional column permutation,
triggered by the presence of a third output argument, is useful for
detecting singularity or rank deficiency. At each step of the factorization,
the column of the remaining unfactored matrix with largest norm is
used as the basis for that step. This ensures that the diagonal elements
of *R* occur in decreasing order and that any linear
dependence among the columns is almost certainly be revealed by examining
these elements. For the small example given here, the second column
of `C`

has a larger norm than the first, so the two
columns are exchanged:

[Q,R,P] = qr(C) Q = -0.3522 0.8398 -0.4131 -0.7044 -0.5285 -0.4739 -0.6163 0.1241 0.7777 R = -11.3578 -8.2762 0 7.2460 0 0 P = 0 1 1 0

When the economy-size and column permutations are combined, the third output argument is a permutation vector, rather than a permutation matrix:

[Q,R,p] = qr(C,0) Q = -0.3522 0.8398 -0.7044 -0.5285 -0.6163 0.1241 R = -11.3578 -8.2762 0 7.2460 p = 2 1

The QR factorization transforms an overdetermined linear system into an equivalent triangular system. The expression

norm(A*x - b)

equals

norm(Q*R*x - b)

Multiplication by orthogonal matrices preserves the Euclidean norm, so this expression is also equal to

norm(R*x - y)

where `y = Q'*b`

. Since the last *m*-*n* rows
of *R* are zero, this expression breaks into two
pieces:

norm(R(1:n,1:n)*x - y(1:n))

and

norm(y(n+1:m))

When `A`

has full rank, it is possible to solve
for `x`

so that the first of these expressions is
zero. Then the second expression gives the norm of the residual.
When `A`

does not have full rank, the triangular
structure of `R`

makes it possible to find a basic
solution to the least-squares problem.

MATLAB software supports multithreaded computation for a number of linear algebra and element-wise numerical functions. These functions automatically execute on multiple threads. For a function or expression to execute faster on multiple CPUs, a number of conditions must be true:

The function performs operations that easily partition into sections that execute concurrently. These sections must be able to execute with little communication between processes. They should require few sequential operations.

The data size is large enough so that any advantages of concurrent execution outweigh the time required to partition the data and manage separate execution threads. For example, most functions speed up only when the array contains several thousand elements or more.

The operation is not memory-bound; processing time is not dominated by memory access time. As a general rule, complicated functions speed up more than simple functions.

`lu`

and `qr`

show significant increase in speed
on large double-precision arrays (on order of 10,000 elements).

Was this topic helpful?