One of the most important problems in technical computing is the solution of systems of simultaneous linear equations.
In matrix notation, the general problem takes the following form: Given two matrices A and b, does there exist a unique matrix x, so that Ax = b or xA = b?
It is instructive to consider a 1-by-1 example. For example, does the equation
7x = 21
have a unique solution?
The answer, of course, is yes. The equation has the unique solution x = 3. The solution is easily obtained by division:
x = 21/7 = 3.
The solution is not ordinarily obtained by computing the inverse of 7, that is 7^{–1} = 0.142857..., and then multiplying 7^{–1} by 21. This would be more work and, if 7^{–1} is represented to a finite number of digits, less accurate. Similar considerations apply to sets of linear equations with more than one unknown; the MATLAB^{®} software solves such equations without computing the inverse of the matrix.
Although it is not standard mathematical notation, MATLAB uses
the division terminology familiar in the scalar case to describe the
solution of a general system of simultaneous equations. The two division
symbols, slash, /, and backslash,
\, correspond to the two MATLAB functions mrdivide
and mldivide
. mrdivide
and mldivide
are used for the two situations where
the unknown matrix appears on the left or right of the coefficient
matrix:
| Denotes the solution to the matrix equation xA = b. |
| Denotes the solution to the matrix equation Ax = b. |
Think of "dividing" both sides of the equation Ax = b or xA = b by A.
The coefficient matrix A
is always in the "denominator."
The dimension compatibility conditions for x = A\b
require
the two matrices A
and b
to
have the same number of rows. The solution x
then
has the same number of columns as b
and its row
dimension is equal to the column dimension of A
.
For x = b/A
, the roles of rows and columns are
interchanged.
In practice, linear equations of the form Ax = b occur more frequently than those of the form xA = b. Consequently, the backslash is used far more frequently than the slash. The remainder of this section concentrates on the backslash operator; the corresponding properties of the slash operator can be inferred from the identity:
(b/A)' = (A'\b').
The coefficient matrix A
need not be square.
If A
is m-by-n,
there are three cases:
m = n | Square system. Seek an exact solution. |
m > n | Overdetermined system. Find a least-squares solution. |
m < n | Underdetermined system. Find a basic solution with at most m nonzero components. |
The mldivide
operator
employs different solvers to handle different kinds of coefficient
matrices. The various cases are diagnosed automatically by examining
the coefficient matrix. For more information, see the "Algorithms"
section of the mldivide
reference
page.
The general solution to a system of linear equations Ax = b describes all possible solutions. You can find the general solution by:
Solving the corresponding homogeneous
system Ax = 0.
Do this using the null
command,
by typing null(A)
. This returns a basis for the
solution space to Ax = 0. Any
solution is a linear combination of basis vectors.
Finding a particular solution to the nonhomogeneous system Ax = b.
You can then write any solution to Ax = b as the sum of the particular solution to Ax = b, from step 2, plus a linear combination of the basis vectors from step 1.
The rest of this section describes how to use MATLAB to find a particular solution to Ax = b, as in step 2.
The most common situation involves a square coefficient matrix A
and
a single right side column vector b
.
If the matrix A
is nonsingular, the solution, x
= A\b
, is then the same size as b
. For
example:
A = pascal(3); u = [3; 1; 4]; x = A\u x = 10 -12 5
It can be confirmed that A*x
is exactly equal
to u
.
If A
and b
are square
and the same size, x= A\b
is also that size:
b = magic(3); X = A\b X = 19 -3 -1 -17 4 13 6 0 -6
It can be confirmed that A*x
is exactly equal
to b
.
Both of these examples have exact, integer solutions. This is
because the coefficient matrix was chosen to be pascal(3)
,
which is a full rank matrix (nonsingular).
A square matrix A is singular if it does
not have linearly independent columns. If A is
singular, the solution to Ax = b either does not exist, or is not
unique. The backslash operator, A\b
, issues a warning
if A
is nearly singular and raises an error condition
if it detects exact singularity.
If A is singular and Ax = b has a solution, you can find a particular solution that is not unique, by typing
P = pinv(A)*b
P
is a pseudoinverse of A.
If Ax = b does not have an exact
solution, pinv(A)
returns a least-squares solution.
For example:
A = [ 1 3 7 -1 4 4 1 10 18 ]
is singular, as you can verify by typing
rank(A) ans = 2
Since A is not full rank, it has some singular values equal to zero.
Note:
For information about using |
Exact Solutions. For b =[5;2;12]
, the equation Ax = b has
an exact solution, given by
pinv(A)*b ans = 0.3850 -0.1103 0.7066
Verify that pinv(A)*b
is an exact solution
by typing
A*pinv(A)*b ans = 5.0000 2.0000 12.0000
Least-Squares Solutions. However, if b = [3;6;0]
, Ax = b does
not have an exact solution. In this case, pinv(A)*b
returns
a least-squares solution. If you type
A*pinv(A)*b ans = -1.0000 4.0000 2.0000
you do not get back the original vector b
.
You can determine whether Ax = b has an exact solution by finding
the row reduced echelon form of the augmented matrix [A b]
.
To do so for this example, enter
rref([A b]) ans = 1.0000 0 2.2857 0 0 1.0000 1.5714 0 0 0 0 1.0000
Since the bottom row contains all zeros except for the last
entry, the equation does not have a solution. In this case, pinv(A)
returns
a least-squares solution.
This example shows how overdetermined systems are often encountered in various kinds of curve fitting to experimental data.
A quantity, y
, is measured at several different values of time, t
, to produce the following observations. You can enter the data and view it in a table with the following statements.
t = [0 .3 .8 1.1 1.6 2.3]'; y = [.82 .72 .63 .60 .55 .50]'; B = table(t,y)
B = 6×2 table
t y
___ ____
0 0.82
0.3 0.72
0.8 0.63
1.1 0.6
1.6 0.55
2.3 0.5
Try modeling the data with a decaying exponential function
.
The preceding equation says that the vector y
should be approximated by a linear combination of two other vectors. One is a constant vector containing all ones and the other is the vector with components exp(-t)
. The unknown coefficients,
and
, can be computed by doing a least-squares fit, which minimizes the sum of the squares of the deviations of the data from the model. There are six equations in two unknowns, represented by a 6-by-2 matrix.
E = [ones(size(t)) exp(-t)]
E =
1.0000 1.0000
1.0000 0.7408
1.0000 0.4493
1.0000 0.3329
1.0000 0.2019
1.0000 0.1003
Use the backslash operator to get the least-squares solution.
c = E\y
c =
0.4760
0.3413
In other words, the least-squares fit to the data is
The following statements evaluate the model at regularly spaced increments in t
, and then plot the result together with the original data:
T = (0:0.1:2.5)'; Y = [ones(size(T)) exp(-T)]*c; plot(T,Y,'-',t,y,'o')
E*c
is not exactly equal to y
, but the difference might well be less than measurement errors in the original data.
A rectangular matrix A
is rank deficient if it does not have linearly independent columns. If A
is rank deficient, the least-squares solution to AX = B
is not unique. The backslash operator, A\B
, issues a warning if A
is rank deficient and produces a least-squares solution if the system has no solution and a basic solution if the system has infinitely many solutions.
This example shows how the solution to underdetermined systems
is not unique. Underdetermined linear systems involve more unknowns
than equations. The matrix left division operation in MATLAB finds
a basic solution, which has at most m
nonzero
components for an m
-by-n
coefficient
matrix.
Here is a small, random example:
R = [6 8 7 3; 3 5 4 1] rng(0); b = randi(8,2,1)
R = 6 8 7 3 3 5 4 1 b = 7 8
The linear system Rp = b
involves two equations
in four unknowns. Since the coefficient matrix contains small integers,
it is appropriate to use the format
command
to display the solution in rational format. The particular solution
is obtained with
format rat
p = R\b
p = 0 17/7 0 -29/7
One of the nonzero components is p(2)
because R(:,2)
is
the column of R
with largest norm. The other nonzero
component is p(4)
because R(:,4)
dominates
after R(:,2)
is eliminated.
The complete general solution to the underdetermined system
can be characterized by adding p
to an arbitrary
linear combination of the null space vectors, which can be found using
the null
function with an option requesting a rational
basis.
Z = null(R,'r')
Z = -1/2 -7/6 -1/2 1/2 1 0 0 1
It can be confirmed that R*Z
is zero and
that the residual R*x - b
is small for any vector x
,
where
x = p + Z*q.
Since the columns of Z
are the null space
vectors, the product Z*q
is a linear combination
of those vectors:
$$Z*q=\left(\begin{array}{cc}{\stackrel{\rightharpoonup}{x}}_{1}& {\stackrel{\rightharpoonup}{x}}_{2}\end{array}\right)\left(\begin{array}{c}u\\ w\end{array}\right)=u{\stackrel{\rightharpoonup}{x}}_{1}+w{\stackrel{\rightharpoonup}{x}}_{2}\text{\hspace{0.17em}}\text{\hspace{0.17em}}.$$
To illustrate, choose an arbitrary q
and
construct x
.
q = [-2; 1]; x = p + Z*q;
Calculate the norm of the residual.
format short
norm(R*x - b)
ans = 2.6645e-15
MATLAB software supports multithreaded computation for a number of linear algebra and element-wise numerical functions. These functions automatically execute on multiple threads. For a function or expression to execute faster on multiple CPUs, a number of conditions must be true:
The function performs operations that easily partition into sections that execute concurrently. These sections must be able to execute with little communication between processes. They should require few sequential operations.
The data size is large enough so that any advantages of concurrent execution outweigh the time required to partition the data and manage separate execution threads. For example, most functions speed up only when the array contains several thousand elements or more.
The operation is not memory-bound; processing time is not dominated by memory access time. As a general rule, complicated functions speed up more than simple functions.
inv
, lscov
, linsolve
,
and mldivide
show
significant increase in speed on large double-precision arrays (on
order of 10,000 elements or more) when multithreading is enabled.
If the coefficient matrix A is large and sparse, factorization methods are generally not efficient. Iterative methods generate a series of approximate solutions. MATLAB provides several iterative methods to handle large, sparse input matrices.
pcg
Preconditioned conjugate gradients method. This method is appropriate for Hermitian positive definite coefficient matrix A.
bicg
BiConjugate Gradients Method
bicgstab
BiConjugate Gradients Stabilized Method
bicgstabl
BiCGStab(l) Method
cgs
Conjugate Gradients Squared Method
gmres
Generalized Minimum Residual Method
lsqr
LSQR Method
minres
Minimum Residual Method. This method is appropriate for Hermitian coefficient matrix A.
qmr
Quasi-Minimal Residual Method
symmlq
Symmetric LQ Method
tfqmr
Transpose-Free QMR Method