Least-squares solution in presence of known covariance

`x = lscov(A,B)`

x = lscov(A,B,w)

x = lscov(A,B,V)

x = lscov(A,B,V,alg)

[x,stdx] = lscov(...)

[x,stdx,mse] = lscov(...)

[x,stdx,mse,S] = lscov(...)

`x = lscov(A,B)`

returns
the ordinary least squares solution to the linear system of equations```
A*x
= B
```

, i.e., `x`

is
the n-by-1 vector that minimizes the sum of squared errors ```
(B - A*x)'*(B -
A*x)
```

, where `A`

is m-by-n, and `B`

is m-by-1. `B`

can also be an m-by-k matrix,
and `lscov`

returns one solution for each column
of `B`

. When ```
rank(A)
< n
```

, `lscov`

sets the maximum possible
number of elements of `x`

to zero to obtain a "basic
solution".

`x = lscov(A,B,w)`

,
where `w`

is a vector length m of real positive weights,
returns the weighted least squares solution to the linear system ```
A*x
= B
```

, that is, `x`

minimizes ```
(B - A*x)'*diag(w)*(B -
A*x)
```

. `w`

typically contains either counts
or inverse variances.

`x = lscov(A,B,V)`

,
where `V`

is an m-by-m real symmetric positive definite
matrix, returns the generalized least squares solution to the linear
system `A*x = B`

with covariance
matrix proportional to `V`

, that is, `x`

minimizes ```
(B - A*x)'*inv(V)*(B -
A*x)
```

.

More generally, `V`

can be positive semidefinite,
and `lscov`

returns `x`

that minimizes `e'*e`

, subject to ```
A*x
+ T*e = B
```

, where the
minimization is over `x`

and `e`

,
and `T*T' = V`

. When `V`

is semidefinite,
this problem has a solution only if `B`

is
consistent with `A`

and `V`

(that
is, `B`

is in the
column space of `[A T]`

), otherwise `lscov`

returns
an error.

By default, `lscov`

computes the Cholesky decomposition
of `V`

and, in effect, inverts that factor to transform
the problem into ordinary least squares. However, if `lscov`

determines
that `V`

is semidefinite, it uses an orthogonal decomposition
algorithm that avoids inverting `V`

.

`x = lscov(A,B,V,alg)`

specifies
the algorithm used to compute `x`

when `V`

is
a matrix. `alg`

can have the following values:

`'chol'`

uses the Cholesky decomposition of`V`

.`'orth'`

uses orthogonal decompositions, and is more appropriate when`V`

is ill-conditioned or singular, but is computationally more expensive.

`[x,stdx] = lscov(...)`

returns
the estimated standard errors of `x`

. When `A`

is
rank deficient, `stdx`

contains zeros in the elements
corresponding to the necessarily zero elements of `x`

.

`[x,stdx,mse] = lscov(...)`

returns
the mean squared error. If `B`

is
assumed to have covariance matrix σ^{2}`V`

(or
(σ^{2})×`diag`

(1./`W`

)),
then `mse`

is an estimate of σ^{2}.

`[x,stdx,mse,S] = lscov(...)`

returns
the estimated covariance matrix of `x`

. When `A`

is
rank deficient, `S`

contains zeros in the rows and
columns corresponding to the necessarily zero elements of `x`

. `lscov`

cannot
return `S`

if it is called with multiple right-hand
sides, that is, if `size(B,2) > 1`

.

The standard formulas for these quantities, when `A`

and `V`

are
full rank, are

`x = inv(A'*inv(V)*A)*A'*inv(V)*B`

`mse = B'*(inv(V) - inv(V)*A*inv(A'*inv(V)*A)*A'*inv(V))*B./(m-n)`

`S = inv(A'*inv(V)*A)*mse`

`stdx = sqrt(diag(S))`

However, `lscov`

uses methods that are faster
and more stable, and are applicable to rank deficient cases.

`lscov`

assumes that the covariance matrix
of `B`

is known only up to a scale factor. `mse`

is
an estimate of that unknown scale factor, and `lscov`

scales
the outputs `S`

and `stdx`

appropriately.
However, if `V`

is known to be exactly the covariance
matrix of `B`

, then that scaling is unnecessary.
To get the appropriate estimates in this case, you should rescale `S`

and `stdx`

by `1/mse`

and `sqrt(1/mse)`

,
respectively.

The MATLAB^{®} backslash operator (\) enables you to perform
linear regression by computing ordinary least-squares (OLS) estimates
of the regression coefficients. You can also use `lscov`

to
compute the same OLS estimates. By using `lscov`

,
you can also compute estimates of the standard errors for those coefficients,
and an estimate of the standard deviation of the regression error
term:

x1 = [.2 .5 .6 .8 1.0 1.1]'; x2 = [.1 .3 .4 .9 1.1 1.4]'; X = [ones(size(x1)) x1 x2]; y = [.17 .26 .28 .23 .27 .34]'; a = X\y a = 0.1203 0.3284 -0.1312 [b,se_b,mse] = lscov(X,y) b = 0.1203 0.3284 -0.1312 se_b = 0.0643 0.2267 0.1488 mse = 0.0015

Use `lscov`

to compute a weighted least-squares
(WLS) fit by providing a vector of relative observation weights. For
example, you might want to downweight the influence of an unreliable
observation on the fit:

w = [1 1 1 1 1 .1]'; [bw,sew_b,msew] = lscov(X,y,w) bw = 0.1046 0.4614 -0.2621 sew_b = 0.0309 0.1152 0.0814 msew = 3.4741e-004

Use `lscov`

to compute a general least-squares
(GLS) fit by providing an observation covariance matrix. For example,
your data may not be independent:

V = .2*ones(length(x1)) + .8*diag(ones(size(x1))); [bg,sew_b,mseg] = lscov(X,y,V) bg = 0.1203 0.3284 -0.1312 sew_b = 0.0672 0.2267 0.1488 mseg = 0.0019

Compute an estimate of the coefficient covariance matrix for either OLS, WLS, or GLS fits. The coefficient standard errors are equal to the square roots of the values on the diagonal of this covariance matrix:

[b,se_b,mse,S] = lscov(X,y); S S = 0.0041 -0.0130 0.0075 -0.0130 0.0514 -0.0328 0.0075 -0.0328 0.0221 [se_b sqrt(diag(S))] ans = 0.0643 0.0643 0.2267 0.2267 0.1488 0.1488

The vector `x`

minimizes the quantity `(A*x-B)'*inv(V)*(A*x-B)`

.
The classical linear algebra solution to this problem is

x = inv(A'*inv(V)*A)*A'*inv(V)*B

but the `lscov`

function instead computes
the QR decomposition of `A`

and then modifies `Q`

by `V`

.

[1] Strang, G., *Introduction to
Applied Mathematics*, Wellesley-Cambridge, 1986, p. 398.