Multiple linear regression

`b = regress(y,X)`

[b,bint] = regress(y,X)

[b,bint,r] = regress(y,X)

[b,bint,r,rint] = regress(y,X)

[b,bint,r,rint,stats] = regress(y,X)

[...] = regress(y,X,alpha)

`b = regress(y,X)`

returns
a *p*-by-1 vector `b`

of coefficient
estimates for a multilinear regression of the responses in `y`

on
the predictors in `X`

. `X`

is an *n*-by-*p* matrix
of *p* predictors at each of *n* observations. `y`

is
an *n*-by-1 vector of observed responses.

`regress`

treats `NaN`

s
in `X`

or `y`

as missing values,
and ignores them.

If the columns of `X`

are linearly dependent, `regress`

obtains
a basic solution by setting the maximum number of elements of `b`

to
zero.

`[b,bint] = regress(y,X)`

returns
a *p*-by-2 matrix `bint`

of 95%
confidence intervals for the coefficient estimates. The first column
of `bint`

contains lower confidence bounds for each
of the *p* coefficient estimates; the second column
contains upper confidence bounds.

If the columns of `X`

are linearly dependent, `regress`

returns
zeros in elements of `bint`

corresponding to the
zero elements of `b`

.

`[b,bint,r] = regress(y,X)`

returns
an *n*-by-1 vector `r`

of residuals.

`[b,bint,r,rint] = regress(y,X)`

returns
an *n*-by-2 matrix `rint`

of intervals
that can be used to diagnose outliers. If the interval `rint(i,:)`

for
observation `i`

does not contain zero, the corresponding
residual is larger than expected in 95% of new observations, suggesting
an outlier.

In a linear model, observed values of `y`

are
random variables, and so are their residuals. Residuals have normal
distributions with zero mean but with different variances at different
values of the predictors. To put residuals on a comparable scale,
they are "Studentized," that is, they are divided by
an estimate of their standard deviation that is independent of their
value. Studentized residuals have *t* distributions
with known degrees of freedom. The intervals returned in `rint`

are
shifts of the 95% confidence intervals of these *t* distributions,
centered at the residuals.

`[b,bint,r,rint,stats] = regress(y,X)`

returns
a 1-by-4 vector `stats`

that contains, in order,
the *R*^{2} statistic,
the *F* statistic and its *p* value,
and an estimate of the error variance.

The The |

`[...] = regress(y,X,alpha)`

uses
a `100*(1-alpha)`

% confidence level to compute `bint`

and `rint`

.

[1] Chatterjee, S., and A. S. Hadi. "Influential
Observations, High Leverage Points, and Outliers in Linear Regression." *Statistical
Science*. Vol. 1, 1986, pp. 379–416.

`fitlm`

| `LinearModel`

| `mvregress`

| `rcoplot`

| `stepwiselm`

Was this topic helpful?