# Documentation

### This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English verison of the page.

# glmfit

Generalized linear model regression

## Syntax

```b = glmfit(X,y,distr) b = glmfit(X,y,distr,param1,val1,param2,val2,...) [b,dev] = glmfit(...) [b,dev,stats] = glmfit(...) ```

## Description

`b = glmfit(X,y,distr)` returns a (p + 1)-by-1 vector `b` of coefficient estimates for a generalized linear regression of the responses in `y` on the predictors in `X`, using the distribution `distr`. `X` is an n-by-p matrix of p predictors at each of n observations. `distr` can be any of the following: `'binomial'`, `'gamma'`, ```'inverse gaussian'```, `'normal'` (the default), and `'poisson'`.

In most cases, `y` is an n-by-1 vector of observed responses. For the binomial distribution, `y` can be a binary vector indicating success or failure at each observation, or a two column matrix with the first column indicating the number of successes for each observation and the second column indicating the number of trials for each observation.

This syntax uses the canonical link (see below) to relate the distribution to the predictors.

### Note

By default, `glmfit` adds a first column of 1s to `X`, corresponding to a constant term in the model. Do not enter a column of 1s directly into `X`. You can change the default behavior of `glmfit` using the `'constant'` parameter, below.

`glmfit` treats `NaN`s in either `X` or `y` as missing values, and ignores them.

`b = glmfit(X,y,distr,param1,val1,param2,val2,...)` additionally allows you to specify optional parameter name/value pairs to control the model fit. Acceptable parameters are as follows.

ParameterValueDescription
`'link'`

`'identity'`, default for the distribution `'normal'`

µ = Xb

`'log'`, default for the distribution `'poisson'`

log(µ) = Xb

`'logit'`, default for the distribution `'binomial'`

log(µ/(1 – µ)) = Xb

`'probit'`

norminv(µ) = Xb

`'comploglog'`

log( -log(1 – µ)) = Xb

`'reciprocal'`, default for the distribution `'gamma'`

1/µ = Xb

`'loglog'`

log( -log(µ)) = Xb

`p` (a number), default for the distribution ```'inverse gaussian' ```(with p = -2)

µp = Xb

cell array of the form `{FL FD FI}`, containing three function handles, created using `@`, that define the link (`FL`), the derivative of the link (`FD`), and the inverse link (`FI`).

Custom-defined link function. You must provide

• `FL(mu)`

• `FD = dFL(mu)/dmu`

• `FI = FL^(-1)`

structure array having these fields:

• `'Link'` — Link function

• `'Derivative'` — Derivative of the link function

• `'Inverse'` — Inverse of the link function

The value of each field is a character vector corresponding to a function that is on the path or a function handle (created using `@`).

Custom-defined link function, its derivative, and its inverse.
`'estdisp'``'on'`

Estimates a dispersion parameter for the binomial or Poisson distribution.

`'off'` (Default for binomial or Poisson distribution)

Uses the theoretical value of 1.0 for those distributions.

`'offset'`

Vector

Used as an additional predictor variable, but with a coefficient value fixed at 1.0.

`'weights'`

Vector of prior weights, such as the inverses of the relative variance of each observation

`'constant'`

`'on'` (default)

Includes a constant term in the model. The coefficient of the constant term is the first element of `b`.

`'off'`

Omit the constant term.

`[b,dev] = glmfit(...)`returns `dev`, the deviance of the fit at the solution vector. The deviance is a generalization of the residual sum of squares. It is possible to perform an analysis of deviance to compare several models, each a subset of the other, and to test whether the model with more terms is significantly better than the model with fewer terms.

`[b,dev,stats] = glmfit(...)` returns `dev` and `stats`.

`stats` is a structure with the following fields:

• `beta` — Coefficient estimates `b`

• `dfe` — Degrees of freedom for error

• `sfit` — Estimated dispersion parameter

• `s` — Theoretical or estimated dispersion parameter

• `estdisp` — 0 when the `'estdisp'` name-value pair argument value is `'off'` and 1 when the `'estdisp'` name-value pair argument value is `'on'`.

• `covb` — Estimated covariance matrix for B

• `se` — Vector of standard errors of the coefficient estimates `b`

• `coeffcorr` — Correlation matrix for `b`

• `t`t statistics for `b`

• `p`p-values for `b`

• `resid` — Vector of residuals

• `residp` — Vector of Pearson residuals

• `residd` — Vector of deviance residuals

• `resida` — Vector of Anscombe residuals

If you estimate a dispersion parameter for the binomial or Poisson distribution, then `stats.s` is set equal to `stats.sfit`. Also, the elements of `stats.se` differ by the factor `stats.s` from their theoretical values.

## Examples

collapse all

Enter the sample data.

```x = [2100 2300 2500 2700 2900 3100 ... 3300 3500 3700 3900 4100 4300]'; n = [48 42 31 34 31 21 23 23 21 16 17 21]'; y = [1 2 0 3 8 8 14 17 19 15 17 21]'; ```

Each `y` value is the number of successes in the corresponding number of trials in `n`, and `x` contains the predictor variable values.

Fit a probit regression model for `y` on `x`.

```b = glmfit(x,[y n],'binomial','link','probit'); ```

Compute the estimated number of successes and plot the percent observed and estimated percent success versus the `x` values.

```yfit = glmval(b,x,'probit','size',n); plot(x, y./n,'o',x,yfit./n,'-','LineWidth',2) ```

```load fisheriris ```

The column vector, `species`, consists of iris flowers of three different species, setosa, versicolor, virginica. The double matrix `meas` consists of four types of measurements on the flowers, the length and width of sepals and petals in centimeters, respectively.

Define the response and predictor variables.

```X = meas(51:end,:); y = strcmp('versicolor',species(51:end)); ```

Define three function handles, created using `@`, that define the link, the derivative of the link, and the inverse link for a logit link function, and store them in a cell array.

```link = @(mu) log(mu ./ (1-mu)); derlink = @(mu) 1 ./ (mu .* (1-mu)); invlink = @(resp) 1 ./ (1 + exp(-resp)); F = {link, derlink, invlink}; ```

Fit a logistic regression using `glmfit` with the link function you defined.

```b = glmfit(X,y,'binomial','link',F) ```
```b = 42.6378 2.4652 6.6809 -9.4294 -18.2861 ```

Now, fit a generalized linear model using the `logit` link function and compare the results.

```b = glmfit(X,y,'binomial','link','logit') ```
```b = 42.6378 2.4652 6.6809 -9.4294 -18.2861 ```

## References

[1] Dobson, A. J. An Introduction to Generalized Linear Models. New York: Chapman & Hall, 1990.

[2] McCullagh, P., and J. A. Nelder. Generalized Linear Models. New York: Chapman & Hall, 1990.

[3] Collett, D. Modeling Binary Data. New York: Chapman & Hall, 2002.