Documentation Center

  • Trial Software
  • Product Updates

glmfit

Generalized linear model regression

Syntax

b = glmfit(X,y,distr)
b = glmfit(X,y,distr,param1,val1,param2,val2,...)
[b,dev] = glmfit(...)
[b,dev,stats] = glmfit(...)

Description

b = glmfit(X,y,distr) returns a (p + 1)-by-1 vector b of coefficient estimates for a generalized linear regression of the responses in y on the predictors in X, using the distribution distr. X is an n-by-p matrix of p predictors at each of n observations. distr can be any of the following strings: 'binomial', 'gamma', 'inverse gaussian', 'normal' (the default), and 'poisson'.

In most cases, y is an n-by-1 vector of observed responses. For the binomial distribution, y can be a binary vector indicating success or failure at each observation, or a two column matrix with the first column indicating the number of successes for each observation and the second column indicating the number of trials for each observation.

This syntax uses the canonical link (see below) to relate the distribution to the predictors.

    Note:   By default, glmfit adds a first column of 1s to X, corresponding to a constant term in the model. Do not enter a column of 1s directly into X. You can change the default behavior of glmfit using the 'constant' parameter, below.

glmfit treats NaNs in either X or y as missing values, and ignores them.

b = glmfit(X,y,distr,param1,val1,param2,val2,...) additionally allows you to specify optional parameter name/value pairs to control the model fit. Acceptable parameters are as follows.

ParameterValueDescription
'link'

'identity', default for the distribution 'normal'

µ = Xb

'log', default for the distribution 'poisson'

log(µ) = Xb

'logit', default for the distribution 'binomial'

log(µ/(1 – µ)) = Xb

'probit'

norminv(µ) = Xb

'comploglog'

log( -log(1 – µ)) = Xb

'reciprocal', default for the distribution 'gamma'

1/µ = Xb

'loglog'

log( -log(µ)) = Xb

p (a number), default for the distribution 'inverse gaussian' (with p = -2)

µp = Xb

cell array of the form {FL FD FI}, containing three function handles, created using @, that define the link (FL), the derivative of the link (FD), and the inverse link (FI).

Custom-defined link function. You must provide

  • FL(mu)

  • FD = dFL(mu)/dmu

  • FI = FL^(-1)

'estdisp''on'

Estimates a dispersion parameter for the binomial or Poisson distribution.

'off' (Default for binomial or Poisson distribution)

Uses the theoretical value of 1.0 for those distributions.

'offset'

Vector

Used as an additional predictor variable, but with a coefficient value fixed at 1.0.

'weights'

Vector of prior weights, such as the inverses of the relative variance of each observation

 
'constant'

'on' (default)

Includes a constant term in the model. The coefficient of the constant term is the first element of b.

'off'

Omit the constant term.

[b,dev] = glmfit(...)returns dev, the deviance of the fit at the solution vector. The deviance is a generalization of the residual sum of squares. It is possible to perform an analysis of deviance to compare several models, each a subset of the other, and to test whether the model with more terms is significantly better than the model with fewer terms.

[b,dev,stats] = glmfit(...) returns dev and stats.

stats is a structure with the following fields:

  • beta — Coefficient estimates b

  • dfe — Degrees of freedom for error

  • s — Theoretical or estimated dispersion parameter

  • sfit — Estimated dispersion parameter

  • se — Vector of standard errors of the coefficient estimates b

  • coeffcorr — Correlation matrix for b

  • covb — Estimated covariance matrix for B

  • tt statistics for b

  • pp-values for b

  • resid — Vector of residuals

  • residp — Vector of Pearson residuals

  • residd — Vector of deviance residuals

  • resida — Vector of Anscombe residuals

If you estimate a dispersion parameter for the binomial or Poisson distribution, then stats.s is set equal to stats.sfit. Also, the elements of stats.se differ by the factor stats.s from their theoretical values.

Examples

expand all

Fit Generalized Linear Model with Probit Link

Enter sample data.

x = [2100 2300 2500 2700 2900 3100 ...
     3300 3500 3700 3900 4100 4300]';
n = [48 42 31 34 31 21 23 23 21 16 17 21]';
y = [1 2 0 3 8 8 14 17 19 15 17 21]';

Each y value is the number of successes in corresponding number of trials inn, and x contains the predictor variable values.

Fit a probit regression model for y on x.

b = glmfit(x,[y n],'binomial','link','probit');

Compute the estimated number of successes and plot the percent observed and estimated percent success versus the x values.

yfit = glmval(b,x,'probit','size',n);
plot(x, y./n,'o',x,yfit./n,'-','LineWidth',2)

Use Custom-defined Link Function

Load the sample data.

load fisheriris

The column vector, species, consists of iris flowers of three different species, setosa, versicolor, virginica. The double matrix meas consists of four types of measurements on the flowers, the length and width of sepals and petals in centimeters, respectively.

Define the response and predictor variables.

X = meas(51:end,:);
y = strcmp('versicolor',species(51:end)); 

Define three function handles, created using @, that define the link, the derivative of the link, and the inverse link for a logit link function, and store them in a cell array.

link = @(mu) log(mu ./ (1-mu));
derlink = @(mu) 1 ./ (mu .* (1-mu));
invlink = @(resp) 1 ./ (1 + exp(-resp));
F = {link, derlink, invlink};

Fit a logistic regression using glmfit with the link function you defined.

b = glmfit(X,y,'binomial','link',F)
b =

   42.6378
    2.4652
    6.6809
   -9.4294
  -18.2861

Now, fit a generalized linear model using the logit link function and compare the results.

b = glmfit(X,y,'binomial','link','logit')
b =

   42.6378
    2.4652
    6.6809
   -9.4294
  -18.2861

References

[1] Dobson, A. J. An Introduction to Generalized Linear Models. New York: Chapman & Hall, 1990.

[2] McCullagh, P., and J. A. Nelder. Generalized Linear Models. New York: Chapman & Hall, 1990.

[3] Collett, D. Modeling Binary Data. New York: Chapman & Hall, 2002.

See Also

| | | | |

Was this topic helpful?