# testDeviance

Deviance test for multinomial regression model

Since R2023a

## Syntax

``p = testDeviance(mdl)``
``[p,testStat] = testDeviance(mdl)``

## Description

example

````p = testDeviance(mdl)` returns the p-value for a test that determines whether the fitted model in the `MultinomialRegression` model object `mdl` fits significantly better than an intercept-only model.```

example

````[p,testStat] = testDeviance(mdl)` also returns the value of the test statistic used to generate the p-value.```

## Examples

collapse all

Load the `fisheriris` sample data set.

`load fisheriris`

The column vector `species` contains three iris flower species: setosa, versicolor, and virginica. The matrix `meas` contains of four types of measurements for the flowers: the length and width of sepals and petals in centimeters.

Fit a multinomial regression model using `meas` as the predictor data and `species` as the response data.

`mdl = fitmnr(meas,species);`

`mdl` is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data.

Perform a chi-squared test with the null hypothesis that an intercept-only model performs as well as the model `mdl`.

`p = testDeviance(mdl)`
```p = 7.0555e-64 ```

The small p-value indicates that enough evidence exists to reject the null hypothesis and conclude that `mdl` performs better than the intercept-only model.

Load the `carbig` sample data set.

`load carbig`

The variables `MPG` and `Origin` contain data for car mileage and country of origin, respectively.

Fit a multinomial regression model with `MPG` as the predictor data and `Origin` as the response. Estimate the dispersion parameter during the fitting.

`mdl = fitmnr(MPG,Origin,EstimateDispersion=true);`

`mdl` is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data.

Perform an F-test with the null hypothesis that an intercept-only model fits the data as well as the model `mdl`. Display the p-value and the F-statistic.

`[p,tStats] = testDeviance(mdl)`
```p = 1.2314e-45 ```
```tStats = 39.1789 ```

The small p-value indicates that enough evidence exists to reject the null hypothesis and conclude that `mdl` performs better than the intercept-only model.

## Input Arguments

collapse all

Multinomial regression model object, specified as a `MultinomialRegression` model object created with the `fitmnr` function.

## Output Arguments

collapse all

Deviance test p-value, returned as a numeric scalar in the range [0,1].

Deviance test statistic, returned as a numeric scalar. If `mdl.Dispersion` is estimated, `testDeviance` performs an F-test to determine whether the fitted model `mdl` fits better than an intercept-only model. If `mdl.Dispersion` is not estimated, `testDeviance` performs a chi-squared test instead.

collapse all

### Deviance

Deviance is a generalization of the residual sum of squares. It measures the goodness of fit compared to a saturated model.

The deviance of a model M1 is twice the difference between the loglikelihood of the model M1 and the saturated model Ms. A saturated model is a model with the maximum number of parameters that you can estimate.

For example, if you have n observations (yi, i = 1, 2, ..., n) with potentially different values for XiTβ, then you can define a saturated model with n parameters. Let L(b,y) denote the maximum value of the likelihood function for a model with the parameters b. Then the deviance of the model M1 is

`$-2\left(\mathrm{log}L\left({b}_{1},y\right)-\mathrm{log}L\left({b}_{S},y\right)\right),$`

where b1 and bs contain the estimated parameters for the model M1 and the saturated model, respectively. The deviance has a chi-squared distribution with np degrees of freedom, where n is the number of parameters in the saturated model and p is the number of parameters in the model M1.

Assume you have two different generalized linear regression models M1 and M2, and M1 has a subset of the terms in M2. You can assess the fit of the models by comparing their deviances D1 and D2. The difference of the deviances is

`$\begin{array}{l}D={D}_{2}-{D}_{1}=-2\left(\mathrm{log}L\left({b}_{2},y\right)-\mathrm{log}L\left({b}_{S},y\right)\right)+2\left(\mathrm{log}L\left({b}_{1},y\right)-\mathrm{log}L\left({b}_{S},y\right)\right)\\ \text{ }\text{ }\text{ }\text{\hspace{0.17em}}\text{\hspace{0.17em}}=-2\left(\mathrm{log}L\left({b}_{2},y\right)-\mathrm{log}L\left({b}_{1},y\right)\right).\end{array}$`

Asymptotically, the difference D has a chi-squared distribution with degrees of freedom v equal to the difference in the number of parameters estimated in M1 and M2. You can obtain the p-value for this test by using ```1  —  chi2cdf(D,v)```.

Typically, you examine D using a model M2 with a constant term and no predictors. Therefore, D has a chi-squared distribution with p – 1 degrees of freedom. If the dispersion is estimated, the difference divided by the estimated dispersion has an F distribution with p – 1 numerator degrees of freedom and np denominator degrees of freedom.

## Alternative Functionality

`coefTest` performs an F-test to determine whether the coefficient estimates in `mdl` are zero. If you do not specify coefficients to test, `coefTest` tests whether the model `mdl` is a better fit to the data than a model with no coefficients.

## Version History

Introduced in R2023a