## Multinomial Models for Hierarchical Responses

The outcome of a response variable might sometimes be one of a restricted set of
possible values. If there are only two possible outcomes, such as male and female for
gender, these responses are called binary responses. If there are multiple outcomes,
then they are called polytomous responses. These responses are usually qualitative
rather than quantitative, such as preferred districts to live in a city, the severity
level of a disease, the species for a certain flower type, and so on. Polytomous
responses might also have categories which are not independent of each other. Instead
the response happens in a sequential manner, or one category is nested in the previous
one. These types of responses are called *hierarchical*, *or
sequential*, or *nested multinomial responses*.

For example, if the response is the number of cigarettes a person smokes in a given
day, the first level is whether the person is a smoker or not. Given that he or she is a
smoker, the number of cigarettes he or she smokes can be from one to five or more than
five a day. Given that it is more than 5, this person might be smoking from 6 to 10 or
more than 10 cigarettes a day, and so on. The risk group at each level changes
accordingly. At level one, the risk group is all of the individuals of interest (smoker
or not), say *m*. If out of *m* individuals,
*y*_{1} of them are not smokers, then at level
two, the risk group is the number of all smoking individuals, *m* –
*y*_{1}. If
*y*_{2} of these *m* –
*y*_{1} individuals smoke from one to five
cigarettes a day, then at level three, the risk group is *m* –
*y*_{1} –
*y*_{2}. So, at each level, the number of
people in that category becomes a conditional binomial observation.

A hierarchical multinomial regression model is an extension of a binary regression
model based on conditional binary observations. The default is a model with separate
intercepts and slopes (coefficients) among categories. In this case, the `fitmnr`

function creates a `MultinomialRegression`

model object with a sequence of conditional binomial
models. The name-value argument `IncludeClassInteractions=true`

in
`fitmnr`

specifies the default multinomial model. By default,
`fitmnr`

uses the `logit`

link function to
create a `MultinomialRegression`

model object. You can specify a
different link function using the `Link`

name-value argument.

Suppose the probability that an individual is in category *j* given
that he or she is not in the previous categories is
*π _{j}*, and the
cumulative probability that a response belongs to a category

*j*or a previous category is P(

*y*≤

*c*

_{j}). Then the hierarchical model with a logit link function and different slopes assumption is

$$\begin{array}{l}\mathrm{ln}\left(\frac{{\pi}_{1}}{1-P\left(y\le {c}_{1}\right)}\right)=\mathrm{ln}\left(\frac{{\pi}_{1}}{1-{\pi}_{1}}\right)={\alpha}_{1}+{\beta}_{11}{X}_{1}+{\beta}_{12}{X}_{2}+\cdots +{\beta}_{1p}{X}_{p},\\ \mathrm{ln}\left(\frac{{\pi}_{2}}{1-P\left(y\le {c}_{2}\right)}\right)=\mathrm{ln}\left(\frac{{\pi}_{2}}{1-\left({\pi}_{1}+{\pi}_{2}\right)}\right)={\alpha}_{2}+{\beta}_{21}{X}_{2}+{\beta}_{22}{X}_{2}+\cdots +{\beta}_{2p}{X}_{p},\\ \text{\hspace{1em}}\text{\hspace{1em}}\vdots \\ \mathrm{ln}\left(\frac{{\pi}_{k-1}}{1-P\left(y\le {c}_{k-1}\right)}\right)=\mathrm{ln}\left(\frac{{\pi}_{k-1}}{1-\left({\pi}_{1}+\cdots +{\pi}_{k-1}\right)}\right)={\alpha}_{k-1}+{\beta}_{(k-1)1}{X}_{1}+{\beta}_{(k-1)2}{X}_{2}+\cdots +{\beta}_{(k-1)p}{X}_{p}.\end{array}$$

For example, for a response variable with four sequential categories, there are 4 – 1 = 3 equations as follows:

$$\begin{array}{l}\mathrm{ln}\left(\frac{\pi {}_{1}}{\pi {}_{2}+\pi {}_{3}+\pi {}_{4}}\right)={\alpha}_{1}+{\beta}_{11}{X}_{1}+{\beta}_{12}{X}_{2}+\cdots +{\beta}_{1p}{X}_{p},\\ \mathrm{ln}\left(\frac{\pi {}_{2}}{\pi {}_{3}+\pi {}_{4}}\right)={\alpha}_{2}+{\beta}_{21}{X}_{1}+{\beta}_{22}{X}_{2}+\cdots +{\beta}_{2p}{X}_{p},\\ \mathrm{ln}\left(\frac{\pi {}_{3}}{\pi {}_{4}}\right)={\alpha}_{3}+{\beta}_{31}{X}_{1}+{\beta}_{32}{X}_{2}+\cdots +{\beta}_{3p}{X}_{p}.\end{array}$$

The coefficients *β*_{ij} are
interpreted within each level. For example, for the previous smoking example,
*β*_{12} shows the impact of
*X*_{2} on the log odds of a person being a
smoker versus a nonsmoker, provided that everything else is held constant.
Alternatively, *β*_{22} shows the impact of
*X*_{2} on the log odds of a person smoking one
to five cigarettes versus more than five cigarettes a day, given that he or she is a
smoker, provided that everything else is held constant. Similarly,
*β*_{23}, shows the effect of
*X*_{2} on the log odds of a person smoking 6
to 10 cigarettes versus more than 10 cigarettes a day, given that he or she smokes more
than 5 cigarettes a day, provided that everything else is held constant.

You can specify other link functions for hierarchical models. The
`'link','probit'`

name-value pair argument uses the probit link
function. With the separate slopes assumption, the model becomes

$$\begin{array}{l}{\Phi}^{-1}\left({\pi}_{1}\right)={\alpha}_{1}+{\beta}_{11}{X}_{1}+\cdots +{\beta}_{1p}{X}_{p},\text{\hspace{1em}}\\ {\Phi}^{-1}\left({\pi}_{2}\right)={\alpha}_{2}+{\beta}_{21}{X}_{1}+\cdots +{\beta}_{2p}{X}_{p},\\ \text{\hspace{1em}}\text{\hspace{1em}}\vdots \text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\vdots \\ {\Phi}^{-1}\left({\pi}_{k}\right)={\alpha}_{k}+{\beta}_{k1}{X}_{1}+\cdots +{\beta}_{kp}{X}_{p},\end{array}$$

where *π*_{j} is the
conditional probability of being in category *j*, given that it is not
in categories previous to category *j*. And
Φ^{-1}(.) is the inverse of the standard normal cumulative
distribution function.

After estimating the model coefficients by using `fitmnr`

to create a
`MultinomialRegression`

model object, you can estimate the cumulative
probabilities by using `predict`

with the name-value argument `ProbabilityType="conditional"`

.
`predict`

accepts the `MultinomialRegression`

model object returned by `fitmnr`

, and estimates the category labels,
categorical probabilities, and confidence bounds for each categorical probability. You
can specify whether `predict`

returns category, cumulative, or
conditional probabilities using the `ProbabilityType`

name-value
argument.

## References

[1] McCullagh, P., and J. A. Nelder. *Generalized
Linear Models*. New York: Chapman & Hall, 1990.

[2] Liao, T. F. *Interpreting Probability Models: Logit, Probit, and
Other Generalized Linear Models* Series: Quantitative Applications in
the Social Sciences. Sage Publications, 1994.

## See Also

`fitglm`

| `fitmnr`

| `predict`

| `glmfit`

| `glmval`