Note: This page has been translated by MathWorks. Please click here

To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

Cox proportional hazards regression is a semiparametric method
for adjusting survival rate estimates to quantify the effect of predictor
variables. The method represents the effects of explanatory variables
as a multiplier of a common baseline hazard function, *h*_{0}(*t*).
The hazard function is the nonparametric part of the Cox proportional
hazards regression function, whereas the impact of the predictor variables
is a loglinear regression. For a baseline relative to 0, this model
corresponds to

$${h}_{X}(t)={h}_{0}(t){e}^{{\displaystyle \sum _{i}{X}_{i}{b}_{i}}},$$

where *h*_{X}(*t*)
is the hazard rate at *X* and *h*_{0}(*t*)
is the baseline hazard rate function.

The Cox proportional hazards model relates the hazard rate for
individuals or items at the value *X*, to the hazard
rate for individuals or items at the baseline value. It produces an
estimate for the hazard ratio, *HR* = * h*_{X}(*t*)/*h*_{0}(*t*).
The model is based on the assumption that the baseline hazard function
depends on time, *t*, but the predictor variables
do not. This is also called the proportional hazards assumption, which
states that the hazard rate does not change over time for any individual.
The hazard ratio represents the relative risk of instant failure for
individuals or items having the predictive variable value *X* compared
to the ones having the baseline values. For example, if the predictive
variable is smoking status, where nonsmoking is the baseline category,
the hazard ratio shows the relative instant failure rate of smokers
compared to the baseline category, that is, nonsmokers.

For a baseline relative to *X*^{*} and
the predictor variable value *X*, the hazard ratio
is

$$HR=\frac{{h}_{X}\left(t\right)}{{h}_{{X}^{*}}\left(t\right)}=\mathrm{exp}\left[{\displaystyle \sum _{i}\left({X}_{i}-{X}_{i}{}^{*}\right){b}_{i}}\right].$$

For example, if the baseline is the mean values
of the predictor variables (`mean(X)`

), then the
hazard rate model becomes

$${h}_{X}\left(t\right)={h}_{\overline{X}}\left(t\right)\mathrm{exp}\left[{\displaystyle \sum _{i}\left({X}_{i}-\overline{X}\right)}\text{\hspace{0.05em}}{b}_{i}\right].$$

Hazard rates are related to survival rates, such that the survival
rate at time *t* for an individual with the explanatory
variable value *x* is

$${S}_{X}\left(t\right)={S}_{0}{\left(t\right)}^{H{R}_{X}\left(t\right)},$$

where *S*_{0}(*t*)
is the survivor function with the baseline hazard rate function *h*_{0}(*t*),
and *HR*_{x}(*t*)
is the hazard ratio of the predictor variable value *x* relative
to the baseline value.

A point estimate of the effect of each explanatory variable,
that is, the estimated hazard ratio for the effect of each explanatory
variable is exp(*b*), given all other variables are
held constant, where *b* is the coefficient estimate
for that variable. The coefficient estimates are found by maximizing
the likelihood function of the model. The likelihood function for
the proportional hazards regression model is based on the observed
order of events. It is the product of likelihood of a failure estimated
for each failure time. If there are *n* failures
at *n* distinct failure times, then the likelihood
is

$$L=\left[\frac{h\left({t}_{1}\right)}{{\displaystyle {\sum}_{i=1}^{n}h\left(t{}_{i}\right)}}\right]\times \left[\frac{h\left({t}_{2}\right)}{{\displaystyle {\sum}_{i=2}^{n}h\left(t{}_{i}\right)}}\right]\times \cdot \cdot \cdot \times \left[\frac{h\left({t}_{n}\right)}{h\left({t}_{n}\right)}\right].$$

You can use a likelihood ratio test to assess the
significance of adding a term or terms in a model. Consider the two
models where the first model has *p* predictive variables
and the second model has *p* + *r* predictive
variables. Then, comparing the two models, –2*(*L*_{1}/*L*_{2})
has a chi-square distribution with *r* degrees of
freedom (the number of terms being tested).

[1] Cox, D. R., and D. Oakes. *Analysis
of Survival Data*. London: Chapman & Hall, 1984.

[2] Lawless, J. F. *Statistical
Models and Methods for Lifetime Data*. Hoboken, NJ: Wiley-Interscience,
2002.

[3] Kleinbaum, D. G., and M. Klein. *Survival Analysis*.
Statistics for Biology and Health. 2nd edition. Springer, 2005.

- Hazard and Survivor Functions for Different Groups
- Survivor Functions for Two Groups
- Cox Proportional Hazards Model for Censored Data

Was this topic helpful?