Cox proportional hazards regression is a semiparametric method for adjusting survival rate estimates to quantify the effect of predictor variables. The method represents the effects of explanatory variables as a multiplier of a common baseline hazard function, h0(t). The hazard function is the nonparametric part of the Cox proportional hazards regression function, whereas the impact of the predictor variables is a loglinear regression. For a baseline relative to 0, this model corresponds to
where is the predictor variable for the ith subject, h(Xi,t) is the hazard rate at time t for Xi, and h0(t) is the baseline hazard rate function.
The Cox proportional hazards model relates the hazard rate for individuals or items at the value Xi, to the hazard rate for individuals or items at the baseline value. It produces an estimate for the hazard ratio:
The hazard ratio represents the relative risk of instant failure for individuals or items having the predictive variable value Xi compared to the ones having the baseline values. For example, if the predictive variable is smoking status, where nonsmoking is the baseline category, the hazard ratio shows the relative instant failure rate of smokers compared to the baseline category, that is, nonsmokers. For a baseline relative to X* and the predictor variable value Xi, the hazard ratio is
mean(X)), then the hazard ratio becomes
Hazard rates are related to survival rates, such that the survival rate at time t for an individual with the explanatory variable value Xi is
where S0(t) is the survivor function with the baseline hazard rate function h0(t), and HR(Xi) is the hazard ratio of the predictor variable value Xi relative to the baseline value.
When you have variables that do not satisfy the proportional hazards (PH) assumption, you can consider using two extensions of Cox proportional hazards model: the stratified Cox model and the Cox model with time-dependent variables.
If the variables that do not satisfy the PH assumption are categorizable, use the stratified Cox model:
coxphfitby using the name-value pair
If the variables that do not satisfy the PH assumption are time-dependent variables, use the Cox model with time-dependent variables:
coxphfit, see Cox Proportional Hazards Model with Time-Dependent Covariates.
A point estimate of the effect of each explanatory variable, that is, the estimated hazard ratio for the effect of each explanatory variable is exp(b), given all other variables are held constant, where b is the coefficient estimate for that variable. The coefficient estimates are found by maximizing the partial likelihood function of the model. The partial likelihood function for the proportional hazards regression model is based on the observed order of events. It is the product of partial likelihoods of failures estimated for each failure time. If there are n failures at n distinct failure times, , then the partial likelihood is
You can use a likelihood ratio test to assess the significance of adding a term or terms in a model. Consider the two models where the first model has p predictive variables and the second model has p + r predictive variables. Then, comparing the two models, –2*(L1/L2) has a chi-square distribution with r degrees of freedom (the number of terms being tested).
When you have tied events,
the partial likelihood of the model by either Breslow’s method
(default) or Efron’s method, instead of computing the exact
partial likelihood. Computing the exact partial likelihood requires
a large amount of computation, which involves an entire permutation
of the risk sets for the tied event times.
The simplest approximation method is Breslow’s method. This method uses the same denominator for each tied set.
Efron’s method is more accurate than Breslow’s method, yet simple. This method adjusts the denominator of the tied events as follows:
For an example, assume that the first two events are tied, that is, t1 = t2 and . In Breslow’s method, the denominators of the first two terms are the same:
You can specify an approximation method by using the name-value
The Cox proportional hazards model can incorporate with the frequency or weights of observations. Let wi be the weight of the ith observation. Then, the partial likelihoods of the Cox model with weights become as follows:
Partial likelihood with weights
Partial likelihood with weights and Breslow’s method
Partial likelihood with weights and Efron’s method
You can specify the frequency or weights of observations
by using the name-value pair
 Cox, D. R., and D. Oakes. Analysis of Survival Data. London: Chapman & Hall, 1984.
 Lawless, J. F. Statistical Models and Methods for Lifetime Data. Hoboken, NJ: Wiley-Interscience, 2002.
 Kleinbaum, D. G., and M. Klein. Survival Analysis. Statistics for Biology and Health. 2nd edition. Springer, 2005.
 Klein, J. P., and M. L. Moeschberger. Survival Analysis. Statistics for Biology and Health. 2nd edition. Springer, 2003.