For example, taking a drug may halve one's hazard rate for a stroke occurring, or, changing the material from which a manufactured component is constructed, may double its hazard rate for failure.
Survival models can be viewed as consisting of two parts: the underlying baseline hazard function, often denoted
, describing how the risk of event per time unit changes over time at baseline levels of covariates; and the effect parameters, describing how the hazard varies in response to explanatory covariates.
A typical medical example would include covariates such as treatment assignment, as well as patient characteristics such as age at start of study, gender, and the presence of other diseases at start of study, in order to reduce variability and/or control for confounding.
In the simplest case of stationary coefficients, for example, a treatment with a drug may, say, halve a subject's hazard at any given time
, it is typically assumed that the hazard responds exponentially; each unit increase in
[3] However, Cox also noted that biological interpretation of the proportional hazards assumption can be quite tricky.
This expression gives the hazard function at time t for subject i with covariate vector (explanatory variables) Xi.
In other words, adding an intercept term would make the model unidentifiable.
The first factor is the partial likelihood shown below, in which the baseline hazard has "canceled out".
The second factor is free of the regression coefficients and depends on the data only through the censoring pattern.
Treating the subjects as statistically independent of each other, the partial likelihood for the order of events [6] is
Using this score function and Hessian matrix, the partial likelihood can be maximized using the Newton-Raphson algorithm.
Several approaches have been proposed to handle situations in which there are ties in the time data.
An alternative approach that is considered to give better results is Efron's method.
Note that when Hj is empty (all observations with time tj are censored), the summands in these expressions are treated as zero.
Suppose the endpoint we are interested in is patient survival during a 5-year observation period after a surgery.
There are important caveats to mention about the interpretation: To demonstrate a less traditional use case of survival analysis, the next example will be an economics question: what is the relationship between a company's price-to-earnings ratio (P/E) on their first IPO anniversary and their future survival?
More specifically, if we consider a company's "birth event" to be their first IPO anniversary, and any bankruptcy, sale, going private, etc.
Provided is a (fake) dataset with survival data from 12 companies: T represents the number of days between first IPO anniversary and death (or an end date of 2022-01-01, if did not die).
Running this dataset through a Cox model produces an estimate of the value of the unknown
There are important caveats to mention about the interpretation: Extensions to time dependent variables, time dependent strata, and multiple events per subject, can be incorporated by the counting process formulation of Andersen and Gill.
[11][12] In this context, it could also be mentioned that it is theoretically possible to specify the effect of covariates by using additive hazards,[13] i.e. specifying
If such additive hazards models are used in situations where (log-)likelihood maximization is the objective, care must be taken to restrict
The Cox model may be specialized if a reason exists to assume that the baseline hazard follows a particular form.
This was more important in the days of slower computers but can still be useful for particularly large data sets or complex problems.
They note, "we do not assume [the Poisson model] is true, but simply use it as a device for deriving the likelihood."
In high-dimension, when number of covariates p is large compared to the sample size n, the LASSO method is one of the classical model-selection strategies.
Tibshirani (1997) has proposed a Lasso procedure for the proportional hazard regression parameter.
[17] The Lasso estimator of the regression parameter β is defined as the minimizer of the opposite of the Cox partial log-likelihood under an L1-norm type constraint.