Statistical & Financial Consulting by Stanford PhD

Home Page

Survival Analysis is a collection of methods designed for modeling time to an event of specific type. The event can be death, bankruptcy, hurricane, outbreak of mass protests or failure of a mechanical system. It can also be something good, like invention of a new drug. "Survival" up to a certain time means that the event has not occurred by that time.

In mainstream survival analysis the event in question may happen only once and is not subdivided into different categories. An example would be revocation of a medical license for controversial, borderline practices. In the *recurring events* framework the event may happen multiple times (e.g. engine failure). In the *competing risks* framework the event may be of multiple types. Each type may happen with a different likelihood and may require different treatment (e.g. death from different causes). Important research questions are the following.

- If
is the time until the first event or the time between two recurring events, estimate the
*survival curve*, defined as - How is the survival curve related to different characteristics of the system? For example, if we are modeling car accidents in a certain county, does the rate of accidents depend on the time of day? Is the rate different between male and female drivers? Alternatively, if we are modeling the time to the next default of a financial services company, how does the default rate vary with the overall state of the economy (say GDP) and debt-to-equity ratio of a randomly sampled company?
- If recurring events are the focus, can we see any pattern? Is the time between two subsequent events independent of what happened before? Or does the system learn from the past and the new survival curve incorporates information on previous occurrences? Are there valid contexts for
*renewal theory*, which studies systems that reset themselves periodically? - If there are many competing risks, are they correlated? Which one is likeliest at any specific time?
- What can we say about the survival of the whole system based on the survival of its components?

The estimation methods for survival curves can be split into two categories:

One can see that the Kaplan-Meier estimate is piece-wise constant, while the true survival curve may be somewhat smoother. Nonetheless, it has been shown that as the sample size converges to infinity the Kaplan-Meier estimate converges to the true survival curve. The plot below illustrates the Kaplan-Meier estimates run in an experiment containing two types of patients: those who received treatment and those who did not.

A

In parametric survival analysis we assume that the survival curve has a certain functional form, i.e. it is a known function of

Here

where

In

where

Term

Lee, E. T., & Wang, J. W. (2003). Statistical Methods for Survival Data Analysis (3rd ed). Wiley-Interscience, Hoboken, New Jersey.

Balakrishnan, N., & Rao, C. R. (2004). Advances in Survival Analysis. Handbook of Statistics, Vol. 23. North Holland.

Cleves, M., Gould, W., & Marchenko, Y. (2016). An Introduction to Survival Analysis Using Stata (4th ed). Stata Press, College Station, Texas.

Kalbfleisch, J. D., & Prentice, R. L. (2002). The Statistical Analysis of Failure Time Data (2nd ed). Wiley-Interscience, Hoboken, New Jersey.

Rausand, M. & Høyland, A. (2004). System Reliability Theory: Models, Statistical Methods, and Applications. Wiley-Interscience, Hoboken, New Jersey. -

- Detailed description of the services offered in the areas of statistical consulting and financial consulting: home page, types of service, experience, case studies and payment options
- Directory of financial topics