Statistical & Financial Consulting by Stanford PhD
Home Page

Biostatistics encompasses all statistical methods used in biology and medicine. Sometimes biostatistics is referred to as biometry or biometrics. The difference between biostatistics and statistics is subjective, as almost any statistical idea may find its way into quantitative analysis of biological or medical phenomena. Typically, the distinction is important when looking at references or software. If somebody has written a book and called it "Biostatistics", he or she has probably filled the book with examples of statistical methods in biology or medicine. If somebody claims to be selling bioinformatics products (software implementing various types of biostatistical analyses), then the chances are the software caters to MDs, nurses and researchers in biology. It is made user-friendly specifically for them. The conclusions and inner workings of the statistical methods are still the same as those in an econometrics book or an artificial intelligence infrastructure.

Parts of biology and medicine that rely on statistical modeling most heavily are epidemiology, population genetics, human genetics, environmental health and pharmacology. In fact, the probabilistic and statistical methods in genetics have been singled out into a separate field called statistical genetics... Here is a compilation of problems which are encountered in biological and medical sciences: 1) based on the patient's age, gender, race, blood pressure, hemoglobin A1C level, temperature and reaction to several tests, classify the patient into one of several categories requiring different kinds of treatment or diagnostics; 2) use pattern recognition, cluster analysis and classification techniques to determine which groups of genes are associated with certain visible and invisible characteristics of a human being; 3) estimate a model of gene propagation and simulate the model forward to determine likely scenarios of gene exchange in the near and remote future; 4) determine which environmental factors are associated with increased levels of a certain decease in a particular geographic area... Oftentimes, biostatistical studies are built around longitudinal data - several objects (patients) observed over multiple moments of time. Longitudinal data require relatively complex statistical techniques, as the variation with respect to the object (individual) and time have to be modeled concurrently. The specifics of biostatistics are such that, usually, the longitudinal sets are not big along the time component. For that reason, even with all the complexity, the most ambitious data mining tools are inapplicable. Outside biostatistics, longitudinal data are also known as panel data.

Together with engineering and finance, medicine and biology represent the most lucrative areas for statistical applications.


D'Agostino, R., Sullivan, L., & Beiser, A. (2005). Introductory Applied Biostatistics. Cengage Learning.

Dekking, F. M., Kraaikamp, C., Lopuhaä, H. P., & Meester, M. E. (2007). A Modern Introduction to Probability and Statistics: Understanding Why and How (3rd ed). Springer.

Agresti, A. (2002). Categorical Data Analysis. New York: Wiley-Interscience.

Wang, L. (2003). Statistical Methods for Survival Data Analysis (3rd ed). Wiley-Interscience, Hoboken, New Jersey.

Le, C. T., & Eberly, L. E. (2016). Introductory Biostatistics (2nd ed). Wiley, Hoboken, New Jersey.

Ewens, W. J. & Grant, G. R. (2004). Statistical Methods in Bioinformatics: An Introduction. Springer, New York.

Causton, H., Quackenbush, J., & Brazma, A. (2003). Statistical Analysis of Gene Expression Microarray Data. Wiley-Blackwell.

Emmert-Streib, F., & Dehmer, M. (2010). Medical Biostatistics for Complex Diseases. Wiley-Blackwell.

Dehmer, M., Emmert-Streib, F., Graber, A., & Salvador, A. (2011). Applied Statistics for Network Biology: Methods in Systems Biology. Wiley-Blackwell.