Statistical & Financial Consulting by Stanford PhD

Home Page

Bagging (**b**ootstrap **agg**regat**ing**) is a protocol for building *ensemble methods* for regression and classification. It says: estimate a given model on bootstrap samples of the given data and "average" the estimates. In the regression setting "averaging" is understood in the direct sense of this word. In the classification setting "averaging" is understood as averaging the estimated probabilities of the classes or averaging the 0/1 vote indicators of the classes (which is the same as trusting the *majority vote*). In any case we build an ensemble, where verdicts from many models are combined into one.

The idea of bagging is reducing the variance of the estimated model through averaging while keeping the bias the same. Since bootstrapped random models
_{}
are i.i.d., the variance of the average model

is

the error goes down with One might ask: why do we not do this every day? Don't we want the second term in (1) to always go to 0 while keeping the first term constant? Why don't we bootstrap and average in any setting amenable to bootstrap: linear regression, generalized linear model, survival analysis, etc? The issue is subtle. Remember we wanted to keep "the first term constant". Well, this is not quite possible. The

is higher for each bootstrapped model

Bagging lends itself naturally to asymptotically unbiased estimation of the predictive error. To ensure unbiasedness we have to test the predictive performance of the model out of sample, not in sample. By design we can do this on the fly, while running the bagging algorithm. For each data point, we predict the dependent variable using the aggregate of the models estimated on the bootstrap samples not containing the given data point. We record the so-called

Efron, B., & Hastie, T. (2017). Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. Cambridge University Press.

Hastie, T., Tibshirani, R., & Friedman, J. H. (2008). The elements of statistical learning: Data mining, inference, and prediction. New York: Springer.

Bishop, C. M. (2006) Pattern Recognition and Machine Learning. New York: Springer.

Breiman, L. (1994). Bagging Predictors. Department of Statistics, University of California Berkeley. Technical Report.

- Detailed description of the services offered in the areas of statistical and financial consulting: home page, types of service, experience, case studies, payment options and statistics tutoring
- Directory of financial topics