Statistical & Financial Consulting by Stanford PhD
AKAIKE INFORMATION CRITERION

Akaike Information Criterion (AIC) is a model selection tool. If a model is estimated on a particular data set (training set), AIC score gives an estimate of the model performance on a new, fresh data set (testing set). It can be shown that, for Gaussian models with known residual variance, AIC is equivalent to an estimate of the in-sample error of the estimated model (true prediction error on the training data set). AIC is given by the formula:

AIC = -2 * loglikelihood + 2 * d,

where and d is the total number of parameters. The lower AIC score signals a better model.

To use AIC for model selection, we simply chose the model giving smallest AIC over the whole set of candidates. AIC attempts to mitigate the risk of over-fitting by introducing the penalty term 2 * d, which grows with the number of parameters. This allows us to filter out unnecessarily complicated models, which have too many parameters to be estimated accurately on a given data set of size N. AIC has preference for more complex models compared to Bayesian Information Criterion (BIC).

In some textbooks and software packages an alternative version of AIC is used, where the formula above is divided by the sample size N. Such definition makes it easier to compare models estimated on different data sets of varying size.

AKAIKE INFORMATION CRITERION REFERENCES

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. Second International Symposium on Information Theory, pp. 267-281.

Bishop, C. (1995). Neural Networks for Pattern Recognition. Clarendon Press, Oxford.

Cover, T. & Thomas, J. (1991). Elements of Information Theory. Wiley, New York.

Breiman, L., Friedman, J., Olshen, R. & Stone, C. (1984). Classication and Regression Trees. Wadsworth.

Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge University Press.

BACK TO THE
STATISTICAL ANALYSES DIRECTORY