Skip to content

8.10 文献笔记

原文 The Elements of Statistical Learning
翻译 szcf-weiya
时间 2017-09-05

关于经典的统计推断有很多书:Cox and Hinkley (1974)1 和Silvey (1975)2给出了通俗的解释。自助法归功于Efron (1979)3,并且在Efron and Tibshirani (1993)4 和 Hall (1992)5中作了完整的描述。贝叶斯推断的一本很好的现代书是Gelman et al. (1995)6。清晰描述贝叶斯方法在神经网络的应用由Neal (1996)给出。Gibbs采样的统计应用归功于Geman and Geman (1984)7,以及Gelfand and Smith (1990)8,还有Tanner and Wong (1987)9相关的工作。马尔科夫蒙特卡洛方法,包括吉布斯采样以及MH算法,在Spiegelhalter et al. (1996)10有讨论。EM算法归功于Dempster et al. (1977)11;虽然之前有很多相关的工作,但因为这篇文章的讨论者将其清晰地描述出来。EM看成完整数据对数似然惩罚的联合最大化由Neal and Hinton (1998)12给出;他们将其归功于Csiszar and Tusnády (1984) 和Hathaway (1986),因为这两篇文章已经注意到了这个联系。Bagging由Breiman (1996a)13提出。Stacking归功于Wolpert (1992)14; Breiman (1996b)包含统计学家们的讨论。Leblanc and Tibshirani (1996)15描述了基于自助法的stacking的波动。在贝叶斯网络中平均模型最近由Madigan and Raftery (1994)16提出。Bumping由Tibshirani and Knight (1999)17 提出。


  1. Cox, D. and Hinkley, D. (1974). Theoretical Statistics, Chapman and Hall, London. 

  2. Silvey, S. (1975). Statistical Inference, Chapman and Hall, London. 

  3. Efron, B. (1979). Bootstrap methods: another look at the jackknife, Annals of Statistics 7: 1–26. 

  4. Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap, Chapman and Hall, London. 

  5. Hall, P. (1992). The Bootstrap and Edgeworth Expansion, Springer, New York. 

  6. Gelman, A., Carlin, J., Stern, H. and Rubin, D. (1995). Bayesian Data Analysis, CRC Press, Boca Raton, FL. 

  7. Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images, IEEE Transactions on Pattern Analysis and Machine Intelligence 6: 721–741. 

  8. Gelfand, A. and Smith, A. (1990). Sampling based approaches to calculating marginal densities, Journal of the American Statistical Association 85: 398–409. 

  9. Tanner, M. and Wong, W. (1987). The calculation of posterior distributions by data augmentation (with discussion), Journal of the American Statistical Association 82: 528–550. 

  10. Spiegelhalter, D., Best, N., Gilks, W. and Inskip, H. (1996). Hepatitis B: a case study in MCMC methods, in W. Gilks, S. Richardson and D. Spegelhalter (eds), Markov Chain Monte Carlo in Practice, Inter- disciplinary Statistics, Chapman and Hall, London, pp. 21–43. 

  11. Dempster, A., Laird, N. and Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion), Journal of the Royal Statistical Society Series B 39: 1–38. 

  12. Neal, R. and Hinton, G. (1998). A view of the EM algorithm that justifies incremental, sparse, and other variants; in Learning in Graphical Models, M. Jordan (ed.), Dordrecht: Kluwer Academic Publishers, Boston, MA., pp. 355–368. 

  13. Breiman, L. (1996a). Bagging predictors, Machine Learning 26: 123–140. 

  14. Wolpert, D. (1992). Stacked generalization, Neural Networks 5: 241–259. 

  15. Leblanc, M. and Tibshirani, R. (1996). Combining estimates in regression and classification, Journal of the American Statistical Association 91: 1641–1650. 

  16. Madigan, D. and Raftery, A. (1994). Model selection and accounting for model uncertainty using Occam’s window, Journal of the American Statistical Association 89: 1535–46. 

  17. Tibshirani, R. and Knight, K. (1999). Model search and inference by bootstrap “bumping, Journal of Computational and Graphical Statistics 8: 671–686. 

Comments