

8.10 文献笔记¶

原文	The Elements of Statistical Learning
翻译	szcf-weiya
发布	2017-09-06
更新	2019-08-01 19:39:45
状态	Done

关于经典的统计推断有很多书：Cox and Hinkley (1974)¹ 和 Silvey (1975)² 给出了通俗的解释．自助法归功于 Efron (1979)³，并且在 Efron and Tibshirani (1993)⁴ 和 Hall (1992)⁵ 中作了完整的描述．贝叶斯推断的一本很好的现代书是 Gelman et al. (1995)⁶．清晰描述贝叶斯方法在神经网络的应用由 Neal (1996)¹⁸ 给出．Gibbs 采样的统计应用归功于 Geman and Geman (1984)⁷，以及Gelfand and Smith (1990)⁸，还有 Tanner and Wong (1987)⁹ 相关的工作．马尔科夫蒙特卡洛方法，包括吉布斯采样以及 MH 算法，在Spiegelhalter et al. (1996)¹⁰ 有讨论．EM 算法归功于 Dempster et al. (1977)¹¹；虽然之前有很多相关的工作，但因为这篇文章的讨论者将其清晰地描述出来．EM 看成完整数据对数似然惩罚的联合最大化由 Neal and Hinton (1998)¹²给出；他们将其归功于 Csiszar and Tusnády (1984)¹⁹ 和 Hathaway (1986)²⁰，因为这两篇文章已经注意到了这个联系．Bagging 由 Breiman (1996a)¹³ 提出．Stacking 归功于 Wolpert (1992)¹⁴； Breiman (1996b)²¹ 包含统计学家们的讨论．Leblanc and Tibshirani (1996)¹⁵ 描述了基于自助法的 stacking 的变形．在贝叶斯网络中平均模型最近由 Madigan and Raftery (1994)¹⁶ 提出．Bumping 由 Tibshirani and Knight (1999)¹⁷ 提出．

Cox, D. and Hinkley, D. (1974). Theoretical Statistics, Chapman and Hall, London. ↩
Silvey, S. (1975). Statistical Inference, Chapman and Hall, London. ↩
Efron, B. (1979). Bootstrap methods: another look at the jackknife, Annals of Statistics 7: 1–26. ↩
Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap, Chapman and Hall, London. ↩
Hall, P. (1992). The Bootstrap and Edgeworth Expansion, Springer, New York. ↩
Gelman, A., Carlin, J., Stern, H. and Rubin, D. (1995). Bayesian Data Analysis, CRC Press, Boca Raton, FL. ↩
Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images, IEEE Transactions on Pattern Analysis and Machine Intelligence 6: 721–741. ↩
Gelfand, A. and Smith, A. (1990). Sampling based approaches to calculating marginal densities, Journal of the American Statistical Association 85: 398–409. ↩
Tanner, M. and Wong, W. (1987). The calculation of posterior distributions by data augmentation (with discussion), Journal of the American Statistical Association 82: 528–550. ↩
Spiegelhalter, D., Best, N., Gilks, W. and Inskip, H. (1996). Hepatitis B: a case study in MCMC methods, in W. Gilks, S. Richardson and D. Spegelhalter (eds), Markov Chain Monte Carlo in Practice, Inter- disciplinary Statistics, Chapman and Hall, London, pp. 21–43. ↩
Dempster, A., Laird, N. and Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion), Journal of the Royal Statistical Society Series B 39: 1–38. ↩
Neal, R. and Hinton, G. (1998). A view of the EM algorithm that justifies incremental, sparse, and other variants; in Learning in Graphical Models, M. Jordan (ed.), Dordrecht: Kluwer Academic Publishers, Boston, MA., pp. 355–368. ↩
Breiman, L. (1996a). Bagging predictors, Machine Learning 26: 123–140. ↩
Wolpert, D. (1992). Stacked generalization, Neural Networks 5: 241–259. ↩
Leblanc, M. and Tibshirani, R. (1996). Combining estimates in regression and classification, Journal of the American Statistical Association 91: 1641–1650. ↩
Madigan, D. and Raftery, A. (1994). Model selection and accounting for model uncertainty using Occam’s window, Journal of the American Statistical Association 89: 1535–46. ↩
Tibshirani, R. and Knight, K. (1999). Model search and inference by bootstrap “bumping, Journal of Computational and Graphical Statistics 8: 671–686. ↩
Neal, R. (1996). Bayesian Learning for Neural Networks, Springer, New York. ↩
Csiszar, I. and Tusn´ady, G. (1984). Information geometry and alternating minimization procedures, Statistics & Decisions Supplement Issue 1: 205–237. ↩
Hathaway, R. J. (1986). Another interpretation of the EM algorithm for mixture distributions, Statistics & Probability Letters 4: 53–56. ↩
Breiman, L. (1996b). Stacked regressions, Machine Learning 24: 51–64. ↩

8.10 文献笔记¶

💬 讨论区