|国家预印本平台
首页|Bayesian Double Descent

Bayesian Double Descent

Bayesian Double Descent

来源:Arxiv_logoArxiv
英文摘要

Double descent is a phenomenon of over-parameterized statistical models. Our goal is to view double descent from a Bayesian perspective. Over-parameterized models such as deep neural networks have an interesting re-descending property in their risk characteristics. This is a recent phenomenon in machine learning and has been the subject of many studies. As the complexity of the model increases, there is a U-shaped region corresponding to the traditional bias-variance trade-off, but then as the number of parameters equals the number of observations and the model becomes one of interpolation, the risk can become infinite and then, in the over-parameterized region, it re-descends -- the double descent effect. We show that this has a natural Bayesian interpretation. Moreover, we show that it is not in conflict with the traditional Occam's razor that Bayesian models possess, in that they tend to prefer simpler models when possible. We illustrate the approach with an example of Bayesian model selection in neural networks. Finally, we conclude with directions for future research.

Nick Polson、Vadim Sokolov

计算技术、计算机技术

Nick Polson,Vadim Sokolov.Bayesian Double Descent[EB/OL].(2025-07-09)[2025-08-02].https://arxiv.org/abs/2507.07338.点此复制

评论