Alpha. curiolab.science is in early alpha testing — expect rough edges, broken links, and content that may change without notice.

Why do deep neural networks generalise so well?

Category: Computer Science

Status: Queued

Classical statistical learning theory predicts that models with many more parameters than training examples should overfit catastrophically. Deep networks routinely have orders of magnitude more parameters than training points and still generalise.

Proposed explanations include implicit regularisation by stochastic gradient descent, double descent, neural tangent kernel theory and feature learning. No single theory yet predicts generalisation performance from architecture alone.

Sources

Runs

No runs yet — this question is queued.