Delays in generalization match delayed changes in representational geometry

Zheng, X, Daruwalla, K, Benjamin, AS, Klindt, D (January 2024) Delays in generalization match delayed changes in representational geometry. In: UniReps: 2nd Edition of the Workshop on Unifying Representations in Neural Models, 2024 Dec 15, Vancouver, Canada.

[thumbnail of 52_Delays_in_generalization_ma.pdf] PDF
52_Delays_in_generalization_ma.pdf - Published Version
Available under License Creative Commons Attribution.

Download (11MB)

Abstract

Delayed generalization, also known as “grokking”, has emerged as a well-replicated phenomenon in overparameterized neural networks. Recent theoretical works associated grokking with the transition from lazy to rich learning regime, measured as the change in the Neural Tangent Kernel (NTK) from its initial state. Here, we present an empirical study on image classification tasks. Surprisingly, we demonstrate that the NTK deviates from its initial state significantly before the onset of grokking, i.e., before test performance increases, suggesting that rich learning does occur before generalization. To explain this difference, we instead look at the representational geometry of the network, and find that grokking coincides in time with a rapid increase in manifold capacity and improved effective geometry metrics. Notably, this sharp transition is absent when generalization is not delayed. Our findings on real data show that lazy and rich training regimes can become decoupled from sudden generalization. In contrast, changes in representational geometry remain tightly linked and may therefore better explain grokking dynamics.

Item Type: Conference or Workshop Item (Paper)
Subjects: bioinformatics
bioinformatics > quantitative biology
bioinformatics > computational biology
CSHL Authors:
Communities: CSHL labs > Navlakha lab
CSHL labs > Zador lab
SWORD Depositor: CSHL Elements
Depositing User: CSHL Elements
Date: 1 January 2024
Date Deposited: 09 Sep 2025 17:42
Last Modified: 09 Sep 2025 17:42
URI: https://repository.cshl.edu/id/eprint/41958

Actions (login required)

Administrator's edit/view item Administrator's edit/view item