Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking
In a groundbreaking study published on arXiv, Tian (2025) presents significant findings regarding the phenomenon known as grokking in two-layer neural networks. The research focuses on the role of feature repulsion during the interactive feature-learning stage and its implications for learning dynamics in neural networks.
The study introduces a repulsion theorem, denoted as Theorem 6, which discusses the behavior of the matrix \( B = (\widetilde{F}^\top \widetilde{F} + \eta I)^{-1} \). The theorem asserts that similar features exhibit negative off-diagonal entries \( B_{j\ell} \), resulting in an effective force that drives similar features apart. This mechanism plays a crucial role in feature learning; however, the study raises important questions regarding the empirical observability of this phenomenon and its spectral implications during parameter updates.
Research Methodology
Tian’s empirical investigation utilized a modular addition setup characterized by parameters \( M = 71 \) and \( K = 2048 \) with a mean squared error (MSE) loss function. The primary goal was to assess whether the theoretical predictions of the repulsion theorem manifest in observable ways during the learning process.
Key Findings
- Structure-Mechanism Dissociation: The study revealed a notable dissociation between the predicted structure of feature repulsion and its empirical manifestations in network behavior.
- Sign Rule Validation: The predicted sign rule showed a robust correlation with the top 200 most-similar feature pairs across various activations. The empirical sign-match increased significantly from 0.865 to 0.985 for the activation function \( \sigma = x^2 \) across five seeds, saturating at 1.000 for \( \sigma = \operatorname{ReLU} \).
- Activation Dependency: The spectral signature observed in the parameter updates exhibited strong dependency on the choice of activation function. For \( \sigma = x^2 \), a simple slope detector analyzing the rolling eigengap \( \sigma_2 / \sigma_3 \) of the weight updates \( \Delta W \) indicated clear evidence of grokking, firing in 15 out of 15 seeds at epoch 174.
- Contrast with Non-Grokking Controls: In stark contrast, the same detector recorded no activity in the non-grokking controls, highlighting the distinct learning dynamics associated with grokking.
- Rank-2 Spectrum vs. Rank-1 Spectrum: The spectral analysis revealed a rank-2 spectrum for the \( x^2 \) activation, while the \( \operatorname{ReLU} \) activation maintained an effectively rank-1 spectrum, underscoring the critical influence of the activation derivative on feature repulsion’s translation into weight updates.
Conclusion
This empirical study not only validates aspects of Tian’s theoretical framework but also emphasizes the complex interplay between feature learning mechanisms and activation functions in neural networks. The findings suggest that while the foundational structure predicted by the repulsion theorem remains consistent, the mechanisms through which feature repulsion influences learning outcomes are highly activation-dependent. This research opens avenues for future exploration into optimizing activation functions to enhance learning efficacy in neural networks.
Related AI Insights
- AI in Number Theory: LLMs for Algorithms & Verification
- Shepherd: Fast Runtime for Meta-Agents with Formal Traces
- Boost AI Code Compliance 49% with Product Context
- Crystal Fractional GNN for Accurate HEA Energy Prediction
- Safety-Aware Denoiser for Secure Text Diffusion Models
- Interpretable ML Limits in Football: Elite to University
- ComplexMCP: Benchmarking LLM Agents in Dynamic Tool Environments
- Nonlinear Effects of Misleading Info in Long-Context AI
- AI Tools Boost Campus Well-being: Prevention & Intervention
- Stable RL Alignment with Unified Pair-GRPO Preference Constraints
