Deep Neural Regression Collapse
Summary: arXiv:2603.23805v1 Announce Type: cross
Abstract: Neural Collapse is a phenomenon that helps identify sparse and low rank structures in deep classifiers. Recent work has extended the definition of neural collapse to regression problems, albeit only measuring the phenomenon at the last layer. In this paper, we establish that Neural Regression Collapse (NRC) also occurs below the last layer across different types of models.
Introduction
Neural networks have revolutionized the field of machine learning, particularly in tasks involving classification and regression. One of the intriguing phenomena observed in deep learning is known as Neural Collapse, which describes the behavior of neural networks as they train on data. This phenomenon has primarily been investigated within the context of classification problems, leading to the identification of sparse and low-rank structures. However, recent advancements have sought to extend this understanding to regression problems, raising questions about the underlying mechanisms at play.
Key Findings of Neural Regression Collapse (NRC)
The research presented in this paper provides compelling evidence that Neural Regression Collapse (NRC) is not limited to the final layer of neural networks, but rather extends across various layers in different model architectures. The authors demonstrate several important characteristics associated with NRC:
- Subspace Alignment: In the layers where collapse occurs, features are aligned within a subspace that corresponds to the dimension of the target variable.
- Feature Covariance: The covariance of the learned features aligns closely with the covariance of the target outputs, suggesting a strong relationship between the model’s learned representations and the underlying data structure.
- Input Subspace Alignment: The input subspace of the layer weights shows alignment with the feature subspace, which indicates that the model is effectively learning relevant features for making predictions.
- Prediction Error: The linear prediction error of the features is shown to be closely related to the overall prediction error of the model, highlighting the efficiency of the learned features in minimizing loss.
Implications of Deep NRC
Beyond establishing the existence of Deep NRC, the study explores its implications for model training and performance. Notably, models exhibiting Deep NRC are found to learn the intrinsic dimension of low-rank targets effectively. This suggests that NRC may serve as a guiding principle for designing more efficient neural architectures, particularly in regression tasks.
The Role of Weight Decay
An interesting aspect of the findings relates to the necessity of weight decay in inducing Deep NRC. The authors discuss how regularization techniques, such as weight decay, can play a critical role in shaping the learned representations within neural networks, thereby facilitating the phenomenon of Neural Regression Collapse.
Conclusion
This paper contributes significantly to the understanding of deep learning models in regression contexts. By elucidating the nature of Neural Regression Collapse and its properties, the authors provide a more complete picture of the structures learned by deep networks. This research not only advances theoretical insights but also has practical implications for the development of more robust and efficient regression models in the future.
