Low-Rank Compression of Pretrained Models via Randomized Subspace Iteration
Summary: arXiv:2604.02659v1 Announce Type: cross
The rapid advancement of artificial intelligence has led to the development of increasingly large pretrained models. However, the massive scale of these models presents significant challenges in terms of computational efficiency and practical deployment. To address these challenges, researchers are exploring various approaches to compressing these models without sacrificing performance. One promising method is low-rank decomposition based on singular value decomposition (SVD).
Challenges of Low-Rank Decomposition
While low-rank decomposition provides a principled approach for reducing model size, the exact computation can be prohibitively expensive for large weight matrices. This limitation has driven the search for more efficient alternatives. Randomized SVD (RSVD) has emerged as a popular choice due to its ability to approximate SVD quickly. However, RSVD is not without its drawbacks, particularly in scenarios where the singular value spectrum of the model weights decays slowly—a common occurrence in modern pretrained models.
Addressing Limitations of RSVD
- The research establishes a theoretical link between low-rank approximation errors and predictive performance. This connection is made through the analysis of softmax perturbations, demonstrating that deviations in class probabilities can be controlled by the spectral error of the compressed weights.
- It is found that RSVD often falls short in terms of approximation quality, particularly under aggressive compression scenarios.
Introducing Randomized Subspace Iteration (RSI)
In light of the limitations identified with RSVD, the authors propose a novel method known as randomized subspace iteration (RSI). This method significantly enhances the approximation quality through the use of multiple power iterations, which helps improve spectral separation. By implementing RSI, researchers can achieve a more controllable mechanism for enhancing the quality of low-rank approximations.
Empirical Evaluation
The effectiveness of the RSI approach is evaluated across various architectures, including convolutional networks and transformer-based models. The results from the experiments indicate that RSI not only achieves near-optimal approximation quality but also surpasses RSVD in predictive accuracy, especially under conditions of aggressive model compression.
Conclusion
The findings of this study highlight the critical importance of efficient model compression techniques as the scale of pretrained models continues to grow. By addressing the limitations of traditional methods like RSVD and introducing RSI, researchers are paving the way for more practical implementations of AI models that maintain high performance while being more computationally efficient.
This research represents a significant advancement in the field of AI model compression, contributing to the ongoing quest for efficient and effective machine learning solutions.
