Label-Free Cross-Task LoRA Merging with Null-Space Compression
Summary: arXiv:2603.26317v1 Announce Type: cross
Abstract
Model merging is an innovative approach that combines independently fine-tuned checkpoints without requiring joint multi-task training. In the current landscape dominated by foundation models, fine-tuning using Low-Rank Adaptation (LoRA) has gained traction, making LoRA merging a compelling area of research. Traditional methods typically excel in homogeneous settings where all tasks involved are classification-based. However, these methods often struggle when the tasks include both classification and regression.
Many existing techniques, particularly those employing entropy-based surrogates, fail to apply effectively to regression tasks. These methods also incur high computational costs, especially for large language models characterized by lengthy token sequences. To address these challenges, we introduce Null-Space Compression (NSC) Merging, a label-free and output-agnostic technique that determines merge weights based on the geometry of the adapter.
Key Observations
Our primary insight is derived from the observation that during LoRA fine-tuning, the down-projection factor A in the equation ΔW = BA compresses its null space. This compression is closely correlated with the performance of the model. NSC leverages this relationship as an optimization signal for merging that is capable of generalizing across various tasks, including classification, regression, and sequence generation.
Performance and Applications
The NSC method has demonstrated remarkable effectiveness, achieving state-of-the-art performance across twenty heterogeneous vision tasks. Notably, it provides balanced gains, a significant improvement over prior methods that tended to overfit specific subsets of tasks. Moreover, NSC has outperformed baseline models on six Natural Language Inference (NLI) benchmarks and has excelled in vision-language evaluations, particularly in tasks such as Visual Question Answering (VQA) and image captioning.
Conclusion
In conclusion, the Null-Space Compression method represents a significant advancement in the realm of model merging by providing a scalable and effective solution for bridging the gap between classification and regression tasks. As machine learning continues to evolve, techniques like NSC will be crucial in enhancing the versatility and performance of models across a diverse range of applications.
Future Directions
Further research will focus on the following:
- Exploring the applicability of NSC in even more heterogeneous task environments.
- Investigating the potential for integrating NSC with other emerging techniques in the field.
- Implementing NSC in real-world applications to validate its effectiveness in practical settings.
