ShadowPEFT: Shadow Network for Parameter-Efficient Fine-Tuning
Summary: arXiv:2604.19254v1 Announce Type: cross
Abstract: Parameter-efficient fine-tuning (PEFT) reduces the training cost of full-parameter fine-tuning for large language models (LLMs) by training only a small set of task-specific parameters while freezing the pretrained backbone. However, existing approaches, such as Low-Rank Adaptation (LoRA), achieve adaptation by inserting independent low-rank perturbations directly to individual weights, resulting in a local parameterization of adaptation.
We propose ShadowPEFT, a centralized PEFT framework that instead performs layer-level refinement through a depth-shared shadow module. At each transformer layer, ShadowPEFT maintains a parallel shadow state and evolves it repeatedly for progressively richer hidden states. This design shifts adaptation from distributed weight-space perturbations to a shared layer-space refinement process.
Since the shadow module is decoupled from the backbone, it can be reused across depth, independently pretrained, and optionally deployed in a detached mode, benefiting edge computing scenarios. Experiments on generation and understanding benchmarks show that ShadowPEFT matches or outperforms LoRA and DoRA under comparable trainable-parameter budgets.
Key Features of ShadowPEFT
- Centralized Framework: Unlike traditional methods, ShadowPEFT centralizes adaptation through a shadow module that operates at the layer level.
- Layer-Level Refinement: The framework allows for a progressive evolution of hidden states, enhancing the quality of representations learned by the model.
- Decoupled Shadow Module: This design permits the shadow module to function independently from the main backbone, making it flexible and reusable across various layers.
- Enhanced Edge Computing Support: The option to deploy the shadow module in a detached mode allows for more efficient edge computing scenarios.
Performance Evaluation
In extensive experiments, ShadowPEFT has been evaluated against existing methods like LoRA and DoRA, showcasing competitive performance in both generation and understanding benchmarks. The results indicate that ShadowPEFT not only matches the performance of these established techniques but often surpasses them when operating under similar budgets of trainable parameters.
Additional Insights
Further analyses conducted on various aspects of ShadowPEFT reveal its versatility and robustness:
- Shadow Pretraining: The benefits of shadow pretraining are examined, showing improved adaptability across different tasks.
- Cross-Dataset Transfer: The framework demonstrates a strong ability to transfer learned representations across diverse datasets.
- Parameter Scaling: ShadowPEFT shows promising results in scaling parameters, allowing for flexible model adjustments based on specific use cases.
- Inference Latency: Evaluations on inference latency suggest that the centralized adaptation approach does not compromise speed, making it suitable for real-time applications.
- System-Level Performance: A comprehensive system-level evaluation indicates that ShadowPEFT is a competitive and flexible alternative to conventional low-rank PEFT methods.
Conclusion
ShadowPEFT represents a significant advancement in the domain of parameter-efficient fine-tuning. By shifting focus from weight-space perturbations to a centralized layer-space refinement, it opens new avenues for improving the efficiency and effectiveness of large language models in various applications.
