Layer-wise Insights for Efficient Supervised Fine-Tuning

A Layer-wise Analysis of Supervised Fine-Tuning

Summary: arXiv:2604.11838v1 Announce Type: cross

Abstract: While critical for alignment, Supervised Fine-Tuning (SFT) incurs the risk of catastrophic forgetting, yet the layer-wise emergence of instruction-following capabilities remains elusive. We investigate this mechanism via a comprehensive analysis utilizing information-theoretic, geometric, and optimization metrics across model scales (1B-32B).

Our experiments reveal a distinct depth-dependent pattern: middle layers (20%-80%) are stable, whereas final layers exhibit high sensitivity. Leveraging this insight, we propose Mid-Block Efficient Tuning, which selectively updates these critical intermediate layers.

Empirically, our method outperforms standard LoRA up to 10.2% on GSM8K (OLMo2-7B) with reduced parameter overhead, demonstrating that effective alignment is architecturally localized rather than distributed. The code is publicly available at https://anonymous.4open.science/r/base_sft.

Introduction

Supervised Fine-Tuning (SFT) is an essential process in training models to follow instructions effectively. However, it often leads to catastrophic forgetting, where previously learned information is lost as new data is introduced. Understanding how different layers in a neural network contribute to instruction-following capabilities is vital for improving SFT.

Methodology

In this research, we utilize a combination of information-theoretic, geometric, and optimization metrics to analyze the behavior of various model scales ranging from 1 billion to 32 billion parameters. This multifaceted approach allows us to gain insights into how different layers of the model respond to fine-tuning.

Key Findings

Layer Stability: Our analysis revealed that the middle layers of the model, specifically those between 20% and 80% depth, demonstrate a remarkable stability during the fine-tuning process.
Layer Sensitivity: In contrast, the final layers exhibited a high sensitivity to changes, indicating that they are more prone to the effects of catastrophic forgetting.
Mid-Block Efficient Tuning: Based on these findings, we developed a method called Mid-Block Efficient Tuning that focuses on selectively updating the stable middle layers while minimizing changes to the sensitive final layers.
Performance Improvement: Our empirical evaluations showed that this new approach outperformed the standard Low-Rank Adaptation (LoRA) method by up to 10.2% on the GSM8K dataset when using the OLMo2-7B model.

Conclusion

The results of our study indicate that effective alignment in supervised fine-tuning is not uniformly distributed across the layers of a model. Instead, it is architecturally localized within the middle layers. This insight opens new avenues for optimizing fine-tuning strategies and reducing parameter overhead while maintaining or improving performance.

For those interested in exploring this further, our code is publicly available, allowing researchers and practitioners to implement and test the Mid-Block Efficient Tuning approach in their own work.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Layer-wise Insights for Efficient Supervised Fine-Tuning

A Layer-wise Analysis of Supervised Fine-Tuning

Introduction

Methodology

Key Findings

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related