Optimizer-Aware Online Data Selection for LLM Fine-Tuning

Two-Stage Optimizer-Aware Online Data Selection for Large Language Models

Summary: arXiv:2604.00001v1

Announce Type: cross

Abstract

Gradient-based data selection offers a principled framework for estimating sample utility in large language model (LLM) fine-tuning. However, existing methods are primarily designed for offline settings, making them less suitable for online fine-tuning where data is presented sequentially. In such scenarios, sample utility is step-dependent, and the effective update geometry is influenced by adaptive optimizers. To address this challenge, we propose an optimizer-aware framework for gradient-based online data selection and reweighting specifically tailored for LLM fine-tuning.

Introduction

The key innovation of our approach lies in viewing online data selection not merely as a static ranking of samples, but as a process that shapes the next target-oriented update based on the current optimizer state. This perspective allows for a more dynamic and effective selection strategy that can adapt to the evolving nature of data flow in online environments.

Methodology

We formulate the online data selection problem as an optimizer-aware update-matching challenge. This formulation establishes a connection to second-order target utility, highlighting the importance of considering interactions and redundancy among selected samples during subset-level construction. Our proposed solution is encapsulated in a two-stage Filter-then-Weight algorithm:

Filter Stage: This initial stage focuses on filtering candidates that are geometrically useful for the current update.
Weight Stage: In this subsequent stage, we optimize the coefficients of the filtered candidates to maximize their utility in the update process.

Practical Implementation

To translate our theoretical framework into a practical solution for LLMs, we introduce a factorized outer-product gradient representation. This aids in efficient computations, particularly for long-context data, ensuring that our method is not only effective but also scalable to real-world applications.

Results

We conducted a series of experiments to evaluate the performance of our proposed method against existing online data selection baselines. The results consistently demonstrate that our two-stage Filter-then-Weight algorithm significantly improves convergence rates and downstream performance, all while operating within the same data budget.

Conclusion

In conclusion, our optimizer-aware online data selection approach represents a significant advancement in the fine-tuning of large language models. By redefining the selection process as a dynamic interaction with the optimizer state, we offer a solution that is not only theoretically sound but also practically viable. Future work will explore further enhancements and applications of this method in diverse LLM fine-tuning scenarios.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Optimizer-Aware Online Data Selection for LLM Fine-Tuning

Two-Stage Optimizer-Aware Online Data Selection for Large Language Models

Abstract

Introduction

Methodology

Practical Implementation

Results

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related