Multi-objective Evolutionary Merging Enables Efficient Reasoning Models
Summary: arXiv:2604.06465v1 Announce Type: cross
Abstract: Reasoning models have demonstrated remarkable capabilities in solving complex problems by leveraging long chains of thought. However, this more deliberate reasoning comes with substantial computational overhead at inference time. The Long-to-Short (L2S) reasoning problem seeks to maintain high accuracy using fewer tokens, but current training-free model merging approaches rely on scalarized, fixed-hyperparameter arithmetic methods that are highly brittle and force suboptimal compromises.
Introduction to Evo-L2S
To address the limitations of existing methods, we introduce Evo-L2S, a novel framework that formulates L2S reasoning as a multi-objective optimization challenge. This innovative approach leverages evolutionary model merging to explicitly optimize the trade-off between reasoning accuracy and output length, resulting in a robust Pareto front of merged models.
Key Features of Evo-L2S
- Multi-objective Optimization: Evo-L2S treats L2S reasoning as a multi-objective task, allowing for simultaneous optimization of multiple criteria.
- Evolutionary Model Merging: The framework employs evolutionary strategies to enhance the merging process, yielding more effective models.
- Pareto Front Generation: By producing a Pareto front, Evo-L2S provides a range of optimal solutions that balance accuracy and output length, catering to diverse application requirements.
- Entropy-based Subset Sampling: To make the search process computationally efficient, we introduce an entropy-based subset sampling technique that significantly reduces the overhead involved in fitness estimation.
Experimental Results
We conducted comprehensive experiments across various parameter scales, including 1.5B, 7B, and 14B parameters, on six mathematical reasoning benchmarks. The results show that Evo-L2S can reduce the length of generated reasoning traces by over 50% while preserving or even improving the problem-solving accuracy of the original reasoning models.
Conclusion
The Evo-L2S framework represents a significant advancement in the development of efficient reasoning models. By addressing the traditional limitations associated with model merging and inference overhead, Evo-L2S not only enhances computational efficiency but also maintains high levels of accuracy. This work opens new avenues for research in the field of AI, particularly in applications requiring robust reasoning capabilities.
Future Work
As we move forward, further exploration of the Evo-L2S framework will focus on:
- Enhancing the entropy-based sampling methods for even greater computational efficiency.
- Exploring additional benchmarks and real-world applications to validate the effectiveness of the proposed approach.
- Investigating the integration of Evo-L2S with other AI techniques to create hybrid models that can tackle a wider range of complex problems.
