Fine-Tuning Small Reasoning Models for Quantum Field Theory
Summary: arXiv:2604.18936v1 Announce Type: cross
Abstract
Despite the growing application of Large Language Models (LLMs) to theoretical physics, there is little academic exploration into how domain-specific physics reasoning ability develops while training these models. To investigate this, we perform the first academic fine-tuning study of small (7B-parameter) reasoning models dedicated specifically to theoretical physics.
Introduction
The rise of Large Language Models has opened new avenues for research and application in various fields, including theoretical physics. However, the understanding of how these models can be fine-tuned to improve their reasoning capabilities in specific domains remains limited. This study aims to bridge that gap by focusing on Quantum Field Theory (QFT), a complex area of theoretical physics.
Methodology
Given the scarcity of open-source verifiable training data suitable for the specific needs of QFT reasoning, we developed a robust data generation pipeline. This pipeline is capable of creating synthetic problems and adapting existing human-authored problems for model training. The process involved several key steps:
- Data Generation: We generated over 2,500 synthetic problems to train the models.
- Curated Collection: A selection of human-adapted problems was sourced from arXiv and standard pedagogical resources.
- Fine-Tuning Approaches: We utilized both Reinforcement Learning (RL) and Supervised Fine-Tuning (SFT) experiments to benchmark performance gains.
Results
Our experiments focused on evaluating the model’s reasoning performance both before and after fine-tuning. The analysis included:
- Performance Benchmarking: We assessed the improvements in reasoning accuracy and problem-solving capabilities across different physics domains.
- Generalization: The models were tested for their ability to generalize knowledge from QFT to other areas of physics.
- Chains-of-Thought Analysis: We conducted an extensive analysis of the model’s reasoning chains to understand how errors evolved during the fine-tuning processes.
Conclusion
This study represents a significant step forward in the academic exploration of small reasoning models in theoretical physics. By focusing on Quantum Field Theory, we generated a valuable dataset and demonstrated the potential for improved reasoning capabilities through targeted fine-tuning techniques. Our findings suggest that with the right data and training methodologies, small models can effectively enhance their reasoning ability in complex domains.
Public Release
In line with open science principles, we are pleased to announce the public release of our data pipeline, verifiable QFT training data, and approximately 200 million tokens of QFT reasoning traces. This release will enable further research and development in the area of physics reasoning models.
