LLM Attribution Analysis Across Different Fine-Tuning Strategies and Model Scales for Automated Code Compliance
Summary: arXiv:2604.15589v1 Announce Type: cross
Abstract
Existing research on large language models (LLMs) for automated code compliance has primarily focused on performance, treating the models as black boxes and overlooking how training decisions affect their interpretive behavior. This paper addresses this gap by employing a perturbation-based attribution analysis to compare the interpretive behaviors of LLMs across different fine-tuning strategies such as full fine-tuning (FFT), low-rank adaptation (LoRA), and quantized LoRA fine-tuning, as well as the impact of model scales that include varying LLM parameter sizes.
Key Findings
Our results show distinctive trends and patterns regarding the interpretive behavior of LLMs based on different training methodologies and model sizes. The key findings of the study include:
- Full Fine-Tuning (FFT) vs. Parameter-Efficient Methods: FFT produces attribution patterns that are statistically different and more focused than those derived from parameter-efficient fine-tuning methods such as LoRA and quantized LoRA.
- Impact of Model Scale: As the model scale increases, LLMs exhibit specific interpretive strategies, particularly in how they prioritize numerical constraints and rule identifiers within the code.
- Performance Plateau: Despite improved interpretive strategies with larger models, the performance gains in semantic similarity of the generated and reference computer-processable rules plateau for models exceeding 7 billion parameters.
Importance of Explainability in LLMs
The results of this analysis shed light on the importance of explainability in LLMs, especially when applied to critical, regulation-based tasks in industries such as Architecture, Engineering, and Construction (AEC). Understanding how different fine-tuning strategies influence model interpretability can lead to more effective and transparent models that adhere to code compliance.
Future Directions
This research sets the stage for future exploration into the optimization of LLMs for regulatory compliance tasks. Further studies could investigate:
- The integration of hybrid fine-tuning methods that combine the benefits of FFT and parameter-efficient strategies.
- The development of metrics that better quantify interpretive behavior in LLMs across various regulatory contexts.
- The exploration of additional model scales beyond 7B parameters to determine if further improvements in interpretive behavior can be achieved without sacrificing performance.
Conclusion
This paper contributes to the ongoing discourse surrounding the interpretability of LLMs and the significance of training methodologies in shaping their behavior. As the demand for automated code compliance solutions grows, understanding these dynamics will be vital for developing LLMs that not only perform effectively but also provide clarity and transparency in their decision-making processes.
