Beyond Accuracy: Introducing a Symbolic-Mechanistic Approach to Interpretable Evaluation
Summary: arXiv:2603.23517v1 Announce Type: cross
Abstract: Accuracy-based evaluation cannot reliably distinguish genuine generalization from shortcuts like memorization, leakage, or brittle heuristics, especially in small-data regimes. In this position paper, we argue for mechanism-aware evaluation that combines task-relevant symbolic rules with mechanistic interpretability, yielding algorithmic pass/fail scores that show exactly where models generalize versus exploit patterns. We demonstrate this on NL-to-SQL by training two identical architectures under different conditions: one without schema information (forcing memorization), one with schema (enabling grounding).
Standard evaluation shows the memorization model achieves 94% field-name accuracy on unseen data, falsely suggesting competence. Our symbolic-mechanistic evaluation reveals this model violates core schema generalization rules, a failure invisible to accuracy metrics.
The Limitations of Accuracy-Based Evaluation
In the field of artificial intelligence, traditional methods of evaluation have focused heavily on accuracy metrics. However, this approach is fundamentally flawed for several reasons:
- Memorization Over Generalization: Models may achieve high accuracy by memorizing training data rather than learning to generalize from it.
- Data Leakage: Models could perform well due to unintentional leakage of information from training to test sets, skewing results.
- Brittle Heuristics: Models may rely on shortcuts or patterns that do not hold true in broader contexts, leading to poor performance in real-world applications.
A Symbolic-Mechanistic Approach
To address these shortcomings, we propose a symbolic-mechanistic evaluation framework that integrates symbolic rules with mechanistic interpretability. This approach offers several advantages:
- Clear Pass/Fail Scores: Models are evaluated based on specific criteria that indicate whether they truly generalize or merely exploit patterns.
- Transparency: The evaluation process reveals the inner workings of the model, making it easier to understand where the model’s strengths and weaknesses lie.
- Applicability to Small Data Sets: This method is particularly effective in scenarios with limited data, where traditional accuracy metrics are less reliable.
Case Study: NL-to-SQL
We applied our symbolic-mechanistic evaluation framework to the NL-to-SQL task, training two identical architectures under different conditions:
- Without Schema Information: This setup forced the model to rely on memorization tactics, leading to misleadingly high accuracy scores.
- With Schema Information: This configuration allowed the model to ground its understanding in the underlying schema, promoting genuine generalization.
While the memorization model reported a striking 94% field-name accuracy on unseen data, our symbolic-mechanistic evaluation uncovered significant violations of core schema generalization rules. This discrepancy highlights the critical need for more nuanced evaluation methods in AI development.
Conclusion
As artificial intelligence continues to evolve, it is essential to adopt evaluation methods that go beyond traditional accuracy metrics. By embracing a symbolic-mechanistic approach, researchers and practitioners can achieve a deeper understanding of model performance and ensure that AI systems are genuinely proficient rather than deceptively accurate.
