Neurosymbolic Repo-level Code Localization
Summary: arXiv:2604.16021v1 Announce Type: cross
Abstract
Code localization is a cornerstone of autonomous software engineering. Recent advancements have achieved impressive performance on real-world issue benchmarks. However, we identify a critical yet overlooked bias: these benchmarks are saturated with keyword references (e.g. file paths, function names), encouraging models to rely on superficial lexical matching rather than genuine structural reasoning. We term this phenomenon the Keyword Shortcut.
Introduction
To address the limitations posed by the Keyword Shortcut, we formalize the challenge of Keyword-Agnostic Logical Code Localization (KA-LCL) and introduce KA-LogicQuery, a diagnostic benchmark that necessitates structural reasoning without any naming hints. Our evaluation reveals a catastrophic performance drop of state-of-the-art approaches on KA-LogicQuery, exposing their lack of deterministic reasoning capabilities.
Proposed Solution: LogicLoc
In response to the challenges identified, we propose LogicLoc, a novel agentic framework that integrates large language models (LLMs) with the rigorous logical reasoning capabilities of Datalog for precise code localization. The framework operates as follows:
- Extraction of Program Facts: LogicLoc extracts essential program facts from the codebase, creating a structured representation that facilitates deeper reasoning.
- LLM Synthesis: Leveraging the capabilities of LLMs, LogicLoc synthesizes Datalog programs that represent the logical structure of the code.
- Parser-Gated Validation: The framework employs parser-gated validation to ensure that the generated Datalog programs are both correct and efficient.
- Mutation-Based Feedback: Intermediate-rule diagnostic feedback is utilized to enhance the accuracy and efficiency of the localization process.
- High-Performance Execution: The validated Datalog programs are executed by a high-performance inference engine, which enables accurate and verifiable localization in a fully automated, closed-loop workflow.
Experimental Results
Experimental results demonstrate that LogicLoc significantly outperforms state-of-the-art (SOTA) methods on the KA-LogicQuery benchmark while maintaining competitive performance on popular issue-driven benchmarks. Notably, LogicLoc achieves superior performance with significantly lower token consumption and faster execution times. This efficiency is attained by offloading structural traversal tasks to a deterministic engine, thereby reducing the overhead associated with iterative LLM inference.
Conclusion
The advancements presented in this research represent a significant step forward in the field of code localization. By addressing the limitations imposed by the Keyword Shortcut and introducing a more structured reasoning approach, LogicLoc sets a new standard for autonomous software engineering. The future of code localization lies in frameworks like LogicLoc that prioritize logical reasoning over superficial keyword matching.
