Topology-Aware Reasoning over Incomplete Knowledge Graph with Graph-Based Soft Prompting
Summary: arXiv:2604.12503v1 Announce Type: cross
Abstract: Large Language Models (LLMs) have shown remarkable capabilities across various tasks but remain prone to hallucinations in knowledge-intensive scenarios. Knowledge Base Question Answering (KBQA) mitigates this by grounding generation in Knowledge Graphs (KGs). However, most multi-hop KBQA methods rely on explicit edge traversal, making them fragile to KG incompleteness. In this paper, we proposed a novel graph-based soft prompting framework that shifts the reasoning paradigm from node-level path traversal to subgraph-level reasoning.
Introduction
The proliferation of Large Language Models (LLMs) has transformed the landscape of natural language processing, enabling highly sophisticated interactions and applications. Despite their advancements, LLMs frequently exhibit hallucinations—producing incorrect or nonsensical information—especially in knowledge-intensive tasks. One promising solution to enhance their accuracy is Knowledge Base Question Answering (KBQA), which integrates the capabilities of LLMs with the structured information found in Knowledge Graphs (KGs).
The Challenge of Knowledge Graph Incompleteness
Traditional multi-hop KBQA methods often depend on explicit edge traversal within KGs, which can lead to vulnerabilities due to the inherent incompleteness of these graphs. This fragility can result in erroneous outputs when critical paths or connections are missing. To address this issue, our research introduces a novel approach that emphasizes subgraph-level reasoning instead of solely relying on node-level path traversal.
Proposed Framework
Our proposed graph-based soft prompting framework utilizes a Graph Neural Network (GNN) to encode extracted structural subgraphs into soft prompts. This innovative method allows LLMs to engage with richer contextual information and identify relevant entities outside their immediate neighbors in the graph. By doing so, we significantly reduce the sensitivity of our model to missing edges within the KG.
Two-Stage Paradigm
To enhance efficiency while maintaining performance, we have developed a two-stage paradigm. The first stage employs a lightweight LLM that utilizes the soft prompts to identify entities and relations pertinent to the user’s question. Following this initial identification, a more powerful LLM is deployed for evidence-aware answer generation. This structured approach not only optimizes computational resources but also ensures high-quality outputs.
Experimental Results
We conducted extensive experiments on four multi-hop KBQA benchmarks to evaluate the efficacy of our proposed approach. The results indicate that our framework achieves state-of-the-art performance on three of the evaluated benchmarks, underscoring its potential to significantly improve KBQA tasks. The outcomes of these experiments illustrate the effectiveness of transitioning from traditional edge traversal methods to a more nuanced subgraph-level reasoning approach.
Conclusion
In conclusion, our research presents a significant advancement in the field of Knowledge Base Question Answering by addressing the limitations posed by KG incompleteness. Through the introduction of a graph-based soft prompting framework and a two-stage reasoning paradigm, we demonstrate that it is possible to enhance the performance of LLMs in knowledge-intensive tasks. For those interested, the code for our approach is available at the following repository: https://github.com/Wangshuaiia/GraSP.
