Beyond RAG for Cyber Threat Intelligence: A Systematic Evaluation of Graph-Based and Agentic Retrieval
Summary: arXiv:2604.11419v1 Announce Type: new
Abstract: Cyber threat intelligence (CTI) analysts face the challenge of answering complex questions using extensive collections of narrative security reports. Traditional retrieval-augmented generation (RAG) systems assist language models in accessing external knowledge; however, standard vector retrieval often encounters difficulties when queries necessitate reasoning about relationships among entities like threat actors, malware, and vulnerabilities. This challenge arises due to the distribution of relevant evidence across numerous text fragments and documents. Knowledge graphs provide a solution by facilitating structured multi-hop reasoning through explicit representations of entities and their interrelations.
In the evolving landscape of cyber threat intelligence, several retrieval paradigms have emerged, including graph-based, agentic, and hybrid approaches. Each of these paradigms comes with different assumptions and potential failure modes, making it essential to understand how they compare in practical CTI scenarios. This study presents a systematic evaluation of four RAG architectures tailored for CTI analysis:
- Standard vector retrieval
- Graph-based retrieval utilizing a CTI knowledge graph
- An agentic variant designed to repair failed graph queries
- A hybrid approach that integrates graph queries with traditional text retrieval
To effectively assess these systems, we evaluated them on a dataset comprising 3,300 CTI question-answer pairs. These pairs encompassed a variety of query types, including:
- Factual lookups
- Multi-hop relational queries
- Analyst-style synthesis questions
- Unanswerable cases
The results of our evaluation indicate a significant performance improvement when utilizing graph grounding for structured factual queries. Specifically, the hybrid graph-text approach demonstrated a remarkable enhancement in answer quality, achieving up to a 35 percent increase on multi-hop questions in comparison to traditional vector RAG systems. This hybrid method also exhibited more consistent reliability in performance when contrasted with graph-only systems.
In conclusion, our study highlights the importance of selecting appropriate retrieval paradigms in cyber threat intelligence tasks. The findings suggest that leveraging knowledge graphs and hybrid retrieval methods can substantially improve the effectiveness of CTI analyses, ultimately leading to better-informed decision-making in cybersecurity. As the field continues to evolve, further research will be necessary to refine these approaches and address remaining challenges in the retrieval of cyber threat intelligence.
