Graph-Based vs Agentic Retrieval for Cyber Threat Intel

Beyond RAG for Cyber Threat Intelligence: A Systematic Evaluation of Graph-Based and Agentic Retrieval

Summary: arXiv:2604.11419v1 Announce Type: new

Abstract: Cyber threat intelligence (CTI) analysts face the challenge of answering complex questions using extensive collections of narrative security reports. Traditional retrieval-augmented generation (RAG) systems assist language models in accessing external knowledge; however, standard vector retrieval often encounters difficulties when queries necessitate reasoning about relationships among entities like threat actors, malware, and vulnerabilities. This challenge arises due to the distribution of relevant evidence across numerous text fragments and documents. Knowledge graphs provide a solution by facilitating structured multi-hop reasoning through explicit representations of entities and their interrelations.

In the evolving landscape of cyber threat intelligence, several retrieval paradigms have emerged, including graph-based, agentic, and hybrid approaches. Each of these paradigms comes with different assumptions and potential failure modes, making it essential to understand how they compare in practical CTI scenarios. This study presents a systematic evaluation of four RAG architectures tailored for CTI analysis:

Standard vector retrieval
Graph-based retrieval utilizing a CTI knowledge graph
An agentic variant designed to repair failed graph queries
A hybrid approach that integrates graph queries with traditional text retrieval

To effectively assess these systems, we evaluated them on a dataset comprising 3,300 CTI question-answer pairs. These pairs encompassed a variety of query types, including:

Factual lookups
Multi-hop relational queries
Analyst-style synthesis questions
Unanswerable cases

The results of our evaluation indicate a significant performance improvement when utilizing graph grounding for structured factual queries. Specifically, the hybrid graph-text approach demonstrated a remarkable enhancement in answer quality, achieving up to a 35 percent increase on multi-hop questions in comparison to traditional vector RAG systems. This hybrid method also exhibited more consistent reliability in performance when contrasted with graph-only systems.

In conclusion, our study highlights the importance of selecting appropriate retrieval paradigms in cyber threat intelligence tasks. The findings suggest that leveraging knowledge graphs and hybrid retrieval methods can substantially improve the effectiveness of CTI analyses, ultimately leading to better-informed decision-making in cybersecurity. As the field continues to evolve, further research will be necessary to refine these approaches and address remaining challenges in the retrieval of cyber threat intelligence.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

Graph-Based vs Agentic Retrieval for Cyber Threat Intel

Beyond RAG for Cyber Threat Intelligence: A Systematic Evaluation of Graph-Based and Agentic Retrieval

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related