Resolving the Robustness-Precision Trade-off in Financial RAG through Hybrid Document-Routed Retrieval
Summary: arXiv:2603.26815v1 Announce Type: cross
Abstract
Retrieval-Augmented Generation (RAG) systems for financial document question answering typically follow a chunk-based paradigm: documents are split into fragments, embedded into vector space, and retrieved via similarity search. While effective in general settings, this approach suffers from cross-document chunk confusion in structurally homogeneous corpora such as regulatory filings. Semantic File Routing (SFR), which uses LLM structured output to route queries to whole documents, reduces catastrophic failures but sacrifices the precision of targeted chunk retrieval.
The Trade-off Challenge
In our research, we identify a significant robustness-precision trade-off through controlled evaluation on the FinDER benchmark, which includes 1,500 queries across five distinct groups. The findings reveal that:
- SFR achieves higher average scores (6.45 vs. 6.02).
- SFR shows fewer failures (10.3% vs. 22.5%).
- Chunk-based retrieval (CBR) yields more perfect answers (13.8% vs. 8.5%).
Introducing Hybrid Document-Routed Retrieval (HDRR)
To address the identified trade-off, we propose a novel solution: Hybrid Document-Routed Retrieval (HDRR). This innovative two-stage architecture employs SFR as a document filter, which is followed by chunk-based retrieval scoped to the identified document(s). The primary advantage of HDRR is its ability to:
- Eliminate cross-document confusion.
- Preserve targeted chunk precision.
Experimental Results
Our experimental results demonstrate that HDRR achieves the best performance across all evaluated metrics, including:
- An average score of 7.54, which is 25.2% above CBR and 16.9% above SFR.
- A remarkably low failure rate of only 6.4%.
- A correctness rate of 67.7%, which is an improvement of 18.7 percentage points over CBR.
- A perfect-answer rate of 20.1%, exceeding CBR by 6.3 percentage points and SFR by 11.6 percentage points.
Conclusion
In summary, HDRR effectively resolves the robustness-precision trade-off in financial document question answering systems. By combining the strengths of Semantic File Routing and chunk-based retrieval, HDRR not only minimizes the failure rate but also maximizes precision across all five experimental groups. This advancement holds significant implications for improving RAG systems in the financial domain, enhancing their reliability and accuracy in processing complex queries.
