Learning to Retrieve from Agent Trajectories for AI Search

Date:

Learning to Retrieve from Agent Trajectories

Summary: arXiv:2604.04949v1 Announce Type: cross

Abstract

Information retrieval (IR) systems have traditionally been designed and trained for human users, with learning-to-rank methods relying heavily on large-scale human interaction logs such as clicks and dwell time. With the rapid emergence of large language model (LLM) powered search agents, however, retrieval is increasingly consumed by agents rather than human beings, and is embedded as a core component within multi-turn reasoning and action loops. In this setting, retrieval models trained under human-centric assumptions exhibit a fundamental mismatch with the way agents issue queries and consume results.

Introduction

In this work, we argue that retrieval models for agentic search should be trained directly from agent interaction data. We introduce learning to retrieve from agent trajectories as a new training paradigm, where supervision is derived from multi-step agent interactions. This shift in focus is essential for aligning retrieval systems with the unique interaction patterns exhibited by search agents.

Methodology

Through a systematic analysis of search agent trajectories, we identify key behavioral signals that reveal document utility. These signals include:

  • Browsing actions: The actions agents take while exploring documents.
  • Unbrowsed rejections: Instances where agents reject documents without browsing them.
  • Post-browse reasoning traces: Insights gathered after agents have browsed documents.

Proposed Framework: LRAT

Guided by these insights, we propose LRAT, a simple yet effective framework that mines high-quality retrieval supervision from agent trajectories. This framework incorporates relevance intensity through weighted optimization, allowing it to better capture the nuances of agent interactions.

Results

Extensive experiments on both in-domain and out-of-domain deep research benchmarks demonstrate that retrievers trained with LRAT consistently improve:

  • Evidence recall: The ability to retrieve relevant documents efficiently.
  • End-to-end task success: The effectiveness of completing tasks using the retrieved information.
  • Execution efficiency: The speed and resource usage of the retrieval process.

Conclusion

Our results highlight agent trajectories as a practical and scalable supervision source, pointing to a promising direction for retrieval in the era of agentic search. As the landscape of information retrieval continues to evolve with advancements in artificial intelligence, adapting retrieval models to better suit the needs of agents will be crucial for future developments in the field.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.