Environment Maps: Structured Environmental Representations for Long-Horizon Agents
Summary: arXiv:2603.23610v1 Announce Type: new
Abstract
Although large language models (LLMs) have advanced rapidly, robust automation of complex software workflows remains an open problem. In long-horizon settings, agents frequently suffer from cascading errors and environmental stochasticity; a single misstep in a dynamic interface can lead to task failure, resulting in hallucinations or trial-and-error. This paper introduces Environment Maps: a persistent, agent-agnostic representation that mitigates these failures by consolidating heterogeneous evidence, such as screen recordings and execution traces, into a structured graph.
Key Components of Environment Maps
The representation consists of four core components:
- Contexts: Abstracted locations that allow agents to navigate and understand their environment more effectively.
- Actions: Parameterized affordances that define the possible operations an agent can perform within specific contexts.
- Workflows: Observed trajectories that capture the sequences of actions that lead to successful task completion.
- Tacit Knowledge: Domain definitions and reusable procedures that provide the foundational understanding necessary for effective decision-making.
Evaluation and Results
We evaluate this framework on the WebArena benchmark across five domains. Agents equipped with environment maps achieve a 28.2% success rate, nearly doubling the performance of baselines limited to session-bound context, which recorded a mere 14.2%. Furthermore, our environment maps also outperform agents that have access to the raw trajectory data used to generate the environment maps, which only reached a success rate of 23.3%.
Conclusion
By providing a structured interface between the model and the environment, Environment Maps establish a persistent foundation for long-horizon planning that is human-interpretable, editable, and incrementally refinable. This breakthrough not only enhances the reliability of agents in complex settings but also paves the way for future research into more robust and intelligent automation systems.
