HELM: Advanced Memory for Long-Horizon Vision-Language Tasks

Date:

HELM: Harness-Enhanced Long-horizon Memory for Vision-Language-Action Manipulation

Summary: arXiv:2604.18791v1 Announce Type: cross

Abstract

Recent advancements in Vision-Language-Action (VLA) models have demonstrated significant efficacy in short-horizon manipulation tasks. However, these models systematically fail when faced with long-horizon tasks, revealing a critical gap in current methodologies. This challenge is not merely a function of extending the context length but is rooted in three persistent execution-loop deficiencies: the memory gap, the verification gap, and the recovery gap.

Introduction to HELM

In response to these deficiencies, we introduce HELM, a model-agnostic framework designed to enhance long-horizon manipulation capabilities in VLA models. HELM incorporates three innovative components:

  • Episodic Memory Module (EMM): This module retrieves essential task history by utilizing CLIP-indexed keyframes to provide contextual awareness.
  • State Verifier (SV): A learned mechanism that predicts potential action failures prior to execution, based on a combination of observations, actions, subgoals, and memory-conditioned context.
  • Harness Controller (HC): This component facilitates rollback and replanning, enabling the system to adapt dynamically to unforeseen issues during task execution.

The State Verifier: A Core Contribution

The State Verifier stands out as the core learning contribution of HELM. Our empirical evaluations demonstrate that the SV consistently outperforms traditional rule-based feasibility checks and ensemble uncertainty baselines. Its efficacy is critically dependent on access to the episodic memory, which informs its decision-making process.

Performance Improvements

Our evaluation on the LIBERO-LONG benchmark showcases that HELM significantly enhances the task success rate, achieving an increase of 23.1 percentage points over the OpenVLA model, raising the success rate from 58.4% to an impressive 81.5%. In contrast, merely extending the context window to H=32 yields a modest 5.4-point improvement, while same-budget LoRA adaptation remains at 12.2 points below HELM’s performance.

Enhancements Across Various Tasks

HELM not only excels on LIBERO-LONG but also enhances long-horizon performance on the CALVIN task. Additionally, it demonstrates substantial improvements in recovery success rates when subjected to controlled perturbations. Our comprehensive set of ablations and mechanism analyses further isolate the contributions of each HELM component, validating the framework’s robustness.

Introducing LIBERO-Recovery

As part of our commitment to advancing research in this field, we are excited to release LIBERO-Recovery, a novel perturbation-injection protocol designed for evaluating failure recovery in long-horizon manipulation tasks. This resource aims to facilitate further exploration and improvement of VLA capabilities.

Conclusion

In summary, HELM presents a significant leap forward in addressing the long-standing challenges faced by VLA models in manipulation tasks. By effectively bridging the gaps identified in existing methodologies, HELM lays the groundwork for more resilient and capable vision-language-action systems.


Related AI Insights

Lazarus Omolua
Lazarus Omoluahttps://richlyai.com/blog
My mission is to make sure that people in Africa are not left behind in the global AI revolution. RichlyAI exists to give everyone — students, founders, creators, and businesses — the tools to compete globally.

Subscribe

Popular

More like this
Related

How Business Ops Teams Boost Productivity with Codex

Discover how business operations teams use Codex to streamline documentation, enhance collaboration, and improve decision-making with AI-powered automation...

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.