HaM-World: Soft-Hamiltonian World Models with Selective Memory for Planning
A groundbreaking study has been released on arXiv with the title “HaM-World: Soft-Hamiltonian World Models with Selective Memory for Planning” (arXiv:2605.05951v1). This research introduces an innovative approach to world models, which are essential for model-based planning through learned latent dynamics. The study highlights the challenges faced when utilizing imagined rollouts, particularly the instability that arises as the planning horizon extends or as the dynamics distribution shifts.
The authors contend that this instability stems from two significant missing structures in the planner-facing latents: the absence of history-conditioned memory, which is crucial for achieving approximate Markov completeness, and the lack of geometric organization that effectively distinguishes between configuration, momentum, and task semantics.
Introducing HaM-World (HMW)
To address these issues, the researchers propose a structured world model known as HaM-World (HMW). This model decomposes the latent state into two fundamental components: a canonical subspace represented by (q, p) and a context subspace denoted as c. HMW utilizes Mamba selective state-space memory as a history-conditioned input that feeds into the same latent dynamics.
The evolution of the (q, p) subspace occurs through an energy-derived Hamiltonian vector field combined with learnable residual and control dynamics. Meanwhile, the context subspace c encapsulates semantic, dissipative, and non-conservative factors. This architectural design allows the planner to maintain a unified latent state that serves multiple purposes, including dynamics prediction, reward/value estimation, imagined rollouts, and CEM action search.
Performance and Results
The performance of HaM-World has been rigorously evaluated across four tasks in the DeepMind Control Suite. The results are impressive, with HaM-World achieving the highest average area under the curve (Avg. AUC) of 117.9, representing a significant improvement of 9.5%. Additionally, the model successfully reduces long-horizon rollout error to just 45% of that observed in a robust baseline model, and it excels in competitive settings, winning 11 out of 12 key performance metrics in various mean-squared error (MSE) cells.
- Out-of-Distribution (OOD) Performance: HaM-World demonstrated remarkable resilience under 12 OOD perturbations, which included dynamics shifts, action delays, and observation masking.
- Consistent Returns: The model achieved the highest return in every condition tested, with average OOD-return gains of 10.2% on the Finger Spin task and 13.6% on the Reacher Easy task.
Diagnostic Insights
Further diagnostics of the mechanisms underlying HaM-World reveal several key findings:
- Bounded Action-Free Hamiltonian-Energy Drift: The model maintains stability even in the absence of actions.
- Structured Energy Variation: Energy varies in a coherent manner under policy rollouts, suggesting effective control dynamics.
- Coherent Control-Induced Energy Transfer: The design supports the intended Soft-Hamiltonian dynamics, facilitating enhanced planning capabilities.
In conclusion, HaM-World represents a significant advancement in the field of model-based planning, combining innovative structural elements with empirical performance improvements. As AI continues to evolve, models like HMW may pave the way for more robust and flexible planning systems.
Related AI Insights
- HyperLens: Measuring Cognitive Effort in Large Language Models
- HEDP: Hybrid Energy-Distance Framework for Domain Learning
- Robust Explainability for Safety-Critical ATR Systems
- Intentmaking & Sensemaking in AI-Guided Math Discovery
- SkillRet Benchmark: Enhancing Skill Retrieval in LLM Agents
- ReFlect: Boosting Long-Horizon Reasoning in LLMs
- Long-Horizon Q-Learning for Accurate Value Estimation
- Enhancing Low-Resource Language Digital Representation with Knowledge Graphs
- Best Arm Identification in Generalized Linear Bandits Using Hybrid Feedback
- Expert Time Series Anomaly Detection with Multi-Agent LLM
