Discover HiL-Bench, a benchmark measuring AI agents' ability to know when to ask for help in uncertain tasks, improving decision-making and performance.
Discover how Spatial-Gym benchmarks spatial reasoning in AI agents step-by-step, revealing key insights to improve navigation and decision-making models.
Enhance model-based reinforcement learning with advantage-guided diffusion to improve trajectory sampling and boost long-term returns in control tasks.