RAMP: Hybrid DRL for Online Numeric Action Model Learning

RAMP: Hybrid DRL for Online Learning of Numeric Action Models

Summary: arXiv:2604.08685v1 Announce Type: new

Abstract: Automated planning algorithms require an action model specifying the preconditions and effects of each action, but obtaining such a model is often hard. Learning action models from observations is feasible, but existing algorithms for numeric domains are offline, requiring expert traces as input. We propose the Reinforcement learning, Action Model learning, and Planning (RAMP) strategy for learning numeric planning action models online via interactions with the environment.

Introduction

The development of automated planning algorithms has become increasingly vital across various fields, including robotics, artificial intelligence, and operations research. A significant challenge lies in acquiring action models, which define the necessary preconditions and expected effects for each action. Traditionally, this task involves complex offline processes that demand expert-generated data, which can be both time-consuming and impractical.

The RAMP Strategy

The RAMP framework introduces a novel approach to address these challenges by enabling the online learning of numeric action models through direct interaction with the environment. This hybrid strategy incorporates three primary components:

Deep Reinforcement Learning (DRL) Policy: RAMP simultaneously trains a DRL policy that learns optimal actions based on feedback from the environment.
Numeric Action Model Learning: The system learns a numeric action model that captures the relationships between actions, preconditions, and outcomes based on past interactions.
Planning: RAMP utilizes the learned action model to generate plans for future actions, optimizing the performance of the RL policy.

Positive Feedback Loop

One of the significant advantages of the RAMP framework is the creation of a positive feedback loop. As the DRL policy gathers data from the environment, this information refines the action model. In turn, the enhanced model supports the planner in generating more effective plans, which further aids the RL policy in its training. This cyclical process enhances the learning efficiency and effectiveness of the system.

Numeric PDDLGym Framework

To facilitate the integration of reinforcement learning and numeric planning, the RAMP framework includes the Numeric PDDLGym, an automated environment designed to convert numeric planning problems into Gym environments. This framework allows researchers and practitioners to leverage existing RL tools while addressing the specific needs of numeric action models.

Experimental Results

In recent experiments conducted on standard IPC numeric domains, RAMP demonstrated significant advantages over traditional DRL algorithms such as Proximal Policy Optimization (PPO). The results indicated that RAMP not only improved the solvability of planning problems but also enhanced the quality of generated plans. This underscores the potential of RAMP to revolutionize the field of automated planning by making it more adaptable and efficient.

Conclusion

The introduction of the RAMP framework marks a significant advancement in the online learning of numeric action models. By combining reinforcement learning with planning capabilities, RAMP addresses the limitations of existing offline algorithms, paving the way for more dynamic and effective automated planning solutions. As research continues, the implications of RAMP could extend far beyond numeric domains, influencing various applications in artificial intelligence and robotics.

RichlyAI Blog AI Guide, Tutorials, Industrial Insights, & more!

Company

RAMP: Hybrid DRL for Online Numeric Action Model Learning

RAMP: Hybrid DRL for Online Learning of Numeric Action Models

Introduction

The RAMP Strategy

Positive Feedback Loop

Numeric PDDLGym Framework

Experimental Results

Conclusion

Related AI Insights

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

RichlyAI Blog
AI Guide, Tutorials, Industrial Insights, & more!

More like this
Related