Safe Decentralized Operation of EV Virtual Power Plant with Limited Network Visibility via Multi-Agent Reinforcement Learning
Summary: arXiv:2604.03278v1 Announce Type: cross
Abstract
As power systems advance toward net-zero targets, behind-the-meter renewables are driving rapid growth in distributed energy resources (DERs). Virtual power plants (VPPs) increasingly coordinate these resources to support power distribution network (PDN) operation, with EV charging stations (EVCSs) emerging as a key asset due to their strong impact on local voltages.
Challenges in VPP Operations
In practice, VPPs face significant challenges as they must make operational decisions with only partial visibility of PDN states. This limited visibility forces VPPs to rely on:
- Aggregated information shared by the distribution system operator.
- Real-time assessment of voltage levels and demand-satisfaction constraints.
Proposed Framework
This work introduces a safety-enhanced VPP framework specifically designed to coordinate multiple EVCSs while operating under realistic information constraints. The framework ensures voltage security and maintains economic operation through innovative methodologies.
Transformer-assisted Lagrangian Multi-Agent Proximal Policy Optimization (TL-MAPPO)
At the core of the proposed solution is the development of the TL-MAPPO algorithm. This advanced approach allows EVCS agents to learn decentralized charging policies through:
- Centralized training with Lagrangian regularization, which helps enforce voltage and demand-satisfaction constraints.
- A transformer-based embedding layer deployed on each EVCS agent that captures temporal correlations among prices, loads, and charging demand.
Key Benefits
The innovative TL-MAPPO framework offers several significant advantages:
- Improved decision quality through advanced learning techniques.
- Reduction in voltage violations by approximately 45%.
- Decrease in operational costs by approximately 10% compared to representative multi-agent deep reinforcement learning (DRL) baselines.
Experimental Validation
Experiments conducted on a realistic 33-bus PDN demonstrate the effectiveness of the proposed framework. The results not only validate the robustness of the TL-MAPPO algorithm but also highlight its potential for practical deployment in real-world VPP scenarios.
Conclusion
As the energy sector continues to evolve, the implementation of decentralized frameworks like TL-MAPPO represents a significant step toward optimizing the operation of virtual power plants. By addressing the challenges posed by limited network visibility and ensuring voltage security, this approach paves the way for more efficient and sustainable energy management practices.
