Enhance agentic code repair with signal reshaping in GRPO, improving accuracy and performance under weak feedback conditions in reinforcement learning.
Explore how RL-trained empathetic agents withstand adversarial emotional scenarios using the Adversarial Empathy Benchmark and Emotional Consistency Score.