Discover LatentRefusal, a lightweight mechanism that improves refusal accuracy in text-to-SQL systems by detecting unanswerable queries safely and efficien...
Discover a new benchmark assessing outcome-driven constraint violations in autonomous AI agents to improve safety and ethical compliance under KPI pressure...