Auditable Agents: Ensuring Accountability in AI Systems
The deployment of large language model (LLM) agents has transformed the landscape of artificial intelligence, enabling systems to call tools, query databases, delegate tasks, and trigger external side effects. However, as these agent systems begin to operate in the real world, a pressing concern arises: not only how to prevent harmful actions, but also how to ensure that such actions remain answerable post-deployment.
Defining Key Concepts
In the paper “Auditable Agents,” we make a crucial distinction between three interconnected concepts:
- Accountability: The ability to determine compliance with guidelines and assign responsibility for actions taken.
- Auditability: The intrinsic property of a system that enables accountability. Without auditability, accountability cannot be achieved.
- Auditing: The process of reconstructing the behavior of an agent system using trustworthy evidence.
The Importance of Auditability
Our central claim asserts that no agent system can claim true accountability unless it possesses auditability. This raises significant questions regarding how to operationalize these concepts in a practical setting.
Dimensions of Agent Auditability
To advance the conversation, we define five critical dimensions of agent auditability:
- Action Recoverability: The ability to trace actions taken by the agent back to their origin.
- Lifecycle Coverage: Ensuring that all phases of the agent’s operation are monitored and recorded.
- Policy Checkability: The capacity to verify that the agent acts in accordance with predetermined policies.
- Responsibility Attribution: The mechanism through which responsibility for actions can be assigned to specific entities.
- Evidence Integrity: Ensuring that the evidence collected is tamper-proof and reliable.
Mechanisms for Achieving Auditability
We categorize mechanisms that facilitate agent auditability into three classes:
- Detect: Mechanisms that identify actions and potential issues as they occur.
- Enforce: Systems that ensure compliance with policies and guidelines.
- Recover: Approaches that allow for the recovery of information and accountability after actions have been taken.
Empirical Evidence and Findings
Our position is supported by a layered evidence framework rather than a single benchmark. Key findings include:
- Lower-bound ecosystem measurements indicate that basic security prerequisites for auditability are frequently unmet, with 617 security findings across six notable open-source projects.
- Runtime feasibility studies reveal that pre-execution mediation with tamper-evident records introduces only a median overhead of 8.3 milliseconds.
- Controlled recovery experiments demonstrate that responsibility-relevant information can be partially recovered even in the absence of conventional logs.
Future Directions
As we move forward, we propose the development of an Auditability Card for agent systems, along with the identification of six open research problems categorized by mechanism class. The pursuit of these challenges will be vital to enhancing the auditability of agent systems and ensuring responsible AI deployment.
