ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems
Summary: arXiv:2604.04426v1 Announce Type: new
Abstract: Existing research on LLM agent security mainly focuses on prompt injection and unsafe input/output behaviors. However, as agents increasingly rely on third-party tools and MCP servers, a new class of supply-chain threats has emerged, where malicious behaviors are embedded in seemingly benign tools, silently hijacking agent execution, leaking sensitive data, or triggering unauthorized actions.
Despite their growing impact, there is currently no comprehensive benchmark for evaluating such threats. To bridge this gap, we introduce SC-Inject-Bench, a large-scale benchmark comprising over 10,000 malicious MCP tools grounded in a taxonomy of 25+ attack types derived from MITRE ATT&CK targeting supply-chain threats.
Introduction
The evolution of Large Language Models (LLMs) has brought about significant advancements in the field of artificial intelligence. However, the increasing reliance on third-party tools and Multi-Cloud Platforms (MCPs) has introduced new vulnerabilities within agentic systems. These vulnerabilities arise from the potential for malicious actors to embed harmful code within seemingly legitimate tools.
The Challenge
Existing security measures primarily focus on prompt injection and direct unsafe behaviors, but they often overlook the complexities of supply-chain threats. As a result, agents may inadvertently execute harmful commands or leak sensitive information through compromised tools.
Introducing SC-Inject-Bench
To address this critical gap, we propose SC-Inject-Bench, a comprehensive benchmark designed to evaluate the security of agentic systems against supply-chain injections. Key features include:
- Over 10,000 malicious MCP tools encompassing various attack vectors.
- A taxonomy based on 25+ distinct attack types, offering a granular understanding of the threat landscape.
- Insights into the effectiveness of existing MCP scanners and semantic guardrails, highlighting their limitations in detecting supply-chain threats.
Proposed Solution: ShieldNet
Motivated by the findings from SC-Inject-Bench, we introduce ShieldNet, a novel network-level guardrail framework aimed at enhancing the security of agentic systems. Key components of ShieldNet include:
- Man-in-the-Middle (MITM) Proxy: This component captures real-time network interactions, allowing for the monitoring of data flows and potential anomalies.
- Event Extractor: This tool identifies critical network behaviors, providing insights into the actions taken by agents when interacting with MCPs.
- Lightweight Classifier: A sophisticated algorithm processes the extracted data to detect malicious activities with high accuracy.
Performance Evaluation
Extensive experiments have demonstrated that ShieldNet achieves impressive detection performance, with an F-1 score of up to 0.995 and a false positive rate of only 0.8%. Moreover, the framework introduces minimal runtime overhead, making it a viable option for real-world applications.
Conclusion
As the landscape of AI and agentic systems continues to evolve, the need for robust security measures becomes increasingly vital. ShieldNet represents a significant advancement in the fight against supply-chain injections, providing a framework that not only enhances detection capabilities but also ensures the integrity of agentic systems in an interconnected world.
