Why AI Security Matters
Understanding the new attack surfaces in AI agents and RAG systems.
Traditional security tools focus heavily on inspecting content by scanning prompts for injection patterns or checking outputs for policy violations. While this is necessary, the rise of autonomous AI agents and the Model Context Protocol (MCP) introduces a fundamentally new attack surface at the infrastructure layer.
Infrastructure Attacks on MCP
Recent research has identified critical vulnerabilities in the Model Context Protocol where malicious servers can exploit the trust boundary between tool discovery and execution. Because the threat originates at the infrastructure layer rather than within the text prompt, traditional content inspection cannot detect a server that misrepresents its capabilities.
Key infrastructure threats include:
- Shadowing: A server exposes undeclared tools or hides dangerous functionality in implementations that do not match their advertised descriptions. For example, a
search_documentstool might secretly execute arbitrary shell commands. - Rug Pulls: A server presents a benign tool definition during discovery but executes malicious code when the tool is actually called. The application trusts the description it saw earlier, completely unaware the implementation has changed.
- Confused Deputy: The agent is tricked into calling tools it should not have access to, or tools are invoked with parameters the agent never intended to use.
Content and Data Attacks
Beyond the infrastructure, the data flowing through your agent workflows also presents significant risks. RAG pipelines and direct user inputs are susceptible to:
- Prompt Injection: Adversarial text designed to hijack the model's instructions.
- Knowledge Corruption: Poisoned documents retrieved during a RAG cycle that silently alter the model's factual output or behavior.
- Instructional Overrides: Inputs that trick the model into discarding its system prompt and safety guidelines.
Deconvolute addresses this entire spectrum by layering infrastructure firewalls with robust content validation.
Further Reading
For a comprehensive technical analysis of these threat models, including empirical benchmarks and detailed attack vectors, read the in-depth survey blog post about attacks on MCP and RAG systems.