Content Scanners

Adversarial content flowing through your AI pipelines poses a direct risk to model safety and reliability. Poisoned Retrieval-Augmented Generation (RAG) documents, prompt injection attacks in user input, and policy violations in LLM outputs manipulate the model's behavior through the data it processes.

Deconvolute provides a suite of content scanners to validate untrusted text before it enters your system and to actively monitor the outputs generated by your LLMs.

Architecture Principles

Deconvolute's scanning engine is built on two core philosophies to ensure reliable and interpretable security.

Deterministic Detection

Each content scanner is a deterministic check that analyzes text for a specific class of failure or attack pattern. Scanners do not modify model behavior; they observe and report. Because they test concrete hypotheses (e.g. "Did the output match the expected language?" or "Is the security token present?"), the results are strictly interpretable and actionable.

Defense in Depth Through Composition

No single scanner covers all failure modes. The SDK is designed to be layered so that each component monitors a different attack surface. A failure in one scanner does not invalidate the others, increasing overall system robustness. You can layer these scanners together for comprehensive coverage.

Usage Patterns

Deconvolute provides two primary ways to interact with the scanning engine:

High-Level APIs: The recommended approach for most applications. Use scan() for validating static text (like RAG documents) and llm_guard() for wrapping LLM clients.
Direct Scanner Usage: Instantiate specific scanners directly for advanced, custom security workflows and fine-grained control over the execution lifecycle.

Built-In Scanners

The SDK includes several specialized scanners out of the box:

SignatureScanner: Detects known adversarial patterns, prompt injection.
LanguageScanner: Enforces output language policies and detects payload splitting.
CanaryScanner: Detects instruction overrides and jailbreaks by verifying system prompt adherence.

Architecture Principles

Deterministic Detection

Defense in Depth Through Composition

Usage Patterns

Built-In Scanners

On this page