Regex Scanner
The Regex Scanner provides powerful pattern-based content analysis using regular expressions. It enables flexible content filtering and sanitization through customizable pattern matching rules.

Regex pattern matching and content sanitization workflow
How It Works
The scanner evaluates input text against predefined regular expression patterns, enabling identification of specific content formats, keywords, or phrases. It supports both validation and redaction capabilities based on pattern matches.
Key Features
- Custom regex pattern support
- Flexible matching modes
- Content redaction capabilities
- Multiple pattern processing
- Configurable validation rules
Configuration Options
- patterns: List of regex patterns to match
- is_blocked: Pattern interpretation mode
- True: Patterns indicate forbidden content
- False: Patterns indicate allowed content
- match_type: Pattern matching mode
- SEARCH: Find pattern anywhere in text
- FULL_MATCH: Pattern must match entire text
- redact: Enable/disable content redaction
Common Pattern Examples
- API Keys:
r"Bearer [A-Za-z0-9-._~+/]+"
- Email Addresses:
r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]"
- URLs:
r"https?://(?:[-\w.]|(?:%[\da-fA-F]2))+"
Output Format
The scanner returns a tuple containing:
- sanitized_prompt: Text with optional redactions applied
- is_valid: Boolean based on pattern matching results
- risk_score: Pattern match confidence score (0-1)
Note: Regular expressions can be computationally intensive. Consider pattern optimization and caching strategies for high-volume applications.
Tip: Test your regex patterns thoroughly with diverse input samples. Consider using pattern libraries for common use cases like email or URL validation.