Token Limit Scanner
The Token Limit Scanner provides robust protection against excessive token usage in LLM prompts. It helps prevent resource exhaustion and potential denial of service attacks by enforcing predetermined token count limits.

Token Limit Scanner workflow using tiktoken library
How It Works
Using the tiktoken library, the scanner accurately calculates token counts for input text. It enforces configurable limits to protect your LLM applications from excessive resource consumption.
Key Features
- Precise token counting using tiktoken
- Configurable token limits
- Support for multiple encoding schemes
- Real-time token analysis
- Compatible with major LLM models
Configuration Options
- Token Limit: Maximum allowed tokens (default: 4096)
- Encoding Name: Supported encodings:
- cl100k_base (for GPT-4, GPT-3.5-turbo)
- p50k_base (for GPT-3)
- r50k_base (for Codex)
Output Format
The scanner returns a tuple containing:
- sanitized_prompt: Original text if within limits
- is_valid: Boolean indicating if token count is within limit
- risk_score: Ratio of actual tokens to limit (0-1)
Token Counting Guide
- ~4 characters per token (English text)
- ~75 words ≈ 100 tokens
- Special characters may use multiple tokens
- Non-English text may have different ratios
Note: Token counting varies by model and encoding scheme. Always use the appropriate encoding for your target LLM to ensure accurate token counting.
Tip: Consider implementing a buffer below the maximum token limit to account for additional tokens that may be added by the model during processing.