Token Limit Scanner

The Token Limit Scanner provides robust protection against excessive token usage in LLM prompts. It helps prevent resource exhaustion and potential denial of service attacks by enforcing predetermined token count limits.

Token Limit Scanner Architecture

Token Limit Scanner workflow using tiktoken library

How It Works

Using the tiktoken library, the scanner accurately calculates token counts for input text. It enforces configurable limits to protect your LLM applications from excessive resource consumption.

Key Features

  • Precise token counting using tiktoken
  • Configurable token limits
  • Support for multiple encoding schemes
  • Real-time token analysis
  • Compatible with major LLM models

Configuration Options

  • Token Limit: Maximum allowed tokens (default: 4096)
  • Encoding Name: Supported encodings:
    • cl100k_base (for GPT-4, GPT-3.5-turbo)
    • p50k_base (for GPT-3)
    • r50k_base (for Codex)

Output Format

The scanner returns a tuple containing:

  • sanitized_prompt: Original text if within limits
  • is_valid: Boolean indicating if token count is within limit
  • risk_score: Ratio of actual tokens to limit (0-1)

Token Counting Guide

  • ~4 characters per token (English text)
  • ~75 words ≈ 100 tokens
  • Special characters may use multiple tokens
  • Non-English text may have different ratios

Note: Token counting varies by model and encoding scheme. Always use the appropriate encoding for your target LLM to ensure accurate token counting.

Tip: Consider implementing a buffer below the maximum token limit to account for additional tokens that may be added by the model during processing.