Documentation

Code Scanner Agent

The Code Scanner Agent examines text for embedded code fragments and potential security risks. It identifies programming language syntax, evaluates security vulnerabilities, and helps prevent code injection attacks in applications that process user-generated content.

Code Scanner Component

Code Scanner Agent interface and configuration

Security Notice: While the Code Scanner helps identify potentially malicious code, it should be part of a multi-layered security approach. Combine with other security measures for comprehensive protection against code-based attacks.

Component Inputs

  • Input Text: The text content to be analyzed for code fragments

    Example: "Try running this script: System.exec('rm -rf /');"

  • Languages: Programming languages to detect in the content

    Example: "javascript,python,sql,php,bash"

  • Threshold: Sensitivity level for code detection

    Range: 0.0 to 1.0 (default: 0.5)

    Lower values catch more potential code but may increase false positives

  • Is Blocked: Whether content with code should be blocked

    Options: true (block code) or false (allow but flag code)

Component Outputs

  • Safety Status: Overall assessment of code detection results

    Values: Safe (no code detected), Warning (benign code), Unsafe (potentially harmful code)

  • Risk Score: Numerical evaluation of code-based security risk

    Scale: 0.0 (no risk) to 1.0 (high risk)

  • Detected Languages: List of programming languages identified in the content

    Example: ["javascript", "python"]

Detection Categories

Language Detection

  • JavaScript/TypeScript
  • Python
  • PHP
  • SQL
  • Bash/Shell
  • HTML/CSS
  • Java/C#/C++

Risk Assessment

  • Code Injection Attempts
  • Command Execution
  • Data Access Operations
  • Network Operations
  • System Function Calls
  • Obfuscated Code Patterns
  • Malicious Payloads

How It Works

The Code Scanner Agent employs multiple detection techniques including syntax analysis, pattern matching, and language fingerprinting to identify code fragments. It evaluates both the presence of code and its potential security impact based on the operations it attempts to perform.

Detection Techniques

  • Language syntax identification (keywords, operators, format)
  • Pattern-based detection of common coding structures
  • Contextual analysis to differentiate code from regular text
  • Semantic analysis of potential operations and their risk level
  • Detection of code snippets and embedded fragments

Use Cases

  • Forum and Comment Protection: Prevent code injection in user-generated content
  • Chatbot Security: Detect attempts to manipulate AI systems through code inputs
  • Document Processing: Identify potentially malicious code in uploaded documents
  • Email Filtering: Flag emails containing suspicious code snippets
  • Content Moderation: Identify code-sharing that violates platform policies

Implementation Example

const codeScanner = new CodeScanner({ languages: ["javascript", "python", "php", "sql", "bash"], threshold: 0.6, isBlocked: true }); const inputText = "Here's a solution: for i in range(10): exec('rm -rf /'); const result = codeScanner.scan(inputText); // Output: // { // safetyStatus: "Unsafe", // riskScore: 0.89, // detectedLanguages: ["python"], // codeFragments: [ // { // fragment: "for i in range(10): exec('rm -rf /')", // language: "python", // riskLevel: "high", // reason: "System command execution with destructive operation" // } // ] // }

Best Practices

  • Configure the language list to match your application's specific vulnerabilities
  • Adjust thresholds based on your content policies and risk tolerance
  • Supplement code scanning with input sanitization in your application
  • Implement content sandboxing for environments where code sharing is permitted
  • Monitor scanning logs to identify emerging code injection techniques