JSON Cleaner Component

The JSON Cleaner component sanitizes and validates JSON strings by removing control characters, normalizing Unicode, and ensuring proper JSON structure. It provides detailed cleaning statistics and validation results.

JSON Cleaner Architecture

JSON Cleaner workflow and architecture

Configuration Parameters

Required Parameters

  • jsonString: Input JSON string to clean

Optional Parameters

  • removeControlCharacters: Remove control characters (default: true)
  • normalizeUnicode: Normalize Unicode characters (default: true)
  • validateJson: Validate JSON structure (default: true)

Output Format

{
  "cleaned_json": string,
  "cleaning_stats": {
    "original_length": number,
    "cleaned_length": number,
    "characters_removed": number,
    "unicode_normalized": number,
    "processing_time": number
  },
  "validation": {
    "is_valid": boolean,
    "errors": Array<{
      "type": string,
      "message": string,
      "position": number
    }>,
    "warnings": Array<string>
  },
  "modifications": {
    "control_chars_removed": Array<{
      "char": string,
      "position": number,
      "count": number
    }>,
    "unicode_changes": Array<{
      "original": string,
      "normalized": string,
      "count": number
    }>
  },
  "performance": {
    "start_time": string,
    "end_time": string,
    "duration": number
  }
}

Features

  • Control character removal
  • Unicode normalization
  • JSON validation
  • Error reporting
  • Performance tracking
  • Modification logging
  • Statistics generation
  • Character tracking

Note: Handle large JSON strings with care. Monitor memory usage for extensive cleaning operations.

Tip: Enable all cleaning options for maximum compatibility. Implement proper error handling for invalid JSON.

Example Usage

const jsonCleaner = new JsonCleaner();

// Basic cleaning
const result1 = await jsonCleaner.clean({
  jsonString: '{"name":"John\u0000Doe","age":30}',
  removeControlCharacters: true,
  normalizeUnicode: true,
  validateJson: true
});

// Custom cleaning options
const result2 = await jsonCleaner.clean({
  jsonString: `{
    "title": "Hello\u0008World",
    "description": "Special\u0003chars",
    "items": ["item\u0001", "item\u0002"]
  }`,
  options: {
    preserveFormatting: true,
    removeNullBytes: true,
    normalizeNewlines: true
  }
});

// Handling malformed JSON
const result3 = await jsonCleaner.clean({
  jsonString: `{
    "name": "Test",
    "data": [1, 2, 3,]  // Invalid trailing comma
    "status": "active"
  }`,
  options: {
    autofix: true,
    reportWarnings: true,
    strictMode: false
  }
});

Common Issues:

// Control Characters
\u0000 - Null byte
\u0001-\u001F - Control chars

// Unicode Issues
"café" vs "café" // Different encodings

// JSON Structure
{
  "key": "value",, // Double comma
  "array": [1, 2,] // Trailing comma
  "missing": "brace"
}