Documentation

Text Embedder

A versatile text embedding component that supports multiple embedding models and provides standardized output format. Ideal for converting text into vector representations for semantic search, clustering, and similarity analysis.

Text Embedder Component

Text Embedder component interface and configuration

Model Configuration Notice: When configuring this component, ensure that the embedding model type and dimensions match the expected output format for your downstream applications. Inconsistent dimensions can cause issues with vector stores and similarity searches.

Component Inputs

  • Embedding Model: Configuration object for the embedding model

    Example: modelType: openai, dimensions: 1536, options: {...}

  • Message: Text content to be embedded

    Example: "The quick brown fox jumps over the lazy dog"

  • Model Type: Type of embedding model to use

    Example: "openai", "huggingface", "custom"

  • Dimensions: Size of the embedding vector

    Example: 768, 1024, 1536, 3072

  • Model Options: Specific configuration for the selected model

    Example: model: text-embedding-ada-002, apiKey: your-api-key

Component Outputs

  • Embeddings: Vector representation of the input text

    Example: [0.021, -0.038, 0.075, ...]

  • Metadata: Information about the embedding process

    Example: model_type: openai, dimensions: 1536, processing_time_ms: 178

  • Status: Success or error information

    Example: success: true, error: null

Model Type Comparison

OpenAI Models

High-quality embeddings from OpenAI's API service

modelType: 'openai' dimensions: 1536 or 3072 options: { model: 'text-embedding-3-small', apiKey: 'your-api-key' } Ideal for: High-quality embeddings when API access is available

Hugging Face Models

Local or hosted embeddings using Hugging Face Transformers

modelType: 'huggingface' dimensions: varies by model (typically 384-1024) options: { model: 'sentence-transformers/all-MiniLM-L6-v2', quantize: true } Ideal for: Local embedding generation without API dependencies

Custom Models

Integrate with your own embedding models or third-party services

modelType: 'custom' dimensions: specified by implementation options: { // Custom parameters for your model } Ideal for: Specialized embedding models or proprietary implementations

Implementation Example

const embedder = new TextEmbedder({ embeddingModel: { modelType: "openai", dimensions: 1536, options: { model: "text-embedding-ada-002", apiKey: process.env.OPENAI_API_KEY } } }); // Single text embedding const result = await embedder.embed({ message: "Your text to embed" }); // Batch processing const batchResult = await embedder.embedBatch({ messages: [ "First text to embed", "Second text to embed", "Third text to embed" ] }); // Custom model configuration const customEmbedder = new TextEmbedder({ embeddingModel: { modelType: "custom", dimensions: 768, options: { modelPath: "path/to/model", normalize: true, batchSize: 32 } } }); console.log(result.embeddings);

Use Cases

  • Model Abstraction: Standardize embedding interfaces across multiple providers
  • Flexible RAG Systems: Build retrieval systems that can switch between embedding models
  • Multi-Model Testing: Compare embedding quality across different providers
  • Hybrid Search: Combine results from multiple embedding models
  • Fallback Architecture: Implement model redundancy with automatic fallbacks

Best Practices

  • Choose embedding dimensions appropriate for your application's needs
  • Implement proper error handling for model failures
  • Cache embeddings for frequently used text to reduce computation
  • Pre-process text appropriately for the chosen embedding model
  • Use batch processing for multiple texts to improve throughput