OpenAI Embeddor

Production-Grade Embeddings
OpenAI Embeddor Diagram

Overview

Generate high-quality embeddings using OpenAI's state-of-the-art models. Features extensive configuration options, robust error handling, and enterprise-grade reliability.

Available Models

  • text-embedding-3-small
  • text-embedding-3-large
  • text-embedding-ada-002

Key Features

  • Automatic retries
  • Token counting
  • Progress tracking
  • Proxy support

Configuration

Required Parameters

  • openaiApiKeyOpenAI API key
  • modelModel identifier

Optional Parameters

  • openaiApiBaseCustom API endpoint
  • openaiApiTypeAPI type (azure, openai)
  • openaiApiVersionAPI version
  • openaiOrganizationOrganization ID
  • maxRetriesMaximum retry attempts (Default: 3)
  • requestTimeoutTimeout in milliseconds
  • chunkSizeBatch processing size
  • showProgressBarDisplay progress (Default: false)
  • skipEmptySkip empty inputs (Default: true)
  • tiktokenEnableEnable token counting

Example Usage

// Basic configuration
const embedder = new OpenAIEmbeddor({
  openaiApiKey: "your-api-key",
  model: "text-embedding-3-small"
});

// Advanced configuration
const advancedEmbedder = new OpenAIEmbeddor({
  openaiApiKey: "your-api-key",
  model: "text-embedding-3-large",
  openaiApiBase: "https://custom-endpoint.com",
  openaiApiType: "azure",
  openaiApiVersion: "2024-02-15",
  openaiOrganization: "org-id",
  maxRetries: 5,
  requestTimeout: 30000,
  chunkSize: 1000,
  showProgressBar: true,
  skipEmpty: true,
  tiktokenEnable: true
});

// Generate embeddings
const result = await embedder.embed({
  input: "Your text to embed"
});

// Batch processing with progress
const batchResult = await advancedEmbedder.embedBatch({
  inputs: [
    "First text to embed",
    "Second text to embed"
  ]
});

Best Practices

  • Use environment variables for keys
  • Implement proper error handling
  • Monitor token usage
  • Cache frequent embeddings

Performance Tips

  • Optimize chunk sizes
  • Use appropriate timeouts
  • Enable token counting

Response Format

{
  "embeddings": {
    "vectors": number[][],
    "dimensions": number,
    "model": string
  },
  "usage": {
    "prompt_tokens": number,
    "total_tokens": number
  },
  "metadata": {
    "model_version": string,
    "processing_time": number,
    "chunk_info": {
      "total_chunks": number,
      "processed_chunks": number
    }
  },
  "status": {
    "success": boolean,
    "error": string | null
  }
}