OpenAI Embeddor
Production-Grade Embeddings
Overview
Generate high-quality embeddings using OpenAI's state-of-the-art models. Features extensive configuration options, robust error handling, and enterprise-grade reliability.
Available Models
- text-embedding-3-small
- text-embedding-3-large
- text-embedding-ada-002
Key Features
- Automatic retries
- Token counting
- Progress tracking
- Proxy support
Configuration
Required Parameters
openaiApiKey
OpenAI API keymodel
Model identifier
Optional Parameters
openaiApiBase
Custom API endpointopenaiApiType
API type (azure, openai)openaiApiVersion
API versionopenaiOrganization
Organization IDmaxRetries
Maximum retry attempts (Default: 3)requestTimeout
Timeout in millisecondschunkSize
Batch processing sizeshowProgressBar
Display progress (Default: false)skipEmpty
Skip empty inputs (Default: true)tiktokenEnable
Enable token counting
Example Usage
// Basic configuration const embedder = new OpenAIEmbeddor({ openaiApiKey: "your-api-key", model: "text-embedding-3-small" }); // Advanced configuration const advancedEmbedder = new OpenAIEmbeddor({ openaiApiKey: "your-api-key", model: "text-embedding-3-large", openaiApiBase: "https://custom-endpoint.com", openaiApiType: "azure", openaiApiVersion: "2024-02-15", openaiOrganization: "org-id", maxRetries: 5, requestTimeout: 30000, chunkSize: 1000, showProgressBar: true, skipEmpty: true, tiktokenEnable: true }); // Generate embeddings const result = await embedder.embed({ input: "Your text to embed" }); // Batch processing with progress const batchResult = await advancedEmbedder.embedBatch({ inputs: [ "First text to embed", "Second text to embed" ] });
Best Practices
- Use environment variables for keys
- Implement proper error handling
- Monitor token usage
- Cache frequent embeddings
Performance Tips
- Optimize chunk sizes
- Use appropriate timeouts
- Enable token counting
Response Format
{ "embeddings": { "vectors": number[][], "dimensions": number, "model": string }, "usage": { "prompt_tokens": number, "total_tokens": number }, "metadata": { "model_version": string, "processing_time": number, "chunk_info": { "total_chunks": number, "processed_chunks": number } }, "status": { "success": boolean, "error": string | null } }