OpenAI Embeddings
Generate high-quality embeddings using OpenAI's state-of-the-art models. Features extensive configuration options, robust error handling, and enterprise-grade reliability.

OpenAI Embeddings component interface and configuration
API Key Notice: Ensure your API key has sufficient quota for your embedding needs. OpenAI rate limits API requests based on your tier, and embedding large volumes of text may require a higher tier subscription.
Component Inputs
- OpenAI API Key: Your OpenAI API authentication key
Example: "sk-abcdefg123456789"
- Model: The embedding model to use
Example: "text-embedding-3-small", "text-embedding-3-large", "text-embedding-ada-002"
- OpenAI API Base: Optional custom API endpoint
Example: "https://api.openai.com/v1" or your custom endpoint
- OpenAI API Type: Type of API service
Example: "openai" or "azure"
- OpenAI API Version: Version of the API
Example: "2024-02-15"
- OpenAI Organization: Organization ID for team accounts
Example: "org-123456789"
Component Outputs
- Embeddings: Vector representations of the input text
Example: [0.012, -0.045, 0.067, ...]
- Token Usage: Number of tokens used for the request
Example: total_tokens: 125, prompt_tokens: 125
Model Comparison
text-embedding-3-small
Efficient model with 1536 dimensions, offering a good balance between quality and cost
Dimensions: 1536
Contextual Understanding: Strong
Languages: Multilingual
Ideal for: Most general embedding tasks
text-embedding-3-large
Highest quality embeddings with 3072 dimensions, optimal for tasks requiring maximum accuracy
Dimensions: 3072
Contextual Understanding: Superior
Languages: Multilingual with enhanced capabilities
Ideal for: High-precision semantic search, knowledge retrieval, and nuanced text comparison
text-embedding-ada-002
Legacy model with 1536 dimensions, maintained for backward compatibility
Dimensions: 1536
Contextual Understanding: Good
Languages: Primarily optimized for English
Ideal for: Legacy systems or when compatibility is required
Implementation Example
// Basic configuration
const embedder = new OpenAIEmbeddor({
openaiApiKey: process.env.OPENAI_API_KEY,
model: "text-embedding-3-small"
});
// Advanced configuration
const advancedEmbedder = new OpenAIEmbeddor({
openaiApiKey: process.env.OPENAI_API_KEY,
model: "text-embedding-3-large",
openaiApiBase: "https://custom-endpoint.com",
openaiApiType: "azure",
openaiApiVersion: "2024-02-15",
openaiOrganization: "org-id",
maxRetries: 5,
requestTimeout: 30000,
chunkSize: 1000,
showProgressBar: true,
skipEmpty: true,
tiktokenEnable: true
});
// Generate embeddings
const result = await embedder.embed({
input: "Your text to embed"
});
// The result contains the embedding vectors
console.log(result);
Use Cases
- Semantic Search: Create a vector store for similarity search
- RAG Applications: Enhance retrieval-augmented generation with high-quality embeddings
- Document Clustering: Group similar documents based on semantic similarity
- Recommendation Systems: Build content recommendation engines
- Duplicate Content Detection: Identify similar or duplicate content
Best Practices
- Use environment variables for your API keys in production environments
- Implement caching for embeddings to reduce API costs for repeated content
- Use text-embedding-3-small for most use cases, and text-embedding-3-large when precision is critical
- Pre-process text by removing unnecessary whitespace and formatting to reduce token usage
- Batch similar-length texts together for more efficient processing