Vertex AI Embeddor
Enterprise-Grade Embeddings
Overview
Generate high-quality embeddings using Google's Vertex AI platform. Features advanced parameter control, parallel processing, and enterprise-grade reliability.
Available Models
- textembedding-gecko
- textembedding-gecko-multilingual
Key Features
- Parallel processing
- Advanced parameter tuning
- Streaming support
- Enterprise security
Configuration
Required Parameters
credentials
Google Cloud credentialslocation
GCP regionproject
GCP project IDmodelName
Vertex AI model name
Optional Parameters
maxOutputTokens
Default: 1024maxRetries
Default: 3n
Number of outputs (Default: 1)requestParallelism
Default: 5stop
Stop sequencesstreaming
Default: falsetemperature
Default: 0.0topK
Default: 40topP
Default: 0.95
Example Usage
// Basic configuration const embedder = new VertexAIEmbeddor({ credentials: { client_email: "your-service-account@project.iam.gserviceaccount.com", private_key: "your-private-key" }, location: "us-central1", project: "your-project-id", modelName: "textembedding-gecko" }); // Advanced configuration const advancedEmbedder = new VertexAIEmbeddor({ credentials: { client_email: "your-service-account@project.iam.gserviceaccount.com", private_key: "your-private-key" }, location: "us-central1", project: "your-project-id", modelName: "textembedding-gecko-multilingual", maxOutputTokens: 2048, maxRetries: 5, n: 3, requestParallelism: 10, stop: ["END"], streaming: true, temperature: 0.7, topK: 50, topP: 0.8 }); // Generate embeddings const result = await embedder.embed({ input: "Your text to embed" }); // Batch processing with streaming const streamingResult = await advancedEmbedder.embedBatch({ inputs: [ "First text to embed", "Second text to embed" ], streaming: true });
Best Practices
- Use service account authentication
- Implement proper error handling
- Monitor API quotas
- Cache frequent embeddings
Performance Tips
- Optimize request parallelism
- Use appropriate batch sizes
- Monitor token usage
Response Format
{ "embeddings": { "vectors": number[][], "dimensions": number, "model": string }, "usage": { "total_tokens": number, "prompt_tokens": number, "completion_tokens": number }, "metadata": { "project_id": string, "location": string, "model_version": string, "processing_time": number, "request_params": { "temperature": number, "topK": number, "topP": number, "n": number } }, "status": { "success": boolean, "error": string | null } }