Ollama Embeddings
Generate embeddings using Ollama's local models. Run embeddings locally with no cloud dependencies, perfect for privacy-focused and offline applications with complete data sovereignty.

Ollama Embeddings component interface and configuration
Local Server Requirement: This component requires an Ollama server running either locally or on an accessible network. Ensure the server is running and the selected model is downloaded before using this component.
Component Inputs
- Ollama Model: The name of the Ollama model to use for embeddings
Example: "nomic-embed-text", "llama2", "mistral"
- Ollama Base URL: The URL where your Ollama server is running
Example: "http://localhost:11434" (Default)
- Input Text: The text content to convert to embeddings
Example: "This is a sample text for embedding generation."
Component Outputs
- Embeddings: Vector representation of the input text
Example: [0.024, -0.015, 0.056, ...]
- Dimensions: Size of the embedding vector
Example: 768, 1024, 4096 (depends on the model)
- Metadata: Additional information about the embedding process
Example: model_name: nomic-embed-text, processing_time: 125, total_tokens: 42
Model Comparison
nomic-embed-text
Specialized embedding model optimized for text similarity and retrieval tasks
Dimensions: 768
Training Focus: Text embeddings for retrieval
Performance: Excellent for semantic search
Ideal for: RAG systems, document retrieval, and semantic search applications
llama2
General-purpose language model with embedding capabilities
Dimensions: 4096
Training Focus: General language understanding
Performance: Good for general text understanding
Ideal for: General-purpose embeddings when a specialized model is not available
mistral
Efficient language model with strong multilingual capabilities
Dimensions: 4096
Training Focus: Efficient language modeling
Performance: Good balance of quality and speed
Ideal for: Multilingual applications and general text understanding
Implementation Example
// Basic configuration with default base URL
const embedder = new OllamaEmbeddor({
ollamaModel: "nomic-embed-text"
});
// Custom base URL configuration
const customEmbedder = new OllamaEmbeddor({
ollamaModel: "llama2",
ollamaBaseUrl: "http://your-ollama-server:11434"
});
// Generate embeddings
const result = await embedder.embed({
input: "Your text to embed"
});
// Batch processing
const batchResult = await embedder.embedBatch({
inputs: [
"First text to embed",
"Second text to embed"
]
});
console.log(result.embeddings);
Use Cases
- Local RAG Systems: Build retrieval systems without cloud dependencies
- Privacy-Focused Applications: Generate embeddings while keeping data on-premise
- Offline Environments: Deploy in air-gapped networks or with limited connectivity
- Edge Computing: Process data locally at the edge for reduced latency
- Cost-Effective RAG: Eliminate API costs for high-volume embedding generation
Best Practices
- Use GPU acceleration when available for significantly faster processing
- Pre-download models before deployment to ensure offline capability
- Monitor system resources as local embedding generation can be resource-intensive
- Implement caching strategies to avoid redundant embedding generation
- Batch similar-length texts together for more efficient processing