Ollama Embeddings

Generate embeddings using Ollama's local models. Run embeddings locally with no cloud dependencies, perfect for privacy-focused and offline applications with complete data sovereignty.

Ollama Embeddings component interface and configuration

Local Server Requirement: This component requires an Ollama server running either locally or on an accessible network. Ensure the server is running and the selected model is downloaded before using this component.

Component Inputs

Ollama Model: The name of the Ollama model to use for embeddings
Example: "nomic-embed-text", "llama2", "mistral"
Ollama Base URL: The URL where your Ollama server is running
Example: "http://localhost:11434" (Default)
Input Text: The text content to convert to embeddings
Example: "This is a sample text for embedding generation."

Component Outputs

Embeddings: Vector representation of the input text
Example: [0.024, -0.015, 0.056, ...]
Dimensions: Size of the embedding vector
Example: 768, 1024, 4096 (depends on the model)
Metadata: Additional information about the embedding process
Example: model_name: nomic-embed-text, processing_time: 125, total_tokens: 42

Model Comparison

nomic-embed-text

Specialized embedding model optimized for text similarity and retrieval tasks

Dimensions: 768
Training Focus: Text embeddings for retrieval
Performance: Excellent for semantic search
Ideal for: RAG systems, document retrieval, and semantic search applications

llama2

General-purpose language model with embedding capabilities

Dimensions: 4096
Training Focus: General language understanding
Performance: Good for general text understanding
Ideal for: General-purpose embeddings when a specialized model is not available

mistral

Efficient language model with strong multilingual capabilities

Dimensions: 4096
Training Focus: Efficient language modeling
Performance: Good balance of quality and speed
Ideal for: Multilingual applications and general text understanding

Implementation Example

// Basic configuration with default base URL
const embedder = new OllamaEmbeddor({
  ollamaModel: "nomic-embed-text"
});

// Custom base URL configuration
const customEmbedder = new OllamaEmbeddor({
  ollamaModel: "llama2",
  ollamaBaseUrl: "http://your-ollama-server:11434"
});

// Generate embeddings
const result = await embedder.embed({
  input: "Your text to embed"
});

// Batch processing
const batchResult = await embedder.embedBatch({
  inputs: [
    "First text to embed",
    "Second text to embed"
  ]
});

console.log(result.embeddings);

Use Cases

Local RAG Systems: Build retrieval systems without cloud dependencies
Privacy-Focused Applications: Generate embeddings while keeping data on-premise
Offline Environments: Deploy in air-gapped networks or with limited connectivity
Edge Computing: Process data locally at the edge for reduced latency
Cost-Effective RAG: Eliminate API costs for high-volume embedding generation

Useful Resources

Best Practices

Use GPU acceleration when available for significantly faster processing
Pre-download models before deployment to ensure offline capability
Monitor system resources as local embedding generation can be resource-intensive
Implement caching strategies to avoid redundant embedding generation
Batch similar-length texts together for more efficient processing

Documentation