Documentation

Ollama Embeddings

Generate embeddings using Ollama's local models. Run embeddings locally with no cloud dependencies, perfect for privacy-focused and offline applications with complete data sovereignty.

Ollama Embeddings Component

Ollama Embeddings component interface and configuration

Local Server Requirement: This component requires an Ollama server running either locally or on an accessible network. Ensure the server is running and the selected model is downloaded before using this component.

Component Inputs

  • Ollama Model: The name of the Ollama model to use for embeddings

    Example: "nomic-embed-text", "llama2", "mistral"

  • Ollama Base URL: The URL where your Ollama server is running

    Example: "http://localhost:11434" (Default)

  • Input Text: The text content to convert to embeddings

    Example: "This is a sample text for embedding generation."

Component Outputs

  • Embeddings: Vector representation of the input text

    Example: [0.024, -0.015, 0.056, ...]

  • Dimensions: Size of the embedding vector

    Example: 768, 1024, 4096 (depends on the model)

  • Metadata: Additional information about the embedding process

    Example: model_name: nomic-embed-text, processing_time: 125, total_tokens: 42

Model Comparison

nomic-embed-text

Specialized embedding model optimized for text similarity and retrieval tasks

Dimensions: 768 Training Focus: Text embeddings for retrieval Performance: Excellent for semantic search Ideal for: RAG systems, document retrieval, and semantic search applications

llama2

General-purpose language model with embedding capabilities

Dimensions: 4096 Training Focus: General language understanding Performance: Good for general text understanding Ideal for: General-purpose embeddings when a specialized model is not available

mistral

Efficient language model with strong multilingual capabilities

Dimensions: 4096 Training Focus: Efficient language modeling Performance: Good balance of quality and speed Ideal for: Multilingual applications and general text understanding

Implementation Example

// Basic configuration with default base URL const embedder = new OllamaEmbeddor({ ollamaModel: "nomic-embed-text" }); // Custom base URL configuration const customEmbedder = new OllamaEmbeddor({ ollamaModel: "llama2", ollamaBaseUrl: "http://your-ollama-server:11434" }); // Generate embeddings const result = await embedder.embed({ input: "Your text to embed" }); // Batch processing const batchResult = await embedder.embedBatch({ inputs: [ "First text to embed", "Second text to embed" ] }); console.log(result.embeddings);

Use Cases

  • Local RAG Systems: Build retrieval systems without cloud dependencies
  • Privacy-Focused Applications: Generate embeddings while keeping data on-premise
  • Offline Environments: Deploy in air-gapped networks or with limited connectivity
  • Edge Computing: Process data locally at the edge for reduced latency
  • Cost-Effective RAG: Eliminate API costs for high-volume embedding generation

Best Practices

  • Use GPU acceleration when available for significantly faster processing
  • Pre-download models before deployment to ensure offline capability
  • Monitor system resources as local embedding generation can be resource-intensive
  • Implement caching strategies to avoid redundant embedding generation
  • Batch similar-length texts together for more efficient processing