Documentation

HuggingFace Embeddings

Generate embeddings using HuggingFace's extensive collection of open-source models. Supports both hosted inference endpoints and local model deployment for maximum flexibility.

HuggingFace Embeddings Component

HuggingFace Embeddings component interface and configuration

API Key Notice: When using HuggingFace's hosted inference API, you'll need a valid API token with appropriate permissions. Some models have rate limits that may affect high-volume embedding generation.

Component Inputs

  • API Key: Your HuggingFace API token for authentication

    Example: "hf_abcdefghijklmnopqrstuvwxyz"

  • Model Name: The identifier of the embedding model to use

    Example: "sentence-transformers/all-mpnet-base-v2", "BAAI/bge-large-en-v1.5"

  • Inference Endpoint: Optional custom endpoint for hosted models

    Example: "https://your-endpoint.huggingface.cloud"

  • Input Text: The text to convert to embeddings

    Example: "This is a sample text for embedding generation."

Component Outputs

  • Embeddings: Vector representation of the input text

    Example: [0.018, -0.032, 0.067, ...]

  • Dimensions: Size of the embedding vector

    Example: 384 for all-MiniLM-L6-v2, 768 for all-mpnet-base-v2, 1024 for bge-large-en

  • Model Info: Details about the model used

    Example: name: sentence-transformers/all-mpnet-base-v2, version: 1.0.0, type: huggingface

Model Comparison

sentence-transformers/all-mpnet-base-v2

High-quality general-purpose embedding model with strong performance across many tasks

Dimensions: 768 Performance: Excellent semantic understanding Size: 420MB Language Support: English-focused Ideal for: High-quality general purpose embeddings

sentence-transformers/all-MiniLM-L6-v2

Lightweight and efficient model with good performance-to-size ratio

Dimensions: 384 Performance: Good semantic understanding Size: 80MB Language Support: English-focused Ideal for: Resource-constrained environments or applications with speed requirements

BAAI/bge-large-en-v1.5

State-of-the-art embedding model specifically optimized for retrieval tasks

Dimensions: 1024 Performance: Superior for retrieval and similarity Size: 1.3GB Language Support: English Ideal for: Retrieval augmented generation and semantic search applications

Implementation Example

// Using hosted inference API const embedder = new HuggingFaceEmbeddor({ apiKey: process.env.HUGGINGFACE_API_KEY, modelName: "sentence-transformers/all-mpnet-base-v2" }); // Using custom inference endpoint const customEmbedder = new HuggingFaceEmbeddor({ apiKey: process.env.HUGGINGFACE_API_KEY, modelName: "BAAI/bge-large-en-v1.5", inferenceEndpoint: "https://your-endpoint.huggingface.cloud" }); // Generate embeddings const result = await embedder.embed({ input: "Your text to embed" }); // Batch processing const batchResult = await embedder.embedBatch({ inputs: [ "First text to embed", "Second text to embed" ] }); console.log(result.embeddings);

Use Cases

  • Open Source Deployments: Use high-quality embeddings with open source models
  • Domain-Specific Applications: Select from specialized models for particular domains
  • Resource-Constrained Environments: Choose smaller models for edge devices or limited compute
  • On-premise RAG Systems: Deploy HuggingFace models entirely within your infrastructure
  • Multi-language Support: Select from models trained on multiple languages

Best Practices

  • Choose appropriate model size based on your performance needs and resource constraints
  • Consider using Inference Endpoints for production deployments with high reliability requirements
  • Implement caching for frequently used embeddings to reduce API calls
  • Monitor API usage limits to avoid rate limiting in high-traffic applications
  • Test different models to find the best performing one for your specific use case